January, 1993
January, 1993
EDITORIAL


Double-digit Copy Protection




Jonathan Erickson


So you want to make nine copies each of your favorite computer programs?
That's probably not a problem. Make that tenth copy though, and the next thing
you'll be hearing is a Steve McGarrett wanna-be muttering "Book 'em, Dano."
That's the law. Under the new Software Copyright Protection Act (SCPA), signed
by President Bush shortly after the election, software piracy is a felony,
punishable by up to five years behind bars and a $250,000 fine. Repeat
offenders can expect ten years, along with the fine.
In fairness, the federal statute, which was sponsored by Sen. Orrin Hatch
(whose Utah constituents include Novell and WordPerfect), is targeted at
large-scale commercial crooks--counterfeiters, for instance, who rip off the
software industry for more than $1 billion a year by copying disks, manuals,
and boxes, then selling them as the real thing. Although designed to protect
big software vendors, the law also protects consumers who might unknowingly
buy counterfeit software and expect full vendor support, including upgrades.
A few years ago, Microsoft uncovered one of the first cases of large-scale
illegal copying right around the corner from me. My neighborhood mom-and-pop
computer store was caught with a storeroom full of counterfeit copies of
MS-DOS 3.1. At first glance, the shrink-wrapped packages looked real; under
scrutiny, they were not-so-clever fakes. Still, the crime wasn't much more
than a misdemeanor; the store is still in business and has expanded three
times over. Under the new law, the bums would be stamping out license plates
instead of duping disks.
More recently, Microsoft raids in California and New Jersey netted 16
tractor-trailer trucks worth of fake MS-DOS 5.0 packages. Unlike my friendly
neighborhood counterfeiters, these crooks were sophisticated, right down to
the holograms on the boxes.
Nonetheless, certain aspects of the SCPA are bothersome. While the spirit of
the law is to curtail commercial copying, there's a chance that the letter of
the law--particularly in regard to its broad language--will be misapplied. The
SCPA states that "any person...shall be imprisoned not more than 5 years, or
fined in the amount set forth in this title, or both, if the offense consists
of the reproduction or distribution, during any 180-day period, of at last
[sic] 10 copies...of 1 or more copyrighted works, with a retail value of more
than $2,500." It goes on to say that "any person...shall be imprisoned not
more than 1 year, or fined in the amount set forth in this title, or both, in
any other case." Differentiating between reproduction and distribution
suggests that, as odd as this may seem, anyone making 10 copies of a program
for personal use could be in big trouble.
Likewise, the law states that if you make three or four copies of a program
for personal use when the documentation states you can only make one copy, you
can be fined and sent to jail for one year. There's little question that this
part of the SCPA is aimed at individuals making multiple copies for personal
use, not illegal commercial operations. Let's not forget that before software
vendors and the sheriff can break down your door, they have to have prior
knowledge of your duplication efforts, suggesting yet another invasion of our
privacy somewhere along the line. While enforcement in this case is admittedly
remote, it's still a possibility.
Perhaps of broader concern is what this says about our legal system: You can
be sent to prison for five to ten years for making a dozen copies of a disk,
while hardened criminals get much less time for far more heinous crimes. In
San Francisco last year, a pair of thugs charged with first-degree murder for
stomping a randomly chosen college student to death were sentenced to prison
for four and six years, respectively. And you can get five years for disk
copying? It seems that the lesson here is that murder doesn't go to the bottom
line, but software piracy does.
By no means am I belittling the issue of software piracy. It's a serious
problem that demands a serious response. Led by the Software Publishers
Association (SPA), software vendors have taken steps to fight piracy. In
particular, Apple's Developer Group over a year ago launched an enlightened
educational program called the "Apple Anti-piracy Initiative" which promotes
antipiracy technologies, security strategies for developers, and end-user
education.
While measuring success is difficult, the SPA is claiming that, compared to
1990, antipiracy efforts in 1991 made a difference. The SPA says that there
was 41 percent, or about $800 million, less piracy attributable to education,
legal actions, and lower average software prices.
But considering that software piracy was supposedly cut almost in half a year
before the SCPA was signed, you have to wonder if we need the law at all. In
other words, if piracy is already on the decline, isn't enacting the law akin
to closing the drive door after the floppy is out? The answer is that even
when halved, piracy is a monumental problem costing all of us big bucks and
solving it requires a law with teeth.
I guess we can only hope that the money we save is worth the price we might
pay.




































January, 1993
LETTERS







New Patent Angles


Dear DDJ,
I've read the discussions on software patents with great interest. That the
current system is unworkable seems proven; I like Jeff Duntemann's idea of
replacing it with an ASCAP-like licensing scheme. (Ideally this would be
integrated with a software-component market.)
There is another trend which I would like discussed, an increasingly strident
tone concerning "software quality." For example, IEEE Spectrum recently
published an e-mail conference on "the security and vulnerability of
information technologies." ("A Security Roundtable," IEEE Spectrum, 8/'92, pp.
41-44.) The participants were six "outstanding security authorities."
At least four share a strong opinion that information systems are unacceptably
dangerous because of inadequate controls on their development. Only one
questions (or qualifies) this judgment; the rest argue about the mechanism for
imposing control. The preferred methods seem to be government regulation,
"consumer-oriented code-termination" and an "anti-czar" ("a Ralph Nader of
information technology...to start the whole process of public consciousness").
The extraordinary things about the Spectrum conference are the generality of
the indictment, the contempt for current practice, and the lack of critical
consideration of the proposed solutions.
Although Spectrum's opening remark asked whether systems are "increasingly
vulnerable to malicious action," the respondents have much broader concerns:
"There is no standard...on whether to design (and pay for) very resilient
systems;" "We must recognize the risks in trusting computers...to do jobs we
ourselves cannot do reliably on the scale and with the timeliness demanded;"
"I want to know exactly who wrote the Windows 2 code, DOS 4.01;" etc.
The actual systems mentioned, in addition to MS-DOS and MS Windows, are a
nonfunctional accounting system and a faulty radiation-therapy control
program. It is left unclear which solutions are meant to solve which problems.
The contempt surprised me. Programmers are "techies" who need to be prodded to
consider "values beyond making something work." (Apparently, profit is an
inadequate motive for us to properly value user interfaces.) The following
assertion is made and not challenged: "No designer of any essential software
did a reasonable analysis of the risk of an implementation." Is this true, for
example, of the Space Shuttle avionics? The participants ignore more than
criticize the existing standards and certification processes.
The ideas on how to better develop systems are the following: To have "users
take a more active role in the design," that "all professionals, just like
artists, should sign their work," a rather vague wish to see "the reward
structure changed for techies," and a mention of "good software-engineering
practice." There is more substance in the discussion of how to exert control.
One method involves using the fear of computer viruses to increase "user
consciousness" to the level required for a Ralph Nader type activism. Lawsuits
get favorable mention. The consensus, however, relies on government
regulation: "Legislative control is essential in security. Is information
technology different in this respect from chemical or nuclear engineering?
No!"
But chemical and nuclear engineering were crippled by politics. Is it in the
public interest to do the same to information technology? None of the proposed
mechanisms of control have good records of intelligently addressing problems.
Moreover, existing software-engineering practices are controversial and (in my
opinion) rather rapidly improving. Overreactions may lead to decreased safety
by requiring the wrong techniques and by preventing the development of
improved systems.
I fear that these "outstanding security authorities" will win if unopposed.
Please read the article.
Ed Butler
Reston, Virginia
Dear DDJ,
This is in defiant rage against Jeff Duntemann's irresponsible and dangerous
article in his August "Structured Programming" column. If Jeff has his way,
none of us will have a job as a professional programmer. Instead, we will all
be completely broke from researching for, applying for, and defending against,
patent violations. In fact, we will be so busy with these activities there
will be no time to do what we get paid to do, program computers.
Clearly Jeff doesn't get it. As a software developer I am paid to come up with
solutions to problems utilizing today's microprocessors. I believe that this
is what most software developers get paid to do. I also recall, from my
college education so long ago, that we were taught ways to solve problems
using various logical methods and algorithmic approaches, utilizing a solid
foundation in mathematics and logic.
I can still remember, from programming my computer just this morning, that
today's microprocessors provide a very limited form of expression. They still
execute a sequential set of instructions, performing simple math and logic
operations. Given this limited set of instructions, I try to solve
computational problems every day in my professional life. Now don't get me
wrong. I'm a very clever fellow, and I come up with ideas, almost every day,
that just make my little ego burst with pride. However, Jeff would have me
perform a patent search every single time I hack a clever piece of code!
The point is, nothing a good programmer does is obvious! I don't get paid for
being dumb! I get paid, like many other talented engineers, because I can hack
good code, trim cycles, and push the metal. Maybe you don't remember this kind
of programming, but for many of us it is our bread and butter, day-to-day,
patent-violating profession.
The concept of patenting software is ludicrous. A computer executes a simple
set of logic operations. It's like coming up with a good solution to a chess
problem. Given the constraints of the board, and limited ways that the pieces
my be moved, a chess master can come up with fantastic solutions to achieve a
winning game. However, he doesn't run out and try to patent his winning chess
move. But instead make that a fast sort algorithm, and watch out!
In many other professions, individuals achieve reward and advancement for good
ideas (people in advertising, marketing, sales, and investment, just to name a
few). Imagine the chaos if these professions filed patents for every new or
perceived new idea they ever had! Proposing that a series of add/subtract,
compare, and memory move instructions can be patented is both irresponsible
and dangerous. It puts engineers out of work, while large corporations, who
treat patents like war chests, and lawyers, assert their growing control on a
once creative industry. (Let me point out right now, that no patent filed
describes its process as a series of add/subtract and memory move
instructions, even though that is how we would ultimately express it with
today's machine architectures. Instead each patent is filled with such
obfuscation and technical mumbo-jumbo that describing how to boil a pot of
water would read like the technical reference manual for an Iraqi nuclear
device.)
If this letter sounds angry, you are getting the right tone. Am I being
ridiculous? Hell no! Think about it. Patents have been granted again and again
based on efficient forms of computation. Most of us make a living out of
efficient forms of computation. I've written self-modifying code on many
different microprocessors. The result of this coding effort was invariably a
hot product, not a hot patent!
The patent issue is being raised around me, at a personal level, constantly. I
develop simulation products for a game company. My colleagues and I make a
living developing computer games that do things the rest of the industry
doesn't feel is possible. Imagine writing a 7-frame per second 3-D flight
simulator on a 4.7-MHz IBM PC computer. My friend Ned Lerner did so when he
wrote Chuck Yeager's Flight Trainer in 1986. Not a single patent was filed, or
patent search performed. In Jeff's world people like Ned and I don't have
jobs.
In summary, I ask, "How can we perform our jobs as creative and talented
engineers, if we must, every day, perform patent searches, and apply for the
same?" Join the real world and help fight this madness, not foster it!
John W. Ratcliff
St. Charles, Missouri


Mac Attack


Dear DDJ,
Sitting at the bottom of Africa sometimes gives us the advantage of
objectivity. In answer to Jeff Duntemann's final question in his June
"Structured Programming" column ("Hey already, when is somebody going to do me
up a Visual Pascal?"), I have the following suggestion:
Switch your allegiance to the Macintosh. The Macintosh Toolbox and
operating-system interfaces were all specified in Pascal terms, and as a
result the Macintosh has a much stronger Pascal following than the dreaded PC.
Equip yourself with AppMaker 1.5 from Bowers Development, and THINK Pascal 4.0
from Symantec. An unbeatable combination.
Add to this copies of Inside Macintosh, volumes 1-4, and you're away. It's
programming heaven and about 100 times more professional and productive than
anything else I have ever worked with.
Mike O'Hanlon
Claremont, South Africa


Controlling CRC Valves


Dear DDJ,
I refer to Mark R. Nelson's article "File Verification Using CRC" in the May
1992 DDJ. Mark suggests that the CRC calculation is noninvertible in the sense
that on changing the contents of a file it is difficult to avoid changing the
CRC and that trial and error is required to restore the former CRC. This is
not the case.
If a region of the file is modified and there are four extra bytes following
the region (without extending the file) which can also be modified, then you
simply calculate the CRC over that region before and after the changes,
calculate the difference with a bit-wise exclusive-OR, and then exclusive-OR
that value into those four following bytes. The CRC over the whole file will
then be unchanged.
Better still is the fact that the CRC of a file can be arranged to come out to
any chosen value if only there are 32 consecutive bits that can be modified
somewhere in the file. The bits do not even have to be aligned on a byte
boundary.

The CRC that would result from each of the bits alone (without pre- or
post-inversion) is calculated and those values assembled as the rows of a
binary matrix. The matrix can also be obtained by powering a suitable
sylvester matrix. The difference (exclusive-OR) between the current CRC of the
file and the desired CRC is calculated and is multiplied by the inverse of the
matrix. The nature of the polynomial guarantees the matrix is nonsingular. The
result is exclusively ORed into the chosen field in the file.
If the location for the CRC patch is a fixed distance from the end of the
file, much of the calculation can be done in advance.
There are legitimate applications. The first method could be used to keep the
CRC over a database file constant after updating one record, without having to
read the whole file. The second could be used to arrange that an executable
file or a ROM had a particular CRC, say, 42 hexadecimal.
But the CRC is not a good defense against viruses. Presently it is useful
because the viruses do not account for CRCs, but as shown, it is neither hard
nor time consuming.
Gavin Puche
East Brisbane, Queensland
Australia


Island View


Dear DDJ,
As a participant in your C++ GUI "shoot-out" (see "Sizing up Application
Frameworks and Class Libraries," by Ray Valdes in the October 1992 issue of
DDJ), we felt that this type of forum does indeed provide the reader with a
broader view of an array of product solutions than could be accomplished by a
limited review. Additionally, the coverage for each product is better
balanced. We do, however, feel compelled to clarify the following
misconceptions noted regarding our product, object-Menu.
First, the article notes that our mouse icon is nonstandard from a Windows or
Mac point of view in that it points to the right. In fact, our mouse icon can
be dynamically customized to any shape, including allowing an animated mouse
icon. Several sample icons are provided with the product for those
less-creative folks. By using a right arrow in our demonstration programs we
attempt to illustrate our flexibility, not nonconformity to a standard.
The second note refers to the table of implemented features to the DDJ HWX
browser. Two of the items were "Select letter by keyboard" and "Select
instance by keyboard." It was erroneously noted that our implementation did
not have these characteristics. We expect this must have been an oversight
since object-Menu has extensive keyboard support built in to all objects.
A final issue is a somewhat picky, but important distinction in the
characterization of the supported platform. object-Menu was noted to be the
only "DOS-based" product covered in the article. Earlier in the article,
Borland's TurboVision was referenced as also supporting "DOS apps." The
important distinction to be made is that not all DOS-based support is equal.
object-Menu supports DOS graphics apps, whereas TurboVision is suited only for
DOS text-based apps. This clarification is particularly significant in a
market already undereducated about DOS-based Microsoft Windows alternatives.
With the popularity of Microsoft Windows, OS/2, etc., there is strong demand
for an aesthetic user interface for every product. We salute DDJ for
presenting a non-Windows alternative in this coverage of user-interface
solutions.
Lisa Herman
Island Systems
Burlington, Massachusetts









































January, 1993
EXTENDING WINDOWS TO 32 BITS


Programming benefits and pitfalls when moving to 32-bit Windows programming




Steven Baker


Steven works for the Oregon Department of Energy coaxing energy conservation
out of new state buildings. He was editor of Programmer's Journal and coauthor
of Extending DOS. Steven can be reached at msbaker@astute.com via Internet or
msbaker@tanelorn.UUCP via UUCP.


For years, I've been using 386 DOS extenders to take advantage of the native
power of the chips under the hood. While DOS extenders should have disappeared
long ago, their persistence is a tribute to just how painfully slowly system
software has developed. Now that Windows appears well established and offers a
modestly stable environment for users, it's time to move to Windows those
applications that can benefit from a GUI. Since OS/2 2.0 can also run Windows
programs, this seems a safe approach. But for some applications, the move
makes sense only if the potential performance of 32-bit CPUs can be exploited.
I'll share with you what I've learned about 32-bit Windows development and
some programming issues to consider.


Prospecting for the Golden App


The first challenge is choosing which problems are well served by moving to
32-bit Windows. To some extent, this depends on whether or not you plan on
targeting Windows 3.1, Windows NT, or OS/2 Presentation Manager. Omitting
OS/2, target platforms that can run 32-bit applications include 16-bit Windows
3.x hosts requiring a 32-bit Windows extender and 32-bit Windows NT in native
mode. For the time being, the most likely prospect is running a 32-bit Windows
application hosted atop the 16-bit Windows 3 runtime.
Full 32-bit programming works well with several types of computation problems:
CPU-intensive programs, memory-intensive programs, and programs ported from
32-bit environments (UNIX, VAX, and the like).
If your application chews up a gazillion CPU cycles, then running under 32
bits offers faster performance than 16-bit Windows or DOS code. Registers are
a full 32 bits wide and the 386 flat model reduces the necessity for
segment-register loads with their inherent performance penalty. Programs with
large memory needs also benefit dramatically from the wide open spaces that
32-bit programming provides. If you're porting to the PC from UNIX or VAX,
then you're starting with a 32-bit program. Targeting an environment like
32-bit Windows will be dramatically easier than attempting to shoehorn a large
UNIX application into the multiple-64K-segment world of 16-bit Windows and
DOS.
On the other hand, if your program is rife with user-interface code without
much computation or memory requirements, 32-bit Windows may not be
appropriate. In fact, running a 32-bit app that is mostly user interface can
actually be slower than the equivalent 16-bit Windows code. When running a
32-bit Windows program hosted on Windows 3.1, the Windows API calls are slower
than with a native Windows 3.1 app. The average penalty is estimated at 10
percent, but, like fuel efficiency, this number can vary depending on the mix
of API calls used. The degradation results from the thunk layer that must
convert 32-bit parameters, addresses, and results to segmented 16:16 pointers
and 16-bit integers to communicate with the underlying Windows 3.x API; see
Figure 1. (Aside from address translation, a thunk layer may also handle
realigning structures and the stack.)
If your target is true Windows NT, then the promise of 32-bit graphics device
drivers may provide a performance boost. Keep in mind that higher-performance
32-bit video drivers may soon be in the cards for Windows 3.1.


Options for the Big Picture Show


As Table 1 shows, several choices are available for creating 32-bit GUI
programs. For over a year now, both MetaWare and Watcom have bundled with
their 386 compilers a 32-bit Windows development kit and Windows extender as
free enhancements. Rational Systems offers BigWin, a 32-bit Windows extender,
to developers for use with MetaWare, Watcom, or Zortech 386 compilers. Like
Rational's DOS16/M product (used by Lotus), BigWin is priced more toward OEM
sales.
Table 1: 32-bit Windows options. MetaWare and Watcom have announced future
support for the Win32S Subset and the Win32 NT API. Since Windows NT is
supposed to run Windows 3.1 applications, programs built with one of the
Windows extenders may also run under Windows NT.

 Vendor Product Platform API supported
 ---------------------------------------------------------

 --Windows 3.1 Extenders--
 MetaWare High C/386 Windows 3.x Windows 3.0
 (extended)
 Rational BigWin Windows 3.x Windows 3.1
 (extended)
 Watcom C/386 Windows 3.x Windows 3.1
 (extended)
 Microsoft Win32S DLLs Windows 3.x Win32 Subset
 --Native 32-bit Windows--
 Microsoft Windows NT Windows NT Win32 and
 Win32S Subset

These three products follow the trail blazed by the 386 DOS extenders, but use
Windows 3.x and its underlying DOS protected-mode interface (DPMI) to extend
Windows to 32 bits. These tools allow programmers to write 32-bit Windows
applications that run under Windows 3.0 and Windows 3.1 in Enhanced mode.
To programmers, these Windows extenders look like the Windows 3.x API with
various parameters and functions widened to 32 bits or slightly modified for
performance. Each vendor supplies replacement Windows header files for
proto-typing that handle the 32-bit version of the Windows 3.x API.


New Technology for Old


Microsoft began its real push in Windows NT with its July '92 Developers'
Conference and prerelease Win32 CD-ROM SDK. Soon afterwards, new manuals
documenting the evolving Win32 API were made available to developers.
Part of Microsoft's strategy for enticing developers to Windows NT is
providing the option of running 32-bit Windows apps on Windows 3.x. A
collection of DLLs supporting a subset of the full-blown Windows NT API called
Win32S will be available with the Windows NT SDK for use with Windows 3.1. An
initial version the Win32S DLLs made its way onto the October '92 Win32 SDK
CD-ROM. Win32S offers the promise of creating a single executable that will
run under both Windows 3.1 and Windows NT.

The Win32S DLLs for Windows 3.1 provide a 32-bit Windows extender similar to
the MetaWare, Watcom, and Rational Windows extenders. The Win32S DLLs work
only with Windows 3.1. It's unlikely that Microsoft will add support for
Windows 3.0 -- the earlier version had too many bugs to patch and work around
for Microsoft's taste. In fact, some of the design decisions of the other
Windows extenders resulted from manipulating around serious Windows 3.0 bugs
for 32-bit programs.


A Map for All Seasons


Each Windows extender provides a slightly different twist to the memory model
that an application sees. All provide a thunking layer or Windows supervisor
that translates between 32-bit Windows calls and the native segment:offset
(16:16) format required by the 16-bit Windows 3 kernel; see Figure 2.
Both MetaWare and Watcom use a similar zero-based flat memory model with a
fixed 64K stack at the bottom, followed by code, static data, and heap; see
Figure 3. MetaWare and Watcom use DPMI to allocate memory from the Windows
kernel, manage this 32-bit segment, and trap various INT functions for special
handling. The stack size and its placement were designed to handle serious
bugs in Windows 3.0.
Under some circumstances, the Windows 3.0 kernel trashes the upper half of the
stack-pointer register (ESP). Limiting the stack of the bottom 64K of the
segment preempts damage from occurring. This stack is also shared by the
16-bit side of the supervisor so parameters can be easily passed.
A design fault in the DPMI specification can preclude simultaneously running
more than one 32-bit application that requires floating-point emulation.
Watcom provides a virtual device driver (VxD) that provides floating-point
emulation and allows any number of 32-bit apps to run. The Watcom VxD uses the
first 256 bytes of the stack to store the environment, including the
floating-point registers when context is switched to another program using
floating-point emulation. This enhancement is lacking in the MetaWare and
Rational extenders. Under Windows NT and IBM OS/2 2.0, the operating system
provides floating-point emulation--the preferred strategy.
One outcome of using a zero-based model is that pointers returned by Windows
from, say, a DDE message must be turned into a 48-bit (segment:offset) data
pointer by the thunking layer for access. When an application gets this data,
segment registers must be reloaded. Callbacks in the 32-bit application must
also be passed as 48-bit pointers to the thunking layer.
Windows 3.0 can also trash the upper half of the instruction pointer (EIP)
when executing certain DOS and BIOS interrupt instructions. Keeping the 32-bit
library code that executes these INT instructions below 64K (intdos and
intdosx, for example) prevents these problems.
Rational's BigWin also presents a flat memory model to a 32-bit program, but
it is not zero based. Instead of asking the Windows kernel for memory using
DPMI, the BigWin VxD goes to the bottom of the chain and talks directly to the
low-level memory-manager device driver below Windows. A BigWin segment uses
various page-allocate requests and page mapping to create a more flexible
flat-model memory map. This allows BigWin to map its thunking layer and
Windows itself into this same segment. A DDE message pointer passed by Windows
from some other application also maps into this flat segment. Segment
registers in a BigWin application never need to be loaded.
The stack under Rational's BigWin is not limited to 64K. Rather than relying
on the Windows kernel for DOS and BIOS interrupt support with its resulting
quirks and bugs, Rational was able to just incorporate the INT handler
technology from their other DOS extenders. BigWin even demand-loads the 32-bit
application rather than reading the whole executable into memory at startup.
The Mircrosoft Win32S DLLs for Windows 3.1 are tightly integrated with the
Windows kernel code, a luxury that only Microsoft could have. This allows
Win32S to support a more flexible flat memory model similar to Rational's
BigWin. By ignoring Windows 3.0, Microsoft was able to eliminate a number of
difficult problems other vendors had to work around. Rather than using INT 7
(the Coprocessor Not Present exception) for floating-point emulation,
Microsoft compilers use other INT functions that point into the compiler
library for this emulation. With this strategy, Microsoft isn't limited to a
single 32-bit application that uses the emulator.


Heady Changes Ahead


With the exception of Microsoft's Win32S extender, the tools in Table 1 all
support a 32-bit version of the Windows 3.1 API, although the Windows header
files must be changed to reflect where parameters have been widened. Revised
header files also prototype functions that may require FAR (48-bit) pointers
from a 32-bit program. The thunking layer supplied by each vendor can hide
much of the requirement for 48-bit pointers. Still, the three Windows
extenders take different approaches to this issue.
At one end of the spectrum is the thunking layer in Rational's BigWin, which
eliminates the necessity of any far pointers for the programmer. Watcom is
closer to the middle since the programmer must explicitly convert function
callbacks and some returned data pointers (DDE messages from other apps, for
example) to far pointers with some supplied macros. MetaWare is at the other
extreme: It defines all Windows calls and data pointers to Windows parameters
as FAR (48-bit), rather than hiding much of this in the thunking layer.
Changes to the calling conventions necessitated by the thunking layer and any
widened parameters must also be reflected in revised header files. The file
requiring the most change in WINDOWS.H, supplied with the Microsoft Windows
3.1 SDK. Microsoft defined many of the Windows 3.1 structures in a nonportable
way by using int in place of short. Somehow Microsoft managed to clean up most
of the other headers supplied with the 3.1 SDK before its release. Windows NT
header files use the UINT macro in structures to hide this difference.
Watcom handles these header changes by actually #includeing the 3.1 header
file intact within their own WINDOWS.H wrapper file, which temporarily
#defines int to short and dummies out FAR when _WINDOWS_16 is not defined.
Rational provides a revised WINDOWS.H file with the offending structure
members explicitly changed to short and the FAR modifier removed. MetaWare
provides a complete rewrite of the header files with even more changes since
parameters to Windows functions become extended to FAR (48-bit) data pointers.
Windows messages remain unchanged from the Windows 3.1 API.
One of Microsoft's design goals for Win32 API was to minimize the impact on
existing code, so that 16-bit applications can be easily adapted. However,
some significant changes were made along the way for the larger address space;
see Table 2. The most problematic change for programmers is that some messages
are modified and repacked, most notably WM_COMMAND. Microsoft's Win32S
reflects a subset of the full Win32 API. The Win32S API functions and the
32-bit versions of the Win 3.1 API from these other vendors are reasonably
close, with a few exceptions. I've extracted excerpts from Windows.H for
comparison between the 3.1 SDK and these Windows extenders. These files are
available electronically, see "Availability" on page 5.
Table 2: Changes from the Windows 3.1 API. Remember that the Win32 API is a
moving target. This information is based on the October '92 NT SDK.

 Total Comparison with Windows 3.1 API:
 count Widened Changed Dropped New
 --------------------------------------------------------

 --Functions--
 Windows 3.1 973
 Win32S 838 711 6 256 121
 Win32 1449 829 6 138 614
 --Messages--
 Windows 3.1 271
 Win32S 287 250 21 0 16
 Win32 291 250 21 0 20

Win 3.1 and the Win32S API differ primarily in repacking messages. The other
three Windows extenders don't require such changes. The MetaWare and Watcom
extender do require the programmer to explicitly convert some function and
data pointers to FAR using supplied macros. To illustrate how your source
might change, I've excerpted code from a converted version of the GENERIC
sample program comparing Win 3.1 and the Windows extenders. Listing One (page
88) is the code from the Windows 3.1 SDK, Listing Two (page 88) is the same
code converted to Rational Systems' BigWin, Listing Three (page 88) is code
for Watcom's C/386, Listing Four (page 88) is code for MetaWare's High C/386,
and Listing Five (page 88) is code for the Windows NT October '92 SDK beta.
The MetaWare, Rational, and Watcom Windows extenders offer one significant
advantage over Win32S when converting from existing 16-bit Windows programs.
All these other products support calling 16-bit DLLs while Win32S does not.
For Microsoft to support mixing 16-bit and 32-bit code would break the notion
of a single Win32S-compatible executable running on both Windows 3.1 and NT.
By supporting 16-bit DLLs, the transition to 32 bits can be made gradually,
module by module. Third-party libraries or existing assembly language modules
can be used until 32-bit code is available. With Win32S it's currently an
all-or-nothing proposition.


Testing the 32-bit Waters


When running Windows 3 executables on either Windows NT or OS/2, similar
thunking layers must be used, but in reverse. Consequently, I ran the 16-bit
version of PC Labs' Winbench program (a benchmark for video cards) under
Windows NT and OS/2 2.0; this should give a rough indication of the worst-case
thunk overhead for common GDI operations. This assumes that both OS/2 and NT
have comparable VGA drivers to Windows 3.1. As Table 3 shows, the OS/2
thunking layer surprisingly had less overhead than the October '92 version of
NT.
Table 3: Thunking overhead using Winbench 2.5. Benchmarks run on a 386/40 in
640x480 mode with a standard VGA device driver and an older Tseng 3000 card.
Larger numbers are better.

 Platform Performance

 Windows 3.1 1,285,900 pixels/sec
 OS/2 2.0 1,029,760 pixels/sec
 Win NT (Oct '92 SDK) 787,218 pixels/sec

I selected several program types for evaluating the benefits offered with
32-bit Windows, including simulation (intense number crunching), database
operations, large text-manipulation tools, and graphics (image) processing.
For each type, I tried to create a modest Windows program or test application
to determine the possible performance benefits. For comparison purposes, I had
to create equivalent 16-bit and 32-bit Windows programs -- a very
time-consuming process. (Source code for some of the conversion programs is
provided electronically; see "Availability," page 5.)
To represent the sheer number crunching of simulation, I modified a version of
the Linpack benchmark (translated to C) to run under Windows and display its
results. The execution timing differences are dominated by the computation
times since the text and graphics output-function overhead is modest.
For database operations, I used library routines from CodeBase 4.5 from
Sequiter Software (Edmonton, Alberta) for manipulating large database and
index files (FoxBase format). For text manipulation, I tried several different
programs. I pulled out and modified part of the PortTool sample code from the
Windows NT SDK to run under 16-bit Windows. This program scans a source-code
file for Windows functions that may require changes when porting to the Win32
API. PortTool reads a datafile of keywords into memory, creates a linked list
of keywords, and loops through this list, comparing every token in the source
file. The second text tool was the GNU diff program that reads two files into
memory, scans for comparisons, and reports the differences. The algorithm used
by diff hashes the lines and goes about its comparisons. The problem under DOS
and Windows comes about when the files are large and huge pointers must be
used.

For graphics processing, I had originally planned to port a large UNIX
ray-tracing program. But after some false starts, I extracted parts of another
sample from the Windows NT SDK--the Julia (Mandel) program which calculates
and draws the Mandelbrot and corresponding Julia set. Julia can do its
calculations with either fixed-point or floating-point math and uses the
Windows GDI and USER functions to draw fractals.
Most of my test programs improved by at least a factor of two, going from 16-
to 32-bit Windows. Table 4 shows the most dramatic results of these tests for
Linpack. If huge pointers must be used to access data structures (like arrays)
greater than 64K, then the overhead under 16-bit Windows is truly enormous.
These 32-bit Windows extenders offer much better performance than their 16-bit
counterparts on these selected program types. I found porting code to 32-bit
Windows can be somewhat frustrating, but the performance benefits for your
application can be most rewarding.
Table 4: Using Linpack to compare DOS, 16-bit, and 32-bit Windows. Higher
numbers are better--floating-point operations per second (FLOPS) for
single-precision results on an AMI 486/33.

API Execution Rate Compiler Extender

MS-DOS 23,000 FLOPS Watcom C none
16-bit Windows 18,000 FLOPS Watcom C Windows 3.1
32-bit Windows 1,360,000 FLOPS Watcom C/386 BigWin & Windows 3.1
Extended DOS 1,440,000 FLOPS Watcom C/386 Watcom (Rational)



Products Mentioned


MetaWare Extended Watcom C9.0/386
DOS Watcom
High C/C++ w/ADK 415 Philips Street
MetaWare Waterloo, Ontario
2161 Delaware Avenue Canada N2L 3X2
Santa Cruz, CA 95060 800-265-4555
408-429-6382



 Windows NT SDK
BigWin SDK Microsoft
Rational Systems 1 Microsoft Way
220 North Main Street Redmond, WA 98052
Natick, MA 01760 206-882-8080
508-653-6006

_EXTENDING WINDOWS TO 32 BITS-
by Steven Baker


[LISTING ONE]

/* Excerpts from GENERIC sample program supplied with Windows 3.0 & 3.1 SDK */

BOOL InitApplication(hInstance)
HANDLE hInstance; /* current instance */
{
 WNDCLASS wc;

 /* Fill in window class structure with parameters that describe the */
 /* main window. */

 wc.style = NULL; /* Class style(s). */
 wc.lpfnWndProc = MainWndProc; /* Function to retrieve messages */
 /* for windows of this class. */
 wc.cbClsExtra = 0; /* No per-class extra data. */
 wc.cbWndExtra = 0; /* No per-window extra data. */
 wc.hInstance = hInstance; /* Application that owns class. */
 wc.hIcon = LoadIcon(NULL, IDI_APPLICATION);
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = "GenericMenu"; /* Name of menu resource in RC file. */

 wc.lpszClassName = "GenericWClass"; /* Name in call to CreateWindow. */

 /* Register the window class and return success/failure code. */

 return (RegisterClass(&wc));
}
long FAR PASCAL MainWndProc(hWnd, message, wParam, lParam)
HWND hWnd; /* window handle */
unsigned message; /* type of message */
WORD wParam; /* additional information */
LONG lParam; /* additional information */
{
 FARPROC lpProcAbout; /* pointer to the "About" function */
 switch (message) {
 case WM_COMMAND: /* message: command from application menu */
 if (wParam == IDM_ABOUT) {
 lpProcAbout = MakeProcInstance(About, hInst);
 DialogBox(hInst, /* current instance */
 "AboutBox", /* resource to use */
 hWnd, /* parent handle */
 lpProcAbout); /* About() instance address */
 FreeProcInstance(lpProcAbout);
 break;
 }
 else /* Lets Windows process it */
 return (DefWindowProc(hWnd, message, wParam, lParam));
 case WM_DESTROY: /* message: window being destroyed */
 PostQuitMessage(0);
 break;
 default: /* Passes it on if unproccessed */
 return (DefWindowProc(hWnd, message, wParam, lParam));
 }
 return (NULL);
}





[LISTING TWO]

/* Excerpts from GENERIC.C modified for use with Rational Systems BigWin
 * 32-bit Windows extender. Class names must be unique for each "instance" to
 * run multiple instances of the same 32-bit Windows program. The WNDPROC
macro
 * is used to handle the differences between the MetaWare, Watcom, and Zortech
 * compilers and their strictness when parsing multiple attributes. These are
 * the only changes required to recompile this program. */

static char szClassName[8] ; /* GENxxxx, where xxxx is instance hdl */

BOOL InitApplication(hInstance)
HANDLE hInstance; /* current instance */
{
 WNDCLASS wc;

 wc.style = NULL; /* Class style(s). */
 wc.lpfnWndProc = (WNDPROC) MainWndProc;
 /* windows of this class. */
 wc.cbClsExtra = 0; /* No per-class extra data. */

 wc.cbWndExtra = 0; /* No per-window extra data. */
 wc.hInstance = hInstance; /* Application that owns the class. */
 wc.hIcon = LoadIcon(NULL, IDI_APPLICATION);
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = "GenericMenu"; /* Name of menu resource in RC file. */
 wsprintf(szClassName, "GEN%4.4X", hInstance) ;
 wc.lpszClassName = szClassName ;

 return (RegisterClass(&wc));
}







[LISTING THREE]

/* Excerpts from GENERIC.C modified for use with Watcom C/386 Compiler and
 * 32-bit Windows extender. Class names must be unique for each "instance" to
 * run multiple instances of the same 32-bit Windows program. The LPVOID macro
 * for callback functions casts this function pointer as a far (16:32) pointer
 * These are the only changes required to recompile this program. */

char _class[64];

BOOL InitApplication(hInstance)
HANDLE hInstance; /* current instance */
{
 WNDCLASS wc;
 wc.style = NULL; /* Class style(s). */
 wc.lpfnWndProc = (LPVOID) MainWndProc;
 /* windows of this class. */
 wc.cbClsExtra = 0; /* No per-class extra data. */
 wc.cbWndExtra = 0; /* No per-window extra data. */
 wc.hInstance = hInstance; /* Application that owns the class. */
 wc.hIcon = LoadIcon(NULL, IDI_APPLICATION);
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = "GenericMenu"; /* Name of menu resource in RC file. */
 sprintf( _class,"GenericWClass%d",hInstance );
 wc.lpszClassName = _class; /* Name used in call to CreateWindow. */
 return (RegisterClass(&wc));
}






[LISTING FOUR]

/* Excerpts from GENERIC.C modified for use with MetaWare High C/C++ 386
 * Compiler and 32-bit ADK Windows extender. The "unsigned" message parameter
 * is changed to a WORD message. This is the only change required to recompile
 * this program. The function declararations in the revised WINDOWS.H handle
 * any other changes. */


long FAR PASCAL MainWndProc(hWnd, message, wParam, lParam)
HWND hWnd; /* window handle */
WORD message; /* type of message */
WORD wParam; /* additional information */
LONG lParam; /* additional information */






[LISTING FIVE]

/* Excerpts from GENERIC.C modified for use with Windows NT October SDK beta.
 * The use of the WNDPROC cast of MainWndProc and the use of the APIENTRY
macro
 * for callback functions since WINDOWS NT currently uses the _stdcall calling
 * convention for windows functions (a cross between C parameter passing order
 * and Pascal's callee clears stack efficiency. The largest change is that
 * many messages are packed differently in the Windows API and must be
unpacked
 * differently from the the Windows 3.1 API (see the WM_COMMAND message below
 * and the LOWORD macro). The DialogBox callback for the about box also has
 * messages that that must be unpacked differently (not shown) */

BOOL InitApplication(HANDLE hInstance) /* current instance */
{
 WNDCLASS wc;

 wc.style = NULL; /* Class style(s). */
 wc.lpfnWndProc = (WNDPROC)MainWndProc; /* Function to retrieve messages */
 /* for windows of this class. */
 wc.cbClsExtra = 0; /* No per-class extra data. */
 wc.cbWndExtra = 0; /* No per-window extra data. */
 wc.hInstance = hInstance; /* Application that owns the class. */
 wc.hIcon = LoadIcon(NULL, IDI_APPLICATION);
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = "GenericMenu"; /* Name of menu resource in RC file. */
 wc.lpszClassName = "GenericWClass"; /* Name in call to CreateWindow. */

 return (RegisterClass(&wc));
}
LONG APIENTRY MainWndProc(
 HWND hWnd, /* window handle */
 UINT message, /* type of message */
 UINT wParam, /* additional information */
 LONG lParam) /* additional information */
{
 FARPROC lpProcAbout; /* pointer to the "About" function */
 switch (message) {
 case WM_COMMAND: /* message: command from application menu */
 if (LOWORD(wParam) == IDM_ABOUT) {
 lpProcAbout = MakeProcInstance((FARPROC)About, hInst);
 DialogBox(hInst, /* current instance */
 "AboutBox", /* resource to use */
 hWnd, /* parent handle */
 lpProcAbout); /* About() instance address */
 FreeProcInstance(lpProcAbout);
 break;

 }
 else /* Lets Windows process it */
 return (DefWindowProc(hWnd, message, wParam, lParam));
 case WM_DESTROY: /* message: window being destroyed */
 PostQuitMessage(0);
 break;
 default: /* Passes it on if unproccessed */
 return (DefWindowProc(hWnd, message, wParam, lParam));
 }
 return (NULL);
}



=====================================================

UNPUBLISHED SOURCE CODE FROM HERE ON:

/* Selected excerpts from WINDOWS.H supplied with Windows 3.1 SDK
 *
 * Note the use of nonportable "int" parameters in functions and structures
 * that will map to 16-bit values for a 16-bit compiler
 * or 32-bit bits for a 386 compiler
 */

#define FAR _far
#define PASCAL _pascal
#define WINAPI _far _pascal

typedef unsigned int UINT;

typedef UINT HANDLE;
#define DECLARE_HANDLE(name) typedef UINT name

DECLARE_HANDLE(HWND); // after first expansion becomes "typedef UINT HWND"

HWND WINAPI CreateWindowEx(DWORD, LPCSTR, LPCSTR, DWORD, int, int, int, int,
HWND, HMENU, HINSTANCE, void FAR*);
HWND WINAPI CreateWindow(LPCSTR, LPCSTR, DWORD, int, int, int, int, HWND,
HMENU, HINSTANCE, void FAR*);

/* WM_CREATE/WM_NCCREATE lParam struct */
typedef struct tagCREATESTRUCT
{
 void FAR* lpCreateParams;
 HINSTANCE hInstance;
 HMENU hMenu;
 HWND hwndParent;
 int cy;
 int cx;
 int y;
 int x;
 LONG style;
 LPCSTR lpszName;
 LPCSTR lpszClass;
 DWORD dwExStyle;
} CREATESTRUCT;
typedef CREATESTRUCT FAR* LPCREATESTRUCT;

======================================================



/* Excerpts from WINDOWS.H supplied with Rational Systems BigWin
 *
 * note that "int" parameters in functions are left unchanged,
 * but are now widened to 32-bits when passed
 * and "int" declarations in structures have been changed to SHORT
 * to ensure they match with Windows 3.1 definitions
 */

#define _far

#define PASCAL _pascal
#define WINAPI _far _pascal

typedef unsigned short UINT;

typedef UINT HANDLE;
#define DECLARE_HANDLE(name) typedef UINT name

DECLARE_HANDLE(HWND);

HWND WINAPI CreateWindowEx(DWORD, LPCSTR, LPCSTR, DWORD, int, int, int, int,
HWND, HMENU, HINSTANCE, void FAR*);
HWND WINAPI CreateWindow(LPCSTR, LPCSTR, DWORD, int, int, int, int, HWND,
HMENU, HINSTANCE, void FAR*);

/* WM_CREATE/WM_NCCREATE lParam struct */
typedef struct tagCREATESTRUCT
{
 void FAR* lpCreateParams;
 HINSTANCE hInstance;
 HMENU hMenu;
 HWND hwndParent;
 SHORT cy;
 SHORT cx;
 SHORT y;
 SHORT x;
 LONG style;
 LPCSTR lpszName;
 LPCSTR lpszClass;
 DWORD dwExStyle;
} CREATESTRUCT;
typedef CREATESTRUCT FAR* LPCREATESTRUCT;


[LISTING THREE: UNPUBLISHED SOURCE]

/* Excerpts from WINDOWS.H supplied with Watcom C/386 Compiler
 *
 * note that "int" parameters in functions and structures are left unchanged,
 * but "int" is #defined as "short"
 * after inclusion of the original WINDOWS.H file int, far, etc are #undef
 */

#ifdef _WINDOWS_16_
#include <win16.h> // the original Windows 3.1 file left intact
#else
#include <_win386.h> // Watcom's 32-bit header file wrapper
#endif

// Excerpts from _win386.h

#define int short
#define __far
#define __huge
#define __export

#include <win16.h> // the original Windows 3.1 file left intact

#undef int
#undef __far
#undef __huge
#undef FAR
#define FAR far


[LISTING FOUR: UNPUBLISHED SOURCE]

/* Excerpts from WINDOWS.H supplied with MetaWare High C/386 Compiler
 *
 * note that "int" parameters in functions and structures are left unchanged,
 * but "int" is #defined as "short", and "far" disappears
 * after inclusion of the original WINDOWS.H file int, far, etc are #undef
 */

#ifdef __HIGHC__
#define int short
#define far

# define lpFAR _Dfar
#endif

#define FAR far
#define PASCAL pascal

typedef char lpFAR *LPSTR;
typedef char _Dfar *LP48STR;

HWND FAR PASCAL CreateWindow(LP48STR, LP48STR, DWORD, int, int, int, int,
HWND, HMENU, HANDLE, LP48STR);
HWND FAR PASCAL CreateWindowEx(DWORD, LP48STR, LP48STR, DWORD, int, int, int,
int, HWND, HMENU, HANDLE, LP48STR);

typedef struct tagCREATESTRUCT
 {
 LP48STR lpCreateParams; // a true 48-bit (16:32) far pointer
 HANDLE hInstance;
 HANDLE hMenu;
 HWND hwndParent;
 int cy; // members become defined as 16-bit short
 int cx;
 int y;
 int x;
 LONG style;
 LP48STR lpszName;
 LP48STR lpszClass;
 DWORD dwExStyle;
 } CREATESTRUCT;
typedef CREATESTRUCT FAR *LPCREATESTRUCT; typedef CREATESTRUCT _Dfar
*LP48CREATESTRUCT;

===========================================================

/* Selected excerpts from WINDOWS.H, WINBASE.h, and WINUSER.H supplied with

 * Windows NT October SDK beta
 *
 * Note the use of nonportable "int" parameters in structure definitions
 * that will map to 16-bit values for a 16-bit compiler
 * or 32-bit bits for a 386 compiler
 * Note that "int" parameters have now been widened to 32-bits
 * Two versions of this function exist in the Win32 API depending on
 * whether UNICODE (16-bit characters) is used
 */

#define far
#define near
#define pascal

#define WINAPI
#define APIENTRY WINAPI
#define PASCAL pascal
#define FAR far

typedef unsigned int UINT;

HWND WINAPI CreateWindowExA(
 DWORD dwExStyle,
 LPCSTR lpClassName,
 LPCSTR lpWindowName,
 DWORD dwStyle,
 int X,
 int Y,
 int nWidth,
 int nHeight,
 HWND hWndParent ,
 HMENU hMenu,
 HINSTANCE hInstance,
 LPVOID lpParam);

HWND WINAPI CreateWindowExW(
 DWORD dwExStyle,
 LPCWSTR lpClassName,
 LPCWSTR lpWindowName,
 DWORD dwStyle,
 int X,
 int Y,
 int nWidth,
 int nHeight,
 HWND hWndParent ,
 HMENU hMenu,
 HINSTANCE hInstance,
 LPVOID lpParam);
#ifdef UNICODE
#define CreateWindowEx CreateWindowExW
#else
#define CreateWindowEx CreateWindowExA
#endif // !UNICODE

typedef struct tagCREATESTRUCTA {
 LPVOID lpCreateParams;
 HINSTANCE hInstance;
 HMENU hMenu;
 HWND hwndParent;

 int cy;
 int cx;
 int y;
 int x;
 LONG style;
 LPCSTR lpszName;
 LPCSTR lpszClass;
 DWORD dwExStyle;
} CREATESTRUCTA, *LPCREATESTRUCTA;
typedef struct tagCREATESTRUCTW {
 LPVOID lpCreateParams;
 HINSTANCE hInstance;
 HMENU hMenu;
 HWND hwndParent;
 int cy;
 int cx;
 int y;
 int x;
 LONG style;
 LPCWSTR lpszName;
 LPCWSTR lpszClass;
 DWORD dwExStyle;
} CREATESTRUCTW, *LPCREATESTRUCTW;
#ifdef UNICODE
#define CREATESTRUCT CREATESTRUCTW
#define LPCREATESTRUCT LPCREATESTRUCTW
#else
#define CREATESTRUCT CREATESTRUCTA
#define LPCREATESTRUCT LPCREATESTRUCTA
#endif // UNICODE


================================================================

/* Selected excerpts from WINDOWS.H, WINBASE.H, and WINUSER.H supplied with
 * Windows NT October '92 SDK beta. Note use of nonportable int parameters in
 * structure definitions that will map to 16-bit values for a 16-bit compiler
 * or 32-bit bits for a 386 compiler. Note int parameters have now been
widened
 * to 32-bits. Two versions of this function exist in the Win32 API depending
 * on whether UNICODE (16-bit characters) is used */


#define far
#define near
#define pascal

#define WINAPI
#define APIENTRY WINAPI
#define PASCAL pascal
#define FAR far

typedef unsigned int UINT;

HWND WINAPI CreateWindowExA(
 DWORD dwExStyle,
 LPCSTR lpClassName,
 LPCSTR lpWindowName,
 DWORD dwStyle,
 int X,

 int Y,
 int nWidth,
 int nHeight,
 HWND hWndParent ,
 HMENU hMenu,
 HINSTANCE hInstance,
 LPVOID lpParam);

HWND WINAPI CreateWindowExW(
 DWORD dwExStyle,
 LPCWSTR lpClassName,
 LPCWSTR lpWindowName,
 DWORD dwStyle,
 int X,
 int Y,
 int nWidth,
 int nHeight,
 HWND hWndParent ,
 HMENU hMenu,
 HINSTANCE hInstance,
 LPVOID lpParam);
#ifdef UNICODE
#define CreateWindowEx CreateWindowExW
#else
#define CreateWindowEx CreateWindowExA
#endif // !UNICODE

typedef struct tagCREATESTRUCTA {
 LPVOID lpCreateParams;
 HINSTANCE hInstance;
 HMENU hMenu;
 HWND hwndParent;
 int cy;
 int cx;
 int y;
 int x;
 LONG style;

 LPCSTR lpszName;
 LPCSTR lpszClass;
 DWORD dwExStyle;
} CREATESTRUCTA, *LPCREATESTRUCTA;
typedef struct tagCREATESTRUCTW {
 LPVOID lpCreateParams;
 HINSTANCE hInstance;
 HMENU hMenu;
 HWND hwndParent;
 int cy;
 int cx;
 int y;
 int x;
 LONG style;
 LPCWSTR lpszName;
 LPCWSTR lpszClass;
 DWORD dwExStyle;
} CREATESTRUCTW, *LPCREATESTRUCTW;
#ifdef UNICODE
#define CREATESTRUCT CREATESTRUCTW
#define LPCREATESTRUCT LPCREATESTRUCTW

#else
#define CREATESTRUCT CREATESTRUCTA
#define LPCREATESTRUCT LPCREATESTRUCTA
#endif // UNICODE


























































January, 1993
PORTING FROM 16-BIT TO 32-BIT EXTENDED DOS


More speed and greater data capacity




Joe Huffman


Joe has a BSc and MS in electrical engineering and worked for Zortech for
several years writing C, C++, and assembly code for DOS, UNIX, 16-bit, and
32-bit DOS extenders. He then founded FlashTek, a 32-bit DOS extender vendor,
where he can be contacted at 121 Sweet Ave., Moscow, ID 83843.


Many of today's applications have outgrown MS-DOS's 640K limit, and
programmers are turning to DOS extenders for more speed, greater data
capacity, and the code size required by enhanced feature sets. While 16-bit
DOS extenders are relatively easy to port to because of the minimal number of
required code changes, they don't provide all of the advantages of 32-bit
code. Such advantages include the elimination of the 64-Kbyte segment limits
(it's a pleasure to call malloc() with a request for a megabyte and have it
comply!), the availability of programmer-transparent virtual memory (your
apparent RAM size is limited by the amount of free disk space), 32-bit
integers, and 32-bit near pointers.
If you've programmed exclusively for 16-bit DOS, 32-bit integers and near
pointers are both a blessing and curse. 32-bit integers mean a single
instruction acting on a 32-bit register instead of slow, multi-instruction
long arithmetic (that frequently involves a compiler-known function call).
This gives you a new tool to speed up your program and, at the same time,
reduce executable size. Since many of the more subtle pitfalls of converting
16-bit applications to 32-bit are related to these two differences, I'll
discuss them in depth, insofar as they apply to 32-bit DOS extenders.
When converting 16-bit code to 32-bit DOS, you'll inevitably make some changes
in any "real" application. Intercepting hardware interrupts, writing directly
to video memory, catching DOS critical errors, and many other details will
mean consulting the DOS-extender documentation. You'll have to modify your
code to utilize the extender's built-in hooks so you can gracefully access all
the normally accessible resources. These hooks vary from vendor to vendor, but
vendors usually support all the normal things you do in 16-bit DOS to access
the hardware and services. Still, most porting problems are related to memory
protection, integer size, or structure size and padding.


Memory Protection


Memory-protection problems usually involve uninitialized pointers and
over-writing the end of an array. The symptoms are something many 16-bit
real-mode DOS programmers have never seen before--register dumps. The DOS
extender detects an out-of-bounds memory access, aborts the program, and
returns an error message, along with the contents of all the registers at the
time the violation occurred. A good debugger generally makes these problems
easy to find. Most debuggers will allow you to run the program to recreate the
fault and, instead of generating the register dump, the debugger will position
the cursor on the offending source line in your code.
Under 16-bit DOS, you might find that a stray pointer had made some rather
"creative" changes to DOS you didn't have any clues about until after the
program exited. I've spent days tracking down bugs that required a reboot of
the machine after every attempt to find it. DOS extenders provide the memory
protection to help catch many of these bugs long before they cause any
problems.


Integer Size


Integer-size problems can show up when you start doing bit twiddling like that
in Figure 1. In this example, the code will work properly for 16-bit code and
in many circumstances for 32-bit code. The problem will only occur when the
array being searched is larger than 64K, and then only sometimes. These type
of bugs are some of the most frustrating and difficult to find. Of course your
test suite (which was designed for testing 16-bit code) will pass just fine.
Your customer with the killer data set will find the problem, but won't be
able to reproduce it on demand--and you can't fix it until you can duplicate
it. The solution is to either put in the code some conditional compilation
that tests for 32-bit compilation or change the algorithm such that it doesn't
depend on hardcoded bit twiddling.
Figure 1: Hardcoding bit masks can lead to subtle program bugs when converting
16-bit code to 32-bit code.

 /***** Use shifts and logical operators to speed up the binary search
 routine. Avoids the use of slow divides. *****/
 unsigned int binary_search(const char **array_p, const char *find_p,
 int size)
 {
 unsigned int mask = 1 << 15; /* Assumes 16-bit integers! */
 unsigned int index = mask;
 int comp_val = 1; /* Initialize the compare value to decrease */
 /* the index until index < size. */
 while(index >= size 
 mask && (comp_val = strcmp(array_p[index], find_p)) ! = 0)
 {
 if(comp_val > 0) /* If the value is too high, clear this bit. */
 index ^= mask;
 mask >>= 1;
 index = mask;
 }
 return index;
 }

Other integer-size problems can occur when transferring data via a serial port
or a binary file. If you're transferring data via a serial port, you are (at
some level) breaking down the data into bytes. Make sure the high word of the
integer is transferred on output and initialized properly on input. In
general, any place your code interfaces with the outside world needs to be
examined for potential integer-size problems. This includes disk files, serial
ports, parallel ports, shared direct memory access (including things like
video memory), sound cards, and game and I/O ports.


Structure Size and Padding



Structure size and padding issues are by far the most subtle problems you'll
encounter when porting code from 16-bit to 32-bit DOS. One of our customers
had to reformat his hard disk at least three times before we found problems
related to assumptions about structure size and layout made by a third-party
library vendor. That third-party libraries were causing such serious problems
convinced us the problem was much more serious than a few novice programmers
fumbling in the dark. If experienced third-party vendors were having problems,
then there's a serious knowledge gap, and 32-bit DOS extenders could unfairly
receive a bad rap for being unreliable.
Different compilers have different default rules for structure padding. The
first structure in Figure 2, for example, can have a size varying from 3 to 8
depending on the compiler and memory model. You can even add more members to a
structure without changing the size--depending on the compiler and memory
model. The compiler may place padding between various members in the structure
to put the larger members on 2- or 4-byte boundaries. In 32-bit code, integers
are 4 bytes and may be placed on a 4-byte boundary (which can speed up memory
access). Short ints are generally 2 bytes and may be put on 2-byte boundaries.
By grouping different structure members together, you can pack a structure
more tightly, nearly halving the size of a structure; see Figure 3. Not only
does this have implications for your program's memory consumption, but, more
importantly, it can mean that code access (perhaps 16-bit code that reads
binary files or assembly language) to the structure and structure members will
not function correctly.
Figure 2: The default size of a structure varies, depending on memory model
and compiler.

 struct fig_2a
 {
 char c;
 int i;
 };
 struct fig_2b
 {
 char c1, c2;
 int i;
 };

 Compiler sizeof(struct fig_2a) sizeof(struct fig_2b)

 Borland C/C++ 3.1
 all models 3 4
 Microsoft C/C++ 7.0
 all models 4 4
 Watcom 386/9.0
 (32-bit code) 5 6
 Zortech 3.0
 T,S,C,M,L,Z models 4 4
 Zortech 3.0 X model 8 8

Figure 3: The size of structures can depend on the order of the members.

 struct fig_3a
 {
 char c1;
 int i1;
 char c2;
 int i2;
 char c3;
 int i3;
 char c4;
 };
struct fig_3b
{
 char c1;
 char c2;
 char c3;
 char c4;
 int i1;
 int i2;
 int i3;

 };

 Compiler sizeof(struct fig_3a) sizeof(struct fig_3b)

 Borland C/C++ 3.1
 all models 10 10
 Microsoft C/C++
 7.0 all models 14 10
 Watcom 386/9.0

 (32-bit code) 16 16
 Zortech 3.0
 T,S,C,M,L,Z models 14 10
 Zortech 3.0 X model 28 16

When code is recompiled for 32 bit, even functional C code (compiled under 16
bit) can break when assumptions about padding are violated. For example, a
structure similar to that in Figure 4 was used to store some attributes for an
array of graphical objects. This structure was initialized from a text file,
not a binary file, which normally would have made the code above suspicion.
But the programmer took a shortcut that cost his customer time, money, and a
trashed hard disk before the problem was found. The offending code looked
something like that in Figure 5.
Figure 4: A structure with different sizes depending on padding; similar to
one in an actual program.

 struct attribute
 {
 char fg_color, bg_color;
 };

Figure 5: Code used to read an array of structures, as defined in Figure 4,
from a text file.

 #define BYTES_PER_ATTR 2
 #define TABLE_SIZE 10
 struct attribute attr_table[TABLE_SIZE];

 void read_attr_table(FILE *fp)
 {
 int i;
 char *p = (char *)&attr_table[0];

 for(i = 0; i <TABLE_SIZE; i++)
 {
 int j;
 for(j = 0; j < BYTES_PER_ATTR; j++)
 {
 int tmp;
 scanf(fp, "%d", &tmp);
 *p++ = tmp;
 }
 }
 }

The size of the structure was a multiple of 2, but not a multiple of 4. This
meant that for 16-bit code there was no padding, but for 32-bit code there
were two bytes of padding at the end of the structure. As the pointer p was
advanced past the end of the last member of the first structure, it did not
then point to the first member of the next structure; it pointed to the
padding instead. Only the first structure received the proper contents; all
the others received what amounted to garbage. Yet the 16-bit code worked
perfectly and avoided the need to address each of the members of the structure
by name, which would have made for much larger code. (There were about ten
members instead of two, as in the example.) Once we found the problem, we
fixed it by taking the address of the ith element in the array and assigning
it to p, thus realigning p at the start of every structure in the array. Even
this was somewhat risky, since there are no ANSI standards that specify the
padding, nor (I believe) even the order of the members in a structure.
There are other solutions to alignment and padding problems. Most compilers
have command-line flags and pragmas that can be used to control how structure
members are aligned. You can even mix alignment in the same module. One
structure can have 1-byte alignment, another can have 2-byte alignment, and
still another the default alignment. Be careful when using command-line flags
or making the entire project use an alignment other than the default. It's
almost certain that you'll be using run-time library structures, such as FILE,
or the time-related structures. If the default padding (with which the runtime
library was most likely built) results in a mis-aligned access to the
structure members, you could again be facing some obscure bugs. The solution
in this case is to bracket the include files that define the structures with
the proper pragmas to set the alignment type to that with which the runtime
library was built. The best solution is for the runtime library implementors
to do this in the header file or to lay out the structure so that padding is
not required for the alignment options.


_PORTING FROM 16-BIT TO 32-BIT EXTENDED DOS_
by Joe Huffman

Figure 1:

/***** Use shifts and logical operators to speed up the binary search routine.
Avoids the use of slow divides. *****/
unsigned int binary_search(const char **array_p, const char *find_p, int size)
{
 unsigned int mask = 1 << 15; /* Assumes 16-bit integers! */
 unsigned int index = mask;
 int comp_val = 1; /* Initialize the compare value to decrease */
 /* the index until index < size. */
 while(index >= size 
 mask && (comp_val = strcmp(array_p[index], find_p)) != 0)
 {
 if(comp_val > 0) /* If the value is too high, clear this bit. */
 index ^= mask;
 mask >>= 1;

 index = mask;
 }
 return index;
}



Figure 2:


struct fig_2a
{
 char c;
 int i;
};
struct fig_2b
{
 char c1, c2;
 int i;
};

Compiler sizeof(struct fig_2a) sizeof(struct fig_2b)

Borland C/C++ 3.1 all models 3 4
Microsoft C/C++ 7.0 all models 4 4
Watcom 386/9.0 (32-bit code) 5 6
Zortech 3.0 T,S,C,M,L,Z models 4 4
Zortech 3.0 X model 8 8




Figure 3:

struct fig_3a
{
 char c1;
 int i1;
 char c2;
 int i2;
 char c3;
 int i3;
 char c4;
};
struct fig_3b
{
 char c1;
 char c2;
 char c3;
 char c4;
 int i1;
 int i2;
 int i3;

};


Compiler sizeof(struct fig_3a) sizeof(struct fig_3b)


Borland C/C++ 3.1 all models 10 10
Microsoft C/C++ 7.0 all models 14 10
Watcom 386/9.0 (32-bit code) 16 16
Zortech 3.0 T,S,C,M,L,Z models 14 10
Zortech 3.0 X model 28 16





Figure 4:

struct attribute
{
 char fg_color, bg_color;
};




Figure 5:


#define BYTES_PER_ATTR 2
#define TABLE_SIZE 10
struct attribute attr_table[TABLE_SIZE];

void read_attr_table(FILE *fp)
{
 int i;
 char *p = (char *)&attr_table[0];

 for(i = 0; i < TABLE_SIZE; i++)
 {
 int j;
 for(j = 0; j < BYTES_PER_ATTR; j++)
 {
 int tmp;
 scanf(fp, "%d", &tmp);
 *p++ = tmp;
 }
 }
}



















January, 1993
64-BIT PROGRAMMING IN A 32-BIT WORLD


Writing portable code for 16-, 32-, and 64-bit architectures




Andy Nicholson


Andy is a computer scientist with Cray Research and can be contacted at droid
@cray.com.


Compared to 16-bit programming, 32 bits means faster programs, more memory
with straightforward addressing, and better processor architecture. Still,
many programmers are already thinking about the even greater advantages of
64-bit processors.
Cray Research computers have always used 64-bit words and addressed large
memories. However, as part of our ongoing effort to develop standard software
and to interoperate with other systems, we end up porting code which
originated on 32-bit processors. As a result, we regularly encounter what we
refer to as "32 bit-isms" -- code written under the assumption that a machine
word is 32 bits. Because of the difficulties in porting this code, we've
established a few simple guidelines for writing code portable across 16-, 32-,
or 64-bit (or more) processors.
Because of its heritage, C has a plethora of data types and data constructs.
You can use not only char, short, int, and long, but their unsigned brothers
as well, and you can mix them up in structures and unions. You can get really
busy making unions of structures of unions, and if your complex data types are
not complex enough, you can then throw in bitfields to spice things up. And of
course, you can cast your data elements to be any other kind of data type you
want. These are power tools, and as with other power tools, you must use them
safely or you'll end up cutting off your fingers and hands.


High-level Structures for High-level Code


In their classic book The Elements of Programming Style, Kernighan and Plauger
suggest that you "choose a data representation that makes the program simple."
To me, this means using high-level data structures for high-level programming,
and low-level data structures for low-level programming.
My favorite example of a 32 bit-ism is a bug I found in our port of version 1
of Gated, a routing protocol engine used for internetworking from Cornell
University. In a Berkeley networking environment, it's natural to use
inet_addr( ) to convert string representations of Internet addresses into a
useful binary format. Internet addresses happen to be 32 bits, the same word
size of many computers running Berkeley networking code.
There's also a high-level definition of an Internet address: struct in_ addr.
For convenience, the structure definition includes the subfield s_ addr, which
is a scalar type (unsigned long) containing the Internet address. inet_addr()
accepts a pointer to a char and returns an unsigned long, and inet_addr will
return -1 on encountering an error in converting the address string.
Gated reads a configuration file that has Internet addresses in text format
and stores them in a sockaddr_in--a high-level structure that includes the
struct in_addr. The code in Example 1(a) worked on a 32-bit machine but failed
when we ported it to a Cray Research computer. Why?
Example 1: High-level structures for high-level code.

 (a)

 struct sockaddr_in saddrin
 char *str;

 if ((saddrin.sin_addr.s_addr = inet_addr(str)) == (unsigned long)-1) {
 do_some_error_handling;
 }

 (b)

 struct sockaddr_in saddrin

 char *str;

 if (inet_aton(str, &saddrin.sin_addr) ! = OK) {
 do_some_error_handling;
 }

Because as long as inet_addr can correctly interpret the string, everything is
fine. But this code never catches the situation where inet_addr returns an
error on our 64-bit machine. You have to consider the bit sizes of the items
being compared to determine what's wrong.
First, inet_addr returns its error value, (unsigned long)-1, which is a 64-bit
word of all 1 bits. This value is then stored in the s_addr field of an
in_addr. in_addr must be 32 bits to match an Internet address, so it is a
32-bit bitfield of an unsigned int (ints are 64 bits with our compiler). Now
we have 32 1 bits stored. The stored value is compared with (unsigned long)-1.
Since we have stored 32 1 bits in an unsigned int, the compiler automatically
promotes the 32 bits to 64; thus the comparison of 0x00000000 ffffffff to
0xffffffff ffffffff fails. This was a difficult bug to detect, particularly
because of the implicit promotion from 32 to 64 bits.
So what do you do about this bug? One fix is to compare against 0xffffffff
instead of -1, but that makes the code even more dependent on objects being a
particular size. Another is to use an intermediate unsigned long variable for
the result and comparison before storing the result in the sockaddr_in. But
that complicates the code.
The real problem is the expected equivalence of an unsigned long and a 32-bit
quantity, such as an Internet address. An Internet address must be stored as
32 bits, but it is sometimes convenient to access the parts of an address as a
scalar type. On a machine with a 32-bit word, it seems okay to access the
address as a long (which is thought to be 32 bits). Instead of assuming that a
low-level data item (32 bits of Internet address) is equivalent to a machine
word, the high-level data-type struct in_addr should be used consistently. And
since an in_addr has no invalid values, there should be a separate status
return value.
The solution is to define a new function that works like inet_addr but returns
a status value and accepts a struct in_addr as a result parameter; see Example
1(b). This code is portable across architectures regardless of word size
because high-level data elements are used consistently and return values are
not overloaded. Trying to change inet_addr( ) this way would break many
programs, although the NET2 release from Berkeley does define the new function
inet_aton( ).


Low-level Structures for Low-level Code



Low-level programming implies direct manipulation of physical devices or
protocol-specific wire formats. For example, device drivers often must
manipulate control registers with very specific bit patterns. Furthermore,
network protocols transmit data items with specific bit patterns that must be
interpreted properly.
This is where your data structures must exactly mirror the physical data item
to be manipulated. Bitfields are wonderful because they precisely specify the
number of bits and their arrangement. In fact, it's this precision which makes
bitfields superior to using shorts, ints, and longs to map physical
structures--short, int, and long may change from machine to machine, but
bitfields remain consistent.
When mapping a physical structure, the use of bitfields allows precision in
defining the format, forcing you to use a coding style consistent in its
accessing of the structure. Each field is named, and your code is written to
access those fields directly. One thing you don't want to do is use arrays of
scalar types (short, int, or long) when accessing physical structures. Code
which accesses these arrays assumes a particular bit size which may be
incorrect when porting to an architecture with different word-size
characteristics.
One problem we ran into when porting the PEX graphics library concerns a
structure which maps a protocol message. On a machine where ints are the same
size as the elements in the message, the code in Example 2(a) works great. In
this case the 32-bit data elements are fine on a 32-bit machine; on a 64-bit
Cray computer, they're terrible. It's necessary to change not only the
definition of the structure to Example 2(b), but all the code that references
the coord array as well. Thus we're faced with the choice of either rewriting
all the code referencing this message or defining a low-level structure and a
high-level structure and having special code to copy from one to the other. I
don't know about you, but I don't look forward to rooting out every reference
of zcoord = draw_ msg.coord[2];. Furthermore, hunting down code like Example
2(c) is a dirty job when it comes time to port to a new architecture. This
particular problem causes the same difficulties regardless of the word sizes.
You just can't assume that machine words, shorts, ints, and longs are a
particular size and have portable code.
Example 2: Low-level structures for low-level code.

 (a)

 struct draw_msg {
 int objectid;
 int coord[3];
 }

 (b)

 struct draw_msg {
 int objectid:32;
 int coord1:32;
 int coord2:32;
 int coord3:32;
 }

(c)

 int *iptr, *optr, *limit;
 int xyz[3];

 iptr = draw_msg.coord;
 limit = draw_msg.coord + sizeof(draw_msg.coord);

 optr = xyz;
 while (iptr < limit)
 *optr++ = *iptr++;



Structure Packing and Word Alignment


The variance of word sizes from machine to machine causes another problem
because of structure packing by compilers. C compilers align word-size data
items on word boundaries, which usually leaves holes between data elements
when a word-size item follows a smaller item (the exception being when there
are enough small items to exactly fill a word).
Clever programmers sometimes declare unions with two or more structures, fill
the union using one of the structures, and then use a different structure to
look at the union; see Example 3(a). But suppose that this code is written for
a 16-bit machine with 16-bit ints and 32-bit longs. Then code which accesses
the different structures can expect reasonable mappings (see Figure 1) and the
code in Example 3(b) will behave as expected. However, if this code is ported
to another machine with 32-bit words, the mappings change. If the new
machine's compiler allows you to use 16-bit ints, the alignment will change to
that shown in Figure 2. Or, if the compiler follows the K&R suggestion that
ints be the same as a word (32 bits), the alignment will be that shown in
Figure 3. In either case you'll have problems.
Example 3: Structure packing and word alignment.

 (a)
 union parse_hdr {
 struct hdr {
 char data1;
 char data2;
 int data3;
 int data4;

 } hdr;
 struct tkn {
 int class;
 long tag;
 } tkn;
 } parse_item;

 (b)


 char *ptr = msgbuf;

 parse_item.hdr.data1 = *ptr++;
 parse_item.hdr.data2 = *ptr++;
 parse_item.hdr.data3 = (*ptr++ << 8 *ptr++);
 parse_item.hdr.data4 = (*ptr++ << 8 *ptr++);

 if (parse.tkn.class >= MIN_TOKEN_CLASS &&
 parse.tkn.class <= MAX_TOKEN_CLASS) {
 interpret_tag(parse.tkn.tag);
 }

In the first case (Figure 2), the tag field no longer lines up as expected and
will be garbage. In the second case (Figure 3), neither the class nor the tag
fields will be meaningful and the code that relies on two chars packing an int
will be incorrect. The best way to write portable code is to once again not
make assumptions about the sizes of standard data types and how they map onto
other data types.


Machine Addressing Characteristics


All processors can address words in memory on word boundaries and are usually
optimized for this. Some processors allow other types of memory accesses (such
as byte addressing and half-word addressing on half-word boundaries), and
others even have extra hardware to allow word and half-word addressing on odd
boundaries.
While addressing mechanisms between machines vary, the fastest addressing mode
is word addressing on word boundaries. The addition of other modes requires
extra hardware, usually adding clock cycles to the memory reference. (These
extra modes and the special hardware support they require run counter to the
philosophy behind RISC processors. Cray computers, for example, support word
addressing on word boundaries--nothing more.)
On machines that do not offer a variety of data-type addressing modes, the
compiler may simulate some of them. For example, a compiler can simulate
half-word addressing within a word by generating instructions to read the word
and shift and mask the half-word into the expected position. This takes extra
clock cycles and generates bigger code.
In this regard, bitfields are inefficient because they generate the maximum
extra code to pull the field out of a word. When you then access another
bitfield in the same word, the process starts all over again by referencing
the word in memory which contains the bitfield. This generates a lot of code
to save a little space.
When designing data structures, we try to save space by using the smallest
data types capable of holding our data. chars and shorts are popular, and when
we really get stingy we pull out the bitfields. But this spends a dollar to
save a dime--all this storage efficiency has a hidden cost in program speed
and size.
Suppose you allocate only a few copies of a really compacted structure. You
have a lot of code that accesses the fields of those structures, and you
execute that code a lot. Then your code will be much slower because of the
overhead of nonword addressing--it may even be larger because of the extra
instructions necessary to pull the fields apart. The extra code generated may
take up more space than you originally saved.
This is where you can have your cake, eat it, and even lose weight in the
bargain. In high-level data structures where specific bit positioning isn't
necessary, you should use words for all the fields and not worry about the
extra space they take up. Somewhere in the machine-dependent section of the
program, you should have a typedef for a word like this:
/* an int is a word */
/* on this particular machine */
typedef word int;
Using all words for the fields of a high-level data structure has the
following benefits:
It's very portable to other machine architectures
The compiler generates the fastest possible code
The processor executes the fastest possible memory references
There are absolutely no structure-alignment surprises.
I admit there are times when you simply can't do this (if, for example, you
have a large structure which would be 25 percent larger using thousands of
words that you'll access infrequently). But using words will often save space
and increase speed, and it will always be more portable.


Conclusion


Writing code that is portable across machine architectures is simple. The most
basic rule is to hide the machine word size as much as possible and to be very
specific about data element bit sizes when mapping physical data structures.
Or, as I suggested earlier, use high-level data structures for high-level
programming, and low-level data structures for low-level programming. Don't
make assumptions about the sizes of standard C scalar types, and work with the
machine--not against it--when formulating high-level data structures.


_64-BIT PROGRAMMING IN A 32-BIT WORLD_
by Andy Nicholson

Example 1:

(a)

struct sockaddr_in saddrin
char *str;

if ((saddrin.sin_addr.s_addr = inet_addr(str)) == (unsigned long)-1) {
 do_some_error_handling;
}


(b)


struct sockaddr_in saddrin

char *str;

if (inet_aton(str, &saddrin.sin_addr) != OK) {
 do_some_error_handling;
}


Example 2

(a)

struct draw_msg {
 int objectid;
 int coord[3];
}


(b)


struct draw_msg {
 int objectid:32;
 int coord1:32;
 int coord2:32;
 int coord3:32;
}


(c)


int *iptr, *optr, *limit;
int xyz[3];

iptr = draw_msg.coord;
limit = draw_msg.coord + sizeof(draw_msg.coord);

optr = xyz;
while (iptr < limit)
 *optr++ = *iptr++;


Example 3

(a)
union parse_hdr {
 struct hdr {
 char data1;
 char data2;
 int data3;
 int data4;

 } hdr;
 struct tkn {
 int class;
 long tag;

 } tkn;
} parse_item;


(b)

char *ptr = msgbuf;

parse_item.hdr.data1 = *ptr++;
parse_item.hdr.data2 = *ptr++;
parse_item.hdr.data3 = (*ptr++ << 8 *ptr++);
parse_item.hdr.data4 = (*ptr++ << 8 *ptr++);

if (parse.tkn.class >= MIN_TOKEN_CLASS &&
 parse.tkn.class <= MAX_TOKEN_CLASS) {
 interpret_tag(parse.tkn.tag);
}













































January, 1993
LUC PUBLIC-KEY ENCRYPTION


A secure alternative to RSA




Peter Smith


Peter has worked in the computer industry for 15 years as a programmer,
analyst, and consultant and has served as deputy editor of Asian Computer
Monthly. Peter's interest in number theory led to the invention of LUC in
1991. He can be reached at 25 Lawrence Street, Herne Bay, Auckland, New
Zealand.


According to former NSA director Bobby Innman, public-key cryptography was
discovered by the National Security Agency in the early seventies. At the
time, pundits remarked that public-key cryptography (PKC) was like binary
nerve gas--it was potent when two different substances were brought together,
but quite innocuous in its separate parts. Because the NSA promptly classified
it, not much was known about PKC until the mid-seventies when Martin Hellman
and Whitfield Diffie independently came up with the notion and published
papers about it.
Traditional cryptographic systems like the venerable Data Encryption Standard
(DES) use the same key at both ends of a message transmission. The problem of
ensuring correct keys leads to such expensive expedients as distributing the
keys physically with trusted couriers. Diffie and Hellman (and the NSA) had
the idea of making the keys different at each end. In addition to encryption,
they envisioned this scheme would also lead to a powerful means of source
authentication known as digital signatures.
RSA, developed in 1977, was the first reliable method of source
authentication. The RSA approach (patented in the early eighties) initiated
intense research in "number theory," one of the most recondite areas of
mathematics. Although C.F. Gauss studied this topic in the early 1800s
(referring to it then as "higher arithmetic"), very little real progress has
been made in solving the problem of factoring since then. The means available
today are essentially no better than exhaustive searching for prime factors.
In terms of intractability theory, however, no one has yet proved that the
problem is intractable, although researchers believe it to be so.


The RSA Algorithm


RSA works by raising a message block to a very large power, then reducing this
modulo N, where N (the product of two large prime numbers) is part of the key.
Typical systems use an N of 512 bits, and the exponent to which blocks are
raised in decryption is of the same order. An immediate problem in
implementing such a system is the representation and efficient manipulation of
such large integers. (Standard microprocessors don't really have the power to
handle normal integer sizes and functions; even numeric coprocessors are
inadequate when integers of this size are involved.)
RSA has dominated public-key encryption for the last 15 years as research has
failed to turn up a reliable alternative--until the advent of LUC. Based on
the same difficult mathematical problem as RSA, LUC uses the calculation of
Lucas functions instead of exponentiation. (See text box entitled, "How the
Lucas Alternative Works.")
Because we're working in the area of mathematics, we can formally prove that
LUC is a true alternative to RSA. Furthermore, we can show that a cipher based
on LUC will be at least as efficient. More importantly, we can show that LUC
is a stronger cipher than RSA. The reason is that under RSA, the digital
signature of a product is the product of the signatures making up the product;
in mathematical terms, M{e}L{e}=(ML){e}. This opens RSA to a cryptographic
attack known as adaptive chosen-message forgery. Ironically, this is outlined
in a paper co-authored by Ron Rivest (the "R" in RSA). LUC is not
multiplicative and therefore not susceptible to this attack. Using Lucas
functions, V[e](M,1)V[e](L,1) is not equal to V[e](ML,1). In other words, the
use of exponentiation leads to RSA being multiplicative in this way, while
LUC's use of Lucas functions avoids this weakness.


Choosing the Algorithms


Lucas functions have been studied mainly in relation to primality testing, and
it was to these sources we turned when researching efficient algorithms for
implementing LUC. For given parameters, the Lucas functions give rise to two
series, U[n] and V[n]. The first algorithm (see Listing One, page 90)
calculated both, even though we were only interested in V[n]. It was only in a
paper on factoring integers that we found a means of calculating V[n] alone
(see Listing Two, page 90). The pseudocode examples show that both algorithms
have two phases: The work done when the current bit is a 0 is half the work
necessary when the current bit is a 1.
More Details.
Typically, in systems like LUC the exponent used for encryption is a much
smaller integer than that used for decryption. A commonly chosen encryption
exponent is the prime number 65,537. This is a good choice for fast encryption
as all but 2 of the 17 bits are 0s. We have no such control over the
decryption exponent, but there is a way of halving the work, and thus, of
introducing a limited degree of parallelism into the calculation.
Since LUC is a public-key cryptosystem, we can always assume that the
possessor of the private decrypting keys knows the two primes (p and q) which
make up the modulus, N. Consequently, we can reduce the exponent and message
with respect to the two primes, in each case at least halving the amount of
work. At the end of the calculation with respect to the primes, we bring the
results together to produce the final plain text (see Listing Three, page 90).


Large-integer Arithmetic


There's really only one source of information about large-integer arithmetic:
Knuth's The Art of Computer Programming. We found that almost every time we
referred to his book, we came up with some new angle or way of tweaking some
extra performance out of our code.
We decided to represent the large integers as 256-byte arrays, with the low
byte giving the length (in bytes) of the integer. For instance, the 8-byte
hexadecimal number 1234567890ABCDEF would appear in a file view as 08 EF CD AB
90 78 56 34 12. These arrays became a Pascal-type har (for hexadecimal array).
We can store integers of over 600 decimal digits in our hars, but because the
hars must be able to hold the results of a multiplication, we are limited to
manipulating integers up to 300 decimal digits in length.
Implementation of addition, subtraction, and multiplication went quite
smoothly; implementation of division took more effort. (We took comfort in not
being the first to encounter problems with division. Lady Ada Lovelace, the
first computer programmer, said, "I am still working at some most entangled
notations of division, but see my way through them at the expense of heavy
labor, from which I shall not shrink as long as my head can bear it.") We
tried various methods, including one based on Newton which calculated the
inverse of the divisor and then multiplied. (See Knuth's discussion.) We
finally opted for Knuth's Algorithm D, despite his warning that it contained
possible discontinuities. At that stage, we were working on a 16-bit 80286 PC;
see Listing Four, page 90.
Of course there was much more than the division routine to consider, but we
found that it was the critical routine in terms of getting LUC to run at a
reasonable speed. Once we had upgraded to an 80386, we converted to a full
32-bit implementation. The assembler code for the division (still Algorithm D)
is given in Listing Five (page 91). Although space constraints prevent a
complete presentation of the code, suffice to say that we have been able to
achieve a signing/decryption speed on a modulus of 512 bits of over 200 bits
per second (33-MHz 80386, 0 wait states).


Other Issues


Central to any cryptographic system are keys. In LUC, if an adversary is able
to find p and q, the prime factors of modulus N, then all messages sent with N
can be either read in the case of encryption or forged in the case of signing.
Since the days of Gauss, research on factoring has come up with various
so-called "aleatoric" methods of factoring some numbers. These methods are
like cures for poison ivy: numerous, and occasionally efficacious. One old
method, found by Pierre Fermat, is very quick at factoring some types of
composite numbers. If N is the product of two primes which are close together,
then it can be easily factored. For example, if p=1949, and q=1951, then
N=3802499. Taking the square root of N, we find that it is approximately
1949.999. Adding 1 to the integral part of this (giving 1950), we square this,
giving 3802500. If we now subtract N from this square, we get a difference of
1, which is the square of itself. This means that N has been expressed as the
difference of two squares. As we learned in high school, x{2}-y{2} =
(x-y)(x+y), and so we obtain the two factors.
Fermat's method works whenever the ratio of the factors is close to an
integer. (Note that the ratio is close to 1 in the above discussion.) This
attack, as cryptographers call methods used to break a cipher, has to be
guarded against in generating the modulus N.
Another guard is that neither (p + 1) and (q + 1) nor (p - 1) and (q - 1)
should be made up of small prime factors. There are many other guards of
varying degrees of importance, but the entire area needs consideration
depending on the level of security required, and how long the keys are meant
to last.
The basic idea behind LUC is that of providing an alternative to RSA by
substituting the calculation of Lucas functions for that of exponentiation.
While Lucas functions are somewhat more complex mathematically than
exponentiation, they produce superior ciphers.
This substitution process can be done with systems other than the RSA. Among
these are the Hellman-Diffie-Merkle key exchange system (U.S. Patent number
4,200,770), the El Gamal public-key cryptosystem, the El Gamal digital
signature, and the recently proposed Digital Signature Standard (DSS), all of
which use exponentiation.
The nonmultiplicative aspect of Lucas functions carries over, allowing us to
produce alternatives to all these. In the case of the DSS, Lucas functions
allow us to dispense with the one-way hashing cited (but not specified) in the
draft standard.
A New Zealand consortium has been set up to develop and license systems based
on LUC, which is protected by a provisional patent. For more information,
contact me or Horace R. Moore, 101 E. Bonita, Sierra Madre, California 91024.



References


Athanasiou, Tom. "Encryption Technology, Privacy, and National Security." MIT
Technology Review (August/September, 1986).
Diffie, W. and M.E. Hellman. "New Directions in Cryptography." IEEE
Transactions on Information Theory (November, 1976).
El Gamal, Taher. "A Public Key Cryptosystem and a Signature Scheme Based on
Discrete Logarithms." IEEE Transactions on Information Theory (July, 1985).
Gauss, C.F. "Disquisitiones Arithmeticae," Article 329.
Goldwasser, S., S. Micali, and R. Rivest. "A Digital Signature Scheme Secure
Against Adaptive Chosen Message Attack." SIAM J. COMPUT (April, 1988).
Kaliski, Burton S., Jr. "Multiple-precision Arithmetic in C." Dr. Dobb's
Journal (August, 1992).
Knuth, D.E. The Art of Computer Programming: Volume II: Semi-Numerical
Algorithms, second edition. Reading, MA: Addison-Wesley, 1981.
Schneier, Bruce. "Untangling Public Key Cryptography." Dr. Dobb's Journal
(May, 1992).
Williams, H.C. "A p + 1 method of factoring." Mathematics of Computation (vol.
39, 1982).


How the Lucas Alternative Works


As with RSA encryption, use of the Lucas alternative involves two public keys:
N and e. The number N is assumed to be the product of two large (odd) prime
numbers, p and q. Encryption and decryption of a message is achieved using
Lucas sequences, which may be defined as shown in Example 1. Note that P and Q
are integers.
If a message P is to be sent, it is encoded as the residue P1 modulo N of the
eth term of the Lucas sequence V[n](P,1), and then transmitted. The receiver
uses a secret key d (based on the prime factorization of N) to decode the
received message P1, by taking the residue modulo N of the dth term of the
Lucas sequence V[n](P1,1). The secret key d is determined so that
V[d](V[e](P,1),1) = P modulo N, ensuring the decryption of the received
message P1 as P. The existence of such a key d is based on the following
theorem.


Theorem


Suppose N is any odd positive integer, and P is any positive integer, such as
P{2}-4 is coprime to N. If r is the Lehmer totient function of N with respect
to D = P{2}-4 (see Example 2), then V[mr+1](P,1)=P modulo N for every positive
integer m. The condition that P{2}-4 be coprime to N is easily checked, as
P{2}-4=(P+2)(P-2). Also, because V[d](V[e](P,1),1)=V[de](P,1), according to
Example 4(e), the key d may simply be chosen so that de=1 modulo r.


The Lehmer Totient Function


Suppose P and Q are integers, and a and b are the zeros of X{2}-Px+Q (so that
P = a+b while Q = ab). Also, let D be the discriminant of x{2}-Px+Q. That is,
D = P{2}-4Q = (a-b){2}.
The Lucas sequences U[n] = U[n] (P,Q) and V[n] = V[n] (P,Q) are defined for n
= 0,1,2, and so on by the equation in Example3.
In particular, U[0] = 0, U[1] = 1, and then U[n+1] = PU[n] - QU[n-1] (for n =
1,2,3,...), while V[0] = 2, V[1] = P, and similarly V[n+1]= PV[n]-QV[n-1] (for
n = 1,2,3,...). These sequences satisfy a number of identities, including the
following which may be simply obtained from the definitions in Example 4.
Next, suppose N is any positive integer, and let r be the Lehmer totient
function of N with respect to D = P{2}-4Q, defined the same way as in the
statement of the theorem. In the special case where N is an odd prime p, the
Lehmer totient function of p with respect to D is the number given by the
equation in Example 5(a). In this case, the Lucas-Lehmer theorem states that
if p does not divide Q then the equation in Example 5(b) holds true.


Example of LUC


Let N = pxq = 1949x2089=4071461, and P = 11111, which equals the message to
encrypt/decrypt. The public keys will be e and N; the private key will be d.
First, calculate r, the Lehmer totient function of P with respect to N. To do
this we need to calculate the Legendre of p and q. Let D = p{2}-4; then
(D/1949) =-1 and (D/2089)=-1 are the two Legendre values. Hence r is the least
common multiple of 1949 + 1 and 2089 + 1; see Example 6(a). Choosing e = 1103
for our public key, we use the Extended Euclidean Algorithm to find the secret
key d, by solving the modular equation ed = 1 mod r. d turns out to equal
24017.
To encrypt the message 11111, we make the calculation shown in Example 6(b).
To decrypt the encrypted message, we calculate as in Example 6(c). --P.S.

_LUC PUBLIC-KEY ENCRYPTION_
by Peter Smith


[LISTING ONE]

{ To calculate Ve(P,1) modulo N }
 Procedure LUCcalc;
 {Initialise}
 BEGIN
 D := P*P - 4; ut := 1; vt := P; u := ut; v := vt;
 If not odd(e) then BEGIN u := 0; v := 2; END;
 e := e div 2;
 {Start main}
 While e > 0 do
 BEGIN

 ut := ut*vt mod N; vt := vt*vt mod N;
 If vt < 3 then vt := vt + N;
 vt := vt - 2;
 If odd(e) then
 BEGIN
 c := (ut*v + u*vt) mod N;
 v := (vt*v + D*u*ut) mod N;
 If odd(v) then v := v + N; v := v/2;
 If odd(c) then c := c + N; u := c/2;
 END;
 e := e div 2;
 END;
 END; {LUCcalc}

{ The required result is the value of v.}






[LISTING TWO]
Pseudocode for calculating Lucas Functions

Procedure wiluc { V = V(M) Mod N, the Mth Lucas number(P,1) }
Var
 V,Vb,P,Vf,N,M,NP, Vd, Vf : LargeInteger ;
 carry, high_bit_set : boolean ;
 bz : word ;
 BEGIN
 Va := 2 ; { V[0] } Vb = P ; { V[1] }
 NP := N - P; bz := bits(M) -1 ; { test bits from high bit downwards }
 For j := 1 to bz do
 BEGIN
 Vc := Vb * Vb; Vf = Vc ; If Vf < 2 then Vf := Vf + N
 Vf := Vf - 2; Vd := Va * Vb
 { Vc := V, Vd := V*Vb, Vf := V-2}
 If high_bit_set Then
 BEGIN
 Vb := P * Vc; If Vb < Vd then Vb := Vb + N; Vb := Vb - Vd;
 If Vb < P then Vb := Vb + N; Vb := Vb - P; Va := Vf
 END ;
 Else BEGIN { "even" ie high bit not set }
 Va := Vd; If Va < P then Va := Va + N; Va := Va - P;
 Vb := Vf;
 END ;
 High_bit_set := next_bit_down(M);
 {This boolean function determines the setting of the next bit down}
 Va := Va Mod N; Vb := Vb Mod N
 END ; { for j to bz }
END ; {wiluc}






[LISTING THREE]


{ Pseudocode for splitting decryption/signing over p and q
 (N = p*q) }
Procedure hafluc ( var s,p,q,m,e : LargeInteger ; qix : word ) ;
var ep,emq,
 temp,pi,qi,
 b,n,pa,qa : LargeInteger ;

{ This procedure applies only to decipherment and signing, where the primes
 making up the modulus N ( = p * q) are known (or can be easily deduced,
 since both keys are known). Applying it allows us to halve the amount of
 work. Encipherment is usually done with a small key - standard is 65537. }
 Begin
 Qpr (pa,qa,p,q,m,qix ) ; {} {assumes qix already calculated }
 ep = e ; ep = ep Mod pa
 emq = e ; emq = emq Mod qa
 mp = m ; mp = mp Mod p
 mq = m ; mq = mq Mod q
 wiluc(q2,mq,emq,q) ; wiluc(p2,mp,ep,p) ;
 if p2 < q2 then
 Begin
 temp = q q = p p = temp
 temp = q2 q2 = p2 p2 = temp
 End ;
 temp = p2 temp = temp - q2
 n = p * q
{ Solve with Extended Euclidean algorithm qi = 1/q Mod p. The algorithm
for the Extended Euclidean calculation can be found in Knuth. }
 r = temp * p
 r = r mod N
 s = r * qi
 s = s Mod n
 s = s + p2
End ; { hafluc }
Procedure SignVerify ;
 Begin
 h4 = 4
 p = large prime...
 q = large prime...
 n = p * q
 bz := bits(n) ;
 {write(cf,' generate 4 keysets (d,e) for p1,q1') ;}
{
 qix table for T[qix]
 Convention for qix
 This calculation is explained below.
 Lehmer totient qix Legendre values for p and q
 i.e. T[qix] = LCM
 (p - 1),(q - 1) 1 1 1
 (p - 1),(q + 1) 2 1 -1
 (p + 1),(q - 1) 3 -1 1
 (p + 1),(q + 1) 4 -1 -1
 e = encryption key, small prime eg 65537
 mu = message as large integer less than n
 Solve e * d[qix] = 1 Mod T[qix] using Extended Euclidean Algorithm
 where T[qix] is lcm(p1,q1), the Lehmer totient function of N
 with repect to mu, according to the above table.
 This gives 4 possible values of d, the decryption/signing key.
 The particular value used depends on the message mu, as follows:
 Let D = mu2 - 4. Calculate the Legendre values of D with respect to

 both p and q. This value is -1 if D is a quadratic non-residue of
 p (or q), and equal to 1 if D is a quadratic residue of p (or q).
 N.B. This part is the most difficult part of LUC! Take care.

 Signing (Deciphering):
 hafluc (a,pu,qu,mu,d,qix)

 Verifying (Enciphering):
 Use Wiluc.
End.






[LISTING FOUR]

Algorithm D in 32-bit Intel assmbler
Author: Christopher T. Skinner
Short version of Mod32.Txt with scalings just as comments
 Modulus routine for Large Integers
 u = u Mod v
Based on:
D.E.Knuth The Art of Computer Programming
 Vol 2 Semi-Numerical Algorithms 2ed 1981
 Algorithm D page 257
We use a Pascal Type called "har" ( for "hexadecimal array")
Type
 har = Array[0..255] of byte ;
Var u,v : har ;
Note that u[0] is the length of u and that the
integer begins in u[1]
It is desirable that u[1] is on a double word boundary.

; Turbo Pascal Usage: ( Turbo Pascal v6.0)
; {$L Mod32a} { contains mod32 far }
; {$F+} { far pointers }
; procedure Mod32 ( var u,v : har ) ;
; Turbo Assembler code: (TASM v2.01)--requires 32-bit chip ie 386 or 486
; nb FS and GS can be used as temporary storage. Don't try to use them as
; segment registers because Windows 3.0 restricts their allowed range, even
; after you have finished out of Windows. You will hang for sure, unless you
; have used a well-behaved protected-mode program to reset them, or cold boot.

Data Segment Word Public Use16
 vdz dw ? ; size v words
 va dd ? ; hi dword v
 vb dd ? ; 2nd " v
 vi dw ? ; ^v[1]
 savdi dw ? ; used in addback
Data EndS

Code Segment Word Public Use16
 Assume cs:Code, ds:Data ,es:Nothing
 Public mod32
; Pascal Parameters:
u Equ DWord Ptr ss:[bp+10] ; Parameter 1 of 2 (far)
v Equ DWord Ptr ss:[bp+ 6] ; parameter 2 of 2

uof equ word ptr ss:[bp+10]
vinof equ word ptr ss:[bp+ 6]

mod32 Proc far
 push bp
 mov bp,sp
 push di
 push si
 push ds ; save the DS

 ; Before using Mod32 check that:
 ; v > 0
 ; v < u u <= 125 words
 ; v[0] is a multiple of 4 and at least 8
 ; v[top] >= 80h (may need to scale u & v)
 ; make u[0] = 0 Mod 4 (add 1..3 if required)
domod:
 ; now point to our v
 mov ax,seg v
 mov ds,ax
 assume ds:Data
 mov si, offset v
 cld
 assume es:Nothing
 xor ah,ah
 mov al,es:[di] ; ax = size of u in bytes "uz"
 mov cx,ax ; cx = uz
 mov bx,ax ; bx = uz
 mov al,[si]
 mov dx,ax ; dx = size v bytes
 shr ax,2
 mov vdz,ax ; vdz " dwords vz = 0 mod 4
 sub bx,dx ; bx = uz - vz difference in bytes
 mov ax,bx ; ax = uz - vz
 sub ax,3 ; ax = uz - vz - 3 -> gs
 sub cx,3 ; cx = uz - 3
 add cx,di ; cx = ^top dword u
 add ax,di
 mov gs,ax ; gs = ^(uz-vz-3) u start (by -4 down to 1)
 inc di
 mov fs,di ; fs = uf = ^u[1] , end point
 inc si
 mov vi,si ; vi = ^v[1]
 add si,dx
 mov eax,[si-4]
 mov va,eax ; va = high word of v
 mov eax,[si-8]
 mov vb,eax ; vb = 2nd highest word v
 mov di,cx ; set di to ut , as at bottom of loop
d3:
 mov edx,es:[di] ; dx is current high dword of u
 sub di,4
 mov eax,es:[di] ; ax is current 2nd highest dword of u
 mov ecx,va
 cmp edx,ecx
 jae aa ; if high word u is 0 , never greater than
 div ecx ; mov ebx,eax
 mov esi,edx ; si = rh
 jmp short ad ; Normal route -- -- -- -- -->

aa: mov eax,0FFFFFFFFh
 mov edx,es:[di] ; 2nd highest wrd u
 jmp short ac
ab: mov eax,ebx ; q2
 dec eax
 mov edx,esi ; rh
ac: mov ebx,eax ; q3
 add edx,ecx
 jc d4 ; Knuth tests overflow,
 mov esi,edx
; normal route:
 ad:
 mul vb ; Quotient by 2nd digit of divisor
 cmp edx,esi ; high word of product : remainder
 jb d4 ; no correction to quot, drop thru to mulsub
 ja ab ; nb unsigned use ja/b not jg/l
 cmp eax,es:[di-4] ; low word of product : 3rd high of u
 ja ab
d4: ; Multiply & subtract * * * * * * *
 mov cx,gs
 mov di,cx ; low start pos in u for subtraction of q * v
 sub cx,4
 mov gs,cx
 xor ecx,ecx
 Mov cx,vdz ; word count for q * v
 mov si,vi ; si points to v[1]
 xor ebp,ebp ; carry 14Oct90 bp had problems in mu-lp
 even
; ** ** ** ** ** ** ** **
ba: lodsd ; eax <- ds[si]
 mul ebx ; dx:ax contains product carry set if dx > 0
 add eax,ebp
 adc edx,0
 sub es:[di],eax
 adc edx,0
 mov ebp,edx
 add di,4
 loop ba ; dec cx , jmp if not 0
; .. .. .. . .. .. . .. .. . .. . . ..
 sub es:[di],edx
 jnc d7

 mov si,vi ; add back (rare)
 mov savdi,di
 mov di,gs
 add di,4
 clc
 mov cx,vdz
bb: lodsd ; eax = ds[si] si + 2
 adc es:[di],eax
 inc di
 inc di
 inc di
 inc di
 loop bb
 xor eax,eax
 mov es:[di],eax
 mov di,savdi
 ; test with:

 ; 1,00000000,00000000,00000001/ 80000000,00000000,00000001
d7:
 mov bx,fs ; fs ^u[1]
 mov ax,gs ; gs = current u start position
 cmp ax,bx ; current - bottom
 jb d8
 sub di,4
 jmp d3
d8:
; here we would scale u down if it had been scaled up
quex: ; quick exit if v < u
 cld ; just in case
 pop ds
 pop si
 pop di
 pop bp
 ret 8 ; 2 pointers = 4 words = 8 bytes
mod32 EndP ;
Code Ends
 End





[LISTING FIVE]

Algorithm D in 16-bit Intel assembler
Author: Christopher T. Skinner
 mod16.txt 21 Au8 92 16 bit modulus
; divm Modulus
Data Segment Word Public
 vwz dw ? ; size v words
 va dw ? ; hi word v
 vb dw ? ; 2nd " v
 vi dw ? ; ^v[1]
 uf dw ? ; ^u[3]
 uz dw ? ; size u byte
 vz dw ? ; " v "
 ua dw ? ; ^( u[0] + uz - vz -1 ) , mul sub start
 ut dw ? ; ^ u[topword]
 qh dw ?
 uzofs dw ? ; ttt
 vzofs dw ? ; ttt
Data EndS
Code Segment Word Public
 Assume cs:Code, ds:Data
 Public diva

u Equ DWord Ptr [bp+10] ; ES:DI
v Equ DWord Ptr [bp+6] ; DS:SI
 ; NB v Must be Global, DS based...
diva Proc far
 push bp
 mov bp,sp
 push ds
 cld ; increment lodsw in mulsub
 lds si,v
 les di,u

 xor ah,ah
 mov al,es:[di] ; ax = uz size of u in bytes N.B. uz is not actually used
 mov cx,ax ; cx = uz
 mov bx,ax ; bx = uz
 mov al,ds:[si]
 mov dx,ax ; dx = size v bytes
 shr ax,1
 mov vwz,ax ; vwz " words
 sub bx,dx ; bx = uz - vz difference in bytes
 mov ax,bx ; ax = uz - vz
 dec ax ; ax = uz - vz - 1 -> ua
 dec cx ; cx = uz - 1
 add cx,di ; cx = ^top word u
 mov ut,cx ; ut = ^top word u
 add ax,di
 mov ua,ax ; ua = ^(uz-vz-1) u start (by -2 down to 1)
 inc di
 mov uf,di ; uf = ^u[1] , end point
 inc si
 mov vi,si ; vi = ^v[1]
 add si,dx
 mov ax,ds:[si-2]
 mov va,ax ; va = high word of v
 mov ax,ds:[si-4]
 mov vb,ax ; vb = 2nd highest word v
 mov di,cx ; set di to ut , as at bottom of loop
d3:
 mov dx,es:[di] ; dx is current high word of u
 dec di
 dec di
 mov ut,di
 mov ax,es:[di] ; ax is current 2nd highest word of u
 mov cx,va
 cmp dx,cx
 jae aa ;if high word u is 0 , never greater than
 div cx ;
 mov qh,ax
 mov si,dx ; si = rh
 jmp ad ; Normal route -- -- -- -- -->
aa: mov ax,0FFFFh
 mov dx,es:[di] ; 2nd highest wrd u
 jmp ac
ab: mov ax,qh
 dec ax
 mov dx,si ; rh
ac: mov qh,ax
 add dx,cx
 jc d4 ; Knuth tests overflow,
 mov si,dx
ad: mul vb ; Quotient by 2nd digit of divisor
 cmp dx,si ; high word of product : remainder
 jb d4 ; no correction to quot, drop thru to mulsub
 ja ab ; nb unsigned use ja/b not jg/l
 cmp ax,es:[di-2] ; low word of product : 3rd high of u
 ja ab
d4: ; Multiply & subtract * * * * * * *
 mov bx,ua
 mov di,bx ; low start pos in u for subtraction of q * v
 dec bx

 dec bx ;
 mov ua,bx
 Mov cx,vwz ; word count for q * v
 mov si,vi ; si points to v[1]
 mov bx,qh
 xor bp,bp
; ** ** ** ** ** ** ** **
ba: lodsw ; ax <- ds[si] si + 2 preserve carry over mul ?
 mul bx ; dx:ax contains product carry set if dx > 0
 add dx,bp
 xor bp,bp
 sub es:[di],ax
 inc di
 inc di
 sbb es:[di],dx
 rcl bp,1
 loop ba ; dec cx , jmp if not 0
; .. .. .. . .. .. . .. .. . .. . . ..
 rcr bp,1
 jnc d7

 mov si,vi ; add back (rare)
 mov di,ua
 inc di
 inc di
 clc
 mov cx,vwz
bb: lodsw ; ax = ds[si] si + 2
 adc es:[di],ax
 inc di
 inc di
 loop bb
 mov cx,ut
 add cx,4
 sub cx,di
 shr cx,1 ; word length of u
bc: mov Word Ptr es:[di],0
 inc di
 inc di
 loop bc ;
 dec di ;
 dec di ;
 clc
d7:
 mov ax,uf
 cmp ua,ax
 jb d8
 dec di ; New these are suspicious, with an add back and a
 dec di ; New
 jmp d3
d8:
 cld ; just in case
 pop ds
 pop bp
 ret 8 ; 2 pointers = 4 words = 8 bytes ???
diva EndP ;
Code Ends
 End
































































January, 1993
DDJ HANDPRINTING RECOGNITION CONTEST WRAP-UP


The envelope, please


 This article contains the following executables: HWXDT2.ARC HWXHR2.ARC
HWXRES.ARC


Ray Valdes


Ray is senior technical editor at DDJ. He can be reached through the DDJ
offices, at 76704, 51 on CompuServe, or at rayval@well.sf.ca.us.


The challenge DDJ posed to you was formidable: Create a recognition engine for
handprinted text that rivals the one we published in April 1992. As always,
you rose to the occasion and acquitted yourselves honorably.
For those of you who missed previous issues, Ron Avitzur presented the
complete code to a handprinting recognizer in "Your Own Recognition Engine"
(DDJ, April 1992). In June, we released the code to a test harness for
evaluating recognizers, along with sample data files. Contestants had to
create a recognition engine that plugs into this test harness, following a
specified interface. The test harness is a simple command-line utility that
reads in some data during an initial training phase, reads test data during
the recognition phase, and finally displays an accuracy score and speed
results. Pen computers or pen operating systems weren't required, only an
algorithm and your coding chops.
Apple Computer generously provided a PowerBook 100 as first prize. The runners
up get to pick, in finish-line order, from a grab-bag of development tools
including C/C++ compilers from Borland, Microsoft, and Watcom, debuggers from
Symantec and Nu-Mega, and text editors from Borland and Lugaru.


Start Your Engines


Despite the specialized nature of the challenge, the contest generated much
interest. As many as 150 people requested or downloaded the test harness.
Requests came in from Japan, Germany, Brazil, Mexico, Belgium, and France.
The contest officially began on June 15th, with a September 15th deadline for
submissions. In retrospect, three months allotted for the contest, which
seemed like a generous amount of time, became a tight deadline. Many hours
were spent during the third month, as contestants followed trails into the
blind alleys of disappointing algorithms or the brick walls of machine
constraints.
Of the initial group, about 20 readers pursued the challenge past the initial
stages. The difficulty of the problem winnowed this down to fewer than ten in
the last month of the contest. The final week was a frenzy of activity,
similar to the end of a marathon. Just before midnight on September 15th, for
example, we received a call from a contestant who was ready to drive to the
post office at the local airport so that the submission would be postmarked by
the 15th. For those who requested it, we granted extensions (of a couple of
days) freely. The extra time could not compensate for those whose particular
approach did not pan out.
Ultimately, five entrants made the cut: Don Branson, Philipp Hanes, Kipton
Moravec, David Randolph, and Allen Stenger. Their submissions were diverse, in
terms of approaches to the problem: neural net (Moravec), fuzzy logic (Hanes),
Pearson statistical correlation (Randolph), incremental improvements to
Avitzur's recognizer (Stenger), and a range of derived features (Branson).
Other interesting approaches that did not make it to the finish line included
Fourier analysis, functional link nets, and various statistical tests.
About the only thing in common was that all entries were written in C and
almost all were developed on PCs. Processing speeds ranged from 100 characters
per second (cps) to four days for processing the alphabet. Our test platform
was a 25-MHz 80386-based PC running the DDJ test harness. In addition to the
sample data distributed with the harness, we collected additional data from
DDJ staffers.
Table 1 shows the entries ranked by accuracy, Table 2 ranks them by speed,
Table 3 shows the size of each program, and Table 4 shows the final rankings.
The following sections discuss each of the entries in turn.
Table 1: Rankings by percent accuracy. Stenger/DDJ entry represents Stenger's
version modified by DDJ to use all data points rather than filtering out some
points. Hanes's entry aborted with divide-by-zero exception on files C and D.
Moravec's entry could not process the data within the allotted time.

 Data Sets
--------------------------------------------------
 Average A B C D
 Randolph 92.4 95.2 91.7 92.3 90.5
 Branson 91.7 98.6 89.9 87.7 90.6
 Stenger/DDJ 82.5 93.1 74.9 84.9 77.2
 Stenger 65.4 99.7 97.2 38.5 26.3
 DDJ (Avitzur) 62.0 97.3 91.2 37.2 22.2
 Hanes 48.9 98.8 96.8 -- --
 Moravec -- -- -- --

Table 2: Rankings by speed (on 25-MHz 386) in cps. Stenger's entry running on
a Mac IIci averages 166 cps. Hanes's entry did not run all tests. It averaged
14.8 cps on the tests it was able to complete. Moravec's entries did not run
all the tests.

 DDJ (Avitzur) 106.5
 Stenger 105.5
 Hanes 7.4
 Randolph 2.6
 Branson 0.8
 Moravec --

Table 3: Size (in lines of code) of recognizer module.

 Randolph 520
 Hanes 764
 DDJ (Avitzur) 925
 Stenger 1041

 Moravec 1542
 Branson 2282

Table 4: The finish line (weighted rankings).

 Randolph 4.75
 Branson 4.35
 Stenger 4.10
 Hanes 3.25
 Moravec 1.05



Practice Over Theory


Allen Stenger came tantalizingly close to winning, if you consider only
results on the initial data set. Al's first task, however, was to port the DDJ
entry to the Macintosh. Actually, Ron Avitzur's recognizer and test harness
were originally developed on the Macintosh. However, when DDJ ported the code
to the PC, certain changes were made that broke the Macintosh-specific code.
These incompatibilities were discovered in compiling the code on the Mac,
independently by both DDJ and Allen. (We distributed these fixes to those who
requested Macintosh-format diskettes.)
Al then took a pragmatic, methodical approach, eschewing theory in favor of
practice. He began with Ron Avitzur's recognizer, identified specific
weaknesses or anomalies, and then fixed each one, boosting accuracy while
maintaining speed. He writes, "I made some changes to the original design,
added a new position function, and then added some ad hoc checks to
distinguish certain characters."
He noticed that the P4 and P8 routines in Avitzur's recognizer do not include
the last point of the stroke, which "seems to be an error." Also, Avitzur's
recognizer "uses a cumulating bounding-box scheme, in which the Nth stroke
carries the bounding box for strokes 1 through N. I modified this calculation
to propagate the gesture's bounding box back into all the strokes."
Avitzur's recognizer weighs each stroke in a character differently; later
strokes carry more weight. Stenger changed this scheme to weigh all strokes
equally. Al also added a new element to the feature set of a character: the
centroid of a stroke. He defines a centroid as "the average of all the points
in the (simplified) stroke, which gives an indication of where most of the ink
is. The centroid is quantized to a 4x4 grid."
Finally, to push his accuracy to an astonishing 99.7 percent (on the "clean"
data file), he added some ad hoc checks: "Although the centroids are not the
same for each copy of the character, we can say a priori from the structure of
the character that certain centroid values should never appear, and we can use
this to disqualify some wrong choices. For example, if the character to guess
has a stroke with a centroid in the left column of the 4x4 grid (that is, if
the character is 'F') we know the character cannot be an 'I' since I's
centroids should always be in the middle two columns."
One reason for the high accuracy of Stenger's recognizer (and of Avitzur's
original version, which achieved 97 percent accuracy on the "clean" data
file), is that the algorithms were tuned to the sample data. This was a result
of circumstance rather than design. When Ron Avitzur was writing his program,
the only data he had easy access to was that of his own handwriting.
In the documentation accompanying the test harness, we said: "Our sample
recognizer works well with our current set of data, but will likely stumble on
other valid data that it has not previously encountered. In judging the
contest, therefore, we will attempt to run all recognizers on as broad a data
set as possible, including any data that you submit with your entry. Although
not a requirement, it is in your interest to submit a sample of handwriting
data (in the binary format used by the test harness), so that you can both
show your recognizer in the best possible light as well as stump other
people's recognizers."
Our new data stumped both Avitzur's and Stenger's versions of the DDJ
recognizer. As Table 1 shows, the accuracy plummeted from 99.7 percent to
below 40 percent on the new samples. As you can tell from the screen shot in
Figure 1, the handwriting samples are not radically different from Ron's
handwriting. However, the writing is a bit more elaborate, with more strokes
and more curves than Ron's sparse style. Data was gathered using the same
Wacom 510C digitizer used in the original data set; the difference is, instead
of a Mac, we used a PC running Microsoft Windows, which collects points at a
different rate than on the Mac.
Just for fun, we modified Stenger's recognizer to not throw away data points
during the initial filtering process. The results are shown in Table 1. Our ad
hoc modification boosted accuracy on the new data while reducing it on the
original data -- which is why Ron Avitzur had created the simplify() routine
in the first place. The overall accuracy improved, but not enough to overtake
the top two finalists.


The Mother of all Neural Nets


Moving from the 100 cps speeds to a different time scale entirely, we arrive
at Kipton Moravec's entry. Kip cheerfully introduced his submission with,
"Here it is: the mother of all neural nets!"
Kip's back-prop neural net is massive, using two hidden levels plus an input
level and an output level. The input level has 148 nodes representing
character features. The two intermediate levels have 440 nodes each, while the
final output level has 65 nodes, one for each character.
Kipton freely admits: "The learning is sloooow. There is a lot of
floating-point math. A coprocessor can make a big difference! My 33-MHz 386 is
a little over 100 times slower than my 33-MHz 486. The other advantage of the
486 is the built-in cache." One reason for the slowness is the size of the
neural net, which will not fit into the 640K RAM of a PC running in real mode.
Therefore the data must be written out to disk, slowing things by orders of
magnitude.
We initially tried running Moravec's program on the same 386 used for the
other tests. After several days of waiting, we took Kip's advice and
commandeered a 33-MHz 486 and used 6 Mbytes as a disk cache. After a couple
more days, the program halted with a disk error. We suspected this was due to
an incompatibility between differing versions of HIMEM.SYS and SMARTDRV.EXE.
We updated the software and tried again, but at press time, the program was
still grinding away on the initial data set, after seven days.
Despite these problems, Kip's code is well worth looking over, if only for the
routines that derive character features. Among the 1600 lines of code are
routines that derive seven moments shown to be invariant to rotation, scaling,
and translation.


Fuzzy Logic Becomes Brittle


Handwritten on the disk containing Philipp Hanes's entry was the note:
"Recognize this, buster!" His submission looked very promising indeed when
processing the original data samples, achieving 98.8 percent accuracy on the
"clean" sample.
His recognition engine uses fuzzy logic, setting up an array of 55 attributes
per stroke (as well a few attributes that are global to the character). In his
earlier approaches, Hanes followed a few blind alleys: "The choice of values
to use is the crucial part. It was pretty easy to recognize 80 percent of the
letters. 90 percent wasn't too bad, either. But every percentage point beyond
that was an uphill battle. A few things I had considered perfectly reasonable
turned out actually to be detrimental to my end results. In particular, any
calculations having to do with angles and with changes in angles (first and
second derivatives) seemed to make the results worse rather than better.... I
tried quite a few statistical techniques, all of which did nothing at all.
Most made it much worse." These metrics included variance, standard deviation,
squaring errors, and discarding extreme values.
Hanes identified an area that was problematic for other contenders as well:
"The other thing that has a surprising effect on recognition accuracy is the
simplify() routine.... Simplifying more than a minimal amount has negative
effects on recognition. On the other hand, going through all the points does
slow things down quite a bit. I went for higher accuracy." Even so, his
recognizer is quite fast, bested only by Ron Avitzur's (and Allen Stenger's)
blazing speed, as shown in Table 2.
One comment from Hanes, unfortunately, proved to be prescient: "It's quite
possible to give it data that will crash the recognizer.... It is vaguely
possible that something might, by chance, add up to exactly 0 at some
point.... This could be looked at as a major design flaw." Sadly, this is what
occurred when processing the additional samples from DDJ, aborting the program
with a divide-by-zero error, and thus lowering the previously high score. We
examined the code, looking for any obvious errors that could be fixed simply,
but such was not the case.


A No-nonsense Approach


Don Branson's entry is among the best performing, contending with David
Randolph's for first place in accuracy. Don's submission is also the largest,
weighing in at 2200 lines of code. Despite the voluminous size, the approach
seems to be reasonably straightforward.
During training, a set of features or attributes is derived for each
character. Prior to deriving attributes, each stroke is passed through a
smoothing routine. Short strokes are discarded. Branson breaks down the
character not just into strokes, but also into enclosed regions, which carry a
set of attributes similar to those borne by strokes. The derived features
include stroke orientation and position, as well as other attributes that
quantify a component's rough spatial relationship to other character
components. Like Allen Stenger, Branson calculates the centroid for each
stroke.
Although Branson almost tied with David Randolph for accuracy, Branson's entry
runs at less than half the speed of Randolph's. Also, Branson's code is four
times the size of Randolph's. For these reasons, the winner's nod must go to
David Randolph.
Table 4 shows weighted scores for the contestants. Our scoring system ranks
all entries in each of three categories: accuracy, speed, and "elegance." All
coding styles being roughly equal, not to mention subject to taste, we defined
"elegance" as being inversely proportional to program size (that is, less is
more). We emphasized accuracy over speed, and both of these over "elegance."
Accordingly, our weighting was 80 percent for accuracy, 15 percent for speed,
and 5 percent for coding conciseness. First place in a category got 6 points,
second place 5 points, and so on down the line. These values were multiplied
by category weights.
Table 5: The prizes.

 First Prize: Apple Powerbook 100

 Subsequent prizes (contestants pick from these, in finish-line order)
 Borland C++ 3.1 with Application Frameworks

 Microsoft C7, including Windows SDK
 Watcom C 9.0/386
 Nu-Mega Bounds Checker
 Multiscope Debugger 2.0 from Symantec
 Visual Basic 2.0 Professional Edition from Microsoft
 Borland Pascal 7.0 with Objects



By the Book, Literally


David Randolph's computer, a PC/AT clone, quite possibly might be the most
modest hardware configuration used in the contest. He writes, "Because of the
primitive nature of my own computer, I have not had very good luck trying to
display the samples sent with the contest materials, but that eliminates any
preconceived notions about the handwriting that I am analyzing. For all I
know, I could have been recognizing Kanji!"
These constraints turned out to be advantageous, because they led to the
choice of a simple, fast algorithm that worked well on a variety of data, at
speeds fast enough to edge out Don Branson's entry and gain first place in the
contest.
David's entry defines a 50-element array for each character. This size was
chosen "because it is all my own computer will handle with 640K." For each
character, multiple-stroke information is concatenated into a single,
variable-length stroke, which is then fitted into a pair of fixed-size arrays
(one for the X coordinates, the other for the Y coordinates). The recognizer
relies on one test, the Pearson Correlation formula, to determine if two
distributions are statistically similar. David writes: "I use it to see if two
pen strokes follow a similar path, by comparing pen coordinates which should
appear in roughly the same place at the same time." The X arrays are compared
using the formula, then the Y arrays, and then the results are averaged.
Interestingly, the code implementing the correlation function was borrowed
from page 506 of Numerical Recipes in C by Press, Flannery et al. (Cambridge
University Press, 1988). Listing One shows this routine, as well as its
caller.
David adds: "The more complex the characters, the better it works. If you need
to examine especially detailed handwriting, this method will hold its own
well.... It is forgiving when the writer makes slips or odd movements of the
pen."


Conclusion


Although we constrained the problem significantly by specifying
character-at-a-time recognition, the contestants still faced a very difficult
task.
Ron Avitzur's original recognizer was disarmingly short and straightforward,
comprising 900 lines of code and using easily understood algorithms. However,
it was the result of many months of work in chasing down more complicated
approaches and then abandoning them. It seems that our contestants had to
retrace some of these steps to arrive at the same conclusions.
The contestants' code shows the tremendous variety of approaches to this
problem. The blind alleys are just as interesting as the successful
approaches.

[LISTING ONE]
_DDJ HANDPRINTING RECOGNITION CONTEST WRAP-UP_
by Ray Valdes


/****************************************************************\
 RECOG.C
 Recognizer for Handprinted Text.
+-----------------------------------------------------------------
 This is a handwriting recognition engine for
 Ray Valdes' contest.
 This recognizer was written by David Randolph.
 (c) Copyright 1992 by David Randolph.
 All rights reserved.
\****************************************************************/

/****************************************************************\
This file is my submission for the handwritten character recognition
contest. Using the DDJ_HWX.DAT file, it achieves above 90.8%
accuracy when trained on the first three samples depending
upon the setting of the rGA_BUF_SIZE constant, and much higher
accuracy when scanning the CLEAN.DAT file. Unlike Ron Avitzur's
method, this algorithm does not benefit quite as much from training
on additional samples, but can still perform strongly (85+%) after
training on only one sample. As one might expect, it performs much
more accurately, even when analyzing poor-quality handwriting, if the
training samples are clean.

Unfortunately, I have no samples of my own to send, but that may be
for the best. Because of the primitive nature of my own computer, I
have not had very good luck trying to see the samples sent with the
contest materials, but that eliminates any preconceived notions about
the handwriting I am analyzing. For all I know, I could have been

recognizing Kanji!

The algorithm:
This is an on-line algorithm which cheats by taking advantage of the
direction information inherent in tablet data. Here is what it does:

1. The multiple-stroke information is concatenated into a single
 variable-length stroke.

2. The variable-length stroke is fitted into a pair of fixed-size
 arrays of size rGA_BUF_SIZE. The left half of the pair is for
 the X coordinates, the right half for the matching Y coordinates.
 I have #defined rGA_BUF_SIZE to 50 because that is all my own
 computer will handle with 640k, but other platforms with more
 memory might be able to increase its size. Keep in mind that
 the larger rGA_BUF_SIZE is, the longer the program takes to run.

3. During the training phase, each sample is run through steps one
 and two and inserted into a linked list along with the character
 code and the number of strokes in the character. The same character
 may appear multiple times in the list. This allows the program to
 train on multiple styles simultaneously, although I would not
 recommend doing this.

4. During the guessing phase, each sample is run through steps one and
 two and compared with each sample in the linked list which has the
 same number of strokes.

5. The comparison: is done using the Pearson Correlation formula. The
 X arrays are compared using the formula, then the Y arrays, and then
 the results are averaged. The correlation formula is a test to see
 if two distributions are similar. I use it to see if two pen strokes
 follow a similar path by comparing pen coordinates which should appear
 in roughly the same place at the same time. By using this method, I
 am making some assumptions: A) that the computer has been trained on
 the handwriting of the person it is analyzing, and B) that that person
 writes a given character consistently (ie. always starting at the top or
 bottom, etc.).


ADVANTAGES of the algorithm:
1. It may be trained on the handwriting of multiple people at the same
 time.
2. It is generic. It is not tuned to work on a specific set of data and
 should work consistently regardless of what it is recognizing.
3. The more complex the characters, the better it works. If you need
 to examine especially detailed handwriting, this method will hold
 its own well.
4. The program can return the set of characters which most closely
 match the test character along with their scores. These characters
 may be used to guide a spelling checker layer which can pick the
 best one based on context.
5. It is forgiving when the writer makes slips or odd movements with the
 pen.


DISADVANTAGES of the algorithm:
1. It cannot train on one person's handwriting and then necessarily
 recognize another's.

2. One of the odd quirks of this algorithm is that the more complex the
 character, the easier it is to recognize because it has more features.
 The reverse is true of simple characters, which give it trouble. In
 fact, the character in the test data which stumped the program the
 most was the minus sign! It does best with characters that have
 curves.
3. The algorithm must work under the assumption that there is some
 foreknowledge about the orientation of a character. But this is
 an assumption that all analyzers must make, or else one might not
 be able to distinguish between an "N" or a "Z", a "W" or an "M",
 a "p" or a "d", etc.
4. It is an on-line algorithm, dependent upon the ordering of the coordinates
 in the sample data. It cannot easily be translated to an off-line
 algorithm, which only uses a scanned picture of the character during the
 training and recognition phase, although it is possible.
5. The algorithm does not always distinguish upper and lowercase characters
 (ie. it cannot always tell the difference between a "w" and a "W").
 I do not believe that a low-level recognizer should be expected to
 distinguish case without the help of a context-sensitive interpreter.
 I would like to ask that case-sensitivity not be made a requirement
 for this contest, since it cannot be practically achieved at this
 level, at least not when using real-world data.


WHY IT WORKS:
Since I am not mathematically inclined, I really have no technical
explanation for why the Pearson correlation formula works better
than other methods I have tried, other than to say it is highly tolerant
of error, so long as each character it trains on is sufficiently
different. With a 50-element array of points used to define 65 characters,
the leeway allowed before an input character is mistaken for another is
formidable. The larger the array and the more detailed the input,
the fewer the mistakes.


Although the contest places results over elegance, I have tried to keep the
code as simple and straightforward as possible to keep the algorithm clear.
I have made no changes to the harness code, so for testing, you only need
to plug in the recog.c file.

ONE NOTE before you compile. If you have a floating point processor, you
may want to convert all the variables in the "correlate" function which
are of type LONG back to type DOUBLE. I had to make them integers because
the floating point emulator software was too slow for my taste.

This code was created on a PC using Turbo C++ 1.0 running under DOS 3.2.

\****************************************************************/

/*** Include files ***/

#include "h_config.h"

#ifdef TURBO_C
#include <stdlib.h>
#include <math.h>
#endif
#ifdef MSOFT_C7
#include <search.h>

#include <math.h>
#endif

#include "h_stddef.h"
#include "h_mem.h"
#include "h_list.h"
#include "h_recog.h"

/****************************************************************/

#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>


/****************************************************************/


/* RECOG internal definitions */

 /** This defines the length of the "gesture_xy_array", in which a */
 /** character is stored */

#define rGA_BUF_SIZE 50

 /** This defines the number of dimensions in the gesture_xy_array, which */
 /** should always be 2, one for X and one for Y */

#define rNUM_COORDINATES 2

 /** These constants define where in the array mentioned above the X and Y */
 /** information should be stored */

#define X_VALUE 0
#define Y_VALUE 1


/* The structure which contains the linked list of trained samples */

struct prototype
{
 INT16 char_num, /* The character code (ie "A") */
 num_strokes, /* The number of strokes in the character */
 gesture_xy_array [rNUM_COORDINATES] [rGA_BUF_SIZE]; /* the character*/
 struct prototype *proto_next_ptr; /* a pointer to the next record */
};

typedef struct prototype PROTOTYPE, *PROTOTYPE_PTR;



private PROTOTYPE_PTR proto_top = NULL; /* Always points to list top */

private PROTOTYPE_PTR proto_bot = NULL; /* Always points to list bot */
 /* during training */


/** Definitions of PRIVATE functions **/


private float correlate (INT16 *, INT16 *, int);
private void FatalError (char *);
private INT16 round (float);



/************************* Start of Code ************************/



/****************************************************************/
/* This is a dummy function; it is called by the harness */
/* program, but never used */

public void rec_InitRecognizer(void)
{

}


/****************************************************************/
/* Below is the TRAIN half of the program. This function is */
/* given a character, along with some information about the */
/* character such as how many strokes it has. The function */
/* takes the given character and converts it into its own */
/* format and adds it to a linked list of sample characters. */

public void rec_Train(lpList gesture,INT16 char_code)
{
 INT16 i , k , total_input_length = 0 , convert_stroke_length ,
 array_index_ptr ;
 PROTOTYPE_PTR proto_current;


 /* display character values */

#ifdef DEBUG_TRACE
#ifndef USE_BGI_GRAPHICS
 printf("\n>For char %d (%c), number strokes = %d",char_code,char_code,
 gesture->num_items);
#endif
#endif

 /* Allocate memory for a new link */

 proto_current = (PROTOTYPE_PTR) malloc ( sizeof (PROTOTYPE) );
 if (proto_current == (PROTOTYPE_PTR) NULL)
 FatalError("\n\nFATAL ERROR! Not Enough Memory to Train.\n\n");

 if (proto_top == (PROTOTYPE_PTR) NULL)
 proto_top = proto_current;
 else
 proto_bot->proto_next_ptr = proto_current;

 proto_bot = proto_current;

 /* set current link's next ptr to NULL */


 proto_current->proto_next_ptr = (PROTOTYPE_PTR) NULL;

 /* Load data into to new structure....*/

 proto_current->char_num = char_code;
 proto_current->num_strokes = gesture->num_items;

 /* Get the total length of the set of strokes because we are going to */
 /* concatenate them into a single fixed-size array */

 for (i = 0; i < gesture->num_items; i++)
 {
 lpList stroke = gesture->items[i];
 total_input_length += stroke->num_items;
 }

 /* NOW, for each stroke, for the x and y values separately, */
 /* take the stroke and */
 /* force it to fit in an array of length rGA_BUF_SIZE, either by */
 /* stretching or compressing it... */
 /* To stretch, each point in the stroke is copied into multiple */
 /* elements of the new array. */
 /* To compress, only a sampling of new points in the stroke are copied */
 /* to the new array. */

 array_index_ptr = 0;
 for (i = 0; i < gesture->num_items; i++)
 {

 /* Get next stroke */

 lpList stroke = gesture->items[i];

 /* Fit the stroke into its slot in the fixed array by converting its */
 /* size */

 convert_stroke_length = round(( (float) stroke->num_items /
 (float) total_input_length ) *
 (float) rGA_BUF_SIZE);

 /* Fill each empty slot in the gesture_xy_array */

 for (k = 0; (k < convert_stroke_length) &&
 (array_index_ptr < rGA_BUF_SIZE); k++, array_index_ptr++)
 {
 VHPoint p = *(lpVHPoint)&stroke->items[(int) ((float)
 ((float) stroke->num_items / (float) convert_stroke_length) *
 (float) (k))];
 proto_current->gesture_xy_array[X_VALUE][array_index_ptr] = p.h;
 proto_current->gesture_xy_array[Y_VALUE][array_index_ptr] = p.v;
 }
 }

}



/******************************************************************/
/* This is the GUESS portion of the program. It is passed an */

/* unknown character, which is compared with the linked template */
/* list. The best fit is the guess. The program is passed the */
/* character and pointers to three guesses are passed back. This */
/* function only attempts to make a first guess. */

public void rec_Guess(lpList gesture,
 LPINT16 g1,LPINT16 g2,LPINT16 g3,
 LPINT16 w1,LPINT16 w2,LPINT16 w3)
{
 PROTOTYPE_PTR compare_ptr = (PROTOTYPE_PTR) NULL;
 float mean_compare;
 INT16 num_strokes, i, j, k, total_input_length = 0, convert_stroke_length,
 array_index_ptr,
 gesture_xy_array[rNUM_COORDINATES][rGA_BUF_SIZE];

 *g1 = *g2 = *g3 = 0;
 *w1 = *w2 = *w3 = 0;

 num_strokes = gesture->num_items;

 /* Get the total length of the set of strokes because we are going to */
 /* concatenate them into a single fixed-size array */

 for (i = 0; i < gesture->num_items; i++)
 {
 lpList stroke = gesture->items[i];
 total_input_length += stroke->num_items;
 }

 /* NOW, for each stroke, for the x and y values separately, */
 /* take the stroke and */
 /* force it to fit in an array of length rGA_BUF_SIZE, either by */
 /* stretching or compressing it... */
 /* To stretch, each point in the stroke is copied into multiple */
 /* elements of the new array. */
 /* To compress, only a sampling of new points in the stroke are copied */
 /* to the new array. */

 array_index_ptr = 0;
 for (i = 0; i < gesture->num_items; i++)
 {

 /* Get next stroke */

 lpList stroke = gesture->items[i];

 /* Fit the stroke into its slot in the fixed array by converting its */
 /* size */

 convert_stroke_length = round(( (float) stroke->num_items /
 (float) total_input_length ) *
 (float) rGA_BUF_SIZE);

 /* Fill each empty slot in the gesture_xy_array */

 for (k = 0; (k < convert_stroke_length) &&
 (array_index_ptr < rGA_BUF_SIZE); k++, array_index_ptr++)
 {
 VHPoint p = *(lpVHPoint)&stroke->items[(int) ((float)

 ((float) stroke->num_items / (float) convert_stroke_length) *
 (float) (k))];
 gesture_xy_array[X_VALUE][array_index_ptr] = p.h;
 gesture_xy_array[Y_VALUE][array_index_ptr] = p.v;
 }
 }

 compare_ptr = proto_top;

 /* Starting from the top compare the test character with each character */
 /* in the template linked list */

 while (compare_ptr != (PROTOTYPE_PTR) NULL)
 {
 /* Don't even bother to compare characters if they do not have */
 /* the same number of strokes */

 if (compare_ptr->num_strokes == num_strokes)
 {

 mean_compare = 0.0;

 mean_compare += correlate (gesture_xy_array[X_VALUE],
 compare_ptr->gesture_xy_array[X_VALUE],
 (int) (rGA_BUF_SIZE));
 mean_compare += correlate (gesture_xy_array[Y_VALUE],
 compare_ptr->gesture_xy_array[Y_VALUE],
 (int) (rGA_BUF_SIZE));

 mean_compare /= rNUM_COORDINATES;

 if ((mean_compare * 100) > *w1)
 {
 *w1 = mean_compare * 100;
 *g1 = compare_ptr->char_num;
 }

 } /* end if */

 compare_ptr = compare_ptr->proto_next_ptr;

 } /* end while */

#ifdef DEBUG_TRACE
#ifndef USE_BGI_GRAPHICS
 printf ("\n### CHAR finally evaluated as %c with prob %d",
 *g1, *w1);
#endif
#endif

}





/*
 The correlate function below was borrowed from the "pearsn" function
 found in the book "Numerical Recipies in C", edited by Press, Flannery,

 Teukolsky, and Vetterling (Cambridge University Press, Cambridge 1989)
 on page 506.
*/

/****************************************************************/
/* This function computes the correlation value between two distributions */

private float correlate (x,y,n)
INT16 x[], y[];
int n;
{
 int j;

 /* NOTE: You may want to change the word 'long' below to 'double' if
 you have a floating point processor. It should speed things up. */

 long yt, xt, syy=0, sxy=0, sxx=0, ay=0, ax=0;
 float r;

 for (j=0; j < n; j++)
 {
 ax += x[j];
 ay += y[j];
 }

 ax /= n;
 ay /= n;

 for (j=0; j < n; j++)
 {
 xt = x[j] - ax;
 yt = y[j] - ay;
 sxx += xt * xt;
 syy += yt * yt;
 sxy += xt * yt;
 }
 r = (double) sxy / sqrt( (double) ((double) sxx * (double) syy) );
 return (float) r;
}



/****************************************************************/
/* This is a dummy function; it is called by the harness */
/* program, but never used */

public void rec_EndTraining(void)
{
}



/****************************************************************/
/* Fatal Error Handler. Displays error message and terminates. */

private void FatalError (char* msg)
{
#ifdef USE_BGI_GRAPHICS
 gr_Close();

#endif

 fprintf (stderr,"\n*** Fatal Error:");
 fprintf (stderr, msg);
 exit (0);
}



/****************************************************************/
/* A small function to round a float to an int */

private INT16 round (float x)
{
 return ( (float) (x) - (float) ( (int) (x) ) >= 0.5 ) ? (int) (x) + 1 :
 (int) (x);
}


/************************* END OF RECOG.C ***********************/










































January, 1993
WINDOWS DDE FOR REAL-TIME APPLICATIONS


Dynamic data exchange extends the Windows message protocol




Kamal Shah


Kamal Shah holds an MSEE from the University of Texas at Austin and is a
software engineer for Intel. He can be contacted at 5200 N E Elam Young
Parkway, Hillsboro, OR 97124-6497; 800-438-4769.


Windows dynamic data exchange (DDE) provides a powerful mechanism for
communicating among Windows applications. This communication takes place as
applications send messages to each other to initiate conversations, to request
and share data, and to terminate conversations.
This article examines how real-time applications can communicate with Windows
applications using Windows DDE protocol. I'll first examine the DDE mechanism
in the realm of Windows applications, then describe how this capability is
extended to include real-time applications.


Getting Up to Speed with DDE


DDE is one of several mechanisms of interprocess communication supported by
Windows. It is an extension of the messaging scheme around which Windows is
designed. Of the two applications involved in the data exchange, one is known
as the server and another as the client. The client application is the
consumer of the data, the server is the provider. It's also possible for an
application to act as both a client and a server.
The client program starts the conversation by posting a message that includes
the server's name (the application's name, such as a name of a database
program) and a name of a topic of its interest. If the server acknowledges
this request positively, a link is established between the two. The client can
then request data from the server. The server provides the data if it has
access to the data requested; otherwise it replies negatively. The client can
also poke (write) data items to the server. The conversation stops when the
client and the server send termination messages to each other.


DDE Links


There are three types of links supported by DDE in Windows: cold, hot, and
warm. As Figure 1 illustrates, a cold-link conversation begins when a client
sends a WM_DDE_INITIATE message identifying the application and the topic it
is interested in. A server application that supports the requested topic
responds to the client with a WM_DDE_ACK (acknowledge) message. The client
then requests a particular data item by sending a WM_DDE_REQUEST message. The
server, if able to provide the requested data, responds with WM_DDE_DATA
message to the client. If the server does not have access to requested data,
it posts a negative WM_DDE_ACK message to the client. On receipt of the
requested data, the client can also optionally post a WM_DDE_ACK message to
the server. The conversation is terminated when the client and the server post
each other WM_DDE_TERMINATE messages.
The hot link (see Figure 2) allows the client to get updated data from the
server without explicit requests. The conversation begins in the same way as
in the cold link. Next, the client informs the server to send updated data by
sending a WM_DDE_ADVISE message. The server replies positively or negatively,
depending upon its access to the requested data. At this point, the server is
obliged to inform the client whenever the value of the data item changes by
posting WM_DDE_DATA messages to the client. When the client is no longer
interested in the particular data item, it sends WM_DDE_UNADVISE message to
the server. The conversation is terminated when the client and the server post
each other WM_DDE_TERMINATE messages.
As shown in Figure 3, the warm link falls somewhere in between the cold link
and the hot link. In this case, the client wants to be notified of changes in
the data item without immediately receiving the value of the new data item.
The conversation begins as usual. Next, the client sends a WM_DDE_ADVISE
message to the server with a special flag indicating deferred update of the
data item value. When the data item changes, the server posts a WM_DDE_DATA
message with NULL data value. To get the actual data value, the client posts a
WM_DDE_REQUEST message to the server, just as in the cold link. When the
client is no longer interested in the particular data item, it sends
WM_DDE_UNADVISE message to the server. The conversation is terminated when the
client and the server post each other WM_DDE_TERMINATE messages.


iRMX for Windows DDE Architecture


iRMX for Windows is a real-time, multitasking operating system that provides
preemptive, priority-based, interrupt-driven scheduling. iRMX for Windows
co-exists with DOS or Microsoft Windows on 386 or higher microprocessor-based
PCs, and users can run DOS/Windows applications and iRMX applications
simultaneously on a single system.
As Figure 4 illustrates, iRMX DDE consists of two components: a DDE library
used by iRMX applications and a NetBIOS-to-DDE Router application used by
Windows applications. This software is designed such that iRMX and Windows
applications running on the same machine, or running on different machines
connected by a network, can communicate with each other transparently. The
NetBIOS-to-DDE Router is a Windows application. NetBIOS is a peer-to-peer
session-layer LAN protocol that provides the primitives to conduct
process-to-process communications across the network nodes. As long as two
systems are running the same NetBIOS implementations at the transport layer,
the machines will be able to communicate.
The NetBIOS-to-DDE Router acts as a surrogate for the iRMX application for DDE
communication. As far as the Windows application is concerned, the iRMX
application is just another Windows application. The DDE library communicates
with the DDE Router using standard network interfaces. The DDE addressing
scheme, which includes an application name and a topic name, is extended to
include a third identifier, a machine name. This machine name is made
available by including it in the win.ini Windows configuration file. This
machine name, coupled with an application name and a topic name, uniquely
identifies a Windows application to the iRMX application. When an iRMX
application wants to initiate a DDE conversation with a Windows application,
it uses this machine name in addition to the application and the topic names.
Similarly, when a Windows application wants to initiate a DDE conversation
with an iRMX application, it uses an identifier of the form
machine_name-%application_name\topic_name. In this case, machine_name is the
name under which an iRMX application registers itself. In essence, we have
expanded the application name to include the machine name. These two names are
separated by a user-configurable separator such as %.
When the NetBIOS-to-DDE Router receives a WM_DDE_INITIATE message, it verifies
that the application-name string is of the form machine_name_ %app_name. Once
this is confirmed, the NetBIOS-to-DDE Router sends the DDE messages via
NetBIOS using the NCBs (Network Control Blocks, a format that NetBIOS
understands). The NetBIOS driver for iRMX for Windows picks up these NCBs,
converts them into transport-protocol information blocks, and ships them to
iRMX networking. The iRMX networking passes these blocks to the iRMX DDE
library, which delivers them to the requesting iRMX application in an
appropriate format.
Similarly, an iRMX application's DDE message first gets converted to
transport-protocol information blocks by the iRMX DDE library. These blocks
are sent to the NetBIOS driver. The NetBIOS driver sends these blocks to the
NetBIOS-to-DDE Router using the NCBs. Using the information in the NCBs, the
Router forms DDE messages and passes them on to the appropriate Windows
application.


Sample Session


A simple example illustrates the concepts involved in DDE programming. In the
example, the iRMX application is constructed as a DDE server (see Listing One,
page 94) and the Windows application is developed as a DDE client (see Listing
Two, page 94). The iRMX application has access to the share prices of four
different companies on a real-time basis. The Windows client application
displays these instantaneous price changes on the screen. Listing Three (page
98) is the DDE client header file, Listing Four (page 98) is the DDE
client-definition file, and Listing Five (page 98) is the Windows
client-generation batch file.
First, the iRMX application registers itself as a DDE server by providing a
machine name and addresses of its two callback functions. Then it waits for
the client application to establish a hot-link connection for the specific
data items.
The Windows client initiates the conversation with the server by sending the
server's machine name, an application name, and a topic name in the
WM_DDE_INITIATE message. Once the conversation is established, the client
sends a WM_DDE_ADVISE message to the server to receive instantaneous updates
of share prices. In the example program, the real-time changes in the prices
are simulated by sending periodic updates in increments of 25 cents. The
server sends this updates using the server_dde_update_link DDE library call.


Conclusions


iRMX for Windows has extended the Windows DDE protocol to include real-time
applications, thus enabling rich graphical user interfaces for a real-time
back end. Implemented on an industry-standard interface such as NetBIOS, the
iRMX DDE capability allows networks of PCs running Windows programs to
communicate and exchange information with real-time applications.


_WINDOWS DDE FOR REAL-TIME APPLICATIONS_
by Kamal Shah


[LISTING ONE]

/* iRMX for Windows DDE server */

#include <stdio.h>
#include <rmxc.h>
#include <string.h>
#include <rmxdde.h>

#define TRUE 0xFFFF
#define FALSE 0

typedef struct
 {
 char *szCompanyName;
 float stockprice;
 }
 STOCKPRICE;
 STOCKPRICE shareprice [] =
 {
 "Intel", 50.0,
 "Microsoft", 70.0,
 "IBM", 90.0,
 "Apple",50.0
 };
#define NUM_COMPANIES (sizeof(shareprice) /sizeof(shareprice[0]))
WORD hlink_status[NUM_COMPANIES] = {FALSE,FALSE,FALSE,FALSE};
WORD server_conv_id;
WORD hot_link_on = FALSE;
WORD link_closed = TRUE;
WORD status;
WORD i;
WORD link_cnt = 0;

/*----------------------------------------------------------
 err_check: error processing function
------------------------------------------------------------*/
void err_check (WORD status, const char *err_msg)
 {
 if (status != DDE_OK)
 {
 fprintf (stderr, "\n---%s Status = 0x%x\n", err_msg, status);
 exit (1);
 }
 }
/* Server Functions */
/*---------------------------------------------------------------------
 conv_callback: This is the conversation callback
 function that is registered at the server_dde_register call.
-----------------------------------------------------------------------*/
WORD
conv_callback (char *client_name_p, char *service_name_p, char *topic_name_p,
 WORD conversation_id, WORD function_code)
 {
 server_conv_id = conversation_id;

 if (function_code == DDE_INITIATE)
 link_closed = FALSE;
 else if (function_code == DDE_TERMINATE)
 link_closed = TRUE;
 return 0;
 }
/*---------------------------------------------------------------------------
 data_callback: This is the data callback function that is registered at the
 server_dde_register call. It processes DDE requests from the client.
-----------------------------------------------------------------------------*/
WORD
data_callback (char *client_name_p, char *service_name_p, char *topic_name_p,
 WORD conversation_id, char *item_p, char *data_buf_p,
 WORD data_buf_size, WORD function_code)
{
 if (function_code == DDE_HOT_LINK)
 {

 if(!strcmp(shareprice[link_cnt].szCompanyName,item_p))
 hlink_status[link_cnt++] = TRUE;
 return 0;
}
 else if (function_code == DDE_CLOSE_LINK)
 {
 for (i = 0; i < NUM_COMPANIES; i++)
 {
 if(!strcmp(shareprice[i].szCompanyName,item_p))
 hlink_status[i] = FALSE;
 return 0;
 }
 }
}
main(int argc, char *argv[])
 {
 WORD dde_status;
 WORD conv_id;
 char price_buf[10];
 printf("Initializing dde library\n");
 dde_library_init (NULL, &dde_status);
 err_check (dde_status, "dde_library_init() failed");
 printf("\n---Registering self as DDE server"
 "\n machine name = %s"
 "\n application name = %s\n", "servernode", "rmxapp");
 server_dde_register ("servernode",
 "rmxapp",
 conv_callback,
 data_callback,
 &dde_status);
 err_check (dde_status, "server_dde_register() failed");
 while (link_closed == TRUE)
 {
 rqsleep(300,&status);
 printf("Waiting for conv. to be established\n");
 }
 while (link_closed == FALSE)
 {
 rqsleep(300,&status);
 for (i = 0; i < NUM_COMPANIES; i++)
 {

 if (hlink_status[i] == TRUE)
 {
 shareprice[i].stockprice = shareprice[i].stockprice +.25;
 sprintf(price_buf, "%f", shareprice[i].stockprice);
 printf("%s\n",shareprice[i].szCompanyName);
 server_dde_update_link(server_conv_id,
 shareprice[i].szCompanyName,
 price_buf,
 &dde_status);
 err_check (dde_status, "server_dde_update_link() failed");
 }

 }
 }
 printf("data xfer over\n");
 exit(0);
}





[LISTING TWO]



//**********************************************************
// DDE client application
//**********************************************************

#include <windows.h>
#include <dde.h>
#include <stdlib.h>

#include <string.h>
#include "hlcli.h"

 // Global variables

char szAppName [] = "ddecli";
HANDLE hInst ;
short cxChar, cyChar;
BOOL bFirstTime = TRUE;

STOCKPRICE shareprice [] =
 {
 "Intel", 50.0,
 "Microsoft", 70.0,
 "IBM", 90.0,
 "Apple",50.0
 };
#define NUM_COMPANIES (sizeof(shareprice)/sizeof(shareprice[0]))

//**********************************************************
// FUNCTION : WinMain(HANDLE, HANDLE, LPSTR, int)
// PURPOSE : Create the main application window
//**********************************************************
int PASCAL WinMain (HANDLE hInstance, HANDLE hPrevInstance,
 LPSTR lpszCmdLine, int nCmdShow)

{
 HWND hwnd ;
 MSG msg ;
 WNDCLASS wcDdeclient ;
 // Register the window class if this is the first instance
 if (!hPrevInstance)
 {
 wcDdeclient.style = CS_HREDRAW CS_VREDRAW ;
 wcDdeclient.lpfnWndProc = WndProc ;
 wcDdeclient.cbClsExtra = 0 ;
 wcDdeclient.cbWndExtra = 0 ;
 wcDdeclient.hInstance = hInstance ;
 wcDdeclient.hIcon = LoadIcon (NULL,IDI_APPLICATION);
 wcDdeclient.hCursor = LoadCursor (NULL, IDC_ARROW) ;
 wcDdeclient.hbrBackground = GetStockObject (WHITE_BRUSH) ;

 wcDdeclient.lpszMenuName = NULL;
 wcDdeclient.lpszClassName = szAppName ;

 RegisterClass (&wcDdeclient);
 }
 // Create the main window
 hwnd = CreateWindow (szAppName, "DDE Client Application",
 WS_OVERLAPPEDWINDOW,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 NULL,
 NULL,
 hInstance,
 NULL) ;
 hInst = hInstance ;
 // Show the main window and send ourselves a paint message
 ShowWindow (hwnd, nCmdShow) ;
 UpdateWindow (hwnd) ;

 // Send ourselves a message that will cause a conversation
 // to be started with the server
 SendMessage (hwnd, WM_INITDDE, 0, 0L) ;

 // Enter the message loop
 while (GetMessage (&msg, NULL, 0, 0))
 {
 TranslateMessage (&msg) ;
 DispatchMessage (&msg) ;
 }

 return msg.wParam ;
}
//**********************************************************
// FUNCTION : WndProc (HWND, WORD, WORD, LONG)
// PURPOSE : Processes all messages
//**********************************************************
long FAR PASCAL WndProc (HWND hwnd, WORD message, WORD wParam, LONG lParam)
{
 static HWND hwndServer = NULL ;
 static BOOL fInit = TRUE ;
 static char szServName [] = "servernode%rmxapp",

 szTopicName [] = "stockprice" ;
 ATOM aAppName, aTopicName, aItemName ;
 char szBuf [24], szStockprice [16], szItemName [16] ;
 DDEACK DdeAck ;
 DDEDATA FAR *lpDdeData ;
 DDEADVISE FAR *lpDdeAdvise ;
 DWORD dwTime ;
 GLOBALHANDLE hDdeData,hDdeAdvise ;
 HDC hDC ;
 MSG msg ;
 TEXTMETRIC tm ;
 PAINTSTRUCT ps ;
 short i ;
 WORD wStatus;
 float stockprice;

 switch (message)
 {
 case WM_CREATE:
 // Get a handle to a device context
 hDC = GetDC (hwnd) ;

 // Get the metrics of the system font
 GetTextMetrics (hDC, &tm) ;

 // Save the character width and height
 cxChar = tm.tmAveCharWidth ;
 cyChar = tm.tmHeight + tm.tmExternalLeading ;

 // Release the device context handle
 ReleaseDC (hwnd, hDC) ;

 return (FALSE) ;
 case WM_INITDDE:
 // To start a DDE conversation, first: Create glogal atomsto send
 // with the WM_DDE_INITIATE message
 aAppName = GlobalAddAtom (szServName) ;
 aTopicName = GlobalAddAtom (szTopicName) ;

 // Now broadcast WM_DDE_INITIATE message, the value of -1 in the
 // first parameter means the message is sent to all windows.
 SendMessage (0xFFFF, WM_DDE_INITIATE, hwnd,
 MAKELONG (aAppName, aTopicName)) ;
 // SendMessage is a synchronous call, so we would receive a response
 // in WM_DDE_ACK message before the SendMessage call returns.
 // Attempt to load Ddeserver if we get no response.
 if (hwndServer == NULL)
 {
 WinExec (szServName, SW_SHOWMINNOACTIVE) ;
 SendMessage (0xFFFF, WM_DDE_INITIATE, hwnd,
 MAKELONG (aAppName, aTopicName)) ;
 }
 // Delete the atoms we created above and reset the flag for
 // WM_DDE_ACK processing
 GlobalDeleteAtom (aAppName) ;
 GlobalDeleteAtom (aTopicName) ;
 fInit = FALSE ;
 // Check again for a response. If we haven't received one,
 // display a message box and exit this routine.

 if (hwndServer == NULL)
 {
 MessageBox (hwnd, "Cannot communicate with iRMX SERVER!",
 szAppName, MB_ICONSTOP MB_OK) ;
 return (FALSE) ;
 }
 // Post WM_DDE_ADVISE messages
 for (i = 0; i < NUM_COMPANIES; i++)
 {
 hDdeAdvise = GlobalAlloc (GHND GMEM_DDESHARE, sizeof (DDEADVISE));
 lpDdeAdvise = (DDEADVISE FAR *) GlobalLock (hDdeAdvise);
 lpDdeAdvise->fAckReq = TRUE;
 lpDdeAdvise->fDeferUpd = FALSE;
 lpDdeAdvise->cfFormat = CF_TEXT;

 GlobalUnlock (hDdeAdvise);

 aItemName = GlobalAddAtom(shareprice[i].szCompanyName);
 if (!PostMessage (hwndServer,WM_DDE_ADVISE,hwnd,
 MAKELONG(hDdeAdvise,aItemName)))
 {
 GlobalFree (hDdeAdvise);
 GlobalDeleteAtom (aItemName);
 break ;
 }
 DdeAck.fAck = FALSE;
 dwTime = GetCurrentTime ();
 while (GetCurrentTime () - dwTime < TIMEOUT)
 {
 if (PeekMessage (&msg, hwnd, WM_DDE_ACK,WM_DDE_ACK, PM_REMOVE))
 {
 GlobalDeleteAtom (HIWORD (msg.lParam));
 DdeAck = * (DDEACK *) & LOWORD (msg.lParam);
 if (DdeAck.fAck == FALSE )
 GlobalFree (hDdeAdvise);
 break ;
 }
 }
 if (DdeAck.fAck == FALSE)
 break ;
 while (PeekMessage (&msg, hwnd, WM_DDE_FIRST,WM_DDE_LAST, PM_REMOVE))
 {
 DispatchMessage (&msg);
 }
 }
 if (i < NUM_COMPANIES)
 {
 MessageBox (hwnd, "WM_DDE_ADVISE Failed!",szAppName,
 MB_ICONEXCLAMATION MB_OK);
 }
 return (FALSE);
 case WM_DDE_ACK:
 // We are only responding here when we are in the middle of a
 // WM_DDE_INITIATE send routine as determined by the fInit flag.
 // We save the server window handle and delete the global atoms.
 if (fInit)
 {
 hwndServer = wParam ;
 GlobalDeleteAtom (LOWORD (lParam)) ;

 GlobalDeleteAtom (HIWORD (lParam)) ;
 }
 return (FALSE) ;
 case WM_DDE_DATA:
 // wParam is the handle to the sending window LOWORD (lParam) is the
 // handle to the DDEDATA memory. HIWORD (lParam) is the item atom.
 hDdeData = LOWORD (lParam) ;
 lpDdeData = (DDEDATA FAR *) GlobalLock (hDdeData);
 aItemName = HIWORD (lParam) ;
 DdeAck.bAppReturnCode = 0;
 DdeAck.reserved = 0;
 DdeAck.fBusy = FALSE;
 DdeAck.fAck = FALSE;

 // Check for matching format and data item
 if (lpDdeData->cfFormat == CF_TEXT)
 {
 GlobalGetAtomName (aItemName, szItemName, sizeof (szItemName)) ;
 for (i = 0; i < NUM_COMPANIES; i++)
 if (strcmp (szItemName, shareprice[i].szCompanyName) == 0)
 break;
 if (i < NUM_COMPANIES)
 {
 lstrcpy (szStockprice, lpDdeData->Value) ;
 shareprice[i].stockprice = atof(szStockprice) ;
 InvalidateRect (hwnd, NULL, FALSE) ;
 DdeAck.fAck = TRUE ;
 }
 }
 if (lpDdeData-> fAckReq == TRUE)
 {
 wStatus = * (WORD *) & DdeAck ;
 if (!PostMessage (wParam, WM_DDE_ACK, hwnd,
 MAKELONG (wStatus,aItemName)))
 {
 GlobalDeleteAtom (aItemName) ;
 GlobalUnlock (hDdeData);
 GlobalFree (hDdeData );
 return 0;
 }
 }
 else
 {
 GlobalDeleteAtom (aItemName) ;
 }
 //clean up
 if (lpDdeData->fRelease == TRUE DdeAck.fAck == FALSE)
 {
 GlobalUnlock (hDdeData);
 GlobalFree (hDdeData);
 }
 else
 {
 GlobalUnlock (hDdeData);
 }
 return 0;
 case WM_PAINT:
 if (bFirstTime)
 {

 bFirstTime = FALSE;
 hDC = BeginPaint (hwnd, &ps) ;
 TextOut(hDC, cxChar, cyChar,BANNER,strlen(BANNER));
 EndPaint (hwnd, &ps) ;
 return (FALSE) ;
 }
 else
 {
 hDC = BeginPaint (hwnd, &ps) ;
 for (i = 0; i < NUM_COMPANIES; i++)
 {
 sprintf(szBuf, "%lf",shareprice[i].stockprice);
 TextOut(hDC,cxChar*(i*20), cyChar*2,szBuf,strlen(szBuf));
 }
 EndPaint (hwnd, &ps) ;

 return (FALSE) ;
 }
 case WM_DDE_TERMINATE:
 // Post a WM_DDE_TERMINATE message back and null our server handle
 PostMessage (hwndServer, WM_DDE_TERMINATE, hwnd, 0L) ;
 hwndServer = NULL ;
 return (FALSE) ;
 case WM_CLOSE:
 if (hwndServer == NULL)
 break ;
 // Post WM_DDE_UNADVISE message
 PostMessage (hwndServer, WM_DDE_UNADVISE,hwnd,MAKELONG (CF_TEXT,NULL));
 // Wait for server to acknowledge
 dwTime = GetCurrentTime () ;
 while (GetCurrentTime () - dwTime < TIMEOUT)
 {
 if (PeekMessage (&msg, hwnd, WM_DDE_ACK, WM_DDE_ACK, PM_REMOVE))
 break ;
 }
 // Tell server to terminate
 PostMessage (hwndServer, WM_DDE_TERMINATE, hwnd, 0L) ;
 // Wait for server to acknowledge
 dwTime = GetCurrentTime () ;
 while (GetCurrentTime () - dwTime < TIMEOUT)
 {
 if (PeekMessage (&msg, hwnd, WM_DDE_TERMINATE, WM_DDE_TERMINATE,
 PM_REMOVE))
 break ;
 }
 break ;
 case WM_DESTROY:
 PostQuitMessage (0) ;
 return (FALSE) ;
 }
 return DefWindowProc (hwnd, message, wParam, lParam) ;

}





[LISTING THREE]

// DDE Client Header File -- Define a structure type for stock prices


typedef struct
 {
 char *szCompanyName ;
 float stockprice;
 }
 STOCKPRICE ;


 // Define user defined messages

#define WM_INITDDE (WM_USER + 1)
#define TIMEOUT 1000
#define BANNER "Intel Microsoft IBM Apple"
 // Function declarations
int PASCAL WinMain (HANDLE, HANDLE, LPSTR, int) ;
long FAR PASCAL WndProc (HWND, WORD, WORD, LONG) ;






[LISTING FOUR]

; DDE Client Definition File -- Definition file for the link editor

NAME ddeclient
DESCRIPTION 'DDE Client application'
; Program executable type
EXETYPE WINDOWS
; Stub is a small program that will be displayed if this
; program is run from the DOS prompt
STUB 'WINSTUB.EXE'

CODE PRELOAD MOVABLE
DATA PRELOAD MOVABLE MULTIPLE

HEAPSIZE 1024
STACKSIZE 8192

; All Windows callback functions must be named here
EXPORTS
 WndProc @1






[LISTING FIVE]

cl -c -Gw -Zp %1.c
link /NOD /NOE /align:16 %1, %1.exe,,libw+slibcew,%1.def
rc %1.exe

































































January, 1993
SIMULATING HYPERCUBES IN UNIX PART II


Using the simulated hypercube




Jeffrey W. Hamilton and Eileen M. Ormsby


Jeff was lead programmer for IBM's W4 Multiprocessing Adapter. He can be
contacted at jeffh@vnet.ibm.com. Eileen is a staff programmer for IBM's FSD,
working with W4 application development. She can be reached at
eileen@vnet.ibm.com.


In last month's installment, we introduced the hypercube concept and examined
how you can simulate one under UNIX. We continue this month by presenting the
SIMCUBE program and discussing ways in which you can use it. Listing Three
(page 99) is the source code listing for simulate.c, the main program. Note
that the source for the include file (cube.h, Listings One) and partition
manager (pm.c, Listing Two) were presented last month.


Using the Simulated Hypercube


Somewhere in the application program, before the first call to a hypercube
function, a call to the function init_simulator has to be inserted into the
code. It would have been nice to hide the function of init_simulator in a
standard hypercube routine, as we did in the load function. However, no one
hypercube routine must be called before any other. We were compelled to
compromise our design goal (discussed last month) and require one alteration
to the hypercube application program near the beginning of the program.
As each application process executes init_simulator, it is able to determine
all the information needed to use the simulated hypercube. It reads the
environment-variable information that PM assembled and determines its position
within the simulation. Specifically, it determines how many nodes are in the
simulation and the node number of this particular process.


Application-environment Functions


Frequently, an application process needs to know who and where it is. Several
functions relay this essential information about the execution environment to
the application program. The mynode function returns the number of the node
that the application is executing on. The first node is 0 and the remaining
nodes are numbered sequentially.
The myhost function is the node number of the host program. For SIMCUBE, the
host number is always one greater than the highest-number node in the
simulated partition.
The mypid function is the partition number (group number) assigned by the host
application with the setpid command. Since SIMCUBE only handles one hypercube
partition, this number is largely ignored, but is provided to the application
when requested.
The numnodes function is the total number of nodes in the partition, excluding
the host application's node, while nodedim is the total number of nodes in the
partition expressed in terms of the dimension of the simulated hypercube.
Available memory is kept track of by availmem. Intel hypercubes currently do
not implement virtual memory. Therefore, large applications should keep track
of the amount of available memory in order not to exceed the system's
resources. Since SIMCUBE runs under an operating system that supplies virtual
memory, we chose to ignore this function. If your UNIX system has limited
memory resources, alter this function to return the amount of memory currently
available for the application to use.


Hypercube Communications


Basic communication is accomplished with the csend and crecv routines for
synchronous communications and the isend, irecv, and msgwait routines for
asynchronous communications. In a hypercube, these routines transmit messages
via special hardware between the nodes.
We can simulate the communications between nodes on the same UNIX system
(partition) with a combination of shared memory and semaphores. Communication
between nodes on different partitions is done with sockets. For local
communications, transmission of messages between nodes is much faster than on
a hypercube because the data does not have to be moved to reach the other
node. Communication between nodes on different systems is much slower than on
a hypercube because of the overhead incurred on a LAN.
Each partition allocates an array of buffers for communications. There is one
buffer per node plus one buffer for the partition manager (PM). The first
partition contains one additional buffer for communications with the host
application. Buffer use is controlled by two sets of semaphores: msgavail and
msgfree. There is one element in each semaphore set for each buffer in the
partition. The msgavail semaphores indicate that a message is available for a
given node, partition manager, or host application. The msgfree semaphores are
used to notify the sender of a message that the message was received.


Synchronous Communications


The csend routine takes five parameters: message type, message contents,
message length, destination node, and destination-partition (group) number.
The partition number is ignored in SIMCUBE since we assume there is only one
hypercube partition.
The message type must be a positive number. It is used to coordinate the order
in which messages are received between nodes.
Since messages can exceed the size of the shared buffer area, each message is
broken down into submessages. The t_length field in the message header tells
the receiving node how many bytes to expect. The length field indicates the
number of bytes in this submessage. Since the communication between local
nodes is ordered, we do not have to worry about the order in which messages
need to be put together at the receiving node.
To send the message, a node places the message into its shared buffer area.
The dnode field indicates which node this message is being sent to. The snode
and spid fields contain the information about which node is sending the
message. If the destination node is -1, the message is broadcast to all nodes
within the partition. Otherwise, the message is sent to a specific node. The
last field set is the valid-flag array, which indicates to potential receivers
that the message buffer contains valid data for that node, PM, or host
application.
If the message is to be sent to a specific node within the partition, then the
msgavail-semaphore element for that node is incremented. If the destination
node is in another partition, the msgavail-semaphore element for the PM is
incremented. If the message is to be broadcast to all nodes, then msgavail
semaphores for every node in the partition and the partition manager are
incremented simultaneously. This allows every node and PM to receive the
message at the same time without duplicating it. Notice that the host
application never receives a broadcast message.
The order of the semaphore elements assigned to each process is important to
efficiently broadcast the message. The local nodes come first, the PM is next,
and the host application, if present, is last. The semcall_all function
handles incrementing multiple elements at the same time. Since our code will
only increment all nodes at once or all nodes plus the partition manager at
once, we only need to specify how many elements to increment. We do not have
to be concerned with specifying which elements to increment.
Since the sender of a broadcast message is not expected to receive its own
message, the sender's msgavail-semaphore element is immediately decremented.
The sender then waits for the receiving node to acknowledge receipt of the
message before continuing. If the message was sent to a specific node, the
sender attempts to decrement its msgfree-semaphore element by 1. Until the
receiver increments the msgfree-semaphore element, the sending process is
blocked. If the message was a broadcast, the sender attempts to decrement its
msgfree-semaphore element by the number of processes that will be receiving
the message. Until all the receiving processes increment the semaphore, the
sending process will be blocked.
The cprobe function is used by a receiving process to wait for a message
without actually accepting the message. A particularly useful function of
cprobe is the ability to determine the size of the next message coming before
receiving it. This gives the receiving process a chance to allocate sufficient
space to hold the message.
Each time cprobe is invoked, it will do one of three things:
If a message has already been selected, but has not been processed, the
function immediately returns.
If the caller is looking for a message of any type, cprobe attempts to
decrement the process's msgavail-semaphore element. If a message is not
available, it will block until some process does a csend to the node. Once the
semaphore has been successfully decremented, each buffer in the partition is
scanned for a message with the valid flag set for this process. The
destination node is also checked as an added safety device. The global
variable next_message is set to the array index of the message buffer.

If the caller is looking for a message of a specific type, cprobe operates as
it did in the previous case, but if the message type does not match, we mark
the message and continue waiting for another. Once a message of the correct
type is found, all the skipped messages are restored and the
msgavail-semaphore element is incremented so that the next time cprobe is
called, we will immediately search for the message.
A deadlock not possible on a hypercube can occur in SIMCUBE. Suppose the
receiver is looking for a specific message type that will come from another
partition. At the same time, a node in another partition sends a different
type message ahead of the message desired. The result will be a deadlocked
system. The PM cannot put another message into the buffer until the current
message is accepted. The current message will not be accepted until a pending
message is placed into the buffer.
Once cprobe locates a message, you can request information about the message.
The infocount function returns the number of bytes contained in the message.
The infonode function returns the sender's node number. The infopid function
returns the sender's partition number. As a safety feature, these functions
call cprobe to ensure that a message is pending before returning information
about the message.
The crecv function receives a message from a sending node. The function first
calls cprobe to ensure that a valid message is pending. If the message comes
in pieces, due to its length, crecv pastes the message back together before
returning to the caller. One danger that must be carefully handled is
receiving two messages of the same type at nearly the same time. It is
possible to mix the messages if we do not check that the sending node is the
same. We use the same trick that cprobe uses to temporarily skip messages from
the wrong node.
Each submessage is received and copied into the destination buffer. Care is
taken so that a large message does not exceed the space provided by the
caller. If a message is too big, only the first part is copied into the
caller's buffer. The remainder of the message is discarded. We also make sure
that we still receive the entire message, even when we must discard a portion
of it.
Once the message or submessage is copied, the valid flag for this process is
set to 0 so that future cprobe calls do not accidentally pick up a message
already received. The message is acknowledged by incrementing the
msgfree-semaphore element of the sending process. The next_message index is
reset to indicate that no message has been selected.


Asynchronous Communications


The asynchronous communication routines, isend and irecv, work exactly like
their synchronous counterparts, except for acknowledgments. The isend routine
does not wait for acknowledgment of acceptance of a message, and irecv does
not immediately acknowledge receipt of the last submessage. The acknowledgment
handshaking for both isend and irecv is handled by msgwait. The only parameter
msgwait needs is the offset of the buffer needing acknowledgment. This offset
is returned by isend and irecv and should be passed unaltered to the msgwait
routine. We can determine which half of the acknowledgment handshaking is
needed by comparing the buffer given with the process's buffer. If they are
the same, we need to complete the send half of the acknowledgment. If they are
different, we need to complete the receive half of the acknowledgment. The
msgwait function must be called before executing either isend or irecv a
second time, or messages can be corrupted. The application programmer must
make sure msgwait is called, as SIMCUBE does not enforce this rule.
The flushmsg routine allows the sender to cancel messages being sent to a
node. The idea is to interrupt a process so that a high-priority message can
be sent. We could not think of a way to safely implement this function, so it
is a placeholder for now.


Handling Global Sums


Besides messages, applications running on nodes can communicate through global
summations. Each process on a node can compute partial answers independently,
then combine all the partial answers into a final result. Every process gets a
copy of the final results. The summation can be performed on a single value or
on an array of values. A major restriction is that all nodes in the partition
must perform the summation and the vector size must be the same.
Once the basic message-passing system is in place, the summation process is
fairly straightforward. We use gdsum (global double summation) to illustrate
the process. In each partition, the nodes send their vectors to the
lowest-number node within the partition. These nodes then send their partial
results to node 0, which computes the final answer. Node 0 then broadcasts the
results directly back to every node.
By having each partition first calculate a partial answer, we reduce the
number of actual network messages sent between the systems. The broadcast
mechanism only sends one network message per partition, so we already limit
network traffic in the return direction.
The summation messages use negative message types to avoid accidental
conflicts with user messages.


Conclusion


SIMCUBE only mimics a single hypercube partition. Multiple-hypercube partition
support can be quickly added to the base code. Many pieces needed to implement
multiple partitions are already present in the code.
The simulator only implemented the functions we needed for a particular
application; a number of similar functions, like gfsum (global floating point
summation) were left out, but can be easily implemented.
Adding support for heterogeneous programs running on the hypercube may take a
bit more work, but is feasible within the framework that we have presented.
SIMCUBE was implemented on an IBM W4 MultiProcessing Adapter which contains
four i860 processors with 32 Mbytes of shared memory. W4 executes a parallel
version of IBM's AIX operating system called AIX/860.
We ported four hypercube applications that a customer had written for the
Intel iPSC/2 hypercube to AIX/860. The port required adding the call to
init_simulator to a common library and recompiling the source modules. Two of
the applications used hypercube partitions with only one node. One application
used a four-node partition. The last application was implemented to use any
number of nodes. For this application, we set up a network of two IBM PS/2s,
each containing a W4 adapter, and a RISC Station/6000 containing two W4
adapters. With this environment, we were able to simulate 4-, 8-, 12-, and
16-node partitions.
In certain cases, the speed of the simulation exceeded our expectations. Since
the W4 has four i860 processors, we defined a partition to have four nodes.
This meant that shared memory would be used for communications between each of
a partition's four nodes. For applications that used four or fewer nodes, our
execution times equaled or slightly exceeded the execution times of the same
application on the real hypercube. We believe this is because communication
via shared memory is much faster than moving data between nodes. For
applications that require more than four nodes, we ran about two times slower
than the real hypercube. A general-purpose Ethernet LAN cannot compete with
the dedicated channels within a hypercube. Of course, your mileage may vary
with the type of systems, networks, and applications you use.

_SIMULATING HYPERCUBES IN UNIX_
by Jeffery W. Hamilton and Eileen M. Ormsby


[LISTING ONE]

/***** cube.h *****/

/* Hypercube Simulation definitions */
#define NUMBER_IN_PART 4 /* number of nodes in partition */
#define PM_PORT 6000

/* Maximum message sent between nodes */
#define MAX_MESSAGE_SIZE (1024 * 16)

typedef struct {
 char *name; /* network name of the computer hosting partition */
 int socket; /* file descriptor for the socket */
 int errfdp; /* file descriptor for sending "kill" values */
 struct sockaddr_in addr;
} subpart;

typedef struct {
 int type; /* message type sent with the message -1 or greater */
 int spid; /* sender's group number (pid) */


 int snode; /* sender's node number */
 int dnode; /* node this message is destined for */
 int t_length; /* total length of message */
 int length; /* length of the message */
 char valid[NUMBER_IN_PART+2]; /* 0= no message */
 char msg[MAX_MESSAGE_SIZE]; /* Actual message contents */






[LISTING TWO]

/***** pm.c *****/
/* PARTITION MANAGER -- This program will run on all partitions used for an
** application. It is started via a remote execution call from "load".
** The main program gets the input arguments, sets a few variables and calls
** the Partition Manager subroutine which performs the following functions:
** determines local partition information; allocates neccessary partition
** structures; sets up interrupt handling to free system resources when the
** application is terminated; sets up server portion of socket communications;
** forks a client PM that sets up client portion of the socket communications,
** waits for a node to request data to be sent to a partition, and sends data
** over the sockets; forks and execs application children; performs server PM
** functions that waits to receive data from the sockets and notifies
** appropriate nodes when data has arrived.
** The PM server only receives data for its nodes, and the PM client sends
** data to a remote partition.
** BASIC SOFTWARE ARCHITECTURE: The load module in the simcube library will:
** 1) Read the .pmrc file; 2) Determine how many partitions will be used
** for this application; 3) Fork and exec a local PM and the appropriate
** number of remote PMs PM is passed the name of the application program,
** its partition number, the key value for PM to communicate to the host
** process with the group number, the total number of application nodes,
** the names of the other partitions running this application.
** The initialization portion (init_simulator) which is called by the
** application processes (nodes) will set up interrupt handling and create
** the shared memory and semaphores necessary for communications
** between local nodes and the Partition Manager.
*/

/* These functions allow a UNIX system to simulate a hypercube environment. */
#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/sem.h>
#include <sys/shm.h>
#include <sys/ipc.h>
#include <sys/wait.h>
#include <signal.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <string.h>
#include <sys/time.h>

#include <sys/utsname.h>
#include <sys/param.h>
#include "cube.h"


#define NUM_TRIES 60
#define min(x,y) (((x) < (y)) ? (x) : (y))

/* Function Prototypes */
void *malloc(int size);
void *shmat(int, void*, int);

int pm (char *filename, char *pmsites[]);
void setup_server_sockets(void);
void pm_server(void);
void setup_client_sockets(void);
void pm_getmsg(void);
void abort_prog(void);
void sig_terminate(int sig, int code, struct sigcontext *scp);
void unexpected_death(int sig, int code, struct sigcontext *scp);
int _killcube(int node, int pid);
int init_shared_mem(void **pointer, int size, int key);
int init_semaphore(int *semid, int size, int value, int key);
int semcall_all(int semid, int size, int operation);
int semcall_one(int semid, int num, int operation);
int numnodes(void);
int numparts(void);
int mypart(void);
int partof(int node);
int pm_partof(int node);
int numbuffers(void);
int mybuffer(void);
int bufferof(int node);
int mynode(void);
int myhost(void);
void pm_client(void);

/* Local, Private Information */
fd_set node_part_set, temp_set;
 /* node_part_set is the variable that FD_XXX commands are */
 /* applied to. Definitions of fd_set structure, and FD_ZERO, */
 /* FD_SET, FD_CLR, and FD_ISSET macros are in <sys/types.h> */
 /* node_part_set will have socket file descriptors.*/
static int num_parts; /* number of partitions */
static int my_part; /* partition this process is in */
static int nodes_in_part; /* number of nodes in this partition */
static subpart *partition; /* list of partition information */
static int base; /* base key value for allocating shared data */
static int my_node; /* node number for this process */
static int my_group; /* group id for this process */

 /* There are two groups, host communications */
 /* and inter-node communications */
static int num_nodes; /* total number of nodes in all partitions */

static int msgavail = -1; /* semaphores indicating message is available */
static int msgfree = -1; /* semaphores indicating buffer is free */
static int next_message = -1; /* which message is to be received next */
static int shmid_m = -1; /* id of shared area for messages */

static message *buffer = NULL;/* communication areas */
static int *children = NULL; /* process ids of all child processes */
static int child_index = 0; /* number of children created */
static int pmserver_pid = 0; /* pid of pmserver */

/* Main: reads arguments from command line, places them in local variables
** and calls pm. (Local variables are not necessary, but enhances readability)
** NOTE: ONLY TEN NODES (PM SITES) ARE READ FROM THE COMMAND LINE */
int main(int argc, char *argv[])
{
 char *filename;
 char *pmsites[16];
 int i;
 if (argc < 7 ) {
 fprintf (stderr, "PM main: error not enough arguments\n");
 fflush(stderr);
 exit(-1);
 }
 filename = argv[1];
 my_part = atoi(argv[2]);
 base = atoi(argv[3]);
 my_group = atoi(argv[4]);
 num_nodes = atoi(argv[5]);
 for (i = 0; i < argc - 6; i++) {
 pmsites[i] = argv[i + 6];
 }
 pm (filename, pmsites);
}

/* pm -- Determines partition information, sets up signal handling, sets up
** server sockets, forks client pm, forks application children. PM splits the
** application into NUMBER_IN_PART processes. The partition number is passed
** as an input parameter. The starting node number is the partition number
** NUMBER_IN_PART and remaining processes will be numbered consecutively.
** Shared memory will be allocated to serve as a communications vehicle within
** a partition. Sockets used between partitions to allow multiple UNIX systems
** to be combined to create a larger set of CPUs to be applied to a problem.
*/
int pm (char *filename, char *pmsites[])
{
 register int i, pid;
 char temp[128]; /* used to set up environment variables */
 char part_names[64];
 int start_node;
 int dest_node;
 /* Determine how many other partitions exist */
 num_parts = (num_nodes + NUMBER_IN_PART - 1) / NUMBER_IN_PART;
 /* Determine which node is the first for this partition */
 start_node = mypart() * NUMBER_IN_PART;
 /* Determine how many nodes are in this partition (1-4) */
 nodes_in_part = numnodes() - (mypart() * NUMBER_IN_PART);
 nodes_in_part = min(NUMBER_IN_PART, nodes_in_part);
 /* Set PM's node to be the last node on this partition */
 /* (The children will be start_node through start_node + nodes_in_part-1) */
 my_node = nodes_in_part;
 /* Create the structure to hold the partition names and socket fds */
 if ((partition = malloc(num_parts * sizeof(subpart))) == NULL) {
 fprintf(stderr,"PM %d SERVER: insufficient memory\n, mypart()");
 fflush(stderr);
 return -1;

 }
 memset(partition, 0, num_parts * sizeof(subpart));
 /* Catch these signals so PM can notify children to clean up */
 signal(SIGINT,sig_terminate);
 signal(SIGTERM,sig_terminate);
 signal(SIGQUIT,sig_terminate);
 /* Watch for unexpected deaths */
 signal(SIGCHLD, unexpected_death);
 /* Create, bind, and listen on sockets */
 setup_server_sockets();
 if (mypart() != 0) {
 /* Only change the base on partitions that are not the one that includes
 ** host. That partition requires same base that host session is using. */

 base = getpid();
 }
 /* Allocate shared memory */
 shmid_m = init_shared_mem(&buffer, sizeof(message) * numbuffers(), base);
 if (mypart() != 0) {
 memset(buffer, 0, sizeof(message) * numbuffers());
 }
 /* Allocate communications semaphores */
 init_semaphore(&msgavail, numbuffers(), 0, base+10000);
 init_semaphore(&msgfree, numbuffers(), 0, base+20000);
 /* Flush stdout and stderr before doing a fork, so child doesn't inherit */
 fflush(stdout);
 fflush(stderr);
 /* Fork PM CLIENT here */
 if ((pmserver_pid = fork()) < 0) {
 /* Can't create the PM CLIENT */
 _killcube(0, 0);
 fprintf(stderr, "PM %d SERVER: unable to create PM CLIENT process %d\n",
 mypart(), i);
 fflush(stderr);
 return -1;
 } else if (pmserver_pid == 0) {
 /* Fill in the names of the other sites in the partition structure and
 ** close the socket file desciptors that this process just inherited. */
 for (i = 0; i < num_parts; i++) {
 if (mypart() != i) {
 partition[i].name = pmsites[i];
 close(partition[i].socket);
 }
 }
 /* CALL CLIENT SUBROUTINES */
 setup_client_sockets();
 pm_client();
 } else {
 /* SERVER: forks application children then calls pm_server subroutine */
 /* Read from pmsites array, create a comma delimited string for env */
 part_names[0] = '\0';
 for (i = 0; i < num_parts; i ++) {
 strcat(part_names, pmsites[i]);
 strcat(part_names, ",");
 }
 /* Allocate space for child pids */
 if ((children = malloc(nodes_in_part * sizeof(int))) == NULL) {
 fprintf(stderr,"PM %d SERVER: insufficient memory\n", mypart());
 fflush(stderr);

 return -1;
 }
 /* Load all nodes within this partition */
 for (i = start_node; (i < start_node + nodes_in_part); i++) {
 if ((pid = fork()) < 0) {
 /* Can't create all the children! */
 _killcube(0, 0);
 fprintf(stderr, "PM %d SERVER: unable to create node process %d\n",
 mypart(), i);
 fflush(stderr);
 return -1;
 } else if (pid == 0) {
 /* I'm the child process */
 /* Start the node program */
 my_node = i;
 sprintf(temp, "SIM_INFO=%d,%d,%d,%d,%s",
 base,my_node,my_group,num_nodes,part_names);
 if (putenv(temp) != 0) {
 fprintf(stderr,
 "PM %d SERVER: Insufficient room to add env variable\n",
 my_node);
 fflush(stderr);
 return -1;
 }
 execlp(filename,filename,NULL);
 /* If we get here, we had a problem */
 perror("execlp");
 fprintf(stderr,"PM %d SERVER: error execing node=%d file=%s
 errno=%d\n", mypart(), my_node, filename, errno);
 fflush(stderr);
 return -1;
 } else {
 /* I'm the parent process */
 children[child_index++] = pid;
 }
 }
 /* CALL SERVER SUBROUTINE */
 pm_server();
 } /* end if PM SERVER */
}
/* setup_server_sockets -- SERVER SOCKETS- for all partitions except ourself:
** Create a socket Bind the socket to a unique PORT id. (If the socket was
** in use in a prior iteration, it may not have been reset yet - therefore we
** loop a fixed number of times retrying.) Put a listen on socket. Put new
** socket file descriptor into our set */
static void setup_server_sockets(void)

{
 int i, j;
 struct sockaddr_in part_sock, tempaddr;
 /* Zero out the set of partition sockets */
 FD_ZERO(&node_part_set);
 FD_ZERO(&temp_set);
 for (i = 0; i < num_parts; i++) {
 /* Skip ourself */
 if (i == mypart () )
 continue;
 for (j = 0; j < NUM_TRIES; j++) {
 /* Create a SERVER socket to receive data */

 if ((partition[i].socket = socket(AF_INET, SOCK_STREAM, 0))
 < 0) {
 fprintf(stderr, "PM %d SERVER: can't open stream socket, errno\n",
 mypart(), errno);
 fflush(stderr);
 exit (100);
 }
 /* Bind SERVER socket to local addr so partitions can send to it */
 bzero((char*)&part_sock, sizeof(part_sock));
 part_sock.sin_family = AF_INET;
 part_sock.sin_addr.s_addr = htonl (INADDR_ANY);
 /* Create unique SERVER socket port address, up to 16 per computer */
 part_sock.sin_port = htons (PM_PORT + (mypart() << 4) + i);
 /* If socket is still in use from prev iter, keep trying to bind */
 if ((bind(partition[i].socket, &part_sock,
 sizeof(part_sock))) < 0) {
 if ((errno == EADDRINUSE) (errno == EINTR)) {
 /* Previous load hasn't shutdown yet, or we were interrupted. */
 close(partition[i].socket);
 sleep(2);
 } else {
 fprintf(stderr,"PM %d SERVER: can't bind local addr,
 errno=%d\n", mypart(), errno);
 fflush(stderr);
 exit(100);
 }
 } else {
 /* It worked, exit the loop */
 break;
 }
 }
 if (j == NUM_TRIES) {
 /* Exceeded retry limit */
 fprintf(stderr,"PM %d SERVER: can't bind local addr, errno=%d\n",
 mypart(), errno);
 fflush(stderr);
 exit(100);
 }

 /* Issue a listen for the server sockets */
 if (listen(partition[i].socket, 1) < 0) {
 fprintf(stderr,"PM %d SERVER: can't listen on %d, errno = %d\n",
 mypart(), partition[i].socket, errno);
 fflush(stderr);
 exit(100);
 }
 /* Set the bit for the socket file descriptor */
 FD_SET(partition[i].socket, &node_part_set);
 } /* end for setting up SERVER sockets */
}
/* pm_server -- SERVER- go into a receiving loop: Copy file desciptors to a
** temporary set. Determine how many sockets are ready to be accepted. For
** each file descriptor that is ready: Find file descriptor that is ready.
** If it is found in a partition's array of fd's then it is a base socket and
** it is "accept"ed and added to the fd set. Else it is an fd that has data to
** be received. Receive the size of the message. Loop until entire message is
** received. Clear the valid indicator bits. Inform nodes that a message has
** arrived. If a broadcast message, set everyone's valid bit, and wait until
** everyone receives it. Else verify that message belongs to a node on this

** part and set that node's valid bit, wait until it is recvd. */
static void pm_server(void)
{
 int i, j;
 int accept_rdy;
 int newsockfd, templen;
 int size, count, partial;
 char *target;
 struct sockaddr_in tempaddr;
 /* forever, accept sockets and receive data */
 for ( ; ; ) {
 temp_set = node_part_set;
 /* Determine how many sockets are ready to be accepted */
 /* FD_SETSIZE is defined in <sys/types.h> to be 200 */
 if ((accept_rdy = select( FD_SETSIZE, &temp_set, 0, 0, 0)) == -1) {
 if (errno != 4) {
 fprintf(stderr, "PM %d SERVER: error in select, errno = %d\n",
 mypart(), errno);
 perror( "pm select" ) ;
 fflush(stderr);

 _killcube(0,0);
 exit(-1);
 } else {
 /* We were interrupted, try again */
 continue;
 }
 }
 for (i = 1; (accept_rdy != 0) && (i < FD_SETSIZE) ; i++) {
 /* Find the file descriptor that needs servicing */
 if ( FD_ISSET( i, &temp_set)) {
 /* temporary modification */
 /* accept_rdy--; */
 accept_rdy = 0;
 /* Examine each partition's array of fd's to find ready one */
 for (j = 0; j < num_parts; j++) {
 /* Skip examining our own partition */
 if (j == mypart() )
 continue;
 /* Since this matches our "base" socket, accept the socket */
 if (i == partition[j].socket) {
 newsockfd = accept(partition[j].socket,
 (struct sockaddr_in *)&tempaddr, &templen);
 FD_SET (newsockfd, &node_part_set);
 /* Found "base" socket, break out of for each part loop */
 break;
 } /* end if base socket */
 } /* end for check file descriptors in partition's array */
 /* If it wasn't a base socket, then need to receive data */
 if (j != num_parts) {
 continue;
 } else /* receive the data from the socket */ {
 /* First receive the size of the message */
 while (recv(i, &size, sizeof(size)) < 0) {
 if (errno != 22) {
 fprintf(stderr, "PM %d SERVER: recv size err,
 errno=%d, fd=%d\n", mypart(),errno, i);
 fflush(stderr);
 _killcube(0,0);

 exit(-1);
 } else {
 fprintf(stderr, "PM %d SERVER: recv size err, errno=%d,
 fd=%d\n", mypart(), errno, i);
 fflush(stderr);
 }
 } /* end while recv msg */

 target = (char *) &buffer[nodes_in_part];
 count = 0;
 /* Now receive the message, it could come in pieces */
 while (count < size) {
 if ((partial = recv(i, target, size - count)) < 0) {
 fprintf(stderr, "PM %d SERVER: Error recvng msg;
 errno=%d\n", mypart(),errno);
 fflush(stderr);
 exit(-1);
 }
 count += partial;
 target += partial;
 }
 /* Make sure all valid bits are cleared */
 memset(buffer[nodes_in_part].valid,0,
 sizeof(buffer[nodes_in_part].valid));
 /* Tell the node(s) the message is there */
 if (buffer[nodes_in_part].dnode == -1) {
 /* Broadcast the message to nodes in this partition */
 for (j=0; j < nodes_in_part; j++) {
 buffer[nodes_in_part].valid[j] = 1;
 }
 semcall_all(msgavail,nodes_in_part, 1);
 /* Wait until everyone receives the message */
 semcall_one(msgfree, nodes_in_part, -nodes_in_part);
 } else {
 if (mypart() != partof(buffer[nodes_in_part].dnode))
 {
 fprintf(stderr, "PM %d SERVER: Recvd msg for node %d
 not this partition\n",
 mypart(), buffer[nodes_in_part].dnode);
 fflush(stderr);
 } else {
 /* Point to point to another node in same partition */
 j = bufferof(buffer[nodes_in_part].dnode);
 buffer[nodes_in_part].valid[j] = 1;
 semcall_one(msgavail, j, 1);
 /* Wait until it is received */
 semcall_one(msgfree, nodes_in_part, -1);
 }
 } /* endif broadcast message */
 } /* endif receiving data from this socket */
 } /* endif this socket */
 } /* endfor */
 } /* end forever receive messages on sockets */
}
/* setup_client_sockets -- Setting up CLIENT sockets- for all partitions
** except ourself: Create a socket to send data. Look up address of host,
** place in the sockaddr_in structure. Determine appropriate PORT id (needs to
** match with SERVER). (If socket was in use in a prior iteration, it may not
** have been reset yet - therefore we loop a fixed number of times retrying.).

** Issue a connect for the socket */
static void setup_client_sockets(void)
{
 int i, j;
 struct hostent *hent;
 /* Establish socket communications with other partitions */
 for (i = 0; i < num_parts; i++) {
 /* Skip ourself */
 if (i == mypart () )
 continue;
 for (j = 0; j < NUM_TRIES; j++) {
 /* Create a CLIENT socket to send data */
 partition[i].socket = socket(AF_INET, SOCK_STREAM, 0);
 /* Lookup host address and place in the socket address structure */
 memset(&partition[i].addr, 0, sizeof(struct sockaddr_in));
 partition[i].addr.sin_family = AF_INET;
 if ((hent = gethostbyname(partition[i].name))
 == NULL) {
 fprintf(stderr,"PM %d CLIENT: No entry for %d in /etc/hosts\n",
 mypart(), partition[i].name);
 fflush(stderr);
 exit(100);
 }
 memcpy(&partition[i].addr.sin_addr, hent->h_addr,
 hent->h_length);
 partition[i].addr.sin_port = htons(PM_PORT + (i << 4) +
 mypart());
 /* Connect to the socket */
 if (connect(partition[i].socket, &partition[i].addr,
 sizeof(struct sockaddr_in)) < 0) {
 if (errno == ECONNREFUSED) {
 /* unsuccessful connect, sleep and try again */
 sleep(3);
 } else {
 /* another error occurred, quit trying to connect */
 j = NUM_TRIES;
 break;
 }
 } else {
 /* successful connect, break out of loop */
 break;
 } /* endif connect */
 } /* endfor NUM_TRIES */
 if (j == NUM_TRIES) {

 fprintf(stderr,
 "PM %d CLIENT: Unable to connect sock to %s, errno %d\n",
 mypart(), partition[i].name, errno);
 fflush(stderr);
 exit(100);
 }
 } /* end for setting up CLIENT sockets */
}
/* pm_client -- The PM CLIENT process sends data to partitions. Set up client
** sockets. Send messages over the sockets: Get message. Send message (if it
** is a broadcast message send it to all partitions, if not send it to
** appropriate partition). Acknowledge sending of message. Release buffer.
** Reset next message indicator. */
static void pm_client(void)

{
 int i, size;
 /* CLIENT- GO INTO INFINITE SENDING LOOP */
 /* Initial setting to indicate the next message has not been selected */
 next_message = -1;
 /* Forever, wait for messages to send over socket */
 for ( ; ; ) {
 /* Get the message */
 pm_getmsg();
 /* Determine where to send the message */
 if (buffer[next_message].dnode == -1) {
 /* BROADCAST MESSAGE, SEND TO ALL PARTITIONS */
 for (i = 0; i < numparts(); i++) {
 /* Don't send broadcast to self */
 if (i == mypart () )
 continue;
 /* First send the size of the message */
 size = buffer[next_message].length;
 if (send(partition[i].socket, &size, sizeof(size),0)
 < 0) {
 fprintf(stderr,
 "PM %d CLIENT: send to PM %d failed, errno=%d\n",

 mypart(), i, errno);
 fflush(stderr);
 return -1;
 }
 /* Then send the actual message */
 if (send(partition[i].socket,
 &buffer[next_message], size, 0) < 0) {
 fprintf(stderr,
 "PM %d CLIENT: send to PM %d failed, errno=%d\n",
 mypart(), i, errno);
 fflush(stderr);
 return -1;
 }
 } /* endfor SEND BROADCAST TO ALL PARTITIONS */
 } else {
 /* SEND TO A SPECIFIC PARTITION */
 /* First send the size of the message */
 size = buffer[next_message].length;
 i = partof(buffer[next_message].dnode);
 if (send(partition[i].socket, &size, sizeof(size),0)
 < 0) {
 fprintf(stderr,
 "PM %d CLIENT: send to PM %d failed, errno=%d\n",
 mypart(), i, errno);
 fflush(stderr);
 return -1;
 }
 /* Then send the actual message */
 if (send(partition[i].socket,
 &buffer[next_message], size, 0) < 0) {
 fprintf(stderr,
 "PM %d CLIENT: send to PM %d failed, errno=%d\n",
 mypart(), i, errno);
 return -1;
 }
 }

 /* FOR BOTH BROADCAST AND REGULAR MESSAGES */
 /* acknowledge the sending of the message */
 buffer[next_message].valid[mybuffer()] = 0;
 /* release (free) the buffer */
 semcall_one(msgfree, next_message, 1);
 /* reset next_message so the next getmsg will work */
 next_message = -1;
 } /* end forever CLIENT PROCESS sending messages over socket */
}
/***** Initialization and Termination routines *****/
/* abort_prog -- Clean up in the case of an error */
static void abort_prog(void)
{
 int i;
 /* Remove the sets of semaphores */
 if (pmserver_pid != 0) {
 if (msgavail != -1) {
 semctl(msgavail, 0, IPC_RMID, 0);
 msgavail = -1;
 }
 if (msgfree != -1) {
 semctl(msgfree, 0, IPC_RMID, 0);
 msgfree = -1;
 }
 }
 /* Remove the shared memory */
 if (buffer != NULL) {
 shmdt(buffer);
 buffer = NULL;
 }
 /* Only PM SERVER process should execute this code */
 if (pmserver_pid != 0) {
 if (shmid_m != -1) {
 shmctl(shmid_m, IPC_RMID, 0);
 shmid_m = -1;
 }
 }
 /* Close the sockets */
 for (i = 0; i < num_parts; i++) {
 if (i != mypart() ) {
 close (partition[i].socket);
 partition[i].socket = 0;
 }
 }
 /* Make sure all pending output gets out */
 fflush(stdout);
 fflush(stderr);
}
/* Handle termination signals */
void sig_terminate(int sig, int code, struct sigcontext *scp)
{
 int i;
 /* Send termination signal to each of PM SERVER's children */

 if (pmserver_pid != 0) {
 for (i = 0; i < child_index; i++) {
 kill(children[i], SIGTERM);
 }
 child_index = 0;

 kill(pmserver_pid, SIGTERM);
 }
 /* Clean up the use of semaphores and shared memory */
 abort_prog();
 exit(100);
}
/* Handle unexpected termination signals */
void unexpected_death(int sig, int code, struct sigcontext *scp)
{
 int statval;
 int waitpid;
 /* Only PM SERVER process should execute this code */
 if (pmserver_pid != 0) {
 waitpid = wait(&statval);
 if (waitpid < 0) {
 printf("Error determining who died unexpectedly. Errno=%d\n", errno);
 } else {
 if (WIFSIGNALED(statval) != 0) {
 printf("Process %d did not catch signal %d.\n",
 waitpid, WTERMSIG(statval));
 } else if (WIFSTOPPED(statval) != 0) {
 printf("Process %d stopped due to signal %d.\n",
 waitpid, WSTOPSIG(statval));
 } else if (WIFEXITED(statval) == 0) {
 /* Normal termination */
 } else {
 /* Terminated with exit code */
 }
 }
 }
 fflush(stdout);
}
/* killcube -- On abort, kill off all children on the hypercube partition */
int _killcube(int node, int pid)
{
 int i;
 int statval;
 int waitpid;

 /* Only PM SERVER process should execute this code */
 if (pmserver_pid != 0) {
 for (i = 0; i < child_index; i++) {
 kill(children[i], SIGTERM);
 }
 kill(pmserver_pid, SIGTERM);
 for (i = 0; i <= child_index; i++) {
 waitpid = wait(&statval);
 if (waitpid < 0) {
 /* No more children left */
 break;
 } else {
 if (WIFSIGNALED(statval) != 0) {
 printf("Process %d did not catch signal %d.\n",
 waitpid, WTERMSIG(statval));
 } else if (WIFSTOPPED(statval) != 0) {
 printf("Process %d stopped due to signal %d.\n",
 waitpid, WSTOPSIG(statval));
 } else if (WIFEXITED(statval) == 0) {
 /* Normal termination */

 } else {
 /* Terminated with exit code */
 }
 }
 }
 }
 /* Clean up after ourself */
 abort_prog();
 child_index = 0;
 return 0;
}
/* init_shared_mem -- Allocates a shared memory region. Sets pointer to region
** in this process's memory space and returns the shared memory identifier. */
static int init_shared_mem(void **pointer, int size, int key)
{
 int shmid;
 if ((shmid = shmget(key, size, 0666 IPC_CREAT)) < 0) {
 printf("init_shm: allocation of shared memory failed. Errno=%d\n",errno);
 printf(" mynode=%d key=%d size=%d\n",my_node,key,size);
 _killcube(0,0);
 exit(-1);
 }
 *pointer = shmat(shmid, NULL, 0);
 return shmid;
}

/* init_semaphore -- Allocates a set of semaphores and initializes them */
static int init_semaphore(int *semid, int size, int value, int key)
{
 register int i;
 if ((*semid = semget(key, size, 0666 IPC_CREAT)) < 0) {
 printf("init_sem: allocation of semaphores failed. Errno=%d\n",errno);
 printf(" mynode=%d key=%d size=%d\n",my_node,key,size);
 _killcube(0,0);
 exit(-1);
 }
 for (i = 0; i < size; i++) {
 if (semctl(*semid, i, SETVAL, value) < 0) {
 printf("init_sem: init of semaphores failed. Errno=%d\n",errno);
 printf(" mynode=%d offset=%d value=%d\n",my_node,i,value);
 _killcube(0,0);
 exit(-1);
 }
 }
 return *semid;
}
/* semcall_all --Perform same operation on all elements of semaphore at
once.*/
static int semcall_all(int semid, int size, int operation)
{
 struct sembuf sbuf[NUMBER_IN_PART+1];
 register int i;
 for (i = 0; i < size; i++) {
 sbuf[i].sem_num = i;
 sbuf[i].sem_op = operation;
 sbuf[i].sem_flg = 0;
 }
 while (semop(semid, sbuf, size) < 0) {
 /* repeat operation if interrupted */
 if (errno != EINTR) {

 printf("PM %d: Semaphore broadcast failed. Errno = %d\n",
 mypart(), errno);
 fflush(stdout);
 return -1;
 }
 }
 return 0;
}
/* semcall_one -- Perform an operation on an element of a semaphore. */
static int semcall_one(int semid, int num, int operation)
{
 struct sembuf sbuf;
 sbuf.sem_num = num;
 sbuf.sem_op = operation;
 sbuf.sem_flg = 0;
 while (semop(semid, &sbuf, 1) < 0) {
 /* repeat operation if interrupted */
 if (errno != EINTR) {
 printf("PM %d: Semaphore failed. Errno = %d\n", mypart(), errno);
 fflush(stdout);
 return -1;
 }
 }
 return 0;
}
/***** Environment Information (External and Internal) *****/
/* numnodes -- Returns the number of simulated nodes */
int numnodes(void)
{
 return num_nodes;
}
/* numparts -- number of partitions */
static int numparts(void)
{
 return num_parts;
}
/* mypart -- Partition this process is in */
static int mypart(void)
{
 return my_part;
}

/* partof -- Determines which partition a given node is a member of */
static int partof(int n)
{
 if (n == myhost()) {
 return 0;
 } else {
 return n / NUMBER_IN_PART;
 }
}
/* pm_partof -- Determines which subpartition a given node is a member of
** A -1 can be passed if a destination node is broadcast, return -1. */
static int pm_partof(int n)
{
 if (n == myhost()) {
 return 0;
 } else if (n == -1) {
 return -1;

 } else {
 return n / NUMBER_IN_PART;
 }
}
/* numbuffers -- Number of buffers in this partition */
static int numbuffers(void)
{
 if (mypart() == 0) {
 return (nodes_in_part + 2);
 } else {
 return (nodes_in_part + 1);
 }
}
/* mybuffer -- returns the index for this process's buffer */
static int mybuffer(void)
{
 return (nodes_in_part);
}

/* bufferof -- Returns the buffer offset of the given node. Host is always
** second to last buffer in partition 0. The PM is always the last buffer */
static int bufferof(int n)
{
 if (mypart() != partof(n)) {
 return nodes_in_part; /* Return the buffer of PM */
 } else if (n == myhost()) {
 return nodes_in_part + 1; /* This partition, buffer of host */
 } else {
 return n % NUMBER_IN_PART; /* This partition, buffer of node */
 }
}
/* mynode -- Returns the node number for this process */
int mynode(void)
{
 return my_node;
}
/* myhost -- Returns the node number of the host */
int myhost(void)
{
 return numnodes();
}
/***** Communications *****/
/* pm_getmsg -- Wait until a message is available. This routine differs from
** getmsg, in that it checks to ensure that destination node is not in this
** partition. (Getmsg checks that current node equals destination node.)
** OUTPUT: next_message - set to the message found of the proper type */
static void pm_getmsg(void)

{
 int i;
 /* Only wait if a message is not already selected */
 if (next_message != -1) return;
 /* Wait for a message for me */
 semcall_one(msgavail, mybuffer(), -1);
 /* Search for those messages that are for me */
 for (i = 0; i < numbuffers(); i++) {
 if (buffer[i].valid[mybuffer()] != 0) {
 next_message = i;
 return;

 }
 }
}





[LISTING THREE]


/***** simulate.c *****/
/* These functions allow a UNIX system simulate a hypercube environment. */
#include <stdio.h>
#include <ctype.h>
#include <sys/types.h>
#include <errno.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <sys/shm.h>
#include <sys/signal.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <sys/in.h>
#include <netdb.h>
#include "cube.h"

/* Prototypes */
char *getenv(char *variable);
void *shmat(int shmid, void *shmaddr, int shmflg);
char *strtok(char *, char *);
char *strcpy(char *, char *);
void *malloc(int size);
#define min(x,y) (((x) < (y)) ? (x) : (y))
int csend(int type, void *msg, int length, int target_node, int group);
int crecv(int type, void *buf, int len);
int killcube(int, int);
int numnodes(void);
int myhost(void);
int mynode(void);
int numparts(void);
int numbuffers(void);
int mybuffer(void);
int bufferof(int node);
int mypart(void);
int partof(int node);

/* Local, Private Information */
static int num_parts; /* number of partitions */
static int my_part; /* partition this process is in */
static int nodes_in_part; /* number of nodes in this partition */
static subpart *partition = NULL; /* list of partition information */
static int base; /* base key value for allocating shared data */
static int my_node; /* node number for this process */
static int my_group; /* group id for this process */
 /* There are two groups, host communications */
 /* and inter-node communications */
static int num_nodes; /* total number of nodes in all partitions */
static int msgavail = -1; /* semaphores indicating message is available */

static int msgfree = -1; /* semaphores indicating buffer is free */
static int next_message; /* which message is to be received next */
static int shmid_m = -1; /* id of shared area for messages */
static message *buffer = NULL;/* communication areas */
static int *children = NULL; /* process ids of all child processes */
static int child_index = 0; /* number of children created */

/ ** Initialization and Termination routines ** /
/* abort_prog -- Clean up when the program terminates */
void abort_prog(void)
{
 /* Remove the sets of semaphores */
 if (mynode() == myhost()) {
 if (msgavail != -1) {
 semctl(msgavail, 0, IPC_RMID, 0);
 msgavail = -1;
 }
 if (msgfree != -1) {
 semctl(msgfree, 0, IPC_RMID, 0);
 msgfree = -1;
 }
 }
 /* Remove the shared memory */
 if (buffer != NULL) {
 shmdt(buffer);
 buffer = NULL;
 }
 if (mynode() == myhost()) {
 if (shmid_m != -1) {
 shmctl(shmid_m, IPC_RMID, 0);
 shmid_m = -1;
 }
 }
 /* Make sure all pending output gets out */
 fflush(stdout);
 fflush(stderr);
}
/* Handle termination signals */
void sig_terminate(int sig, int code, struct sigcontext *scp)
{
 if (mynode() == myhost()) {
 /* Pass on the termination signal to the node processes */
 killcube(0,0);
 } else {
 /* This is executed by the node processes */
 /* Clean up the use of semaphores and shared memory */
 abort_prog();
 }
 exit(100);
}
/* Handle unexpected termination signals. Used by the host process. */
void unexpected_death(int sig, int code, struct sigcontext *scp)
{
 int statval;
 int waitpid;
 waitpid = wait(&statval);
 if (waitpid < 0) {
 printf("Error determining who died unexpectedly. Errno=%d\n", errno);
 } else {

 if (WIFSIGNALED(statval) != 0) {
 printf("Process %d did not catch signal %d.\n",
 waitpid, WTERMSIG(statval));
 } else if (WIFSTOPPED(statval) != 0) {
 printf("Process %d stopped due to signal %d.\n",
 waitpid, WSTOPSIG(statval));
 } else if (WIFEXITED(statval) == 0) {
 /* Normal termination */
 } else {
 /* Terminated with exit code */
 }
 }
 fflush(stdout);
}
/* handler -- handles hypercube specific errors that do not map to UNIX. */
void handler(int type, void (*proc)())
{
 /* ignore this */
}
/* getcube -- Called by host process to gain possession of a partition in a
** hypercube. Note: Assuming getcube is only called once per host process. */
void getcube(char *cubename, char *cubetype, char *srmname, int keep,
 char *account)
{
 char size[8];
 int is_dimension = 0;
 int i;
 char *ptr;
 char *target;
 /* Pull out the requested number of nodes */
 ptr = cubetype;
 if (*ptr == 'd') {
 ptr++;
 is_dimension = 1;
 }
 target = size;
 i = 4;
 while (isdigit(*ptr) && (i-- != 0)) {
 *target++ = *ptr++;
 }

 *target = '\0';
 /* The rest of the parameters don't matter */
 /* Determine the total number of nodes */
 num_nodes = NUMBER_IN_PART; /* default size */
 sscanf(size,"%d",&num_nodes);
 if (is_dimension) {
 num_nodes = 1 << num_nodes;
 }
}
/* cubeinfo -- Passes back information about the partitions on a hypercube.
** Input: global=0 current attached cube; 1. all cubes you own and allocated
** by the current host; 2. all cubes on the system from which the command was
** executed; 3. how cubes are allocated on all SRMs; 4. 1 addition parameter
** (srmname) returns info for that SRM */
int cubeinfo(struct cubetable *ct, int numslots, int global, ...)
{
 /* returns the number of cubes for which information is available */
 /* Ignore this for now */

 return 0;
}
/* relcube -- release cube gained by the getcube call. */
void relcube(char *cubename)
{
 /* Ignore this for now */
}
/* killcube -- On abort, kill off all processes in the hypercube partition */
int killcube(int node, int pid)
{
 int i;
 int statval;
 int waitpid;
 /* Force everyone to terminate */
 for (i = 0; i < child_index; i++) {
 kill(children[i], SIGTERM);
 }

 /* Give the children a chance to terminate */
 if (child_index > 0) sleep(1);
 /* Wait for everyone to exit, check status in case */
 for (i = 0; i < child_index; i++) {
 waitpid = wait(&statval);
 if (waitpid < 0) {
 /* No more children left */
 break;
 } else {
 if (WIFSIGNALED(statval) != 0) {
 printf("Process %d did not catch signal %d.\n",
 waitpid, WTERMSIG(statval));
 } else if (WIFSTOPPED(statval) != 0) {
 printf("Process %d stopped due to signal %d.\n",
 waitpid, WSTOPSIG(statval));
 } else if (WIFEXITED(statval) == 0) {
 /* Normal termination */
 } else {
 /* Terminated with exit code */
 }
 }
 }
 /* Clean up after ourself */
 abort_prog();
 child_index = 0;
 return 0;
}
/* init_shared_mem -- Allocates a shared memory region. Sets pointer to region
** in this process's memory space and returns the shared memory identifier. */
static int init_shared_mem(void **pointer, int size, int key)
{
 int shmid;
 if ((shmid = shmget(key, size, 0666 IPC_CREAT)) < 0) {
 printf("init: allocation of shared memory failed. Errno=%d\n",errno);
 printf(" mynode=%d key=%d size=%d\n",my_node,key,size);
 fflush(stdout);
 sig_terminate(0,0,NULL);
 }
 *pointer = shmat(shmid, NULL, 0);
 return shmid;
}

/* init_semaphore -- Allocates a set of semaphores and initializes them */
static int init_semaphore(int *semid, int size, int value, int key)
{
 register int i;
 if ((*semid = semget(key, size, 0666 IPC_CREAT)) < 0) {
 printf("init: allocation of semaphores failed. Errno=%d\n",errno);
 printf(" mynode=%d key=%d size=%d\n",my_node,key,size);
 fflush(stdout);
 sig_terminate(0,0,NULL);
 }
 for (i = 0; i < size; i++) {
 if (semctl(*semid, i, SETVAL, value) < 0) {
 printf("init: initialization of semaphores failed. Errno=%d\n",errno);
 printf(" mynode=%d offset=%d value=%d\n",my_node,i,value);
 fflush(stdout);
 sig_terminate(0,0,NULL);
 }
 }
 return *semid;
}
/* semcall_all -- Perform same operation on all elements of a semaphore. */
static int semcall_all(int semid, int size, int operation)
{
 struct sembuf sbuf[NUMBER_IN_PART+1];
 register int i;
 for (i = 0; i < size; i++) {
 sbuf[i].sem_num = i;
 sbuf[i].sem_op = operation;
 sbuf[i].sem_flg = 0;
 }
 while (semop(semid, sbuf, size) < 0) {
 /* repeat operation if interrupted */
 if (errno != EINTR) {
 printf("%d: Semaphore broadcast failed. Errno = %d\n",mynode(),errno);
 abort_prog();
 exit(-1);
 }
 }
 return 0;
}
/* semcall_one -- Perform an operation on an element of a semaphore. */
static int semcall_one(int semid, int num, int operation)
{
 struct sembuf sbuf;
 sbuf.sem_num = num;
 sbuf.sem_op = operation;
 sbuf.sem_flg = 0;
 while (semop(semid, &sbuf, 1) < 0) {
 /* repeat operation if interrupted */
 if (errno != EINTR) {
 printf("%d: Semaphore failed. Errno = %d\n",mynode(), errno);
 abort_prog();
 exit(-1);
 }
 }
 return 0;
}
/* setpid -- Assigns a partition identifier to the simulated partition. */
int setpid(int id)

{
 my_group = id;
 return 0;
}
/* init_simulator -- Should be called near the beginning of an application
** before any hypercube-related functions are called. */
void init_simulator(void)
{
 register int i, pid;
 char filename[20];
 char *temp;
 static char env[256]; /* must be static */
 struct hostent *hent;
 /* parent cm will send child cm SIGINT when a CTRL-BREAK is pressed */
 signal(SIGINT,sig_terminate);
 signal(SIGTERM,sig_terminate);
 signal(SIGQUIT,sig_terminate);
 /* Pick up the base key value from the environment */
 if ((temp = getenv("SIM_INFO")) == NULL) {
 fprintf(stderr,"init_sim: Missing environment variable\n");
 fflush(stderr);
 exit(-1);
 }

 strcpy(env,temp);
 if ((temp = strtok(env,",")) == NULL) {
 fprintf(stderr, "init_sim: Missing information in environment variable\n");
 fflush(stderr);
 exit(-1);
 }
 sscanf(temp,"%d",&base);
 if ((temp = strtok(NULL,",")) == NULL) {
 fprintf(stderr,"init_sim: Missing node info in environment variable\n");
 fflush(stderr);
 exit(-1);
 }
 sscanf(temp,"%d",&my_node);
 if ((temp = strtok(NULL,",")) == NULL) {
 fprintf(stderr,"init_sim: Missing pid info in environment variable\n");
 fflush(stderr);
 exit(-1);
 }
 sscanf(temp,"%d",&my_group);
 if ((temp = strtok(NULL,",")) == NULL) {
 fprintf(stderr,"init_sim: Missing number of node info in environment
 variable\n");
 fflush(stderr);
 exit(-1);
 }
 sscanf(temp,"%d",&num_nodes);
 num_parts = (num_nodes + NUMBER_IN_PART - 1) / NUMBER_IN_PART;
 my_part = my_node / NUMBER_IN_PART;
 /* Calcuate the number of nodes in this and remaining partitions */
 i = numnodes() - (mypart() * NUMBER_IN_PART);
 nodes_in_part = min(NUMBER_IN_PART, i);
 /* Allocate shared memory */
 shmid_m = init_shared_mem(&buffer, sizeof(message) * numbuffers(), base);
 /* Allocate communications semaphores */
 init_semaphore(&msgavail, numbuffers(), 0, base+10000);

 init_semaphore(&msgfree, numbuffers(), 0, base+20000);
}
/* load -- Should be called near the beginning of a host application before
any
** hypercube-related functions are called, except for getcube. It will start
** the appropriate number of PMs on the appropriate systems (as read from the
** .pmrc file.) Parent process will be node 0, which has special roles on a
** hypercube. Remaining processes will be numbered consecutively. */
int load(char *filename, int which_node, int group_id)
{
 register int i, j, pid, size;
 char *argv[20];

 char base_string[20];
 char partition_number[20];
 char group_string[20];
 char number_of_nodes[20];
 char temp[256];
 char *ptr;
 struct servent *sp;
 FILE *fd;
 /* Allocate space for child pids */
 if (children == NULL) {
 if ((children = malloc(numnodes() * sizeof(int))) == NULL) {
 fprintf(stderr,"load: insufficient memory\n");
 fflush(stderr);
 return -1;
 }
 }
 /* parent will send us SIGINT when CTRL-BREAK is pressed */
 signal(SIGINT,sig_terminate);
 signal(SIGTERM,sig_terminate);
 signal(SIGQUIT,sig_terminate);
 signal(SIGCHLD, unexpected_death);
 base = getpid();
 num_parts = (num_nodes + NUMBER_IN_PART - 1) / NUMBER_IN_PART;
 if (partition == NULL) {
 if ((partition = malloc(num_parts * sizeof(subpart))) == NULL) {
 fprintf(stderr,"load: insufficient memory\n");
 fflush(stderr);
 return -1;
 }
 memset(partition, 0, num_parts * sizeof(subpart));
 if ((fd = fopen(".pmrc","r")) == NULL) {
 fprintf(stderr,"load: Missing configuration file \".pmrc\"\n");
 fflush(stderr);
 return -1;
 }
 for (i = 0; i < num_parts; i++) {
 temp[0] = '\0';
 fscanf(fd," %[^ \n] \n",temp);
 size = strlen(temp);
 if ((ptr = malloc(size+1)) == NULL) {
 fprintf(stderr,"load: Insufficent memory\n");
 fflush(stderr);
 return -1;
 }
 strcpy(ptr,temp);
 partition[i].name = ptr;
 }

 fclose(fd);
 }

 /* Host program's node number is the same as the number of nodes */
 my_node = numnodes();
 my_part = 0;
 /* Calcuate the number of nodes in this and remaining partitions */
 i = numnodes() - (mypart() * NUMBER_IN_PART);
 nodes_in_part = min(NUMBER_IN_PART, i);
 /* Allocate shared memory */
 if (shmid_m == -1) {
 shmid_m = init_shared_mem(&buffer, sizeof(message) * numbuffers(),base);
 }
 memset(buffer,0,sizeof(message) * numbuffers());
 /* Allocate communications semaphores */
 if (msgavail == -1) {
 init_semaphore(&msgavail, numbuffers(), 0, base+10000);
 }
 if (msgfree == -1) {
 init_semaphore(&msgfree, numbuffers(), 0, base+20000);
 }
 /* Split into node processes */
 fflush(stdout);
 fflush(stderr);
 /* Start the local and remote Partition Managers */
 for (i = 0; i < num_parts; i++) {
 if ((pid = fork()) < 0) {
 /* Can't create all the children! */
 killcube(0,0);
 fprintf(stderr, "LOAD: unable to create Partition Managers\n");
 return -1;
 } else if (pid == 0) {
 /* I'm the child process */
 my_node = -1;
 /* Start the Partition Managers */
 if (i == 0) {
 argv[0] = "pm";
 argv[1] = filename;
 sprintf(partition_number, "%d", i);
 argv[2] = partition_number;
 sprintf(base_string,"%d", base);
 argv[3] = base_string;
 sprintf(group_string, "%d", group_id);
 argv[4] = group_string;
 sprintf(number_of_nodes, "%d", numnodes());
 argv[5] = number_of_nodes;
 for (i = 0; i < num_parts; i++) {
 argv[i+6] = partition[i].name;
 }
 argv[i+6] = NULL;
 execvp("pm",argv);
 /* If we get here, we had a problem */
 printf("execvp of PM 0 failed. errno=%d\n",errno);
 fflush(stdout);
 exit(-1);
 } else {
 argv[0] = "rsh";
 argv[1] = partition[i].name;
 argv[2] = "pm";

 argv[3] = filename;
 sprintf(partition_number, "%d", i);
 argv[4] = partition_number;
 sprintf(base_string,"%d", base);
 argv[5] = base_string;
 sprintf(group_string, "%d", group_id);
 argv[6] = group_string;
 sprintf(number_of_nodes, "%d", numnodes());
 argv[7] = number_of_nodes;
 for (i = 0; i < num_parts; i++) {
 argv[i+8] = partition[i].name;
 }
 argv[i+8] = NULL;
 execvp("rsh",argv);
 /* If we get here, we had a problem */
 printf("execvp of PM 0 failed. errno=%d\n",errno);
 fflush(stdout);
 exit(-1);
 }
 } else {
 /* I'm the parent process */
 children[child_index++] = pid;
 }
 }
}
/** Environment Information (External and Internal) **/
/* availmem -- returns amount of memory available */
int availmem(void)
{
 return 0;
}
/* nodedim -- Returns the dimension of the simulated hypercube */
int nodedim(void)
{
 unsigned int i, temp;
 temp = num_nodes;
 i = 0;
 while (temp != 0) {
 temp >> 1;
 i++;
 }
 return i;
}
/* numnodes -- Returns the number of simulated nodes */
int numnodes(void)
{
 return num_nodes;
}
/* numparts -- number of simulator partitions */
static int numparts(void)
{
 return num_parts;
}
/* mypart -- Simulator partition this process is in */
static int mypart(void)
{
 return my_part;
}
/* partof -- Determines which simulator partition a given node is member of */

static int partof(int n)
{
 if (n == myhost()) {
 return 0;
 } else {

 return n / NUMBER_IN_PART;
 }
}
/* numbuffers -- Number of buffers in this simulator partition */
static int numbuffers(void)
{
 if (mypart() == 0) {
 return nodes&us.in&us.part + 2;
 } else {
 return nodes&us.in&us.part + 1;
 }
}
/* mybuffer -- returns the index for this process's buffer */
static int mybuffer(void)
{
 if (mynode() == myhost()) {
 return nodes&us.in&us.part+1;
 } else {
 return mynode() % NUMBER_IN_PART;
 }
}
/* bufferof -- Returns the buffer offset of the given node. The host is always
** the last buffer in partition 0. The PM is always second to last buffer */
static int bufferof(int n)
{
 if (mypart() != partof(n)) {
 return nodes_in_part; /* Return the buffer of PM */
 } else if (n == myhost()) {
 return nodes_in_part + 1; /* This partition, buffer of host */
 } else {
 return n % NUMBER_IN_PART; /* This partition, buffer of node */
 }
}
/* mynode -- Returns the node number for this process */

int mynode(void)
{
 return my_node;
}
/* mypid -- Returns the group number */
int mypid(void)
{
 return my_group;
}
/* myhost -- Returns the node number of the host */
int myhost(void)
{
 return numnodes();
}
/** Communications **/
/* cread -- Special read for files on hypercube's high-speed disk system. We
just issue a standard read instead. */
int cread(int fd, void *buffer, int size)

{
 return read(fd, buffer, size);
}
/* gdsum -- Sum individual elements of an array on all processes */
void gdsum(double x[], long elements, double work[])
{
 register int i,j;
 double temp;
 if ((mybuffer()) == 0) {
 /* The first node in each partition sums the local data */
 if (nodes_in_part > 1) {
 /* Only sum when we aren't the only ones in the partition */
 for (i = 1; i < nodes_in_part; i++) {

 /* Get the next set of numbers to sum */
 crecv(-2, work, elements * sizeof(double));
 for (j = 0; j < elements; j++) {
 x[j] += work[j];
 }
 }
 }
 /* Node 0 sums for all partitions */
 if (mynode() == 0) {
 /* Only sum if there are more than one partition */
 if (numparts() > 1) {
 for (i = 1; i < numparts(); i++) {
 /* Get the next set of numbers to sum */
 crecv(-3, work, elements * sizeof(double));

 for (j = 0; j < elements; j++) {
 x[j] += work[j];
 }
 }
 }
 /* Only broadcast if there is more than one node */
 if (nodes_in_part > 1) {
 /* Broadcast the results */
 csend(-4,x,elements * sizeof(double),-1,mypid());
 }
 } else {
 /* Each partition needs to send the partial sum to node 0 */
 csend(-3,x,elements * sizeof(double),0,mypid());
 /* Wait for the answer */
 crecv(-4,x,elements * sizeof(double));
 }
 } else {
 /* Send the data to local node to do the summation */
 csend(-2,x,elements * sizeof(double),mypart()*4,mypid());
 /* Wait for the answer */
 crecv(-4,x,elements * sizeof(double));
 }
}
/* getmsg -- Wait until a message is available. OUTPUT: next_message, set to
** the message found of the proper type */
static void getmsg(void)
{
 int i;
 /* Only wait if a message is not already selected */
 if (next_message != -1) return;


 /* Wait for a message for me */
 semcall_one(msgavail, mybuffer(), -1);
 /* Search for those messages that are for me */
 for (i = 0; i < numbuffers(); i++) {
 if (buffer[i].valid[mybuffer()] == 1) {
 if ((buffer[i].dnode == mynode()) (buffer[i].dnode == -1)) {
 next_message = i;
 return;
 }
 }
 }
}
/* cprobe -- Wait until a message of a specific type is available. OUTPUT:
** next_message, set to the message found of the proper type */
void cprobe(int type)
{
 int i,j;
 /* Make sure all pending writes in application have occured */
 fflush(stdout);
 fflush(stderr);
 /* See if a specific type was requested */
 if (type == -1) {
 getmsg();
 return;
 } else if ((next_message != -1) && (type == buffer[next_message].type)) {
 /* message was already located */
 return;
 } else {
 while (1) {
 /* Wait for a message for me */
 semcall_one(msgavail, mybuffer(), -1);
 /* Search for those messages that are for me and is the type I need */
 for (i = 0; i < numbuffers(); i++) {
 if (buffer[i].valid[mybuffer()] == 1) {
 if ((buffer[i].dnode == mynode()) (buffer[i].dnode == -1)) {
 if (buffer[i].type == type) {
 next_message = i;
 /* Put back all skipped messages back */
 for (j = 0; j < numbuffers(); j++) {
 if (buffer[j].valid[mybuffer()] == 2) {
 buffer[j].valid[mybuffer()] = 1;
 semcall_one(msgavail, mybuffer(), 1);
 }

 }
 return;
 } else {
 /* Mark the message so that we don't look at it again */
 buffer[i].valid[mybuffer()] = 2;
 }
 }
 }
 }
 }
 }
}
/* infocount -- Return the length of the message that will be received. */
int infocount(void)

{
 getmsg();
 return buffer[next_message].t_length;
}
/* infonode -- Returns the node that sent the message */
int infonode(void)
{
 getmsg();
 return buffer[next_message].snode;
}
/* infopid -- Returns the group (pid) of the node that sent the message */
int infopid(void)
{
 getmsg();
 return buffer[next_message].spid;
}
/* csend -- Synchronous message sending between two nodes. If the target node
** number is -1, then the message is broadcasted to all nodes. Limitations:
** Assumes that the message buffer is free to use. In other words, if
** an asynchronous send was previously done, we assume that a msgwait
** was done to ensure the previous message reached its destination. */
int csend(int type, void *msg, int length, int target_node, int group)
{
 int i,j, sent_length = 0;
 char *source;
 i = mybuffer();
 /* Fill in the message */
 source = msg;
 buffer[i].type = type;
 buffer[i].dnode = target_node;
 buffer[i].spid = mypid();
 buffer[i].snode = mynode();
 buffer[i].t_length = length;
 while (length > 0) {
 /* Divide the message into smaller chunks */
 buffer[i].length = min(MAX_MESSAGE_SIZE, length);
 memcpy(buffer[i].msg, source, buffer[i].length);
 source += buffer[i].length;
 sent_length += buffer[i].length;
 length -= buffer[i].length;
 /* Tell the node(s) the message is there */
 if (target_node == -1) {
 /* Broadcast the message to nodes in this partition */
 /* and to the process manager */
 for (j=0; j < nodes&us.in&us.part + 1; j++) {
 buffer[i].valid[j] = 1;
 }
 semcall_all(msgavail,nodes&us.in&us.part+1, 1);
 /* Of course, we already have the message */
 semcall_one(msgavail,i, -1);
 /* Wait until everyone receives the message */
 semcall_one(msgfree, i, -nodes&us.in&us.part);
 } else {
 /* Point to point to another node */
 j = bufferof(target_node);
 buffer[i].valid[j] = 1;
 semcall_one(msgavail, j, 1);
 /* Wait until it is received */
 semcall_one(msgfree, i, -1);

 }
 }
 return sent_length;
}
/* crecv -- Synchronous message reception between two nodes. */
int crecv(int type, void *buf, int len)
{
 int recv_len = 0, copy_len = 0, temp_len, total_len;
 int recv_node;
 char *target;
 /* Get a message of this type */
 cprobe(type);
 target = buf;
 total_len = buffer[next_message].t_length;
 recv_node = buffer[next_message].snode;
 do {
 if (recv_node != buffer[next_message].snode) {
 /* Message is from another node, put off receiving */
 buffer[next_message].valid[mybuffer()] = 3;
 } else {
 /* Message is from same node, add it the previous messages */
 recv_len += buffer[next_message].length;
 temp_len = min(len - copy_len, buffer[next_message].length);
 if (temp_len > 0) {
 memcpy(target, buffer[next_message].msg, temp_len);
 target += temp_len;
 }
 copy_len += buffer[next_message].length;
 /* Acknowledge the receipt of the message */
 buffer[next_message].valid[mybuffer()] = 0;
 semcall_one(msgfree, next_message, 1);
 }
 /* Indicate that no message has been selected */
 next_message = -1;
 if (recv_len < total_len) {
 cprobe(type);
 }
 } while (recv_len < total_len);
 /* Scan buffers to restore any skipped messages */
 for (i = 0; i < numbuffers(); i++) {
 if (buffer[i].valid[mybuffer()] == 3) {
 buffer[i].valid[mybuffer()] = 1;
 semcall_one(msgavail, mybuffer(), 1);
 }
 }
 return total_len;
}
/* isend -- Asynchronous message sending between two nodes. If the target node
** number is -1, then the message is broadcasted to all nodes. Limitations:
** Assumes that the message buffer is free to use. In other words, if
** an asynchronous send was previously done, we assume that a msgwait
** was done to ensure the previous message reached its destination. */
int isend(int type, void *msg, int length, int target_node, int group)
{
 int i,j, sent_length = 0;
 char *source;
 i = mybuffer();
 buffer[i].type = type;
 buffer[i].dnode = target_node;

 buffer[i].spid = mypid();
 buffer[i].snode = mynode();
 buffer[i].t_length = length;
 while (length > 0) {
 /* Divide the message into smaller chunks */
 buffer[i].length = min(MAX_MESSAGE_SIZE, length);
 memcpy(buffer[i].msg, source, buffer[i].length);
 source += buffer[i].length;
 sent_length += buffer[i].length;
 length -= buffer[i].length;
 /* Tell the node(s) the message is there */
 if (target_node == -1) {
 /* Broadcast the message to nodes in this partition */
 /* and to the process manager */
 for (j=0; j < nodes_in_part+1; j++) {
 buffer[i].valid[j] = 1;
 }
 semcall_all(msgavail,nodes_in_part+1, 1);
 /* Of course, we already have the message */
 semcall_one(msgavail,i, -1);
 /* Wait for acknowledge on all but the last part */
 if (length > 0) {
 /* Wait until everyone receives the message */
 semcall_one(msgfree, i, -nodes_in_part);
 }
 } else {
 /* Point to point to another node */
 j = bufferof(target_node);
 buffer[i].valid[j] = 1;
 semcall_one(msgavail, j, 1);
 /* Wait for acknowledge on all but the last part */
 if (length > 0) {
 /* Wait until it is received */
 semcall_one(msgfree, i, -1);
 }
 }

 }
 /* Return which buffer needs to be waited on */
 return i;
}
/* irecv -- Asynchronous message reception between two nodes. Returns message
** identifier for acknowledging the message. */
int irecv(int type, void *buf, int len)
{
 int mid;
 int recv_len = 0, copy_len = 0, temp_len, total_len;
 char *target;
 /* Get a message of this type */
 cprobe(type);
 mid = next_message;
 target = buf;
 total_len = buffer[next_message].t_length;
 do {
 recv_len += buffer[next_message].length;
 temp_len = min(len - copy_len, buffer[next_message].length);
 if (temp_len > 0) {
 memcpy(target, buffer[next_message].msg, temp_len);
 target += temp_len;

 }
 copy_len += buffer[next_message].length;
 /* Acknowledge all but last partial message */
 if (recv_len < total_len) {
 /* Acknowledge the receipt of the message */
 buffer[next_message].valid[mybuffer()] = 0;
 semcall_one(msgfree, next_message, 1);
 }
 /* Indicate that no message has been selected */
 next_message = -1;
 if (recv_len < total_len) {
 cprobe(type);
 }
 } while (recv_len < total_len);
 return mid;
}
/* msgwait -- Wait for a message to be received by the target node(s) */
void msgwait(int mid)

{
 if (mid == mybuffer()) {
 /* Then it was a send to another node */
 if (buffer[mid].dnode == -1) {
 /* Wait for everyone to receive the message */
 semcall_all(msgfree, mid, -nodes&us.in&us.part);
 } else {
 semcall_one(msgfree, mid, -1);
 }
 } else {
 /* It was a receive from another node */
 semcall_one(msgfree, mid, 1);
 }
}
/* flushmsg -- Forces the removal of pending messages to a node */
void flushmsg(int type, int target_node, int group)
{
 /* Do nothing for now */
 fflush(stdout);
 fflush(stderr);
}
/* mclock -- Return time in milliseconds. */
unsigned long mclock(void)
{
 unsigned long current_time;
 time(&current_time);
 return current_time * 1000;
}















January, 1993
PORTING TO THE WIN32 API


Porting from Windows 3 to Windows NT


 This article contains the following executables: PORT32.ARC


Peter Handsman


Peter is senior software engineer at Inmark Development and co-architect of
zApp. He can be contacted at 2065 Landings Drive, Mountain View, CA 94043 or
via the Internet at tab@ netcom.com.


This article discusses my experiences in porting the zApp application
framework from the Windows 3.x 16-bit API to the Windows NT 32-bit API. Once
ported to the Win32 API, programs can take advantage of NT features such as
multiple-processor support, distributed processing, networking, and the like.
Still, 16-bit Windows apps can coexist and communicate with 32-bit programs
via dynamic data exchange, object linking and embedding, and the clipboard.
For its part, zApp is a C++ application framework for Windows, DOS,
Presentation Manager, and Motif programming. The version of zApp we ported to
NT had 170 classes, 73 source and header files, and over 28,000 lines of code.
We also ported 38 sample programs (5000 lines) and the zApp demo/information
program (2800 lines). (For more information on zApp programming, see "Sizing
Up Application Frameworks and Class Libraries," by Ray Valdes, DDJ, October
1992.)
We received a working Windows NT C++ compiler days before the NT developers'
conference in July 1992. By the time the conference started, we'd ported the
framework and had sample programs running. Shortly thereafter, we began
shipping our Windows NT version. In this article, I'll cover some of the
problems we encountered and discuss what you can expect when dealing with the
new compiler and operating system.


Compiler and API Issues


The C++ code in zApp is fairly noncompiler specific. Under DOS/Windows, it
compiles with Borland C++, Zortech C++, and Microsoft C7. Avoiding
compiler-specific code gave us some hope of porting zApp without worrying
about how much of the latest C++ specification a compiler implemented or about
some vendor's compiler extensions. This allowed us to limit the scope of the
problems when dealing with the Windows NT C++ compiler. If the compiler did
not like our code, it was the compiler's fault, and I knew to look for a
workaround. The first version of C++ available for Windows NT had major
limitations: no iostreams, limited documentation, internal compiler errors,
and syntax peculiarities you might expect from a first version of a prerelease
compiler. By the time you read this, many of these will likely have been
corrected.
As an application framework, zApp uses most of the Windows API. Consequently,
we probably had to deal with more API calls in the porting process than you'd
expect of an average application. While this forced us to deal with a wide
variety of problems, many modified API calls only required a single change in
the zApp code because much of the Windows API is encapsulated in specific
classes. Thirty-four of the source files actually compiled and ran without
changes, and none of the sample programs needed changing (other than
workarounds for what look like Windows NT bugs). Unfortunately, internal to
zApp, this luck didn't hold. We had to change every piece of code that dealt
with Window creation, message handling, window-class creation, or
superclassing. According to Microsoft, only a few API calls and messages have
changed, but if you use an API call 20 to 40 times (or more), it takes time to
wade through the code.


Starting the Port


I started this port as I do with all ports: ripping out every bit of
functionality that's not needed for the simplest zApp sample program, then
getting it to work. The sample program required only simple message
dispatching and window creation, and very few internal classes. I then added
the rest of the subsystems, class-by-class, file-by-file, one sample program
at a time. Reducing the problem set this way is especially important in the
beginning when attempting to force the C++ compiler and linker to work.
One of my first problems was the missing global operators new() and delete(),
usually defined in the standard libs. Naturally, the linker didn't know about
C++ mangled names and complained that "??2@ZAPEXI@Z" and "??3@ZAXPEX@Z" (new
and delete) were unresolved externals. I figured out the problem by adding a
few lines of C++ code to Microsoft's Generic sample program, then examining
line-by-line what triggered the need for these decorated external functions.
It turned out I needed to explicitly link in a library that the NT SDK didn't
even install! The missing functions were in \mstools\mfc\ lib\libcxx.lib.
Clearly, zApp does not use the Microsoft Foundation Class library, so it took
some time to track this problem. (In the October '92 beta version of NT, the
library has been renamed to libcx32.lib and is not located with MFC--but you
still must link it in explicitly.)
Equally frustrating was that the compiler and linker have different
commandline options and syntax from their MSC7 predecessors. Naturally, they
were only documented in the Tools manual -- which we didn't have. Needless to
say, language/compiler problems like these crop up when using new software.
More significant, however, are the specific problems we ran across with the NT
implementation of the Win32 API.


API Changes


Some API changes -- the move from GetTextExtent() to GetTextExtentPoint() and
from MoveTo() to MoveToEx()--are simple. Our member functions that use these
calls were easily updated with an #ifdef, as shown in Examples 1(a) and 1(b).
While not obvious from the SDK API documentation, these changes cause a big
performance hit. By requiring the coordinates that MoveToEx() returns, the GDI
is forced to do the work immediately. This causes cached GDI commands to be
flushed, as if GdiFlush() were called. The cached code in Example 1(c) is much
faster because 0 is passed as the fourth parameter to MoveToEx(). This void
return value would require a change to zApp, even though our sample code never
used the return value. After questioning our beta testers, we found no one was
using the return value. We may make this change in zApp.
Example 1: Using #ifdef to update zApp member functions.

 (a)
 zDimension zDisplay:: getTextDim(char *s, int c=0)
 {
 #ifdef __NT__
 SIZE sizeRect;
 GetTextExtentPoint(hDC, s, c==0?strlen(s):c, &sizeRect);
 return zDimension (sizeRect.cx, sizeRect.cy);
 #else
 return zDimension(GetTextExtent(hDC, s, c==0?strlen(s):c));
 #endif
 }

 (b)
 inline zPoint zDisplay::moveTo(zPoint p)
 {

 #ifdef __NT__
 zPoint pt;
 MoveToEx(hDC, p.x(), p.y(),&pt);
 return pt;
 #else
 return (zPoint)MoveTo(hDC, p.x(), p.y());
 #endif
 }

 (c)
 inline void zDisplay::moveTo(zPoint p)
 {
 #ifdef __NT__
 MoveToEx(hDC, p.x(), p.y(), 0);
 #else
 MoveTo(hDC, p.x(), p.y());
 #endif
 }

Until Windows NT is more mature, we're not going to do much performance
testing. However, it's clear from preliminary testing that cached vs.
noncached GDI calls do make a difference. A cached GDI call goes from your
application to a DLL, and if there is room in the queue, it's added to the
buffer and returns to your app. However, a noncached GDI call must go into the
DLL, wait on a shared-memory region mutex, copy any cached calls into the
shared-memory region, make an LPC into the kernel through the NT Executive,
move up into the Win32 subsystem process, perform the GDI operation, and then
retrace these steps back into your app. Even with optimizations Microsoft may
undertake, this is going to take more time than just returning from the DLL.
Unless your app absolutely requires it, don't use a noncached variant of a GDI
call.
One thing to keep in mind regarding this queued GDI operation is that if
you're trying to do highly interactive operations or animation, nothing will
be visible until the queue is filled or flushed. It may be necessary to call
GdiFlush() explicitly; while debugging it can help to call GdiSetBatchLimit(1)
to turn off this mechanism.
Problems also arise during compilation. In a complex new environment, it can
take a while to become familiar with cryptic syntax errors. Take, for example,
the CreateWindow() call for a control window in Example 2. When compiling, I
got a syntax error over the tenth parameter to CreateWindowExA, even though I
knew it was cast to the proper value and that I'd never seen CreateWindowExA.
After examining the Windows header files, it turned out CreateWindow is a
#define for CreateWindowA, and CreateWindowA is a #define for CreateWindowExA.
The tenth parameter of CreateWindowExA() corresponds to the ninth parameter of
my call. Okay, but there still didn't seem to be anything wrong with my ninth
parameter; that's how control IDs are set. It turned out that the parameter
needed to be cast into an HMENU: (HMENU)10, /* use a control id of 10 */.
Example 2: This call to CreateWindow() generated an unexpected sytax error
under NT, but not under Windows 3.

 hWnd = CreateWindow("EDIT", (char*)WndText,style,
 100,100,100,100,(HWND)WndParent,
 10, /* use a control id of 10 */
 (HINSTANCE)hInst,NULL);

Casts like this are needed throughout the code. The assumptions we made about
an int being used as a HANDLE are no longer valid -- even when the API
requires an int. Likewise, code like inline BOOL zDisPlay::is Valid() { return
hDC; } is no longer valid when checking the return value of GetDC(), or trying
to check if any HDC is valid. The statement must explicitly check the value,
as it cannot automatically convert a HANDLE into a BOOL. However, this code
does work: inline BOOL zDisplay:: is Valid() { return (hDC!=O); }.
Problems like this, as well as API calls implemented as macros to other calls,
can be confusing when porting code. If at all possible, try to use your
current development environment and move your Win 3.1 code to compile with
STRICT first, as many of the same problems will be caught with much better
syntax-error messages. However, STRICT will not solve all these problems.
Another one to look out for is CallWindowProc().


Coping with Dialog-box Resources


The one major part of zApp we knew would have to be completely rewritten was
the code that reads and decodes the dialog-box resources. zApp does not use
CreateDialog() or DialogBox(); we handle dialog functionality ourselves. As
expected, the resource format changed, and the section that dealt with this
had to be scrapped. Fortunately, the dialog format for Windows NT is
documented (unlike with previous versions of Windows) in the DLGFMT.ZIP file
in CompuServe's MSWIN32 Forum (although still not in the current printed or
online documentation).
In Windows NT, all resources are stored in Unicode. This was my first exposure
to Unicode and its 16-bit characters. While it may sound simple, doing
double-byte char string manipulations on variable-size structures is different
from using single-byte strings. It's easy to get lost, and old
string-manipulation code cannot just be copied from old programs.
Another problem with decoding dialogs was that structures in Windows NT are
not the same easy-packed structs of earlier versions of Windows. The default
struct alignment is double word (4 bytes), and the padding bytes can be
difficult to track down if you're not expecting any. This manifested itself as
things not working properly. Without any support for C++ source-level
debugging, I had to take things one step at a time.


Runtime Problems


After substantial part of zApp were compiling and running, we started
encountering a number of runtime problems. The first was a tough one: I had
the simple zApp sample running such that, when it closed, it looked as if it
were going away. However, Windows NT was keeping the EXE file locked under
some circumstances. This caused sharing violations whenever I tried to
recompile, forcing me to reboot every time I wanted to relink the application.
After getting frustrated with the limited debugging capabilities of Windows
NT, I began using calls to OutputDebugString() to trace the execution path. It
turned out the program was not always getting WM_QUIT; therefore, the message
loop was not terminating and the program never exited, even though the window
disappeared from the screen.
After yet more tracing and reverting to the generic sample, I realized that
PostMessage(hWnd, WM_QUIT,0,0), where hWnd is the top-level window, isn't the
same as PostQuitMessage(0). Under Windows 2/3.0/3.1, this always works--but
not under Windows NT.
An easier-to-find problem--but one without an easy workaround--occurs when you
try to bring up a printer-configuration dialog for a printer. The code in
Example 3, for instance, works fine under previous versions of Windows. Under
Windows NT, however, this GetProcAddress() call always returns 0, which is
technically a legal value under earlier versions of Windows. In practice, this
never happened, as Windows printer drivers export a configuration dialog box.
With Windows NT it appears necessary to use the printer-setup common dialog,
or use the NT-specific OpenPrinter() and PrinterProperties() calls, both of
which I was unable to get working under the July '92 version of Windows NT.
Example 3: This code worked fine under Windows 3, but with Windows NT it
appears necessary to use the printer-setup common dialog or the NT-specific
OpenPrinter() and PrinterProperties() calls.

 strcpy (buf, (char *)_driverName);
 strcat (buf,".DRV");
 HANDLE hLib = LoadLibrary (buf);

 if ((int)hLib >= 32) /* This cast to int was added for WindowsNT */
 {
 SETUPPROC dm = (SETUPPROC)GetProcAddress(hLib, "DEVICEMODE")
 if (GetProcAddress(hLib,"DEVICEMODE")!=0)
 {
 (*dm) (app->rootWindow()->sysId(),hLib,

 (char *)_deviceName,(char *)_portName);
 }
 FreeLibrary(hLib);
 }

Beyond these specific issues, many more changes had to be made to our zApp NT
implementation. To give you an idea of the magnitude and frequency of these
changes, consider that all together, I used 128 #ifdefs which fell into a few
distinct groups:
Eleven were required for changes in the wParam/lParam message splitting.
Seven were inside SendMessage() calls telling controls to do things.
Ten were involved with the sometimes bizarre organization of [Get,Set]Class
[Word,Long] calls, when word==long. It's difficult to figure out how the GCW_
and GCL_ defines are split up. It would have been easier if Microsoft had all
of them mapped to one or the other.
Eleven were changed parameters and calling conventions for zApp's internal
Window procedures.
Twelve were casts required inside CallWindowProc(), due to FARPROC!= WNDPROC.
In the October '92 release of Windows NT, the need for those 12 was removed.
The last major group of #ifdefs occurs in parts of zApp which have not yet
been ported over to Win32: the zApp memory manager, which makes extensive use
of 16-bit ints for speed, and the custom-control interface, which won't be
needed until there are custom controls for Windows NT. Those comprised an
additional 18 #ifdefs. Out of the original 128 #ifdefs, approximately half are
multicase.


Conclusion


After doing the preliminary work and getting almost everything running under
Windows NT, it's important to begin optimizing the code and use of the new
platform's facilities. As NT-specific code gets added, it unfortunately
becomes more and more difficult to maintain a common code base. A Windows NT
program should not be using Peek-Message loops for polling, and tasks such as
printing, file I/0, and compute-intensive work are best performed in separate
threads. As I said earlier, even simple GDI calls can have different kinds of
repercussions than when running with Windows 3.1.
Even with these and other problems, the port to Windows NT went well. We were
able to take a nontrivial amount of code and get it running in a relatively
short time. Porting to NT has some rough edges, and it's not as easy as simply
recompiling. Still, getting an application to run on this completely new
platform doesn't involve as much work as you'd expect.











































January, 1993
PROFILING FOR PERFORMANCE


Simple performance measurement can be misleading




Joseph M. Newcomer


Joseph M. Newcomer received his PhD in the area of compiler optimization from
Carnegie Mellon University in 1975. He is currently a consultant and has been
doing performance measurement of systems for 27 years.


Profiling tools were developed to help identify a program's "hot spots"--those
parts that consume significant computing resources. Hot spots are rarely where
you expect to find them. In one compiler, for example, we found after
profiling that 30 percent of the total execution time went into copying the
input buffer to the lexical-analysis buffer, then copying the input buffer to
the output-listing buffer (even when the listing was turned off). Avoiding
these copy operations gained us considerable performance.
When studying the output from a profiling tool, it's important to understand
what the tool is measuring, know the limits of the technique's accuracy, and
determine whether it is in fact measuring what you think it is measuring. This
article discusses some of the pitfalls of performance measurement.


Common Measures


The two most important performance-measurement techniques are event
measurement and program-counter sampling. Because it is the easiest to
implement, the program-counter sampling method is the most common. With this
technique, a timer interrupts the main execution, and the value of the program
counter is noted. A histogram of the sampling density identifies the hot
spots.
One problem with program-counter sampling is that it only tells you that your
program is spending, say, 30 percent of its time in the storage allocator. It
tells you where you are spending the time, but it doesn't say why. For
example, it could be that the storage allocator is incredibly slow. It could
also be that it is called 1,384,278 times. Without the additional information
about call frequency, the amount of time per call cannot be known.
In one case, we found the storage allocator to be the major bottleneck in the
system. The implementors of the system complained about the allocator
performance. After some measurement, we found that performance was so abysmal
because a subroutine in the inner loop of a critical computation called a
library routine that did a malloc/free. We recorded the loop to avoid the
library routine and to use stack-allocated storage outside the loop. The
number of storage allocations went down from over 1,000,000 to under 1000.
In another case, we had a profiler that recorded the time of each subroutine
entry and exit, then computed the amount of time spent in the subroutine.
However, a subroutine actually has two time values: the amount of time the
program counter spends in the subroutine, and the amount of time the program
executes from subroutine entry to subroutine exit, including all the
subroutines it calls. This profiler gave misleading results because the time
spent in subroutines it called was not counted. The entire system execution
profile looked rather "flat," with no distinguishable hot spots when, in fact,
two critical procedures accounted for 25 percent of the time.


Timer Resolution and Accuracy


Performance measurement on most systems is limited by the timer's accuracy. A
timer with resolution that is too coarse may produce meaningless results. Most
PC profiling tools reprogram the interval timer to get sufficiently fine
resolution. Resolution over an order of magnitude greater than that of a
typical instruction-execution time is too coarse for doing procedure
entry-to-exit style timing.
In a sampling system, a timer that interrupts too frequently induces too much
overhead and introduces other problems with too-frequent diversions from the
mainline program execution, such as breaking pipelining, changing caching
patterns, and so on. Most systems (other than current personal computers) do
not have a readily accessible, "on demand," high-resolution timer.
There is an additional problem: Four Gigasomethings may seem like a lot when
it represents bytes of address space, but with a 20-mus timer, an unsigned
32-bit number is not large enough to count the number of ticks in 24 hours!
There are 4.32 Gigaticks per day using a 20-mus timer, but only 4.29* 10{9}
values can be represented in a 32-bit number. So we can trade off higher
resolution against shorter experimental times; to get 1-mus timer resolution
we can run experiments for only 4294 seconds, or slightly over 71 minutes,
before we have to resort to 64-bit arithmetic. In the profiling tool I wrote,
we had 36-bit signed integers (35 bits of positive value) and a 10-mus timer.
This allowed us to accumulate times of 343,200+ seconds (or 95+ hours), which
seemed comfortable. But after a 15-hour experiment, the results were somewhat
bogus because the percentage computation (n*100/total) generated integer
overflows.
Timers in protected operating systems have additional problems. For example,
in most mainframe protected operating systems, time accounting is done for
purposes of billing. Therefore, time spent in interrupt processing is charged
to the running program. From a billing viewpoint this may be fair, but the
concept of billing for computer time has little relevance to personal
workstations. Nonetheless, the operating systems still seem to be designed
with this philosophy. High-resolution timing is not generally available to
applications running as user processes.
On another occasion, our TOPS-10 operating system had been modified to use a
custom-designed 10-mus timer to provide precision timing to applications. I
measured a simple counting loop using a series of tests. The sequence of
values I got for each iteration had a bimodal distribution: one peak around
100 mus and a second peak around 5 ms. About 10 percent of the samples
reported 5 ms, but they were so large that they made the average loop time
seem to be about 500 mus. Apparently, somewhere in the code an interrupt
sneaked through without recording its presence. In protected-mode operating
systems, especially multitasking or multithreaded systems, it is very
important that each process or thread have its associated high-resolution
time.
It is not sufficient to measure even a single loop, as just demonstrated. On
another system, I created a subroutine containing a short loop, about 25 mus.
I then called it from an outer loop that called the subroutine 10, 100, 1000,
and so on times, then printed out the average time. I ended up with an outer
loop value that ran the program almost two seconds between reports. The
average loop-traversal time went up after a certain value was reached, with a
peculiar two-step modality. It turns out that this was related to the context
swap time, which was charged against the process. If I had a loop that ran
long enough to expend a time slice, I got charged for the context swap at the
end of the time slice. However, since the machine was lightly loaded, my
process, as the highest-priority process, was immediately rescheduled. I got
charged for the set-up time as well.
More Details.
When the loop got long enough, however, the relationship of my priority to
other processes was changed, and a different process was scheduled; I got
charged for the significant time to do a complete process-scheduling
operation! This created another step function in the reported average timing.
When I modified the program to keep a histogram looking for modalities of
distribution, there were three peaks; the bulk were near 100 mus, but there
was a peak in the histogram near 5 ms and another near 11 ms. When I
accumulated these as a time line I found the 5-ms peak appeared every
time-slice quantum (150 ms) and the 11-ms peak every three time-slice quanta
(450 ms). The lesson here is to calibrate not only the short-term accuracy of
the timer, but its long-term accuracy as well.


Gating Error


In any discrete-event measurement, there's a phenomenon known as "gating
error" that was first recognized when vacuum-tube-based counters were used to
measure radioactive-decay events.
Consider the case where we read the system time at the start and end of an
interval we wish to measure. In Figure 1(a), the time line represents the
interval we are measuring. We read the clock shown on the time line in Figure
1(b) (with resolution 10 mus at time t when the interval starts. The clock
ticks twice, and 25 mus later we read the clock again and get time t + 20. The
apparent time of the interval is 20 mus. The next time we read the clock,
however, the clock is as shown on the time line in Figure 1(c). The initial
time is reported as time t. The clock ticks three times, so when we read the
clock at the end of the interval, the time is t+30. The interval appears to be
30 mus in length. Since statistically the distribution of these errors is
uniform, if we take enough samples the two values will average out, and the
result will be very close to the actual value of 25 mus.
The significance of gating is important only when the clock resolution is
coarse and the number of samples is small, but it is particularly important
when attempting to deal with subroutines whose time is very much less than the
clock resolution. For example, if the clock resolution is 250 mus, both a
10-mus subroutine and a 240-mus subroutine could appear as taking 0 time.
However, over a large sample size it is more likely that the 10-mus subroutine
will register as taking 0 time. If the sample size is too small, there will
not be enough values to get a suitable sample. The 24-mus subroutine may
appear to take 0 mus, and the 10-mus subroutine may appear to take 250 mus.
Based on a small sample size, you may be inclined to believe either that the
subroutine, which will be called a lot for "real" data, is taking an order of
magnitude too much time, or that the subroutine is taking 0 time, so it
doesn't need any work. Either conclusion may be false. You cannot even assign
a significance to the data, because there is not enough information to compute
a meaningful standard deviation. (See the text box entitled, "Running Mean and
Standard Deviation.") Consider the ill fortune of having three samples of a
10-mus subroutine each straddle one clock tick. There will be three samples,
each registering 250 mus, and a standard deviation of 0. What is important is
that the expected error is not the standard deviation, but the larger of the
standard deviation and half of the clock resolution (because the gating errors
are uniformly distributed across the samples). This is one reason that
high-resolution timers produce more meaningful results.


Application vs. System Time


A timer that records only the time spent in user space, or in the application
space, gives an equally unfair view of program performance. We once had a
system whose performance was quite poor, so we profiled it. There did not,
according to the author who used my profiling tool, appear to be any
significant hot spot that could account for the timing. Examination of the
code showed that an input loop was reading from the input file one character
at a time, unbuffered. When we changed it to use buffered I/O, it ran almost
ten times faster. The failure here was that the timing values returned by the
operating system dealt only with the time spent in user space, not with time
spent in the operating system on behalf of the user. Unless you understand
what the time values mean, you won't know when they are going to mislead you.


What Time is Being Measured?



In comparing the performance of two UNIX products, a company created a shell
script that issued a time command, ran the product, and issued another time
command. After timing, product A was reported to be 30 percent faster than
product B. Unfortunately, they had only measured this performance once.
Running a dozen or so measurements of product A revealed that they had about
30 percent variation between the smallest and largest numbers. Similar testing
on product B revealed that product A was on average about 10 percent faster
than product B and that the significance of this difference was quite low. The
reason for the difference is that the time reported by the UNIX time command
typically includes all sorts of operating-system overhead. The numbers were
very sensitive to the amount of network activity. In particular, a large FTP
file copy or receipt of a large mail message would generate abnormally large
numbers. Pulling the Ethernet connector and performing the same measurements
showed that the 10-percent difference appeared to be significant; the probable
error was much smaller. (See the text box entitled, "Probable Error.")
On the other hand, ignoring the "wall clock time" can be as misleading as
believing it. When overall system impact is considered, in a program that
consumes too many resources (that is, has a very large working set, serious
paging performance problems, or poor cache-hit ratios), local performance
information may not be terribly helpful. A locally optimized program that has
to coexist with other applications may have to be modified to be locally less
efficient; in the context of the entire system, however, it may result in a
more balanced system.
More Details.
You can now see the importance of timer reliability. Knowing that one sample
out of every 10,000 is a hundred times too large is important. If that one
5-ms sample falls in an infrequently called 20-mus subroutine, it could make
that subroutine look vastly more expensive than it actually is, resulting in
efforts to optimize an irrelevant portion of the program.


System Perturbation


The only precise way to measure a system is to have a totally separate system
monitoring it and accumulating the data. This monitoring must be done in such
a way as to not affect the system being measured. A full-clock-rate in-circuit
emulator is a nice tool for this purpose, but it can be costly, especially for
the high-end 80x86 and 680x0 machines. So, we must often make do with a system
that runs on the same system we are measuring. Any measurement tool of this
nature must necessarily perturb the system it is measuring. A system which
logs events to a disk file would perturb the disk-head positioning. This might
significantly affect the actual performance of the system. A performance tool
that recorded the amount of time spent in DOS, and also logged each event,
would quite likely give an incorrect result. This is why most
performance-measurement tools do not perform any I/O during the time the
measurement is being made. However, in the presence of caches and instruction
pipelines, the control transfers into the measurement tool, for whatever style
is used, will change the information in the pipeline and cache, with possible
serious impact on the system being measured. Results will be artificially
high.


Paging Performance


Most performance tools do not report anything about paging performance. Users
of virtual-memory systems have, on the whole, ignored this. In the old days,
this mattered a lot. I once wrote a performance tool solely to collect
connectivity information. This information was used to determine which
subroutines should be packed close to other subroutines so page faults could
be minimized by minimizing the working set. We thought a lot about designing
an automatic packer that would take the data from this and produce a linker
file that would give us the best packing. But in practice (even for the
several hundred subroutines we had) we found that "eyeballing" the data gave
us enough information to do the job manually-- enough to get significant
performance improvement.
In a now-classic case, one vendor presented its programmers with a 32-bit
virtual address space, and told them to write a compiler. These programmers
were very sensitive to performance issues. It was well known that symbol-table
performance is critical to compiler performance, that hashed symbol tables
perform well, that performance of a hash table degraded with frequent hash
collisions, and that larger hash tables reduced the number of collisions.
Studies of a large number of sample programs let them determine the hashing
parameters for the most uniform distribution of symbols in the table for
typical programs. "Hey, this virtual-memory stuff is great!" we imagined the
programmers saying, "Let's make a big hash table!" So the hash table was a
megabyte. Their hashing algorithm was so well tuned that it nicely avoided
collisions. In fact, the symbol table almost always ended up with one symbol
per page. Since physical memory on this machine was only 750K, any serious
compilation resulted in about two page faults per line of code compiled due to
symbol-table lookups. The symbol table and the compiler code contended for
working-set space, and the result was one of the slowest compilers ever
delivered. The next release fixed this, using a 4K symbol table that could
grow as needed.
One useful experiment is to malloc a large amount of memory, for example a
couple megabytes if you are on a workstation. Run your standard calibration
test for short, medium, and long loops. Then run the same test, but access
bytes one page apart to start forcing paging to occur. Run the test accessing
5 pages, 10 pages, 100 pages, and so on. Note from the run codes whether you
are getting any anomalies in your measurements induced by the paging. If so,
be aware that when you examine the numbers, a poorly performing algorithm
might represent a bad algorithm, a bad data layout (causing too many page
faults), or some policy in your operating system that is artificially reducing
your working-set size. If you get no anomalies, it means that an algorithm
that is performing poorly because of page faulting will not be visible to you
with the profiling tool you are using. If you suspect page faulting as a
performance problem, you will have to use some other tool or technique to
determine paging behavior. Knowing what the system doesn't measure can be as
important as knowing what it does.


Architectural Impact


With modern machines, it's no longer sufficient to think about issues such as
clock cycles per instruction, nor is it even possible, by classical static
analysis of the instruction stream, to predict a program's performance with
any accuracy. Instruction look-ahead, instruction caches, pipelining,
on-and-off-chip memory caching, interacting with interrupts, preemptive
scheduling, dynamic register renaming, and so on mean that the cost of a
particular instruction depends considerably on the instruction and data
history which preceded it. As I already indicated, use of a
performance-monitoring tool may have significant impact on a program tuned
close to the limits of the architecture.
The impact of architecture cannot be minimized. The CDC Star 100 computer, a
supercomputer of its day, used floating point for lexical analysis in its
Fortran compiler. This was because a matrix multiply between floating-point
numbers (built from the individual characters) and a character-class matrix
would produce a new floating-point vector that represented the character
classes for each character position. The floating-point unit was heavily
pipelined; the character and integer handling was pathetic by comparison. No
amount of performance measurement or profiling would have suggested this
algorithmic change. In another example, a Fortran program which executed at 20
SPECmarks on a contemporary RISC machine was optimized to 40 SPECmarks with
conventional optimization. When the program was massaged by a
program-transformation system that arranged computations to use cache-line
sized array slices and to otherwise attempt to maximize cache hits, the
performance rose by more than an order of magnitude. Performance numbers such
as cache flush and cache load times are almost never available to the
application developer except with incredibly sophisticated (and consequently
expensive) equipment. Yet these can have a more significant impact than the
small algorithmic perturbations that profilers would suggest.
Optimizations are becoming more architecture dependent. Assembly-level coding
tricks that made sense on a non-pipelined 8088 may not only be irrelevant on
higher machines, they may even have a negative value. This isn't a new
problem. IBM published a programmer's guide to the 360/91, in the early '70s
that illustrated how code sequences that improved performance on lower
machines in the series actually caused performance degradation in the heavily
pipelined 360/91. This was because they caused gratuitous serialization of
results in the pipe when what appeared to be cruder, less-efficient code
sequences did not force serialization of the pipelined operations. The
parallelism which resulted could double the speed of small loops. I must say
that I have not investigated similar situations on the 386/486 architectures,
mostly because I now leave these fiddling details to compilers.


Instruction Histograms


In one extreme case, a company I worked for needed very precise data on
instruction execution. We wanted to factor out all architectural issues, such
as caching and pipelining. The purpose was to study, in a very particular
manner, the effects of code motion in an optimizing compiler. I took advantage
of the 32-bit virtual memory in a singularly profligate fashion. I computed
the size of the TEXT region--the program area in which the code appears--in
bytes, then used calloc to allocate that many long values. I then ran the
program with the trace-trap bit set, so I got a signal on every instruction
execution. In the signal handler, I computed the offset of the instruction in
the TEXT region and used that as an array index to determine which long value
to increment. When I was done, I went through the array and for every nonzero
counter value I called a disassembly subroutine to print the instruction,
followed by a histogram of its frequency. The results were amazingly
informative, but the method was not practical for anything other than small
subroutines. The slowdown in performance was over 10,000:1. Running in an
unprotected MS-DOS environment, I suspect this ratio would be substantially
smaller, since most of the instructions expended were done in the
trace-trap-to-operating-system interface and the
operating-system-to-user-signal-routine interface.


Summary


In any performance measurement, it is essential to understand what is being
measured, how accurately it is being measured, and the reliability of the
resulting numbers. You need to know such important information as the
resolution of the timer and its reproducibility ove time. You need to
understand the influence both of external events and local events (such as
page faulting in your program) on the measurements.
Having established the accuracy of your measuring tools, you must then
determine what, exactly, you are measuring. Finally, you must determine
whether this measurement is actually meaningful, in the sense that this value
will help you optimize your program. Only then will you know what all those
numbers mean.
What you cannot tell without serious investigation is what should be measured,
because those factors are often specific to a particular architecture (such as
the 80x86) or a particular instance of that architecture (such as a 486 DX50
with 256K outboard cache). You may be measuring instruction cycles, and what
you should be measuring is the cache hit ratio. Without understanding the
global system issues, the significance of the numbers is unknown.
In my own profiling tool, I recorded the time spent in the profiling tool
itself. Because of the algorithm I implemented to handle recursive procedures,
the time spent doing performance measurement was almost 25 times the time
spent executing the program to be measured! A rewrite of the algorithm reduced
this to a factor of around ten. So even performance-measurement tools have to
be measured!


Running Mean and Standard Deviation


If you use the conventional algorithm for computing mean and standard
deviation, for n samples
you need n locations to hold the samples so that the variance can be computed.
The standard deviation is the square root of the variance, s{2}. The
traditional formula is shown in Example 1(a). We have to compute the mean, x,
then go back and compute for each sample the square of the difference between
the mean and the sample.
There is a simple way to keep a running mean and standard deviation without
having to retain all the data. The variance formula can be rewritten as shown
in Example 1(b). Each time we get a new sample we can add to our running
totals. We start with the variables sum, sum2, and the counter n. Simply
compute, for each new value sample, the sum and sum-of-squares, as shown in
Figure 2. Any time you need the variance or standard deviation, you can
instantly compute them, as shown in Figure 3. This makes it easy to run a very
large sample on a small machine. It also makes your measurement program less
susceptible to paging faults. If you don't need a gigantic array to store the
data, you won't page fault while storing into it.
Figure 2: Sum and sum of squares computation.

 float var;
 float sum = 0.0;
 float sum2 = 0.0;
 float sample;
 int n = 0;
 ...
 compute 'sample'

 ...
 sum = sum + sample;
 sum2 = sum2 + (sample * sample);

Figure 3: Variance and standard deviation from running sums.

 var = ( (float) n * sum2 - (sum * sum) ) / (float) (n * (n - 1);
 sd = sqrt(var);



Run-coded Modal Statistics


Computing modal distributions normally requires keeping a lot of data, such as
one table entry per sample. When I did this I was working on small machines,
so the ability to keep long runs in memory was limited by the machine size. I
got around this by using a run code. Of course, there was some variation in
the data, so a conventional run code wouldn't work. Instead, I used a
thresholding feature; I kept a running average of the current run, and any
value within 5 percent was considered to be part of the run. Whenever the
value changed beyond this limit, a new run was started with a new average
being kept. A program fragment for the code that computes mean, standard
deviation, and time-line runs is shown in Listing One, page 106.
--J.M.N.


Probable Error


Errors in measurements fall into two categories: accidental and systematic. A
useful measure for physicists is the probable error (p.e.). This is a value
such that half of the errors are greater than this value and half are less. It
is computed as shown in Example 2(a). The value e[i] is the error term, the
difference between the mean, x and the sample. The constant 0.8453 is derived
from the mathematics of the standard distribution. This gives the probable
error of any one measurement. The probable error of the mean value x, P.E., is
given
by the formula shown in Example 2(b). Thus, the larger the number of samples
the more credence the number has. If the probable error is too large, the
value can be disregarded.
--J.M.N.

_PROFILING FOR PERFORMANCE_
by Joseph Newcomer


[LISTING ONE]

 {
 float sum; // sum of values
 float sum2; // sum-of-squares
 float var; // variance
#define NUM_RUNS 10 // choose a number
 struct {float v;
 int count;
 } runs[NUM_RUNS]; // run table
 int n_run = 0; // current run entry in table
 float sample; // current sample

 float suma; // sum of run
 int na; // number of samples in run

 // initialization
 sum = 0.0;
 sum2 = 0.0;
 n = 0;

 suma = 0.0;
 na = 0;

 /*******************************************************************
 * The loop below generates samples and keeps the averages in a *
 * run-compressed sequence. The assumption here is that the values *
 * should not deviate much from the mean. Used in timer calibration*
 ********************************************************************/
 for(;;)

 { /* runcode */
 if(...) // test if end of samples
 sample = ... ; // compute sample here
 else
 break; // done with computation

 /*********************************************
 * Compute sum and sum-of-squares *
 *********************************************/
 n++;
 sum = sum + sample;
 sum2 = sum2 + (sample * sample);

 if(n > 1)
 { /* add to run? */

 /*********************************************************
 * We add to the current run if the sample is within 5% *
 * of the current average *
 *********************************************************/
 if ( (fabs(runs[n_run].v - sample)) < 0.05 * suma / na )
 { /* run code */
 /*************************************************
 * We are within 5% of the current run average. *
 * Increment the count of elements in the run. *
 *************************************************/

 runs[n_run].count++;

 /*****************************************************
 * Factor the new value into the running average. *
 * Increment the number of elements in the average. *
 *****************************************************/
 suma = suma + sample;
 na++;

 /*************************************************
 * Store the new average as the current average *
 *************************************************/
 runs[n_run].v = suma / na;
 } /* run code */
 else
 { /* start new run */
 /*************************************************
 * We are going to start a new run. *
 * The new average is the sample value *
 *************************************************/
 n_run++;
 runs[n_run].v = sample;
 runs[n_run].count = 1;

 na = 1;
 suma = sample;
 } /* start new run */
 } /* add to run code? */
 else
 { /* first one */
 suma = sample;
 na = 1;

 runs[n_run].count = 1;
 runs[n_run].v = suma;
 } /* first one */
 } /* runcode */

 /****************************************************************
 * We have completed the measurements. Print out the mean, *
 * variance, and standard deviation of the samples *
 ****************************************************************/
 printf("mean = %g, var = %g, s.d. = %g\n", sum / n, var, sqrt(var));

 /****************************************************************
 * Print out the run table *
 ****************************************************************/
 for(i=0; i <= n_run; i++)
 { /* run distribution */
 printf("[%2d] %g (%d)\n", i, runs[i].v, runs[i].count);
 } /* run distribution */
 }











































January, 1993
PROGRAMMING PARADIGMS


Stephen Wolfram: Multiparadigm Man




Michael Swaine


Stephen Wolfram is very bright. The kind of bright that impresses Nobel
laureates like Richard Feynman. The 33-year-old full professor at the
University of Illinois has been making original contributions in physics since
his first published paper at age 16. His pioneering work in cellular automata
opened up the study of complex phenomena.
He's also impatient. Deciding that he needed a better tool for doing
mathematics than what then existed, he built one, much as Donald Knuth built
TeX so his published papers would be more attractive. Unlike Knuth, however,
Wolfram erected his no-compromises system in an impressively short time while
simultaneously putting together a company to support and market it. In four
years, Wolfram Research has grown to over 100 employees, and Mathematica,
Wolfram's "system for doing mathematics," is in use on Sun workstations, Next
machines, Macintoshes, and PCs in over 70 countries.
This two-part column chronicles the conversation Ray Valdes and I had with
Wolfram in the DDJ offices on a variety of topics, from programming paradigms
to the thought processes of mathematicians.
DDJ: It's hardly a new idea to incorporate a programming language into an
application program. Lotus 1-2-3 has its macro language and dBase its database
language. But the language aspect of Mathematica strikes us as considerably
more ambitious. It's even taught as a first language in some college programs.
Did you set out to create an application program or a programming language?
SW: I viewed the intellectually most significant [part] of the enterprise as
being the creation of the elements of a programming language.
DDJ: And yet you sell it as an application.
SW: It has to do with the practical problem of introducing programming
languages. Programming languages are a surprisingly slow-moving field. Fortran
was invented before I was born and C is more than 20 years old now. It's kind
of strange, in a world where computer hardware and the uses that computers are
put to have advanced so rapidly, that programming languages have advanced so
slowly. If you have some ideas about how programming languages should be set
up, and you want people to actually try using them, there's a question of how
you get [them] to do that. Once people have gotten used to using a programming
language, you have to do an awful lot to convince them that they should switch
to something else. We were lucky. People started off using Mathematica like an
extremely enhanced calculator. And if you get a few hundred thousand people
using your thing for whatever reason, then you have a reasonable community to
work on in developing the language for its own sake.
DDJ: What does "for its own sake" mean?
SW: If you ask yourself, "What are the languages that have a chance in the
next century?" there aren't very many of them. And I think that Mathematica
has more than a chance. That means that we have an example of a language that
has pretty modern ideas--it is certainly a big step beyond C and Fortran--and
that is already widely used today. One of the things that I consider an
exciting direction is to what extent we can expand the use of the language
itself, independent of the application side of Mathematica. We've considered
making a thing that will probably be called M, that is essentially Mathematica
without the mathematics.
DDJ: You've considered it. But how seriously?
SW: We've built little Ms. There is no doubt that Mathematica without the
mathematics will exist one day. The main issue for us is to figure out how it
makes sense to distribute the thing. Right now there are particular
application areas where people have written programs in Mathematica that don't
use the mathematical side of Mathematica, and those are the places where you
start. But I believe that every application program should have a language
underneath it, and it would be great if that language was a modern, highly
capable language, not an imitation of Basic or some specially crafted language
that just does things for databases, for example. That's the niche I'm
interested in seeing the Mathematica language go into in the future.
DDJ: You mean extension languages, like macro or scripting languages?
SW: Exactly. The issue is, if you're programming a spreadsheet, would you
prefer to be programming in a sort of Mathematica language or in Lotus macros?
And the answer is not too hard to figure out. Lotus macros are fine if you're
just doing a few simple things, but if you actually want to write a serious
program, they're far from fine. My interest in this direction is to see if one
can use the Mathematica language as a basis for a wide range of different
kinds of application programs. And the symbolic nature of the Mathematica
language is crucial in that.
DDJ: How so?
SW: If you have a word processor and you want to represent a paragraph, that's
an easy thing to do in a symbolic language. It's a pretty hard thing to do if
you're stuck with a language like Basic. The other point is that it's only in
the next couple of years that it becomes realistic to have a language as
sophisticated as Mathematica underneath applications. By the time the M
language is likely to be out, it will be a small fraction of the size of the
typical application. Mathematica itself is a pretty huge program, but the
language part of it is not so big.
DDJ: Could we go a bit more into the virtues of symbolic languages like
Mathematica vs. procedurally based languages like Basic?
SW: When you're working with a procedurally based numerical language, there's
a lot of mysterious hidden state associated with what's happening. For
example, you have a standard program written in C, and you have various data
structures, and you have subroutines that call each other and pass pointers to
these data structures. If you want to look at one subroutine on its own and
see what it's doing, [to] feed this kind of input in and see what comes out,
that's pretty difficult to do in C. But in a symbolic language there's no
[problem], because whatever input might be given, you can always explicitly
write it down; whatever output might come out, you can always explicitly see
it. It's always the same kind of object, always a symbolic data structure that
you can explicitly see. There's no idea that it's some sort of mysterious
pointer encoded in such and such a way.
DDJ: Don't symbolic languages have a name-of operator, a reference operator?
SW: If you're using that stuff when you're programming in Mathematica, you're
almost certainly doing something wrong. What's great about programming in
Mathematica is that you don't have to think about any of that stuff.
Everything you pass around is explicitly right there. It's essentially passed
by value.
DDJ: But there is a reference construct.
SW: Not really. There's no need. Now in Mathematica there are ways of passing
structures unevaluated. And there are some purposes for which--for example,
when you do assignments--you need that. The left-hand side has to remain
unevaluated; otherwise the wrong thing will happen. I would love to figure out
a way to avoid having to do that. I haven't succeeded. But in doing general
programming in Mathematica you shouldn't ever have to keep things unevaluated.
DDJ: We've worked some with Mathematica, but many of our readers haven't. We
really should step back and talk about what sort of language Mathematica is:
the ideas and paradigms it embodies and where it came from. Maybe you could
tell us about the intellectual roots of Mathematica.
SW: I got to do a test run of some of the ideas in Mathematica in a system
called SMP that I built in the late '70s or early '80s. It was more oriented
toward computer algebra; it wasn't as ambitious a system as Mathematica. What
I did there was a very educational experience. I tried to impose on people
what I thought to be a good, but rather an unusual model of programming.
DDJ: What was that?
SW: The model of programming was that of pattern matching and transformation
rules. Pretty much everything in that system was done with pattern matching
and transformation rules. If you were going to write programs in SMP they
pretty much had to be in that paradigm.
DDJ: The late '70s and early '80s would have been about the time Clocksin and
Mellish were bringing Prolog to a wider audience with their book. Were you
influenced by Prolog at that time?
SW: No, actually I wasn't. I had never written a program in Prolog. I'd read
the manual. The main thing that I was trying to do was to imitate what seemed
to be what happens when you do mathematical calculations; that is, that you
are continually applying rules of mathematics. The transformation-rule model
has not been widely adopted. Prolog was an attempt to adopt it.
DDJ: An attempt? You consider it a failure?
SW: Prolog [has a] fatal flaw. A language where fundamental operations give
you no clue as to how long they might take or what's going on isn't going to
cut it. You have to give the user a reasonable conceptual model of what the
computer is doing. It doesn't matter if they're a factor of ten wrong in
knowing how many instructions it's going to take, but it does matter if they
can't estimate whether this is an exponential time algorithm or something
else.
DDJ: How did SMP influence Mathematica?
SW: One of the ideas I had in SMP was, "Figure out a good programming paradigm
and just stick to it." This was a mistake. I think it's not a trivial mistake.
You might think, "If there is a natural way to specify how programs should
work, that maybe hooks into some way that has to do with how the brain
processes ideas about things, then you should just figure out that way and
stick to it. But it turns out that while there are a some kinds of programs
that can be written very nicely using this [transformation rule] paradigm,
there are others that are horrendous to write using it, but that are
straightforward to write using, say, procedural programming or functional
programming.
DDJ: So you built multiple paradigms into Mathematica?
SW: What I decided to do in building Mathematica, and have been very happy
with, is to admit that there is going to be more than one paradigm for writing
programs. Then the trick is to put in those paradigms in such a way that the
edges fit together properly, so that you can move easily from one paradigm to
another. So you can have pure functions and have them interact with
transformation rules and interactive procedural programming and so on, and
have a fairly seamless interface.
DDJ: In the development of Mathematica, were you explicitly thinking in those
terms?
SW: Oh yeah. The development of Mathematica was in some ways boring because it
was extremely deliberate. I knew what I was trying to do and what the steps
were. It hasn't been as educational as it might have been because it's gone
pretty much as expected. It has been interesting, by the way, to look at the
programs people actually write in Mathematica. The idea of multiple paradigms
really works out, because if you look at the programs, there are some that are
20,000 lines of transformation rules and that work just great in that form,
and there are others that are a bunch of functional programming constructs and
again work just great in that form, and then there are still other things that
people end up writing as procedural programs, though it's rarely a good idea.
DDJ: What kinds of design ideas went into the writing of Mathematica?
SW: One way I tried to design Mathematica was the following: Think about
computations that one wants to do, and think about well-defined chunks of
those computations that one could give a definite name to and do lots of
times. A very simple one might be nest, a function in Mathematica that is sort
of an iteration construct. There are a lot of programs one writes where one
wants to do that, so it makes sense to give that thing a definite name, and
say, "This is a chunk of computation that this language provides a primitive
for doing." In a sense it's like [making] up the instruction set for a RISC
machine. So [in developing] Mathematica I wrote a lot of sample programs in
Mathematica, and my principle was if I keep on having to use an idiom it
should have a name.
DDJ: That's one design principle. Were there others?
SW: One principle is to keep the number of fundamentally different ideas
fairly small, and then with each of those ideas to pin a lot of actual
elements of the system on top of [it], because if you pin enough stuff on top
of an idea, people are going to have to understand that idea to use the
system. One of the mistakes that one has to fight in designing is to say, "For
this particular thing we want to do, maybe there's a nice mechanism we can
make up, a special mechanism, say, for the way Poisson series work." This will
be a big mistake, because nobody will understand this mechanism. But if you
have that mechanism be the mechanism that's used for all list-like objects,
say, then anybody who can use the system is going to understand the basic
mechanism. Moreover, their understanding of the mechanism is going to grow if
they see it used in a whole variety of different of places.
DDJ: Is there anything you'd do differently if you were writing Mathematica
today?
SW: Were I to build Mathematica again I would probably have 5 percent less
stuff in it.
DDJ: What would you leave out?
SW: I'll give you an example of something that I put into Mathematica that I
thought was a good idea but that turned out not to be. It was this function
called short. It just has to do with printing out expressions...
DDJ: With the head and tail kind of thing?
SW: Yeah, it's actually a bit cleverer than that. It goes through the
expression [as] a tree and it has a certain amount of energy that starts off
at the top of the tree, and it allocates the energy in different ways as it
goes down the branches of the tree. It does a fairly nice job of showing you
the structure of the expression with some little ellipses. As I say, it seemed
like a good idea. The only catch is, nobody uses it. I haven't used it in
eons. Why do people not use it? I don't know. But that's an example of a
"Designers Beware."
DDJ: Can you ask users for their feedback about design features like that
during the design process?

SW: If you ask a user, "What do you think of the design of such and such an
aspect of Mathematica?" the chances are that you won't get a sensible answer.
If the person actually uses it, they'll say, "Yeah, I can get my work done
with it." And they will have adapted to the language to make it work for what
they want done. If you talk to people who work on the theory of
programming-language design, they have all kinds of things to say, but I don't
believe their theories, so I'm not interested in them.
DDJ: You've spent a significant amount of time doing language design. What
does a language designer really spend the bulk of the time doing?
SW: Almost all the time is spent trying to simplify the construct one comes up
with. You start off with this idea about what capability you want it to have.
Then the trick is, find the simplest, most transparent way to represent that.
That often takes an incredibly long time, and it's kind of frustrating,
because at the end, you [wind up doing] it the obvious way. But the problem is
finding the obvious way.
Next month, Stephen Wolfram talks about science, programming, business, and
why mathematicians don't like him.


























































January, 1993
C PROGRAMMING


Generic D-Flat++ Windows


 This article contains the following executables: DFLT15.ARC D15TXT.ARC


Al Stevens


This month continues the development of D-Flat++ with the base DFWindow class
from which all other windows derive. Because this is the first issue of the
new year, I'll begin by reviewing what D-Flat++ is and where it came from.
About two years ago I went looking for a C function library that implemented
the CUA interface for DOS text-mode applications. CUA is the common term for
the menu-bar/pop-down, menu/dialog-box, mouse/keyboard user interface common
to so many applications and operating environments. Windows and Presentation
Manager are CUA-compliant in their graphical user interfaces. Most Microsoft
and Borland text-mode programs use the CUA conventions. Even if it is not
perfect, CUA could have the beneficial effect of killing all those stupid
look-and-feel lawsuits. More about them later. If your application uses CUA,
then no one--not even those nit-picking magazine reviewers--can criticize the
user interface. Good or bad, it's a standard.
In my search, I found a few DOS text-mode CUA libraries, but each of them was
either more or less than what I needed. So, with only a smattering of Windows
programming experience, I decided to build such a library, one that uses an
event-driven, message-based programming model similar to the one Windows uses.
Thus, D-Flat was born. It started out to be a simple, small package to support
my requirements, and I decided to publish it in the column. Reader response
was positive, and the library grew accordingly, going through 15 versions and
taking a year and a half to publish. Many readers asked about a C++ version,
and those requests launched an experiment to see how the event-driven,
message-based model fits into the C++ object-oriented paradigm. D-Flat++ was
the result.
The previous two columns dispensed with the DF++ desktop and the
platform-dependent device objects. This month we look at the portability layer
to normalize the code for different compilers and the definition of the base
DFWindow class.


The Portability Layer


Listing One, page 136, is dflatdef.h, a header file that defines global values
and converts Borland C++ idioms to Microsoft C++. The WndType enumerated type
identifies the window types for the various window classes in the DF++ class
hierarchy. You can see from the list which window types are in the design so
far. I'll probably add to it as the project continues. The Bool enum and the
min and max macros are defined here. The statements that follow allow me to
compile the DF++ that I developed under Borland C++ with Microsoft C++. The
differences are in function calls specific to the PC. The two compilers are
compatible with respect to pure C++ code. This portability layer assures that
the FP_SEG, FP_OFF, and MK_FP macros work the same as the functions that read
and write I/O ports, access BIOS keyboard functions, and get and set interrupt
vectors. The two compilers implement interrupt functions differently, and the
INTERRUPTARGS macro supports the difference, as do some compile-time
conditionals wherever the program declares pointers to interrupt functions.
The peek and poke macros in dflatdef.h provide those functions for Microsoft C
users.
Later, when I port DF++ to other C++ compilers, this file is where the
majority of the changes will occur. Also, as the project develops, I'll put
any new portability dependencies here.


The DFWindow Class


Listing Two, page 136, is dfwindow.h, the header file that defines the
DFWindow base class, the class from which all DF++ windows will derive. The
class encapsulates all that is common to all windows, regardless of their
type. The header file defines the values for a window's attribute flags. Each
flag is its own bit position, and they specify whether a window may be moved
or resized; whether it has a border, title, control box, min box, max box,
shadow, menu bar, status bar, or scroll bars; whether it saves the video
memory it occupies or simply repaints itself on request; whether its display
is clipped to the client area of its parent; and whether or not it is a
component of its parent's frame, such as a scroll bar or status bar.
The dfwindow.h file defines window-display characteristics such as the
structure for specifying window colors, the colors for a window's shadow, and
the text characters that form a window's frame. The DFWindow class definition
is the base class for all windows. It includes the data members that specify
the window type, its size and screen position, title, attributes and state
words, video save buffer address, colors, and the address of the window's
parent window. There is a list head for a linked list of child windows, and
next and previous pointers to adjacent sibling windows.
Then there are the member functions. A DFWindow object has several
constructors which allow users to instantiate a window by combinations of
parameters that include the title, size, position, and parent. There are
functions that allow derived windows to write characters and strings to the
window's video space. There are functions that both return the window's
position and size, colors, parent, attributes, state, and type, and that
change those things. Lastly, there are the API functions, which are the
equivalent of messages in the D-Flat message-based programming model.
Listing Three, page 138, is dfwindow.cpp, the source code for the DF-Window
member functions that are not inline. These functions construct and destroy
the window, open and close it, display and hide it, and process its messages,
including moving, resizing, minimizing, maximizing, restoring the window, and
processing the keyboard and mouse messages that are passed on by derived
window classes.
You must understand the relationships between constructing a window and
opening it and destroying and closing it. A program constructs a window by
instantiating it as an object--either as a global object, an object within the
scope of a brace-surrounded block, or with the C++ new operator. The window's
constructor also opens the window. The program destroys the window by allowing
it to go out of scope or, in the case of a window constructed with the new
operator, by using the C++ delete operator. The window's destructor also
closes the window. The program can close a constructed window without
destroying it. The program can then reopen the same window without invoking
the window's constructor. A constructed window is one that has been
instantiated. An open window is one that is available for use by the program.
This distinction is necessary because of the CUA convention that allows the
user to close a window with a keyboard or mouse action. This could happen in a
manner asynchronous to the program's instantiation of the window and the
window going out of scope. It is not possible to define a C++ class with
objects declared only with the new operator. Therefore, because DF++ cannot
enforce a convention that says users will only instantiate windows with the
new operator, DF++ cannot assume that it can always use the delete operator
when the user closes the window. Furthermore, there is no way for the system
to know where the using program might save the address of the instantiated
window, so there is no way to set it to NULL. It would be cumbersome to
require a window to notify its software "owner" that it has been destroyed
from within as the result of a user action. For this reason, DF++ implements a
model where the window is constructed and the window may be closed prior to
destruction. The C++ object still exists, but the logical window is closed. A
window knows that it is being closed and will receive no further event
messages, even though the object is still within scope.
To further complicate matters, a constructed open window may be visible or it
may be hidden. A hidden window is not in the view of the user and will,
therefore, receive no device-related event messages.
You will learn in later columns how the pop-down menu system takes advantage
of this behavior. The system constructs all of the pop-down menus when it
constructs the menu bar. Then it opens and closes the pop-down menu windows
when the user selects them.


How to Get the Source Code


D-Flat++ is available as an early version from the DDJ Forum libraries on
CompuServe and M&T Online or directly from me. Send a stamped, addressed
diskette mailer and a formatted 360K or 720K diskette to me at Dr. Dobb's
Journal, 411 Borel Ave., San Mateo, CA 94402-3522. Specify that you want
D-Flat++. If you also want the D-Flat C library, specify that, too. The
software is free, but if you wish, you can participate in our Careware program
by including a dollar, which I will donate to the Brevard County Food Bank. A
word now about that.
Your support of careware in the past year helped the Food Bank to provide
much-needed assistance this past summer to the victims of Hurricane Andrew in
southern Florida. There are a lot of people still staying in tents and
wondering where they will live and where they will work. Imagine owning a
mortgaged parcel of ground with a pile of rubble in a neighborhood of rubble
as far as the eye can see. If you are one of the lucky ones, you have an
insurance check in your pocket. How do you rebuild? Why would you? You lived
there because there was a cooperative society consisting of neighbors, shops,
schools, and jobs. Now it's all gone, reduced to debris. I went there and saw
it. I do not have the words to describe it. The effects of this tragedy will
be felt for a long time to come. The folks at the Food Bank asked me to thank
you for your support of their effort to make the recovery just a little bit
easier.


Book Report: C++ Programming Style


C++ Programming Style by Tom Cargill (Addison-Wesley, 1992) is yet another
addition to the growing body of C++ literature in the new generation of good
C++ books. The author takes a unique approach. He presents a number of
programs taken from what we can presume to be the earlier generation of C++
books when things were not quite so good, but from the works of some respected
authors. Then he proceeds to critique the code, pointing out where it is
flawed, identifying what needs to be fixed, and rewriting it using the C++
styles he recommends. Each example makes a point about a particular issue of
style, and every one makes good sense.
The examples show how C++ programmers can misuse such things as inheritance,
virtual functions, and function overloading, applying them in ways that might
work but that do not appropriately model what the objects represent. You learn
the essential differences between value and behavior, implementation and
interface, and data and functions in a class design. You learn the inherent
strength of a loosely coupled class design. You learn to use default arguments
instead of function overloading. You are lured away from clever operator
overloading. The author develops one-liner rules that will lead you into a
style of programming that adds durability and clarity to your design. For
example, "When overloading operator =, remember x=x." If that doesn't make
sense, then you need this book.
One of the most valuable of Cargill's rules is the one that says, "Do not try
to learn the semantics of multiple inheritance from your compiler." The author
cites his experience with three compilers, none of which worked properly with
virtual base classes. During this time when C++ is not formally defined and
the compilers are subject to the vendor's interpretation of whatever ambiguous
language definition does exist, that rule might be simplified to say, "Do not
try to learn anything from your compiler."
Cargill is a good influence on C++ programmers. This book moved me to examine
some of my own work and change it. The examples went right to the heart of
some of my habits, revealing coding idioms better expressed in other ways.
This is highly recommended reading. No matter how good a C++ programmer you
are, you will be a better one after you read this book.


Clarence and Friends


Write to your Congressman or woman. Insist that the next confirmation hearing
for a Supreme Court justice include an in-depth investigation into the
nominee's computer literacy. We need folks on the bench who understand the
complexities and intricacies of computers and the law. I read today that Lotus
won another round in its look-and-feel lawsuit against Borland's alleged
infringement of the long-since obsolete Lotus user interface. Borland promises
to press on with appeals. Lotus vows to carry it all the way to the Supreme
Court. When it gets there, wouldn't it be just dandy if there was someone on
the bench who understood what the heck they were talking about?


_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

// -------- dflatdef.h

#ifndef DFLATDEF_H
#define DFLATDEF_H

// -------- window class types
enum WndType {
 DFlatWindow,
 ApplicationWindow,
 TextboxWindow,
 FrameWindow,
 ScrollbarWindow,
 MenubarWindow,
 ListboxWindow,
 PopdownWindow,
 EditboxWindow,
 DialogboxWindow,
 PushButtonWindow,
 RadioButtonWindow,
 StatusbarWindow
};
enum Bool {False, True};
inline int max(int a, int b) { return a > b ? a : b; }
inline int min(int a, int b) { return a < b ? a : b; }

// ----- portablility layer for MSC C++
#ifdef MSC

#define keyhit kbhit
#undef FP_OFF
#undef FP_SEG
#undef MK_FP
#define FP_OFF(p) ((unsigned)(p))
#define FP_SEG(p) ((unsigned)((unsigned long)(p) >> 16))
#define MK_FP(s,o) ((void far *) \
 (((unsigned long)(s) << 16) (unsigned)(o)))
#define outp _outp
#define inp _inp
#define bioskey _bios_keybrd
#define getvect(v) _dos_getvect(v)
#define setvect(v,f) _dos_setvect(v,f)

#define INTERRUPTARGS void
#else
#define INTERRUPTARGS ...
#endif

#define poke(a,b,c) (*((int far*)MK_FP((a),(b))) = (int)(c))
#define peek(a,b) (*((int far*)MK_FP((a),(b))))

#endif








[LISTING TWO]

// ------------ dfwindow.h

#ifndef DFWINDOW_H
#define DFWINDOW_H

#include <stdio.h>
#include <string.h>
#include <dos.h>
#include <stdlib.h>

#include "dflatdef.h"
#include "rectangl.h"
#include "strings.h"

// -------- window attribute flags
const int MOVEABLE = 0x0001;
const int SIZEABLE = 0x0002;
const int BORDER = 0x0004;
const int TITLEBAR = 0x0008;
const int CONTROLBOX = 0x0010;
const int MINBOX = 0x0020;
const int MAXBOX = 0x0040;
const int SHADOW = 0x0080;
const int SAVESELF = 0x0100;
const int NOCLIP = 0x0200;
const int MENUBAR = 0x0400;
const int STATUSBAR = 0x0800;
const int VSCROLLBAR = 0x1000;
const int HSCROLLBAR = 0x2000;
const int FRAMEWND = 0x4000;

// --------------- Color Macros
enum Colors {
 BLACK,
 BLUE,
 GREEN,
 CYAN,
 RED,
 MAGENTA,
 BROWN,
 LIGHTGRAY,
 DARKGRAY,
 LIGHTBLUE,
 LIGHTGREEN,
 LIGHTCYAN,
 LIGHTRED,
 LIGHTMAGENTA,
 YELLOW,
 WHITE
};

// ------ window shadow attributes

const unsigned char ShadowFG = DARKGRAY;
const unsigned char ShadowBG = BLACK;

// ----- minimized icon dimensions
const int IconWidth = 10;
const int IconHeight = 3;

// --------------- border characters
const unsigned char FOCUS_NW = '\xc9';
const unsigned char FOCUS_NE = '\xbb';
const unsigned char FOCUS_SE = '\xbc';
const unsigned char FOCUS_SW = '\xc8';
const unsigned char FOCUS_SIDE = '\xba';
const unsigned char FOCUS_LINE = '\xcd';
const unsigned char NW = '\xda';
const unsigned char NE = '\xbf';
const unsigned char SE = '\xd9';
const unsigned char SW = '\xc0';
const unsigned char SIDE = '\xb3';
const unsigned char LINE = '\xc4';

// ----------------- title bar characters
const unsigned char CONTROLBOXCHAR = '\xf0';
const unsigned char MAXPOINTER = 24;
const unsigned char MINPOINTER = 25;
const unsigned char RESTOREPOINTER = 18;

// ----------- window states
enum WndState {
 OPENING,
 ISRESTORED,
 ISMINIMIZED,
 ISMAXIMIZED,
 ISCLOSING,
 CLOSED
};
// ---------- window colors
struct Color {
 Colors fg, bg; // standard colors
 Colors sfg, sbg; // selected text colors
 Colors ffg, fbg; // window frame colors
 Colors hfg, hbg; // highlighted text colors
};
class Application;
class StatusBar;
class PopDown;

class DFWindow {
protected:
 WndType windowtype; // window type
 int prevmouseline; // holders for
 int prevmousecol; // mouse coordinates
private:
 String *title; // window title
 // -------------- window attributes
 int restored_attrib; // attributes when restored
 Bool clipoverride; // True to override clipping
 Rect restored_rc; // restored state rect
 // ------- for painting overlapping windows

 void PaintOverLappers();
 // ----- control menu
 PopDown *ctlmenu;
 void OpenCtlMenu();
 void DeleteCtlMenu();
 // --------- common window contructor code
 void InitWindow(char *ttl,
 int lf, int tp, int ht, int wd, DFWindow *par);
 void InitWindow(int lf, int tp, int ht, int wd,
 DFWindow *par);
 virtual void SetColors();
 void Enqueue();
 void Dequeue();
 Bool ClipParent(int &x, int y, String *ln);
 void WriteChar(int ch, int x, int y,
 Rect &rc, int fg, int bg);
 void WriteString(String &ln, int x, int y,
 Rect &rc, int fg, int bg);
 Rect PositionIcon();
 friend class StatusBar;
protected:
 // --------------- video memory save data
 char *videosave; // video save buffer
 Bool visible; // True = window has been shown
 int attrib; // Window attribute flags
 Bool DblBorder; // True = dbl border on focus
 Color colors; // window colors
 WndState windowstate; // Restored, Maximized, Minimized, Closing
 Rect rect; // window coordinates (0/0 to 79/24)
 char clearch; // for clearing the window
 // ------ previous capture focus handle
 DFWindow *prevcapture;
 // -------------- window geneology
 DFWindow *parent; // parent window
 // -------- children
 DFWindow *first; // first child window
 DFWindow *last; // last child window
 // -------- siblings
 DFWindow *next; // next sibling window
 DFWindow *prev; // previous sibling window
 // -------- system-wide
 void NextSiblingFocus();
 void WriteClientString(String &ln,int x,int y,int fg,int bg);
 void WriteWindowString(String &ln,int x,int y,int fg,int bg);
 void WriteWindowChar(int ch,int x,int y,int fg,int bg);
 void WriteClientChar(int ch,int x,int y,int fg,int bg);
 // ------------- client window coordinate adjustments
 virtual void AdjustBorders();
 int BorderAdj; // adjust for border
 int TopBorderAdj; // adjust for top border
 int BottomBorderAdj; // adjust for bottom border
 // -----
 Bool HitControlBox(int x, int y)
 { return (Bool)(x-Left() == 2 && y-Top() == 0 &&
 (attrib & CONTROLBOX)); }
 friend void DispatchEvents(Application *ApWnd);
 friend DFWindow *MouseWindow();
 friend DFWindow *inWindow(int x, int y, int &fg, int &bg);
public:

 // -------- constructors
 DFWindow(char *ttl, int lf, int tp, int ht, int wd,
 DFWindow *par)
 { InitWindow(ttl, lf, tp, ht, wd, par); }
 DFWindow(char *ttl, int ht, int wd, DFWindow *par)
 { InitWindow(ttl, -1, -1, ht, wd, par); }
 DFWindow(int lf, int tp, int ht, int wd, DFWindow *par)
 { InitWindow(lf, tp, ht, wd, par); }
 DFWindow(int ht, int wd, DFWindow *par)
 { InitWindow(-1, -1, ht, wd, par); }
 DFWindow(char *ttl, DFWindow *par = NULL)
 { InitWindow(ttl, 0, 0, -1, -1, par); }
 // -------- destructor
 virtual ~DFWindow()
 { if (windowstate != CLOSED) CloseWindow(); }
 // ------- window dimensions and position
 Rect WindowRect() { return rect; }
 Rect ShadowedRect();
 int Right() { return rect.Right(); }
 int Left() { return rect.Left(); }
 int Top() { return rect.Top(); }
 int Bottom() { return rect.Bottom(); }
 int Height() { return Bottom() - Top() + 1; }
 int Width() { return Right() - Left() + 1; }
 // ------ client space dimensions and position
 Rect ClientRect();
 int ClientRight() { return Right()-BorderAdj; }
 int ClientLeft() { return Left()+BorderAdj; }
 int ClientTop() { return Top()+TopBorderAdj; }
 int ClientBottom() { return Bottom()-BottomBorderAdj; }
 int ClientHeight() { return Height()-TopBorderAdj-
 BottomBorderAdj; }
 int ClientWidth() { return Width()-BorderAdj*2; }

 DFWindow *Parent() { return parent; }
 Bool isVisible() { return visible; }
 int Attribute() { return attrib; }
 void SetAttribute(int atr) { attrib = atr; AdjustBorders(); }
 void ClearAttribute(int atr) { attrib &= ~atr; AdjustBorders(); }
 WndType WindowType() { return windowtype; }
 // ----- Control Menu messages
 void CtlMenuMove();
 void CtlMenuSize();
 // -------- API messages
 virtual void OpenWindow();
 virtual void CloseWindow();
 virtual void Show();
 virtual void Hide();
 virtual Bool SetFocus();
 virtual void ResetFocus();
 virtual void EnterFocus(DFWindow *child) {}
 virtual void LeaveFocus(DFWindow *child) {}
 void CaptureFocus();
 void ReleaseFocus();
 virtual void Paint();
 virtual void Paint(Rect rc);
 virtual void Border();
 virtual void Shadow();
 virtual void Title();

 virtual void ClearWindow();
 virtual void ShiftChanged(int sk);
 virtual void Keyboard(int key);
 virtual void DoubleClick(int mx, int my);
 virtual void LeftButton(int mx, int my);
 virtual void ButtonReleased(int, int);
 virtual void MouseMoved(int, int) {}
 virtual void Move(int x, int y);
 virtual void Size(int x, int y);
 virtual void ParentSized(int, int) {}
 virtual void ClockTick() {}
 void Minimize();
 void Maximize();
 void Restore();
 WndState State() { return windowstate; }
 Rect &VisibleRect();
 Colors CLientFG() { return colors.fg; }
 Colors ClientBG() { return colors.bg; }
 Colors SelectedFG() { return colors.sfg; }
 Colors SelectedBG() { return colors.sbg; }
 Colors FrameFG() { return colors.ffg; }
 Colors FrameBG() { return colors.fbg; }
 Colors HighlightFG() { return colors.ffg; }
 Colors HighlightBG() { return colors.fbg; }
};
inline DFWindow *inWindow(int x, int y)
{
 int fg, bg;
 return inWindow(x, y, fg, bg);
}
inline void WriteClientString(String &ln,int x,int y,int fg,int bg)
{
 WriteString(ln,x+ClientLeft(),y+ClientTop(),ClientRect(),fg,bg);
}
inline void WriteWindowString(String &ln,int x,int y,int fg,int bg)
{
 WriteString(ln,x+Left(),y+Top(),ShadowedRect(),fg,bg);
}
inline void WriteWindowChar(int ch,int x,int y,int fg,int bg)
{
 WriteChar(ch, x+Left(), y+Top(),ShadowedRect(), fg, bg);
}
inline void WriteClientChar(int ch,int x,int y,int fg,int bg)
{
 WriteChar(ch, x+ClientLeft(),y+ClientTop(),ClientRect(),fg,bg);
}
#endif






[LISTING THREE]

// ------------ dfwindow.cpp

#include "dflatpp.h"
#include "frame.h"

#include "desktop.h"

// -------- common constructor initialization code
void DFWindow::InitWindow(int lf, int tp,int ht, int wd, DFWindow *par)
{
 windowtype = DFlatWindow;
 if (lf == -1)
 lf = (desktop.screen().Width()-wd)/2;
 if (tp == -1)
 tp = (desktop.screen().Height()-ht)/2;
 if (ht == -1)
 ht = desktop.screen().Height();
 if (wd == -1)
 wd = desktop.screen().Width();
 attrib = restored_attrib = 0;
 title = NULL;
 ctlmenu = NULL;
 videosave = NULL;
 visible = False;
 clipoverride = False;
 first = NULL;
 last = NULL;
 next = NULL;
 prev = NULL;
 prevcapture = NULL;
 parent = par;
 BorderAdj = TopBorderAdj = BottomBorderAdj = 0;
 Rect rcc(lf, tp, lf+wd-1, tp+ht-1);
 restored_rc = rect = rcc;
 SetColors();
 clearch = ' ';
 DblBorder = True;
 Enqueue();
 if (parent == NULL)
 SetAttribute(SAVESELF);
 windowstate = ISRESTORED;
}
void DFWindow::InitWindow(char *ttl, int lf, int tp,
 int ht, int wd, DFWindow *par)
{
 InitWindow(lf, tp, ht, wd, par);
 attrib = TITLEBAR;
 title = new String(ttl);
}
void DFWindow::OpenWindow()
{
 if (windowstate == CLOSED)
 InitWindow(*title, Left(), Top(),Height(), Width(), parent);
}
void DFWindow::CloseWindow()
{
 windowstate = ISCLOSING;
 Hide();
 // ------- close window's children
 DFWindow *Wnd = first;
 while (Wnd != NULL) {
 Wnd->CloseWindow();
 Wnd = Wnd->next;
 }

 // ------ delete this window's memory
 if (title != NULL)
 delete title;
 if (videosave != NULL)
 delete videosave;
 DeleteCtlMenu();
 if (this == desktop.InFocus()) {
 if (desktop.FocusCapture() == this)
 ReleaseFocus();
 else if (parent == NULL 
 parent->windowstate == ISCLOSING)
 desktop.SetFocus(parent ? parent : NULL);
 else {
 NextSiblingFocus();
 if (this == desktop.InFocus())
 if (!parent->SetFocus())
 desktop.SetFocus(NULL);
 }
 }
 Dequeue();
 windowstate = CLOSED;
}
// -------- set the fg/bg colors for the window
void DFWindow::SetColors()
{
 colors.fg =
 colors.sfg =
 colors.ffg =
 colors.hfg = WHITE;
 colors.bg =
 colors.sbg =
 colors.fbg =
 colors.hbg = BLACK;
}
// ---------- display the window
void DFWindow::Show()
{
 if (attrib & SAVESELF) {
 Rect rc = ShadowedRect();
 if (videosave == NULL) {
 int sz = rc.Height() * rc.Width() * 2;
 videosave = new char[sz];
 }
 if (!visible)
 desktop.screen().GetBuffer(rc, videosave);
 }
 visible = True;
 clipoverride = True;
 Paint();
 Border();
 Shadow();
 clipoverride = False;
 if (windowstate != ISMINIMIZED) {
 // --- show the children of this window
 DFWindow *Wnd = first;
 while (Wnd != NULL) {
 if (Wnd->windowstate != ISCLOSING)
 Wnd->Show();
 Wnd = Wnd->next;

 }
 }
}
Rect DFWindow::ShadowedRect()
{
 Rect rc = rect;
 if (attrib & SHADOW) {
 rc.Right()++;
 rc.Bottom()++;
 }
 return rc;
}
void DFWindow::Hide()
{
 if (visible) {
 Rect rc = rect;
 Bool HasShadow = (Bool)((attrib & SHADOW) != 0);
 if (HasShadow) {
 rc.Bottom()++;
 rc.Right()++;
 }
 visible = False;
 // ----- hide the children
 DFWindow *Wnd = first;
 while (Wnd != NULL) {
 Wnd->Hide();
 Wnd = Wnd->next;
 }
 if (videosave != NULL) {
 desktop.screen().PutBuffer(rc, videosave);
 delete videosave;
 videosave = NULL;
 }
 else if (parent != NULL) {
 if (parent->isVisible()) {
 parent->Paint(ShadowedRect());
 PaintOverLappers();
 }
 }
 }
}
void DFWindow::Keyboard(int key)
{
 switch (key) {
 case CTRL_F4:
 CloseWindow();
 break;
 case ALT_HYPHEN:
 OpenCtlMenu();
 break;
 default:
 // --- send all unprocessed keystrokes
 // to the parent window
 if (parent != NULL)
 parent->Keyboard(key);
 break;
 }
}
void DFWindow::ShiftChanged(int sk)

{
 if (parent != NULL)
 parent->ShiftChanged(sk);
}
void DFWindow::DoubleClick(int mx, int my)
{
 if (HitControlBox(mx, my))
 CloseWindow();
}
void DFWindow::LeftButton(int mx, int my)
{
 if (my == Top()) {
 // ----- hit the top border
 int x = mx-Left();
 int wd = Width();
 if (x == wd-2) {
 // ---- hit the restore or maximize box
 if (windowstate != ISRESTORED)
 Restore();
 else if (attrib & MAXBOX)
 Maximize();
 }
 else if (x == wd-3) {
 // ----- hit the minimize box
 if (windowstate != ISMINIMIZED &&
 (attrib & MINBOX))
 Minimize();
 }
 else if (HitControlBox(mx, my) &&
 (attrib & CONTROLBOX))
 // ------- hit the control box
 OpenCtlMenu();
 else if ((attrib & MOVEABLE) &&
 windowstate != ISMAXIMIZED)
 // ---- none of the above, move the window
 new Frame(this, mx);
 }
 else if ((attrib & SIZEABLE) && windowstate == ISRESTORED)
 if (mx == Right() && my == Bottom())
 // --- hit the lower right corner, size the window
 new Frame(this);
 prevmouseline = my;
 prevmousecol = mx;
}
void DFWindow::ButtonReleased(int, int)
{
 prevmouseline = -1;
 prevmousecol = -1;
}
Rect DFWindow::ClientRect()
{
 Rect rc(ClientLeft(), ClientTop(),
 ClientRight(), ClientBottom());
 return rc;
}
// ------------ move a window
void DFWindow::Move(int x, int y)
{
 int xdif = x - Left();

 int ydif = y - Top();
 if (xdif == 0 && ydif == 0)
 return;
 Bool wasVisible = visible;
 if (wasVisible)
 Hide();
 int ht = Height();
 int wd = Width();
 rect.Left() = x;
 rect.Top() = y;
 rect.Right() = Left()+wd-1;
 rect.Bottom() = Top()+ht-1;
 if (windowstate == ISRESTORED)
 restored_rc = rect;
 DFWindow *Wnd = first;
 while (Wnd != NULL) {
 Wnd->Move(Wnd->Left()+xdif, Wnd->Top()+ydif);
 Wnd = Wnd->next;
 }
 if (wasVisible)
 Show();
}
// ------------ size a window
void DFWindow::Size(int x, int y)
{
 int xdif = x - Right();
 int ydif = y - Bottom();
 if (xdif == 0 && ydif == 0)
 return;
 Bool wasVisible = visible;
 if (wasVisible)
 Hide();
 rect.Right() = x;
 rect.Bottom() = y;
 if (windowstate == ISRESTORED)
 restored_rc = rect;
 DFWindow *Wnd = first;
 while (Wnd != NULL) {
 Wnd->ParentSized(xdif, ydif);
 Wnd = Wnd->next;
 }
 if (wasVisible)
 Show();
}
void DFWindow::Minimize()
{
 if (windowstate == ISRESTORED)
 restored_rc = rect;
 Rect rc = PositionIcon();
 Hide();
 windowstate = ISMINIMIZED;
 Move(rc.Left(), rc.Top());
 Size(rc.Right(), rc.Bottom());
 Show();
}
void DFWindow::Maximize()
{
 restored_rc = rect;
 Rect rc(0, 0, desktop.screen().Width()-1,desktop.screen().Height()-1);

 if (parent != NULL)
 rc = parent->ClientRect();
 Hide();
 windowstate = ISMAXIMIZED;
 Move(rc.Left(), rc.Top());
 Size(rc.Right(), rc.Bottom());
 Show();
}
void DFWindow::Restore()
{
 Hide();
 Move(restored_rc.Left(), restored_rc.Top());
 Size(restored_rc.Right(), restored_rc.Bottom());
 windowstate = ISRESTORED;
 if (this == desktop.InFocus())
 Show();
 else
 SetFocus();
}
// ---- compute lower right icon space in a rectangle
static Rect LowerRight(Rect &prc)
{
 Rect rc(prc.Right()-IconWidth,prc.Bottom()-IconHeight,0,0);
 rc.Right() = rc.Left() + IconWidth-1;
 rc.Bottom() = rc.Top() + IconHeight-1;
 return rc;
}
// ----- compute a position for a minimized window icon
Rect DFWindow::PositionIcon()
{
 Rect rc(desktop.screen().Width()-IconWidth,
 desktop.screen().Height()-IconHeight,
 desktop.screen().Width()-1,
 desktop.screen().Height()-1);
 if (parent != NULL) {
 Rect prc = parent->rect;
 rc = LowerRight(prc);
 // --- search for icon available location
 DFWindow *Wnd = parent->first;
 while (Wnd != NULL) {
 if (Wnd->windowstate == ISMINIMIZED) {
 Rect rc1= Wnd->rect;
 if (rc1.Left() == rc.Left() &&
 rc1.Top() == rc.Top()) {
 rc.Left() -= IconWidth;
 rc.Right() -= IconWidth;
 if (rc.Left() < prc.Left()+1) {
 rc.Left() =
 prc.Right()-IconWidth;
 rc.Right() =
 rc.Left()+IconWidth-1;
 rc.Top() -= IconHeight;
 rc.Bottom() -= IconHeight;
 if (rc.Top() < prc.Top()+1)
 return LowerRight(prc);
 }
 break;
 }
 }

 Wnd = Wnd->next;
 }
 }
 return rc;
}

























































January, 1993
STRUCTURED PROGRAMMING


Seventh Son




Jeff Duntemann KG7JF


Last week the Magic Van flipped over 100,000 on the way to Safeway for some
ice and a paper, and almost immediately I started hearing creaks and groans
I'm sure I never heard before. But I sympathize; having turned 40 myself this
past summer, I've begun to know what creaks are. (Groans I figure I won't
encounter for another 20 years.) And just as youth is wasted on the young,
new-car smell is wasted on a pile of metal just out of the factory, that may
or may not be a lemon and self-destruct in your driveway. I wasn't sure I was
going to keep the Magic Van until it roared past 50,000, carrying the normal
American load of dog-puke stains, gravel dings, and melted Jolly Joes in the
upholstery. Now, I'm quite sure I'll keep it forever.
I kind of hope Carol feels the same way about me.
I was forced to face another milestone this week, that seemed a little unreal
somehow: Turbo Pascal is now nine years old. Could it be? Then again, I still
had most of my hair when Version 1.0 and I met, and I was still a Cobol slave
for Xerox. Time passes.
On Turbo Pascal's ninth birthday (or pretty close to it) Philippe Kahn
released his seventh son upon the world. It took me all of, oh, ten minutes to
become a believer. There has been a time or two (or maybe three) when I
thought I might trade in Turbo Pascal for something else. But having come this
far, I suspect I'll keep it forever.
Let me take a little time to tell you why.


The Long and the Short Of It


With Version 7, Borland has brought Turbo Pascal into line with its
longstanding strategy for C++ products: a low-end "Turbo" version and a
high-end "Borland" version. The idea has been to have a sleek, uncomplicated
compiler for hobbyist programmers, and a soup-to-nuts development package for
people who do it for a living. This has worked well in the C++ world; Borland
has made a lot of money upgrading people from Turbo C++ to Borland C++. I see
no reason why it won't work just as well in the Pascal marketplace.
So the big Pascal picture from Borland looks like this: Borland Pascal with
Objects 7.0 (BP7 for short) may be hosted on and develops code for both
Windows 3.x and DOS. DOS support includes both real mode and DPMI (DOS
protected-mode interface). It provides everything Turbo Pascal 6.0 and Turbo
Pascal for Windows 1.5 provide, and then some. (I'll get to the "and then
some" in a bit.) On the low end, Turbo Pascal 7.0 replaces Turbo Pascal 6.0,
and is hosted on and develops code for DOS real mode only. Turbo Pascal for
Windows 1.5 remains unchanged and on the market, and is hosted on and develops
code for Windows 3.x only.
BP7 includes both Turbo Vision and Object Windows Library, plus the source for
both frameworks and the runtime libraries generally. There's a standard passel
of new utilities (most of them for the Windows platform) and 3800 pages of
documentation in 11 manuals. (This is yet another "hernia-pack" compiler--big
as a carry-on travel bag and heavier than Mr. Byte, who readily admits to
having had a few too many Milk Bones in his long life.)
The additions to Turbo Pascal 7.0 are more modest, but still significant. The
single most irritating problem with Turbo Pascal is now history: The IDE and
the compiler can now use all available extended memory on 286/386/486
machines. No more "Out of Memory" message boxes, ever again! With memory
running about $25.00 a megabyte, you might as well pour 8 or 12 megs into the
box and be done with it. Ariel, my new home machine, has 16 Mbytes now, and
it's kind of like moving from an army cot to a king-sized waterbed. The
comfort factor is simply beyond quantification, especially now that it extends
to my favorite compiler. (Keep in mind that for Turbo Pascal, this applies
only to the IDE. The .EXE files generated by Turbo Pascal 7.0 do not make use
of extended memory! For that you need the high-end Borland Pascal with Objects
7.0 package, as I'll explain below.) The IDE will still operate in real mode
on pre-286 CPUs.
New optimizations, an enhanced Turbo Vision, and a syntax-cognizant editor
with undo and redo are the other obvious additions to both products. We'll
come back to these and the other minor enhancements, some of which, while
minor, are interesting indeed.


The Big, Big Win


But beyond anything else in the release of BP7, the big, big win is DPMI
support. Over the past two or three years, protected-mode capability has
become the dividing line between the Big Guys and the Little Guys in
compiler-land. We've beaten some cracks in the 640K memory barrier over the
years and finessed it with overlays and EMS support, but this is the real
thing: True, unadulterated and uncompromised access to as much as 16 Mbytes of
RAM. Pascal is in with the big guys now.
Alas, along with power comes the inevitable responsibility, and also a need to
recognize that things are just not going to be simple anymore. (A minicomputer
bigot in my acquaintance said not long ago that the pre-Windows 3.0 decade of
1981-1991 was "the longest-running fools' paradise in the history of
computing." He was right--though he'll gag before admitting that we fools
changed the shape of history and now, paradise having been lost, are still
running the asylum.) Programming for DPMI is not fundamentally difficult, but
it is by no means as simple as Borland's BP7 documentation implies--and
contrary to their claims, it broke nearly every one of the substantial Turbo
Pascal real-mode programs I threw at it. My early experience suggests that if
you're going to grab for those 16 Mbytes worth of gold rings, you'd best
understand the engine that drives the carousel.
DPMI is, at the bottom of things, just an API. It is an interface spec that
allows programs to allocate, use, and release extended memory in protected
mode. A DPMI implementation (as opposed to the spec) is a collection of
callable routines that gather all the low-level management of protected-mode
memory into one place. Rather than the word "implementation," you'll hear talk
of something called a DPMI server. As the word "server" implies, there is but
one block of code in your system at any given time that has the power to
control extended memory in protected mode. There can't be two. But to use the
DPMI spec, there definitely has to be at least one.
This poses some problems. Windows 3.x, for example, incorporates a DPMI
server. Other utilities and environments do as well. But if you're not using
Windows or anything that includes a DPMI server, your application is going to
have to load and run the one that Borland provides. Your protected-mode BP7
programs have to query the system to see if a DPMI server is already active
before beginning execution. (Fortunately, this is done automatically by the
RTL.) If no DPMI server is active, the program will load and run Borland's
server before running the code that you've written.
It's actually more complex than even that. Borland has created a layer atop
the "naked" DPMI server to broaden the capabilities of the DPMI idea in a
Pascalish direction. This layer is called a run-time manager (RTM), and it is
a separate executable program that comes in once a DPMI server has been
detected or installed. The RTM allows the smooth creation of a Pascal megaheap
that can make full use of protected-mode RAM, and also replaces the
traditional Turbo Pascal overlay manager for executing overlaid applications
in protected mode.
Most of the time, and with most protected-mode applications, all this
grimbling of servers and layers and managers happens automatically, behind the
scenes. You have to be sure that if no other DPMI server is active, your
application can find and launch the two files that Borland supplies to provide
DPMI services. That's easy enough.
The real trick to DPMI programming is being sure that your code doesn't commit
any protected-mode no-nos.


Protected-mode Correctness


Compared to the fast-and-loose way the CPU lets us pulp-and-bale memory and
registers in real mode, in protected mode the CPU is crankier than a Berkeley
liberal with hemorrhoids. Certain things Just Aren't Done, and the bulk of the
problems you'll have moving relatively simple applications to protected mode
involve making sure all the rules are followed to the letter.
In general, your problems will fall into three major categories: specifying
absolute addresses, below-the-belt pointer manipulation, and misuse of segment
registers from assembly code.
The first category is where most of my own problems lay. Most of us do a fair
amount of peeking at and poking to video memory, usually by using the MEM
array. (Some of us do it by building a pointer to video RAM with the Ptr
function, which belongs to the second category.) MEM requires a segment and an
offset address, separated by a colon, and acting as an index into real-mode
memory. Specifying offsets into segments is still legit, but specifying
segments directly is verboten.
The problem is that in protected mode, segments aren't really segments
anymore. What we on the application side of the machine might from realmode
habit want to call a "segment" is actually a selector, which is used by the
DPMI server to identify an actual segment value that is used to construct the
physical address. Setting up segments is no longer the proper job of the
application. The app asks the DPMI server to set up a group of segments, and
then the server returns a selector for each segment. The selector is just a
way to identify a given segment without actually using the segment's address
as its name, as we do in real mode. The app never knows what the true,
physical segment address for its selector really is; that's a secret that
helps the server keep out-of-control programs from trashing all of memory. It
also allows the operating system or some other control program to implement
true virtual memory and do many other minicomputer-like things.
If you're in protected mode and use an expression like MEM[$B800:0], the CPU
will assume that $B800 is supposed to be a selector. But if no selector has
the value $B800, the CPU will issue a general-protection (GP) fault, which
within Turbo Pascal's context is considered a runtime error and will halt your
application.
So what do you do? Borland has anticipated this problem, and provides a number
of predefined segment values that you can use in place of literal segment
addresses. These variables have names like Seg0040, SegA000, SegB000, and
SegB800. Their actual values are valid selectors returned by the DPMI server,
but their names indicate what they point to. In other words, you would replace
$B800 in your MEM statement with SegB800, and everything will be fine. MEM
will return the exact same data byte, with the blessing of the DPMI server. If
you need to use MEM to access other areas of memory not covered by any of the
predefined selector variables, you'll have to learn how to ask the DPMI server
nicely for a selector. It's not what I'd call obvious, but having licked Turbo
Vision, I'd be loath to call it impossible.
I've never been a heavy user of the ABSOLUTE reserved word, and I doubt I'm
going to use it much anymore. ABSOLUTE manipulates addresses, and it will
crash you with a GP fault (just like MEM) if you try to use it with a physical
address rather than selectors.
The second category of problems is direct manipulation of the interior
components of pointers. If you try to create a pointer using the Ptr routine,
you must, as with MEM and ABSOLUTE, be sure that the segment portion of the
pointer that you're building is a valid selector. (Using the predefined
selector variables like SegB800 is one safe way to do this.) Nor can you do
"pointer arithmetic" anymore by acting directly on the segment. (Manipulating
the offset is still OK.) Remember, the segment portion of a pointer isn't an
address anymore; it's a selector, which is just a name or a handle, and
incrementing or decrementing the selector just changes the name of the
selector to one that probably doesn't exist.
Dereferencing a NIL pointer will also cause a GP fault, as will dereferencing
a pointer that has been disposed of. This has the slightly weird effect of
forcing the CPU to do some of your debugging for you--for instance, an
application that used to work most of the time (that is, until dereferencing a
dead pointer did something ugly) will now not work at all! After the initial
embarrassment wears off, I suspect most of us will come to depend heavily on
this little perk.
Finally, the gobble-uns'll getcha if you try to use any of the segment
registers as just more buckets, even if, like ES, those registers have no
dedicated higher purpose. The temptation to use ES as a scratch register in
assembly routines is strong, especially in the middle of tight loops where
going to memory can be very expensive in performance terms. But once you're in
protected mode, no segment register may contain anything but a valid selector.
If while in protected mode you stuff a value into ES that doesn't happen to
correspond to a valid selector, you'll get a GP fault. That's part of what the
"protected" in protected mode is about. Scan your ASM routines or your INLINE
code before you try compiling for DPMI. You may be amazed at the segment
register abuse that you uncover. Lord knows I was.
Now, what does this protected stuff buy you? Even in a machine with a paltry 4
Mbytes of RAM, I was able to create a heap with 1,168,848 bytes. And here at
home on Ariel's 16 Mbytes, the heap comes to a thoroughly gaspable 4,143,888
bytes. One can do real things with that much heap! My only remaining question
is why the heap tops out so soon. Why not an 8-Mbyte heap, or even a 12-Mbyte
heap? Whether or not I disable my 4-Mbyte RAM drive, the most I can find
amidst my 16 Mbytes is that 4,143,888 bytes. Something is topping the heap out
at 4 Mbytes, and when I figure out what it is I will certainly report. (Keep
in mind, it's now Thursday and I only received this critter on Monday!)


Break and Continue



Compared to the changes to the programming infrastructure, Borland's mods to
the Pascal language definition itself are minor. But hey, what does Turbo
Pascal really lack? They got it almost bang-on in 1983, and they've had nine
years since to think it through. Two of the items that I've been demanding
since the beginning are finally here: Break/Cycle and conformant arrays. And
wouldn't you know it? They changed the names of both.
Academics are fond of saying that it is possible to exit any loop at any point
in a structured manner by the proper use of flags and tests. That's true. And
when I was a kid I would sit tailor-style in the schoolyard for what seemed
like hours, picking knots and tangles out of kite string so I wouldn't lose
any of it. This was dumb, although when I was penniless and didn't know any
better, it made a certain frugal sense.
I used to pick my early way out of loops like an ant following a strand
through a wad of tangled kite string. Then I got tired of the game and started
using GOTO like a knife to cut the nonsense and get where I needed to go at
once. It felt good. Now it's unnecessary. Both TP7 and BP7 support two new
predefined procedures, Break and Continue. The Break procedure allows you to
bust out of a loop at any time and resume execution at the first statement
following the end of the loop. Continue is what I always knew as Cycle, which
has been supported in Microsoft Pascal (not QuickPascal!) since the beginning.
It cuts the current pass through the loop short, and resumes execution at the
top of the loop with the next iterator if there is one. This applies to all
loops: IF..THEN, WHILE..DO, and REPEAT..UNTIL.
Borland created Break and Continue as procedures, not reserved words, which I
think was a good idea. Adding reserved words to a language necessarily (and
irreversibly) breaks code, and should be done with the utmost hesitation. I
haven't yet tried to trace the code to see precisely how they were
implemented, but I expect they amount to something like the setjmp/longjmp
pair that C hackers cherish. At the top and bottom of every loop, the code
generator probably makes a call to setjmp; in a sense, to "mark the spot."
Later, while inside the loop, the code can call Break or Continue to perform
the longjmp to the respective target, either at the top of the loop (for
Continue) or immediately after it (for Break).
Regardless of how it works under the sheets, with the addition of Break and
Continue, there's very little reason left to use GOTO. Nonetheless, I speak of
GOTO as the gun freaks speak of their guns: I hope never to have to use it,
but when I need it I want it to be there.


Open Array Parameters


At Pascal's very inception, Niklaus Wirth knew that Pascal's rather strong
typing could get seriously in the way of certain kinds of useful structures,
including procedures that act upon arrays incorporating the same base type but
having different bounds. He added the notion of conformant arrays to Pascal,
which allows arrays of different bounds values to be passed as actual
parameters within the identical formal parameter. The formal parameter is said
to conform to the bounds of the actual array parameter passed in it, hence the
name.
In TP/BP 7, a formal array parameter in a procedure or a function can be
defined without any bounds at all, as shown in Listing One. The array can be
passed either by value or by reference; it doesn't matter. The trick is that
within the procedure, you can query the bounds of the actual parameter by
using two predefined functions, Low and High. Low returns the low bound of the
actual array parameter passed as its argument, and High returns the high
bound. Once you obtain the bounds of the conformant array, you can manipulate
it just as you would any array. Be careful not to create functions of your own
named Low or High. In one of my test programs I used a unit that redefined the
two bounds-query functions, and it took some head scratching to discover why
conformant arrays suddenly refused to work.
Borland renamed this idea "open array parameters." Lord knows why Borland
didn't just call them "conformant arrays," but I'd have kept them even if they
were named Al Gore. Conformant arrays will be handy in this new protected-mode
world, where it's no longer kosher to play indiscriminate pointer games to
accomplish the same things.
And alas, as much as I had hoped otherwise, arrays are still limited to 64K in
size, even in protected mode. Keep in mind, moving beyond 64K for individual
Pascal data structures generally requires moving beyond 64K segments; the
final release from 64K segments does not come until you move fully to "flat
address" 32-bit mode, in which a segment may be any size up to 2{32} bytes.
Maybe next version.


Keep it Forever


There you have it. For my money, this is the most significant upgrade of the
product since Turbo Pascal 4.0; more significant, in fact, than the
introduction of objects in 1989. Data structures (including objects) are
intriguing, but you can do anything without objects that you can do with
objects, albeit not always easily. Infrastructure improvements like DPMI
access, on the other hand, can be go/no-go propositions. Certainly
applications that simply could not be written before in Turbo Pascal, using
any paradigm, can be written now.
There is another major infrastructure improvement that I haven't had time to
test yet: DOS DLLs. I'll try to spend a little time on them next column. In
the meantime, sheesh, I have some learning to do.

_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]

PROGRAM ConformTest;

{ Demo program for Turbo Pascal 7.0 conformant arrays }
{ By Jeff Duntemann; From DDJ for January 1993 }

USES Crt;

VAR
 I : Integer;
 Manny : ARRAY[36..72] OF Integer;
 Moe : ARRAY[0..137] OF Integer;
 Jack : ARRAY[1..40] OF Integer;

{ Note that neither RandomFill nor ArrayMax specify the bounds }
{ of their integer array parameter. The High and Low functions }
{ are predefined but are *not* reserved words. }

PROCEDURE RandomFill(VAR Target : ARRAY OF Integer);

VAR
 I : Integer;
BEGIN
 Randomize;
 FOR I := Low(Target) TO High(Target) DO
 Target[I] := Random(100);
END;

FUNCTION ArrayMax(Target : ARRAY OF Integer) : Integer;


VAR
 I,Temp : Integer;
BEGIN
 Temp := 0;
 FOR I := Low(Target) TO High(Target) DO
 IF Target[I] > Temp THEN Temp := Target[I];
 ArrayMax := Temp;
END;

BEGIN
 ClrScr;
 RandomFill(Manny);
 RandomFill(Moe);
 RandomFill(Jack);
 Writeln('The largest element of Manny is ',ArrayMax(Manny));
 Writeln('The largest element of Moe is ',ArrayMax(Moe));
 Writeln('The largest element of Jack is ',ArrayMax(Jack));
 Readln;
END.











































January, 1993
GRAPHICS PROGRAMMING


Yet Another Animation Method


 This article contains the following executables: XSHRP21.ZIP


Michael Abrash


As documented last month, we brought our pets with us when we moved out here
to Seattle. At about the same time, our Golden Retriever, Sam, observed his
third birthday. Sam is relatively intelligent, in the sense that he is clearly
smarter than a Banana Slug, although if he were in the same room with Jeff
Duntemann's dogs Mr. Byte and Chewy, there's a reasonable chance that he would
mistake them for something edible (a category that includes rocks, socks, and
a surprising number of things too disgusting to mention), and Jeff would have
to find a new source of openings for his column.
But that's not important now. What is important is that--and I am not making
this up--this morning I managed to find the one pair of socks Sam hadn't
chewed holes in. And what's even more important is that after we moved and Sam
turned three, he calmed down amazingly. We had been waiting for this magic
transformation since Sam turned one, the age at which most puppies turn into
normal dogs who lie around a lot, waking up to eat their Science Diet (motto,
"The dog food that costs more than the average neurosurgeon makes in a year")
before licking themselves and going back to sleep. When Sam turned one and
remained hopelessly out of control we said, "Goldens take two years to calm
down," as if we had a clue. When he turned two and remained undeniably Sam we
said, "Any day now." By the time he turned three, we were reduced to figuring
that it was only about seven more years until he expired, at which point we
might be able to take all the fur he had shed in his lifetime and weave
ourselves some clothes without holes in them, or quite possibly a house.
But miracle of miracles, we moved, and Sam instantly turned into the dog we
thought we'd gotten when we forked over $500--calm, sweet, and obedient. Weeks
went by, and Sam was, if anything, better than ever. Clearly, the change was
permanent.
And then we took Sam to the vet for his annual check-up and found that he had
an ear infection. Thanks to the wonders of modern animal medicine, a $5 bottle
of liquid restored his health in just two days. And with his health, we got,
as a bonus, the old Sam. You see, Sam hadn't changed. He was just tired from
being sick. Now he once again joyously knocks down any stranger who makes the
mistake of glancing in his direction, and will, quite possibly, be booked any
day now on suspicion of homicide by licking.
Okay, you give up. What exactly does this have to do with graphics? I'm glad
you asked. The lesson to be learned from Sam The Dog With A Brain The Size Of
A Walnut is that while things may look like they've changed, in fact they
often haven't. Take VGA performance. If you buy a 486 with a Super-VGA, you'll
get performance that knocks your socks off, especially if you run Windows.
Things are liable to be so fast that you'll figure the Super-VGA has to
deserve some of the credit. Well, maybe it does if it's a local-bus VGA. But
maybe it doesn't, even if it is local bus--and it certainly doesn't if it's an
ISA-bus VGA, because no ISA-bus VGA can run faster than about 300 nanoseconds
per access, and VGAs capable of that speed have been common for at least a
couple of years now. Your 486 VGA system is fast almost entirely because it
has a 486 in it. (486 systems with accelerators such as the ATI Ultra or
Diamond Stealth are another story altogether.) Underneath it all, the VGA is
still painfully slow--and if you have an old VGA or IBM's original PS/2
motherboard VGA, it's incredibly slow. The fastest ISA-bus VGA around is two
to twenty times slower than system memory, and the slowest VGA around is as
much as 100 times slower. In the old days, the rule was, "Display memory is
slow, and should be avoided." Nowadays, the rule is, "Display memory is not
quite so slow, but should still be avoided."
So, as I say, sometimes things don't change. Of course, sometimes they do
change. For example, in just 49 dog years, I fully expect to own at least one
pair of underwear without a single hole in it. Which brings us, deus ex
machina and the creek don't rise, to yet another animation method:
dirty-rectangle animation.


VGA Access Times


Actually, before we get to dirty rectangles, I'd like to take you through a
quick refresher on VGA memory and I/0 access times. I want to do this partly
because the slow access times of the VGA make dirty-rectangle animation
particularly attractive, and partly as a public service, because even I was
shocked by the results of some I/O performance tests I recently ran.
Table 1 shows the results of the aforementioned I/O performance tests, as run
on two 486/33 Super-VGA systems under the Phar Lap 386DOS-Extender. (The
systems and VGAs are unnamed because this is a not-very-scientific spot test,
and I don't want to unfairly malign, say, a VGA whose only sin is being
plugged into a lousy motherboard, or vice versa). Under Phar Lap, 32-bit
protected-mode apps run with full I/O privileges, meaning that the OUTs I
measured had the best official cycle times possible on the 486: 10 cycles. OUT
takes 16 cycles in real mode on a 486, and a mind-boggling 30 cycles in
protected mode if running without full I/O privileges (as is normally the case
for protected-mode applications). Basically, I/O is just plain slow on a 486.
Table 1: Results of I/O performance tests run under the Phar Lap
386DOS-Extender.

 OUT Time in Microseconds and Cycles
 Official Time 486 #1/16-bit VGA#1 486 #2/16-bit VGA #2
 -----------------------------------------------------------------------

OUT DX,AL
 repeated 1000
 times nonstop
 (maximum byte
 access) 0.300 us 2.546 us 0.813 us
 10 cycles 84 cycles 27 cycles

OUT DX,AX
 repeated 1000
 times nonstop
 (maximum word
 access) 0.300 mus 3.820 mus 1.066 mus
 10 cycles 120 cycles 35 cycles

OUT DX,AL
 repeated 1000
 times, but
 interspersed
 with MULs 0.300 mus 1.610 mus 0.780 mus
 (random byte
 access) 10 cycles 53 cycles 26 cycles

OUT DX,AX
 repeated 1000
 times, but
 interspersed

 with MULs 0.300 mus 2.830 mus 1.010 mus
 (random word
 access) 10 cycles 93 cycles 33 cycles

Slow as 30 or even 10 cycles for an OUT is, one could only wish that VGA I/O
was actually that fast. The fastest OUT in Table 1 is 26 cycles, and the
slowest is 126--this for an operation that's supposed to take 10 cycles. To
put this in context, MUL takes only 13 to 42 cycles, and a normal MOV to or
from system memory takes exactly one cycle on the 486. In short, OUTs to VGAs
are as much as 100 times slower than normal memory accesses, and are generally
two to four times slower than display memory accesses, although there are
exceptions.
Of course, VGA display memory has its own performance problems. The fastest
ISA-bus VGA can, at best, support sustained write times of about 10 cycles per
word-sized write; 15 or 20 cycles is more common, even for relatively fast
Super-VGAs; the worst case I've seen is 65 cycles per byte. However,
intermittent writes, mixed with a lot of register- and cache-only code, can
effectively execute in one cycle because the VGA and the 486 coprocess.
Display memory reads tend to take longer, because coprocessing isn't
possible--one microsecond is a reasonable rule of thumb for VGA reads,
although there's considerable variation. So VGA memory tends not to be as bad
as VGA I/O, but Lord knows it isn't good.
In conclusion, OUTs, in general, are lousy on the 486 (and to think they only
took three cycles on the 286!). OUTs to VGAs are particularly lousy. Display
memory performance is pretty poor, especially for reads. The conclusions are
obvious, I would hope. Structure your graphics code, and, in general, all 486
code, to avoid OUTs. For graphics, this especially means using write mode 3
rather than the bit-mask register. When you must use the bit mask, arrange
drawing so that you can set the bit mask once, then do a lot of drawing with
that mask. For example, draw a whole edge at once, then the middle, then the
other edge, rather than setting the bit mask several times on each scan line
to draw the edge and middle bytes together. Don't read from display memory if
you don't have to. Write each pixel once and only once.
It is indeed a strange concept: The key to fast graphics is staying away from
the graphics adapter as much as possible.


Dirty-rectangle Animation


The relative slowness of VGA hardware is part of the appeal of the technique
that I call "dirty-rectangle" animation, in which a complete copy of the
contents of display memory is maintained in offscreen system (nondisplay)
memory. All drawing is done to this system buffer. As offscreen drawing is
done, a list is maintained of the bounding rectangles for the drawn-to areas;
these are the "dirty" rectangles, dirty in the sense that they do not match
the contents of the screen. After all drawing for a frame is completed, all
the dirty rectangles for that frame are copied to the screen in a burst, and
then the cycle of off-screen drawing begins again.
Why, exactly, would we want to go through all this complication, rather than
simply drawing to the screen in the first place? The reason is visual quality.
If we were to do all our drawing directly to the screen, there'd be a lot of
flicker as objects were erased and then redrawn. Similarly, overlapped drawing
done with the painter's algorithm (in which farther objects are drawn first,
so that nearer objects obscure them) would flicker as farther objects were
visible for short periods. With dirty-rectangle animation, only the finished
pixels for any given frame ever appear on the screen; intermediate results are
never visible. Figure 1 illustrates the visual problems associated with
drawing directly to the screen; Figure 2 shows how dirty-rectangle animation
solves these problems.
Well, then, if we want good visual quality, why not use page flipping? For one
thing, not all adapters and modes support page flipping. The CGA and MCGA
don't, and neither do the VGA's 640x480 16-color or 320x200 256-color modes,
or many Super-VGA modes. In contrast, all adapters support dirty-rectangle
animation. Another advantage of dirty-rectangle animation is that it's
generally faster. While it may seem strange that it would be faster to draw
off screen and then copy the result to the screen, that is often the case,
because dirty-rectangle animation usually reduces the number of times the
VGA's hardware needs to be touched, especially in 256-color modes. This
reduction comes about because when dirty rectangles are erased, it's done in
system memory, not in display memory, and since most objects move a good deal
less than their full width (that is, the new and old positions overlap),
display memory is written to fewer times than with page flipping. (In 16-color
modes, this is not necessarily the case, because of the parallelism obtained
from the VGA's planar hardware.) Also, read/modify/write operations are
performed in fast system memory rather than slow display memory, so display
memory rarely needs to be read. This is particularly good because display
memory is generally even slower for reads than for writes.
Also, page flipping wastes a good deal of time waiting for the page to flip at
the end of the frame. Dirty-rectangle animation never needs to wait for
anything because partially drawn images are never present in display memory.
Actually, in one sense, partially drawn images are sometimes present because
it's possible for a rectangle to be partially drawn when the scanning raster
beam reaches that part of the screen. This causes the rectangle to appear
partially drawn for one frame, producing a phenomenon I call "shearing."
Fortunately, shearing tends not to be particularly distracting, especially for
fairly small images, but it can be a problem when copying large areas. This is
one area in which dirty-rectangle animation falls short of page flipping,
because page flipping has perfect display quality, never showing anything
other than a completely finished frame. Similarly, dirty-rectangle copying may
take two or more frame times to finish, so even if shearing doesn't happen,
it's still possible to have the images in the various dirty rectangles show up
non simultaneously. In my experience, this latter phenomenon is not a serious
problem, but do be aware of it.


Dirty Rectangles in Action


Listing One (page 140) demonstrates dirty-rectangle animation. This is a very
simple implementation, in several respects. For one thing, it's written
entirely in C, and animation fairly cries out for assembly language. For
another thing, it uses far pointers, which C often handles with less than
optimal efficiency, especially because I haven't used library functions to
copy and fill memory. (I did this so the code would work in any memory model.)
Also, Listing One doesn't attempt to coalesce rectangles so as to perform a
minimum number of display-memory accesses; instead, it copies each dirty
rectangle to the screen, even if it overlaps with another rectangle, so some
pixels get copied multiple times. Listing One runs pretty well, considering
all of its failings; on my 486/33, ten 11x11 images animate at a very
respectable clip.
One point I'd like to make is that although the system-memory buffer in
Listing One has exactly the same dimensions as the screen bitmap, that's not a
requirement, and there are some good reasons not to make the two the same
size. For example, if the system buffer is bigger than the screen, it's
possible to pan the visible area around the system buffer. Or, alternatively,
the system buffer can be just the size of a desired window, representing a
window into a larger, virtual buffer. We could then draw the desired portion
of the virtual bitmap into the system-memory buffer, then copy the buffer to
the screen, and the effect will be of having panned the window to the new
location.
Another argument in favor of a small viewing window is that it restricts the
amount of display memory actually drawn to. Restricting the display memory
used for animation reduces the total number of display-memory accesses, which
in turn boosts overall performance; it also improves the performance and
appearance of panning, in which the whole window has to be redrawn or copied.
If you keep a close watch, you'll notice that many high-performance animation
games similarly restrict their full-featured animation area to a relatively
small region. Often, it's hard to tell that this is the case, because the
animation region is surrounded by flashy digitized graphics and by items such
as scoreboards and status screens, but look closely and see if the animation
region in your favorite game isn't smaller than you thought.
Next month, I'll put the important parts of dirty-rectangle animation into
assembler, and I'll coalesce dirty rectangles to minimize display-memory
accesses--and maybe, just maybe, I'll do some panning. Then we'll see what
kind of stuff dirty-rectangle animation is really made of.


3-D Reading


As anyone who's been following this column for a while knows, I'm keenly
interested in 3-D graphics. Thus, it is with considerable pleasure that I'm
able to report that Programming in 3 Dimensions: 3-D Graphics, Ray Tracing,
and Animation by Christopher D. Watkins and Larry Sharp (M&T Books, 1992) is
good stuff. There's a fair amount of theory, and lots of 3-D implementation,
from modeling and scenes to ray tracing and finally, animation. The animation
is the precomputed, playback kind, of the Autodesk Animator sort, and while it
lacks the on-the-fly flexibility of the real-time animation we've developed in
this column, my oh my, it does look good. If you get this book, I strongly
suggest you get the disk as well; in which case, run ANIMATE.EXE, with BOUNCE
as the input file, and marvel that you now have, in source form, all the
software needed to implement that animation. Ten years ago, I'll bet you
couldn't have produced this level of fully rendered, real-time playback
animation for less than $50,000 in hardware and software; now, a couple of
thousand will easily do the trick. What a great time this is to be a
programmer! Recommended.

_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash

[LISTING ONE]

/* Sample simple dirty-rectangle animation program. Doesn't attempt to
coalesce
 rectangles to minimize display memory accesses. Not even vaguely optimized!
 Tested with Borland C++ 3.0 in the small model. */

#include <stdlib.h>
#include <conio.h>
#include <alloc.h>
#include <memory.h>
#include <dos.h>

#define SCREEN_WIDTH 320
#define SCREEN_HEIGHT 200
#define SCREEN_SEGMENT 0xA000

/* Describes a rectangle */
typedef struct {
 int Top;
 int Left;
 int Right;
 int Bottom;
} Rectangle;


/* Describes an animated object */
typedef struct {
 int X; /* upper left corner in virtual bitmap */
 int Y;
 int XDirection; /* direction and distance of movement */
 int YDirection;
} Entity;

/* Storage used for dirty rectangles */
#define MAX_DIRTY_RECTANGLES 100
int NumDirtyRectangles;
Rectangle DirtyRectangles[MAX_DIRTY_RECTANGLES];

/* If set to 1, ignore dirty rectangle list and copy the whole screen. */
int DrawWholeScreen = 0;

/* Pixels for image we'll animate */
#define IMAGE_WIDTH 11
#define IMAGE_HEIGHT 11
char ImagePixels[] = {
 15,15,15, 9, 9, 9, 9, 9,15,15,15,
 15,15, 9, 9, 9, 9, 9, 9, 9,15,15,
 15, 9, 9,14,14,14,14,14, 9, 9,15,
 9, 9,14,14,14,14,14,14,14, 9, 9,
 9, 9,14,14,14,14,14,14,14, 9, 9,
 9, 9,14,14,14,14,14,14,14, 9, 9,
 9, 9,14,14,14,14,14,14,14, 9, 9,
 9, 9,14,14,14,14,14,14,14, 9, 9,
 15, 9, 9,14,14,14,14,14, 9, 9,15,
 15,15, 9, 9, 9, 9, 9, 9, 9,15,15,
 15,15,15, 9, 9, 9, 9, 9,15,15,15,
};
/* Animated entities */
#define NUM_ENTITIES 10
Entity Entities[NUM_ENTITIES];

/* Pointer to system buffer into which we'll draw */
char far *SystemBufferPtr;

/* Pointer to screen */
char far *ScreenPtr;

void EraseEntities(void);
void CopyDirtyRectanglesToScreen(void);
void DrawEntities(void);

void main()
{
 int i, XTemp, YTemp;
 unsigned int TempCount;
 char far *TempPtr;
 union REGS regs;
 /* Allocate memory for the system buffer into which we'll draw */
 if (!(SystemBufferPtr = farmalloc((unsigned int)SCREEN_WIDTH*
 SCREEN_HEIGHT))) {
 printf("Couldn't get memory\n");
 exit(1);
 }

 /* Clear the system buffer */
 TempPtr = SystemBufferPtr;
 for (TempCount = ((unsigned)SCREEN_WIDTH*SCREEN_HEIGHT); TempCount--; ) {
 *TempPtr++ = 0;
 }
 /* Point to the screen */
 ScreenPtr = MK_FP(SCREEN_SEGMENT, 0);

 /* Set up the entities we'll animate, at random locations */
 randomize();
 for (i = 0; i < NUM_ENTITIES; i++) {
 Entities[i].X = random(SCREEN_WIDTH - IMAGE_WIDTH);
 Entities[i].Y = random(SCREEN_HEIGHT - IMAGE_HEIGHT);
 Entities[i].XDirection = 1;
 Entities[i].YDirection = -1;
 }
 /* Set 320x200 256-color graphics mode */
 regs.x.ax = 0x0013;
 int86(0x10, &regs, &regs);

 /* Loop and draw until a key is pressed */
 do {
 /* Draw the entities to the system buffer at their current locations,
 updating the dirty rectangle list */
 DrawEntities();

 /* Draw the dirty rectangles, or the whole system buffer if
 appropriate */
 CopyDirtyRectanglesToScreen();

 /* Reset the dirty rectangle list to empty */
 NumDirtyRectangles = 0;

 /* Erase the entities in the system buffer at their old locations,
 updating the dirty rectangle list */
 EraseEntities();

 /* Move the entities, bouncing off the edges of the screen */
 for (i = 0; i < NUM_ENTITIES; i++) {
 XTemp = Entities[i].X + Entities[i].XDirection;
 YTemp = Entities[i].Y + Entities[i].YDirection;
 if ((XTemp < 0) ((XTemp + IMAGE_WIDTH) > SCREEN_WIDTH)) {
 Entities[i].XDirection = -Entities[i].XDirection;
 XTemp = Entities[i].X + Entities[i].XDirection;
 }
 if ((YTemp < 0) ((YTemp + IMAGE_HEIGHT) > SCREEN_HEIGHT)) {
 Entities[i].YDirection = -Entities[i].YDirection;
 YTemp = Entities[i].Y + Entities[i].YDirection;
 }
 Entities[i].X = XTemp;
 Entities[i].Y = YTemp;
 }

 } while (!kbhit());
 getch(); /* clear the keypress */
 /* Back to text mode */
 regs.x.ax = 0x0003;
 int86(0x10, &regs, &regs);
}

/* Draw entities at current locations, updating dirty rectangle list. */
void DrawEntities()
{
 int i, j, k;
 char far *RowPtrBuffer;
 char far *TempPtrBuffer;
 char far *TempPtrImage;
 for (i = 0; i < NUM_ENTITIES; i++) {
 /* Remember the dirty rectangle info for this entity */
 if (NumDirtyRectangles >= MAX_DIRTY_RECTANGLES) {
 /* Too many dirty rectangles; just redraw the whole screen */
 DrawWholeScreen = 1;
 } else {
 /* Remember this dirty rectangle */
 DirtyRectangles[NumDirtyRectangles].Left = Entities[i].X;
 DirtyRectangles[NumDirtyRectangles].Top = Entities[i].Y;
 DirtyRectangles[NumDirtyRectangles].Right =
 Entities[i].X + IMAGE_WIDTH;
 DirtyRectangles[NumDirtyRectangles++].Bottom =
 Entities[i].Y + IMAGE_HEIGHT;
 }
 /* Point to the destination in the system buffer */
 RowPtrBuffer = SystemBufferPtr + (Entities[i].Y * SCREEN_WIDTH) +
 Entities[i].X;
 /* Point to the image to draw */
 TempPtrImage = ImagePixels;
 /* Copy the image to the system buffer */
 for (j = 0; j < IMAGE_HEIGHT; j++) {
 /* Copy a row */
 for (k = 0, TempPtrBuffer = RowPtrBuffer; k < IMAGE_WIDTH; k++) {
 *TempPtrBuffer++ = *TempPtrImage++;
 }
 /* Point to the next system buffer row */
 RowPtrBuffer += SCREEN_WIDTH;
 }
 }
}
/* Copy the dirty rectangles, or the whole system buffer if appropriate,
 to the screen. */
void CopyDirtyRectanglesToScreen()
{
 int i, j, k, RectWidth, RectHeight;
 unsigned int TempCount;
 unsigned int Offset;
 char far *TempPtrScreen;
 char far *TempPtrBuffer;

 if (DrawWholeScreen) {
 /* Just copy the whole buffer to the screen */
 DrawWholeScreen = 0;
 TempPtrScreen = ScreenPtr;
 TempPtrBuffer = SystemBufferPtr;
 for (TempCount = ((unsigned)SCREEN_WIDTH*SCREEN_HEIGHT); TempCount--; ) {
 *TempPtrScreen++ = *TempPtrBuffer++;
 }
 } else {
 /* Copy only the dirty rectangles */
 for (i = 0; i < NumDirtyRectangles; i++) {
 /* Offset in both system buffer and screen of image */

 Offset = (unsigned int) (DirtyRectangles[i].Top * SCREEN_WIDTH) +
 DirtyRectangles[i].Left;
 /* Dimensions of dirty rectangle */
 RectWidth = DirtyRectangles[i].Right - DirtyRectangles[i].Left;
 RectHeight = DirtyRectangles[i].Bottom - DirtyRectangles[i].Top;
 /* Copy a dirty rectangle */
 for (j = 0; j < RectHeight; j++) {

 /* Point to the start of row on screen */
 TempPtrScreen = ScreenPtr + Offset;

 /* Point to the start of row in system buffer */
 TempPtrBuffer = SystemBufferPtr + Offset;

 /* Copy a row */
 for (k = 0; k < RectWidth; k++) {
 *TempPtrScreen++ = *TempPtrBuffer++;
 }
 /* Point to the next row */
 Offset += SCREEN_WIDTH;
 }
 }
 }
}
/* Erase the entities in the system buffer at their current locations,
 updating the dirty rectangle list. */
void EraseEntities()
{
 int i, j, k;
 char far *RowPtr;
 char far *TempPtr;

 for (i = 0; i < NUM_ENTITIES; i++) {
 /* Remember the dirty rectangle info for this entity */
 if (NumDirtyRectangles >= MAX_DIRTY_RECTANGLES) {
 /* Too many dirty rectangles; just redraw the whole screen */
 DrawWholeScreen = 1;
 } else {
 /* Remember this dirty rectangle */
 DirtyRectangles[NumDirtyRectangles].Left = Entities[i].X;
 DirtyRectangles[NumDirtyRectangles].Top = Entities[i].Y;
 DirtyRectangles[NumDirtyRectangles].Right =
 Entities[i].X + IMAGE_WIDTH;
 DirtyRectangles[NumDirtyRectangles++].Bottom =
 Entities[i].Y + IMAGE_HEIGHT;
 }
 /* Point to the destination in the system buffer */
 RowPtr = SystemBufferPtr + (Entities[i].Y*SCREEN_WIDTH) + Entities[i].X;

 /* Clear the entity's rectangle */
 for (j = 0; j < IMAGE_HEIGHT; j++) {
 /* Clear a row */
 for (k = 0, TempPtr = RowPtr; k < IMAGE_WIDTH; k++) {
 *TempPtr++ = 0;
 }
 /* Point to the next row */
 RowPtr += SCREEN_WIDTH;
 }
 }

}





























































January, 1993
PROGRAMMER'S BOOKSHELF


Portrait of an Artist as a Programmer




Al Stevens


The book Aaron's Code: Meta-Art, Artificial Intelligence, and the Work of
Harold Cohen is about an artist and his computer program. The artist who wrote
the program is Harold Cohen; the program, which creates art, is named Aaron.
The cover is lavishly decorated with a many-colored picture that suggests two
human figures in a lush field of vegetation. You learn from the inside cover
that the picture is "Meeting on Gauguin's Beach," and that it was drawn by
Aaron, the program and hand-colored by Cohen, the human. That description
would compel any computer programmer to start reading. Once you start, you
will not want to stop. Besides reading about the wonder that is Aaron and the
remarkable man who created it, you'll find you self engrossed in a finely
crafted work, thoughtful and thought-provoking, beautifully written, and going
straight to the heart of its reader as well as its subject.
Subthemes run throughout the book: art itself; its social significance;
philosophy; artificial intelligence. But to me the book is mostly about Cohen,
a programmer; Aaron, a computer program that creates drawings; and the
consequence of Aaron's existence. I say, "to me" because Aaron's Code exhibits
the characteristics of the art that it discusses. The reader finds--discovers,
actually--its meaning, which may not coincide with what the author means to
convey. Cohen holds that art is a meaning generator rather than a meaning
communicator, and this book stands as testimony to that idea. Given that
readers are interested in the notion that a computer can produce an original
work of art, each reader will find in Aaron's Code what he or she is inclined
to find, and not everyone will draw the same conclusions. I'm interested in
computer programs and programming; this book is, therefore, about them. I'm
interested in programmers, and the book is about one. I'm interested in the
social consequences of technology; the book is about that, too.
It is important to identify right away what Aaron is not. Aaron is not a
typical image generator of what has come to be known as "computer art." Aaron
does not generate geometric forms, certainly interesting, but infinitely
repeatable. Aaron does not produce fractals, beautiful and random, but not
representative of the items that comprise the world. Aaron is not a tool for
painters, designers, draftsmen, or animators to be used as a medium to express
the creative ideas of the human user. Instead, Aaron is a computer program
with a software interface to a hardware drawing device that creates original
pictures, each picture different from the others and each one
indistinguishable by the uninformed observer from the work of a human artist.
As such, Aaron is significant to the computer scientist as well as to the
artist, because it uses artificial intelligence to encapsulate and replicate
much of the behavior that the artist unconsciously employs to create art, an
understanding of which has eluded people ever since they first tried to
understand themselves.
The programmer Howard Cohen was a successful artist, first in his native
England and then in the United States. At age 39, unsatisfied or unfulfilled
by his work, he took a one-year leave from the art world of London, a
sabbatical that he extended until it became permanent. Cohen relocated to
California. In 1968 he met Jef Raskin, a computer programmer and former Dr.
Dobb's editor. Raskin is known now for his later role in the development of
the Macintosh. Then a programmer associated with a university computer center,
Raskin exposed Cohen to a CDC computer and to Fortran and taught him the
fundamentals of programming. As happens with so many, Cohen was taken with the
power and potential of programming, and he pursued it, apparently as a hobby,
although he started several years before computer hobbyists were commonplace.
There were no personal computers then, and computer time on mainframe and
minicomputers was generally inaccessible.
Cohen's interest in computer programming complemented his quest to understand
the process by which the artist uses what he knows in order to paint. He
assumed that if he could use a computer to model that behavior, he could then
come to understand it. The first version of Aaron was the result, and it
simulated the artist's creative behavior by storing and applying certain rules
of construction and representation as specified by Cohen. Aaron was the artist
and Cohen the meta-artist, and Aaron's development progressed from the first
stage, which could draw simple shapes without apparent meaning, perspective,
or spatial relationships, to the current version, which creates figurative,
three-dimensional drawings of things that the observer recognizes, such as
people, rocks, and plants. The book traces this progression and intersperses
many relevant discussions of the underlying philosophies and consequences of
such a project.
The programming reader will find much with which to relate in Aaron's Code. We
will ask questions, too, about issues that McCorduck does not address--or, at
least, not adequately for us. Cohen wrote the first version of Aaron in
Fortran on the CDC 3200. It is amusing to read of his discovery that batch
debugging by passing card decks through the window to an unseen computer was
less than productive. He solved the problem by getting hands-on access to a
Data General Nova. Subsequent machines included the PDP-11 and VAX, with the
current implementation a Micro-VAX. Sometime during those ports he switched
from Fortran to the C language, which McCorduck calls a "trifle obsolescent."
Today, all new development is proceeding in LISP on a donated LISP machine,
reflecting the program's roots in artificial intelligence. All these ports
suggest a revealing but unrevealed study in portability. We can assume that
Cohen ran into obstacles moving the program among languages and computers. We
would like to hear about some of them, but this book does not go in that
direction, being more concerned about Aaron the result than the problems of
its creation.
It is not clear how extensively Cohen studied programming as a discipline
beyond what he needed to learn to develop Aaron. He seems to have
independently discovered certain established tenets of artificial
intelligence, learning later that disciplines already exist that have covered
those bases comprehensively. Nonetheless, a passage in the chapter titled,
"Art and Science" implies that someone, either McCorduck or Cohen, understands
the basis of object-oriented technology at a conceptual level at least. Here
is what McCorduck says about the search in artificial intelligence for the
principles of behavior:
Among the first such principles proposed is that intelligence lies in the
ability to process symbols. For science, definitions must be precise; a symbol
is an entity capable of carrying denotation and connotation; a symbol system
is a collection of patterns and processes, the processes capable of producing,
destroying, and modifying the symbols. The most important property of patterns
is that they are able to designate objects, processes, or other patterns. When
patterns designate processes, they can be interpreted. Interpretation implies
carrying out the designated processes--in short, action. All this is made
manifest in programming a computer.
Subsequent discussions on the hierarchical nature of Aaron's code reveal a
programming model that resembles structured code. It might not really be
structured--we don't get to see any of the code to verify it--but the
hierarchy that Cohen describes suggests structure, whether he knew about and
conscientiously applied the discipline, or simply "discovered" again the
programming model that best fit his requirements.
Aaron never draws the same picture twice. The book tells us that one of the
variables that influences Aaron's decisions about what to draw is its starting
position on the page, which, I assume, the operator specifies. Surely that
can't be all. There are a finite number of starting points. We are left to
wonder about the other seeds that influence the random picture generator. We
can guess, or at least speculate on how we'd do it, but I'd prefer to know
what Cohen chose and how he chose it.
Cohen's drawing hardware holds some fascination for the technical reader--as
much for what it is as for what the author leaves out in discussing it. The
first device was a mechanical turtle that wheeled around the floor on a
mural-sized sheet of paper, raising and lowering a pen. The book suggests that
Cohen built the turtle himself. He abandoned it for what appears to be a
flat-bed plotter because the turtle, being cute, drew attention during
exhibitions from the artwork it was drawing. Some pictures are said to be
produced on a laser printer. The book says that the plotter, which it does not
identify as such, is a "homebrew" device, suggesting that Cohen built it
specifically for Aaron, although why he did not use a commercial plotter is
unclear.
Aaron's Code contains more than just that which attracts the programming
reader. The author explores the significance of a program that has the
potential to be in many places simultaneously generating unique works of art.
Does its ability to mass produce lessen the value of its creations? What is
the test of quality? Aaron cannot critique or reject its own output. The
program has no archival storage of past works. Its performance does not change
due to experience, criticism, or acceptance. It does not repeat qualities that
sell well and reject those that do not. It is the perfect producer of art in
the sense that it performs without regard to ego or sensitivity to the
reaction of its peers (of which there are none) or its patrons (of which there
are many). What are the consequences of works of art that the
artist--Aaron--creates after the meta-artist--Cohen--dies? Who owns the
creative rights to the work? If a pirated copy of Aaron creates a picture, is
the picture a part of the pirate's contraband? By the way, the author
misunderstands digital technology when she suggests that pirated copies of
Aaron would lack nuances that are lost in the pirate's disk copier--as if
computer programs were analog recordings subject to signal-to-noise ratio loss
in the copying process.
There is always the temptation to compare Cohen's paintings with Aaron's
drawings, and several observers are reported to have done so. Their
conclusions are naturally suspect because they were familiar with Aaron's
origins. One story, however, which may be apocryphal, describes how a person
of some stature in the art world sees an Aaron drawing, does not know its
story, and remarks that the picture reminds him of Cohen's work. The observer
then proceeds to wonder what has become of Cohen.
There's much to learn from Aaron's Code. But, better still, the book, by
allowing us to find its meaning from within ourselves, pushes us to think
about the issues involved in a project like Aaron. As programmers we are
uniquely equipped to do that. The book discusses art and its history and
philosophy and how art is essential to a thinking civilization. It makes us
think about the cognitive process and thought, how they work, what their
results are, how feedback modifies subsequent thought. It contrasts human
spontaneity with a computer's random processes, suggesting that the two cannot
always be distinguished. Finally, the book tells us how the artist gave his
own rules and knowledge to a computer program, how the programmer saw to it
that the program learned and executed them properly, and how the man, in so
doing, learned for himself what they were.



































January, 1993
OF INTEREST





DynaTek Automation Systems is distributing RAIDmark, a tool for benchmarking
storage subsystems. RAIDmark is a RAID- and network-aware benchmarking tool
that uses an application-level performance-measurement approach, which
facilitates a universal storage subsystems comparison.
RAIDmark simulates real-life environments, allowing it to handle all
read/write random-access devices and to predict performance in a multithreaded
environment. RAIDmark is user configurable and can be used for optimizing
network storage subsystems and disk arrays. A whitepaper detailing the
algorithms used in RAIDmark is included with the software.
RAIDmark is available free of charge. Reader service no. 21.
DynaTek Automation Systems Inc. 15 Tangiers Road Toronto, Ontario Canada M3J
2B1 416-636-3000
A Windows GUI toolkit named application::ctor (application constructor) from
Compass Point Software includes an object-oriented view editor, a
user-interface class library (with over 100 classes), and a C++ class browser.
Windows objects built with application::ctor can have all the functionality of
Windows dialog boxes, and the controls in a window can be related to each
other and to the window itself in such a way that the window layout
automatically rearranges or resizes the controls when the window is resized.
The window's surface can also be divided into object-oriented, user-definable
regions.
DDJ spoke with Jim Broomfield, a software engineer with Benson Douglas and
Associates (Cary, North Carolina), who is using application::ctor to develop a
Windows-based financial accounting package that has a relatively complicated
UI. For this reason he needed more control over the layout design than he
could have gotten with competing products, and so he chose what he termed
application::ctor's "sophisticated look and feel." Jim also commented that,
"The fact that the objects you draw are represented by C++ code from the class
object library lets you build a real stand-alone executable without a runtime
module."
The $99.00 package requires Windows 3.1 and a C++ compiler. Reader service no.
24.
Compass Point Software 332A Hungerford Drive Rockville, MD 20850 301-738-9109
Shamus Software has released a new version of MIRACL, their portable
multiprecision C library for implementing public-key cryptography systems. The
library lets you use multiprecision integer and fractional data types in your
programs. New to this version is a truncated-fraction alternative to
floating-point arithmetic, which is illustrated by a scientific calculator
program included in the package. Also included is PC-SCRAMBLER, which allows
secure communication between two remote PCs.
All routines in the library are written in standard C and source code is
included. Available via The Austin Code Works on a variety of platforms,
including the PC, Macintosh, SPARC, and VAX. The price is $90.00. Reader
service no. 23.
The Austin Code Works 11100 Leafwood Lane Austin, TX 78750-3587 512-258-0785
CTOOLS960 4.0 from Intel is a C compiler for the i960 RISC processor. What's
particularly interesting about the compiler is that it incorporates
profile-driven optimization technology. That is, it uses information about the
application program collected during dynamic execution to drive the compiler's
optimization process. The compiler recognizes the program's behavior as it
runs so that the data can be collected and used for further optimization.
Profile-driven compiler optimization differs from local and global
optimization in that it is based on known runtime characteristics of the
program. Thus, the compiler optimizes beyond conventional optimizations and is
much more application specific. The profile information does the following:
allocates additional resources to program parts that execute most often;
transforms loops to separate paths that execute frequently from those that
don't, thus reducing interference between code segments; places paths
frequently taken through the program code in sequence, thus increasing
instruction cache-hit rates; places heavily used global variables in faster
memory such as on-chip SRAM; and performs function inlining across modules,
targeting the most heavily executed call sites.
CTOOLS960 costs $2000.00 for DOS, $3500.00 for the HP9000, and $4300.00 for
the SUN4 or IB RS/6000. Reader service no. 25.
Intel Corp. P.O. Box 58065 Santa Clara, CA 95052-8121 408-765-8080
C-SPY is a high-level language debugger for embedded applications from
Archimedes. It features a windowed interface that allows you to watch both C
and assembler source code on screen; memory-based design for maximum speed;
the ability to debug banked code, a watch window to monitor variables and
expressions; and full type recognition of the variable definitions allowed in
C. There is a log file option for logging commands and automated debug
sessions; full support for auto and global variables; and a built-in
assembler/disassembler to manipulate code in runtime debugging. All the basic
debugging commands are included, as well as optional getchar/putchar emulation
and a macro language to create complex breakpoints and simulate I/Os and
interrupts.
C-SPY gets mixed reviews from its users. Guy Turley, who writes
vending-machine software for Lizard Electronic Development Ltd. in Penryn,
England, told DDJ, "You have to type all the commands in, and although it does
the job, it's not as fast or efficient as a [competitor]."
On the other hand, Peter Bate, technical director at Cardinal Ltd., developers
of smart-card terminals (Hemel Hemstead, England), found C-SPY ideal for cross
development. "You can single-step at the C source level instead of debugging
in assembler," he said, "and another plus is that it runs on more than one PC,
reducing tooling-up costs."
C-SPY prices start at $995.00. Reader service no. 26.
Archimedes Software Inc. 2159 Union Street San Francisco, CA 94123
415-567-4010
Power Programming: The IBM XGA, by graphics expert Jake Richter is available
from MIS Press. The book includes full documentation and explanation of IBM's
XGA standard and information on the VESA XGA standard, as well as the
differences between the 8514/A and XGA APIs, complete register listings, and
sample programs.
The price is $29.95; with source-code disk, $59.95. Reader service no. 27.
MIS Press 115 W. 18th Street New York, NY 10011 212-886-9210
VisualWorks is ParcPlace's development environment for creating graphical,
client/server applications that can be instantly ported across PCs,
Macintoshes, and UNIX machines without recompilation. VisualWorks comprises
several key components, the first of which is its graphical user-interface
builder. The GUI builder uses a point-and-click palette and canvas with a
variety of layout tools. Once the GUI is built, the ChameleonView feature lets
you preview the new interface in the native-application look of either
Windows, Motif, OS/2 Presentation Manager, Macintosh, or OPEN LOOK.
VisualWorks provides direct database access for Oracle and Sybase and access
to other databases via the EDA/ SQL gateway. The VisualWorks External Database
Interface (EDI) handles database interaction. It makes the database interface
an object, letting you apply object-oriented programming to managing
relational data. The EDI allows developers to map data to objects allowing
access to information regardless of the type of database.
VisualWorks' reusable application framework generates code to tie the
interface and application together. It provides the structure for the
application and includes tools that prompt for scripting of actions.
The price of VisualWorks is $2995.00 for Windows, OS/2, and Macintosh, and
$4995.00 for UNIX. The database drivers cost $495.00 for Oracle or Sybase, and
$995.00 for EDA/SQL. Reader service no. 28.
ParcPlace Systems Inc. 999 E. Arques Avenue Sunnyvale, CA 94086 408-481-909o
Scientific Endeavors has announced GraphiC/Win, a C graphics library for
Windows designed for creating scientific graphics. GraphiC/Win has routines
that create linear, log, semilog, 3-D, contour, polar, triangle, and 2-D and
3-D bar plots, as well as a variety of charts. It supports a full set of 3-D
routines for curves and surfaces, including hidden-line removal and projection
of 2-D planes and stacked contour plots into 3-D space. Mathematical functions
are included to help plot your data. Splines are available for 2-D and 3-D
curves to interpolate and smooth data. Data that is not on a uniform grid can
be used to draw 3-D surfaces and contour plots.
Source code for all of the library routines is available. GraphiC/Win costs
$495.00. Reader service no. 29.
Scientific Endeavors Corp. 508 N. Kentucky Street Kingston, TN 37763
800-998-1571 or 615-376-4146
The Codewright's Toolworks has released Alloc-GC, a garbage-collecting
replacement for malloc. When malloc runs out of memory, Alloc-GC garbage
collects, finding all unreferenced blocks, which are then reclaimed and
reused.
The design of C++ classes and the code for constructors and destructors are
often complex because objects must be deallocated. Garbage collection removes
most of these complexities, rendering destructors almost unnecessary. In
addition, Alloc-GC simplifies initialization and termination, and allocation
and deallocation code in C. Also, unlike some other garbage collectors,
Alloc-GC can use normal C pointers because it constructs bit tables so that it
can quickly check that values it uses as pointer address legal memory.
For more on garbage collection, see "Garbage Collection for C Programs," by
Giuliano Carlini and Susan Rendina in the November 1992 DDJ.
The price is $30.00 for a personal, noncommercial license, $130.00 for a DOS
commercial license, and $300.00 for a SPARC commercial license. Reader service
no. 32.
Codewright's Toolworks Box 990 San Pedro, CA 90733-0990 310-514-3151
Now available from Quarterdeck is the DOS Protected Mode Interface (DPMI)
Host, which accompanies QEMM-386, Quarterdeck's memory-management software.
The DPMI Host is compatible with Microsoft C/C++ 7, Borland C++, and Intel's
Code Builder Kit and supports virtual memory.
The DPMI Host fully implements version 0.9 of the DPMI specification, which
defines a standard software interface for allocating extended memory to 16-
and 32-bit protected-mode programs for the 286, 386, and 486.
DPMI Host costs $30.00 and is free to registered Quarterdeck users. Reader
service no. 31.
Quarterdeck Office Systems Inc. 150 Pico Boulevard Santa Monica, CA 90405
213-392-9851
BOUNDS-CHECKER 2.0, a DOS memory-protection tool, is now shipping from
Nu-Mega. Prior to this version, both BOUNDS-CHECKER and a heap checking
library were necessary in order to locate memory problems having to do with
use of nonallocated memory. The new BOUNDS-CHECKER not only detects problems
in a program's heap, stack, or data segment, but also handles array overrun
detection and finds illegal memory accesses outside a program and
automatically overwrites code. The new Smart Mode feature uses a built-in
knowledge base of predefined heuristics to automatically decide the legitimacy
of a memory access, thus obviating the need for system-level knowledge.
BOUNDS-CHECKER supports Microsoft C/C++ 7 and Borland 3.1, as well as
third-party memory managers. The price is $199.00. Reader service no. 30.
Nu-Mega Technologies Inc. P.O. Box 7780 Nashua, NH 03060-7780 603-889-2386











January, 1993
SWAINE'S FLAMES


Did You Hear the One About...?




Michael Swaine


I just finished reading Gates: How Microsoft's Mogul Reinvented an
Industry--and Made Himself the Richest Man in America. The whole book, not
just the title, although getting through the title is an accomplishment.
Actually, what I read was the uncorrected author's proof of the book, the
book-publishing equivalent of beta software. The book is due for release early
in 1993 from Doubleday. It's written by PC/Computing columnist Stephen Manes
and Seattle Times reporter Paul Andrews.
From the title you might think that this was a how-to-get-rich book, and I
suppose it does implicitly offer a success formula. Let me see if I can
distill it. 1. Be smart. 2. Focus. But you don't need to read a 500-page book
to learn that.
What fills the 500 pages are anecdotes. Lots of them.
My favorites are the coding-crunch stories. How Paul Allen wrote and
hand-assembled the loader for Altair Basic on the flight to Albuquerque in
1975. Microsoft programmers knocking out the ceiling tiles to get some
ventilation in the windowless, "secure" room IBM insisted on during the
development of DOS. Richard Brodie's rapid programming on the original version
of Microsoft Word, under some daunting constraints (write to Charles Simonyi's
virtual-machine p-code, hedge the bet on the mouse by supplying a second
complete, keyboard-only interface).
Most of the coding stories are about Bill Gates, though. Adolescent Gates
sneaking out of his parents' house for late-night programming binges at a
local time-share facility. Teenaged Gates skipping baths and sleep and eating
Tang with his fingers one summer while testing software for the Bonneville
Power Administration. Microsoft President Gates snatching away the text editor
for the TRS-80 Model 100 and rewriting it overnight. (This was apparently
Gates's last program to ship as a Microsoft product, and I dug out my Model
100 to marvel again at how much the Microsoft team, including its chairman,
managed to pack into that 32K ROM.) Chairman Gates winning a 1986
Microsoft-sponsored programming contest, beating Jeff Duntemann and Ray
Duncan, among others.
For my taste, there are too few of these coding stories in the book, and too
many car stories. The authors appear to have catalogued every automobile and
every ticket Bill Gates ever acquired. But that's a quibble.
Mostly the book is about Microsoft, because that's mostly what Bill Gates is
about. I had already heard most of these stories while researching Fire in the
Valley, my book on personal-computer history. In fact, replaying some tapes
from 1982, I learned just how practiced Bill Gates had become in telling some
of these stories. But there were surprises in the book even for as jaded a
reader as myself.
I learned, for example, that if it hadn't been for Roland Hansen's insistence,
Windows would have gone out the door under the name Interface Manager. It was
Hansen who was responsible for what I have long considered an extremely
effective ploy: the company-name-plus-generic naming convention for Microsoft
products. Microsoft Chart, Microsoft Word.
And of course the story of Altair Basic is here. And how Microsoft got the
contract to develop the operating system for the IBM PC, and why Digital
Research didn't, and where the operating system actually came from, and how
early IBM was making its plans to dump its software partner. And the
always-tricky relationship with Apple, from the scuttling of MacBasic to the
lawsuit over Windows. The story of the development of Windows weaves through
half the book and, reading it, I got a clearer understanding of why it took so
long and why the early versions were so lame.
Also present in more than half the book is the influence of IBM. Microsoft
profited immensely from its relationship with IBM, but the book gives an idea
of the cost, too. The essence of that relationship is perhaps captured in the
acronym Steve Ballmer would like to forget: BOGUS. I'll let Manes and Andrews
tell you what it stands for.
There are a lot of specific details in the book about system software and
application development at Microsoft; enough to inform one's reading of the
recent allegations that Microsoft hasn't kept a secure wall between the two
activities.
Ross Perot appears in the book, too; with Gates and Perot both recounting
Perot's attempt to buy Microsoft. The quotes from Perot evoke the man nicely.
In fact, the style of the book evokes the style of Bill Gates rather well.
That, stylistically, is its strength and its weakness. Bill Gates is very
bright and very knowledgeable about his company, the industry, and the
technology, but he's not deep and he's not eloquent. Like many of the overage
adolescents who surround him, he speaks a tiresome language of gratuitous
exaggeration, superficiality, and jargon. When it's recounting anecdotes, the
book holds closely to this style.
If, as I do, you regard this chiefly as a book of entertaining anecdotes,
that's a weakness.


































February, 1993
February, 1993
EDITORIAL


Tinker, Tailor, Librarian, Spy




Jonathan Erickson


John Gilmore didn't think he'd have to go to jail just for checking a book out
of the library. After all, his library card was up to date and he didn't owe
any overdue-book fines. Still, with the cloud of ten years in the slammer
looming over his head and the U.S. Department of Justice breathing down his
neck, Gilmore was understandably uneasy, especially when the Feds dusted off a
1950s espionage law to threaten him with. It seems that the National Security
Agency had classified as "secret" a couple of the books he'd checked out, and
because Gilmore wouldn't turn them over to the Feds (they weren't his to give)
or tell which public library lent them, he saw a prison library, not a public
one, in his future.
There's no question that the books Gilmore borrowed (written in 1939 and 1941
by NSA founder William Friedman) were once classified. In 1975, however, they
were declassified and put on public library shelves. Then Reagan's 1982
Executive Order 12356 reclassified reams of information which, if disclosed,
"reasonably could be expected to cause damage to the national security,"
including Friedman's encryption studies, which are still used as standard
texts in military classes.
But no one told the librarians.
The NSA believed that if Gilmore went public with the information in the
50-year old books (he was planning on making 20 to 30 copies of the material
for distribution to other libraries), our national security would be in
jeopardy because foreign countries would know that we know how they
implemented their encryption schemes. (Some countries, it seems, still use
encryption schemes based on Friedman's work, leading to the question of how
they gained access to our classified information in the first place.)
Presumably, these countries would then change their codes, requiring the NSA
to come up with new cracking techniques.
That the secret documents existed was no secret. In fact, Gilmore had
previously used the Freedom of Information Act to request from the NSA copies
of the books--and was denied. He subsequently sued the agency for the release
of three documents, including the two he later found in the library.
Just before last Thanksgiving, the NSA abruptly declassified the two books
Gilmore had in hand, and the Justice Department dropped its threat. No reasons
were given, but the NSA must have discovered that legal precedence exists
whereby once secret documents have been made public, they can't be taken back
into the secret sector. Or the agency might have decided it couldn't retrieve
all public copies, or maybe that the information was harmless. Since no
mention was made of the third book Gilmore had requested, he's still pursuing
his lawsuit for access to that one.
This isn't the first time that, under the guise of security, the Feds have
tried to stifle information. I've referred before to the American Library
Association's Less Access to Less Information by and about the U.S.
Government: A 1988-1991 Chronology, a 230-page litany of attempts by all
branches of government to stem information exchange. It makes for interesting,
if bewildering, reading.
Nor is Gilmore's saga the first instance of the government's focusing its
attention on nonmilitary encryption applications. Back in 1991, if you
remember, the FBI was among the backers of the failed Senate Bill 266, an
onerous proposal that would have required voice-mail and network vendors to
allow government agencies a back door into encryption engines.
Even though SB 266 didn't make it into law, the government hasn't given up.
One of the more hotly debated topics on Capital Hill this past year has been
the FBI's "Digital Telephony Proposal," which would require that all
communications and computer systems be designed to enable Justice Department
interception of private messages on a concurrent and remote basis before they
can come to market. Putting Constitutional issues aside, the proposal would
greatly increase design and manufacturing costs for system vendors, putting
them at a disadvantage with foreign competitors. The penalty for not complying
would be a fine of $10,000 a day.
With data and voice wireless communication becoming the norm at an
astronomical rate, privacy is at the forefront of user concern, and encryption
is the obvious solution. For the first time in history, there's a huge
nonmilitary mass market for encryption technology, and Gilmore's intent was to
jump-start software and hardware products for these emerging markets, not to
simply divulge secrets for the sake of doing so. If publicly funded, relevant
information is already available, it only makes good sense to use it.
To entrepreneurs, this spells opportunity, but to the government, it means
loss of control.





































February, 1993
LETTERS







Fill-style Error


Dear DDJ,
There is a small, but annoying error in the files * BGI distributed with
various Borland programming languages. This concerns the fill style
LtBkSlashFill, which is completely wrong--it should be the reverse of
LtSlashFill, but instead seems to be random.
The fix may interest some readers. Table 1 shows each file with the offset of
the eight consecutive bytes defining LtBkSlashFill. The eight consecutive
bytes (hex) starting at the offset:
 A5 D2 69 B4 5A 2D 96 4R
use Norton's Disk Editor, for example, to search for the string; then replace
it with the following.
 80 40 20 10 08 04 02 01
which will give the desired result. (The bytes of LtShashFill are 01 02 04 06
10 20 40 80.) To check the location, note that the next eight bytes are those
of HatchFill:
 FF 88 88 88 FF 88 88 88
Table 1

pEGAVGA.BGI Offset 0950H
CGA.BGI Offset 0B80H
HERC.BGI Offset 0B00H
ATT.BGI Offset 0B90H
IBM8414.BGI Offset 01270H
PC3270.BGI Offset 0AD0H

To test your work, before and after, in Turbo Pascal 6.0 (or earlier) compile
and run the sample program BGI-DEMO.PAS. For anyone interested in modifying
other fill styles, 1 mention that the 12 style strings of eight bytes are in
this order: LineFill, LtShashFill, SlashFill, BkSlashfill, LtBkSlashFill,
HatchFill, XHatchFill, InterleaveFill (sic), WideDotFill, CloseDotFill,
SolidFill, EmptyFill.
I have written to Borland's Pascal Technical Support about this problem with
each version of Turbo Pascal since 3. The response has always been to deny
that any problem exists, with no indication that anyone has actually looked at
it.
I wonder if Borland is going to abandon its languages eventually. While they
are upgraded in various ways, in some ways they are relies. For instance, the
current flagship language, Borland C++, manuals still go on and on about how
to program the CGA, and offer nothing on the Super-VGA. Nor does C++ or Turbo
Pascal give you a mouse unit. C++ does, but Turbo Pascal does not give options
for using extended memory. Neither C++ nor Turbo Pascal have compiler options
for using any processor other than the 286. While I am happy to receive
packages like Turbo vision, the Help Compiler, and the Whitewater Resource
Toolkit, I must belong to the vanishing race of those who need to program for
themselves rather than plug into prepackaged kits, and program on modern
equipment.
I would like to program a mouse on the graphics screen in Turbo Pascal, to get
results like the Norton Utilities user interface. No help from the Turbo
Pascal or Borland C++ manuals. So I bought the Microsoft Mouse Programmer's
Reference, 2/e. Its Appendix E, "Making Calls from Borland Turbo Pascal
Programs," attracted me. It contains one Pascal program, printed in such a way
that it appears to be actual code. It was first annoying that throughout the
book, mouse calls have four-integer parameters, but in this program there are
five, but let it pass. However, when I reached "If... begin... end; else..."
on page 316, lines 5 up through 4 up, I lost all confidence and want my money
back.
Harley Flanders
Ann Arbor, Michigan


Occupational Aberrations


Dear DDJ,
In the September 1992 "Letters" column, Charles Manning noted that in 1980
Fortran, the function MOD(x,y) does not return a value in the range 0..n-1,
"as any sane person would expect" (and as the mathematical function does ~ see
Knuth, vol. 2, pp. 38). As a newcomer to the world of computer science. I had
been wondering whether the common, erroneous mod function found in languages
for the IBM PC was an aberration of those languages alone or of computer
science in general; it appears that it is the latter. Can any old hands give
an explanation for this peculiar implementation of a standard mathematical
function?
Regardless of the origin of the problem, it seems to me that the best solution
is not to try to patch the existing function but to write a new one that works
properly; for example, in C the code:
 while (x>-y) x -y1
 while (X<0) x+=y:
will set x to the proper value of the mathematical function x mod y, where y
is required to be positive (as is usual in mathematics). This function is
slightly slower (about 10 percent) than the one given by Mr. Manning or the
more common one (x=x%y,if(x<0) x+=y1), but still is almost instantaneous. On a
16 MHz 386, 20,000 random repetition require less than one second, which
should be fast enough for most purposes.
Knuth extends the usual mathematical mod function by setting x mod ()=x and
allowing y to be negative, with x (mod(~y) being the negative of (=x)mod y.
Obviously, these could be incorporated in the above function if desired. The
computer-science version of the mod function does not allow y=0 and does not
agree with Knuth in any instance where x and y have opposite signs.
B.J. Ball
Austin, Texas


My Turn


Dear DDJ,
John McKnight (June 1992 "Letters") seems to have gone out of his way to miss
the point of my letter which you printed in the March 1992 Dr. Dobb's Journal.
To begin with, he objects to my referring to Clipper as a Nantucket language.
Perhaps he would have preferred something like "the Clipper implementation of
the dBase language," but I avoided that terminology for two reasons: 1.
Clipper embodies both a subset and a superset of the dBase language; and 2. my
remarks were concerned with the implementation of the language, rather than
its grammar and syntax. There seems to be no misunderstanding when we speak,
for example, of differences between VAX Fortran and Microsoft Fortran; I'd
have thought that it was obvious that it was in this sense that I referred to
Clipper up a Nantucket product. I fail to see what bearing Borland's recent
acquisition of Ashton-Tate, while newsworthy in its own right, has on the
present discussion.
Second nothing in my letter denigrated Clipper's grammer or syntax. I
explicitly recognized Clipper's screen manager as a valuable tool. I am well
aware that Clipper was designed to work with databases but I see no reason why
I should be denied the right to use Clipper's excellent screen manager for
whatever purpose I choose. I do, however object to having to carry the
overhead of those routines which I choose not to use. Third compile/link
speeds obviously depend on the compiler, the linker, and the machine on which
they run. (I used Optlink, which is the fastest of the three linkers on my
rather slow machine.) What is perhaps less obvious is that linking speed also
depends on the organization of the time library. My point was precisely that
linking is as John observes the slow step in any Clipper job. Clipper is by no
means the only language which suffers from this fault.

I am well aware are that a faster machine We would reduce turnaround times but
I prefer to do my development (or at least the final stages of testing) on a
minimal machine. The reason for this will be obvious to anyone who has ever
lost a potential sale because the client was either unwilling or financially
incapable of upgrading his or her machine to take advantage the latest
technology.
An obvious alternative to Clipper is Turbo Pascal plus Software Science's
Topaz package. The executable" for Turbo's null program is just over 1 Kbyte.
If you force Turbo to include the SAYGET4 module wackily implements the screen
manager this will add about 12 Kbytes. The DBF4 module which communicates with
the database adds about 20K; and using both of these together yields an .EXE
file of about 25K. Compile plus link times on programs that are of moderate
size seem to be about three times as fast as those with Clipper plus Optlink
and the executables are typically about 30 percent smaller. Obviously Borland
has a better idea of how to build a run time library. Is there any excuse for
the voracious libraries with which we are routinely plagued?
Yes John let's choose the right tools for the job whenever we can. I've
programmed on a dozen or so machines in at least 31 dialects of 17 languages.
I believe that my 35 years of experience in System development 12 of them
spent as manager of technical systems for a Fortune 50() chemical company have
given me some insight into what factors contribute to productivity in a
programming shop. I believe that I'm qualified to judge what constitutes an
effective tool for a given application in the DOS world I'm still searching.
Arpad Elo Jr.
St. Johnsbury, Vermont


Undocumented Debate


Dear DDJ,
I should like to comment that the issue of undocumented features is not so
much technical as it is social. There is a kind of intrinsic tension to the
relationship between the operating-system developer (OSD) and the
application-program developer (APD). In order not to forfeit the role of
managing computer resources the OSD must add functionality in response to the
APD's needs. But it is always in the APD's immediate interest to make
frivolous requests by way of offloading his own work onto the OSI). If the OSD
treats every request or suggestion of an OPD as a command the operating system
will come to contain essentially all of the logic of every application as its
size balloons into impossibility. Somehow the OSD has to assess the
credibility of each APD and react accordingly.
This situation is not confined to operating systems and applications programs.
It also exists between the developers of operating system and processors. and
as Tracy Kidder has shown, between the microcode and hardware-design sections
of a processor-design team. In cases where the antagonists know each other
personally. the situation is moderated by social amenities and by each
participant's desire to maintain his reputation for technical skill. When they
are strangers with a full range of ultimately conflicting interests, however,
all parties are likely to grab everything they can.
I suggest that undocumentation, as practiced bv Microsoft and Apple, has to be
seen in this context. The unspoken rule seems to be that the OSD will maintain
a fair assortment of undocumented features out there, and if you, the APD, are
prepared to go to the trouble of hacking the system to find the undocumented
features, you can use them, provided you don't use them lightly or for no very
good reason, and that, when and if the OSD really does have to change them,
you are prepared to fix your program to conform and to make it all right with
your customers. Additionally, information can be leaked to the APDs of known
probity, who can be trusted not to simply gang up and spike their code with
undocumented features in an attempt to present the OSD with a fait accompli.
Similarly, their requests for new features can be taken more seriously.
The problem with this is that in the first place, it is undoubtedly an
"old-boy network", with Microsoft's APDs as charter members? and as such
liable to the antitrust laws. Secondly? someone like Mr. Schulman presents
problems for this system. Obviously he believes that if a feature is there, it
is meant to be used--and is intent on forcing the entire corpus of
undocumented features into the official API. Together, I think these imply
that undocumentation must be replaced by something else.
A possible alternative would be a system whereby APDs may commission new
features in the operating system, subject to some kind of published price
schedule, with orderly procedures for the notification of interested panies,
etc.
Andrew D. Todd
Philadelphia, Pennsylvania


High-resolution Timing


Dear DDJ,
I've also added code to compensate for some hardware platforms that don't keep
the proper count vs. output consistency needed for accurate timing in all
circumstances.
Finally, since the VTD can return erroneous values (decrementing time),I've
added code to ensure that decrementing values are never returned.
Tom Roden
Irvine, California


































February, 1993
WHAT IS COGNITIVE COMPUTING?


Intelligence from nature conquers tough programming problems




Jonathan Erickson


R. Colin Johnson is Advanced Technology Editor for EE Times, author of the
book Cognizers -- Neural Networks and Machines That Think (John Wiley & Sons,
1988), editor of the monthly newsletter Cognizer Report, and outhor of the
Cognizer Almanac, a yearly report. Colin can be contacted at 10075 Barbur
Blvd., #405, Portland, OR 97219.


Cognitive computing denotes an emerging family of problem-solving methods that
mimic the intelligence found in nature. These methods draw from diverse areas
of scientific research and embrace various yet-to-be-proven theories about
naturally occuring phenomena. The common goal of these methods is to crack
tough problems that have resisted straightforward analytic solutions, such as
intractable problems caused by combinatorial explosions.
Recognition problems of all sorts--handwriting, speech, object--involve the
real-time processing of many related inputs. Just tracking objects from one
video frame to the next involves thousands of computations performed on
thousands of picture elements (pixels). Control problems of all sorts also
involve juggling many simultaneous constraints. Robotics controllers present
particularly thorny problems, since they must compute hundreds of
transcendental functions just to lift a finger accurately. Forecasting and
other statistical analysis methods also involve hundreds or thousands of
simultaneous equations. For instance, using an exhaustive search to find the
optimal route for a traveling salesman visiting a dozen cities takes more
computer time than it is worth. How is it that naturally intelligent entities
routinely solve such massively combinatorial problems without straining? Even
the little honeybee can plot complex routes that would challenge a
supercomputer to match. By copying the way that nature solves these problems,
cognitive computing instills that intelligence into your algorithms. For many
programming problems, cognitive computing can result in smaller,
faster-running programs that don't take as long to write and debug.


Modeling Nature


Natural intelligence abounds almost everywhere you lookin nature. Neural
networks extract principles from brain science to model recognition, learning,
and planning processes. Fuzzy logic more closely resembles the way humans
reason with approximate rules-of-thumb than does traditional logic, according
to its inventor, electronic engineer Lotfi Zadeh. Genetic algorithms draw
directly from insights gathered in genetics research--modeling Darwin's
principles of natural selection. It seems anywhere nature exhibits intelligent
solutions, some programmer somewhere is copying them. Fractals, for instance,
vaguely resemble the manlier in which living systems repeat patterns while
growing. Biologist Aristid Lindenmayer invented a specialized
variation--called "L-systems" after their author--which mimic the fractal-like
mechanisms of plant growth. Chaos theory uses mechanisms derived from
mathematical simulations of temporally repeating processes that are never
exactly the same twice--from dripping faucets to the weather. Cellular
auto-mata model a two-dimensional universe of "living" cells and a simple set
of ways they may interact. And the list goes on. Most of these methods mimic
solutions found in nature by creating a software simulation of that process
and plugging the parameters of real-world problems into it. The accuracy with
which the simulation models nature is usually secondary to solving those
real-world problems. Accordingly, there are a wide variety of development
tools to help build such simulations; see Table 1. Experienced programmers
sometimes begin with a specialized development tool with the option of
generating source code in a traditional language, say C. After studying the
code for a while, they catch on to encoding neural learning methods, fuzzy
logic, or genetic algorithms directly into their own programs.


Intelligent Interpolation


Cognitive computing substitutes more-intelligent processing techniques for the
long, complex, and often brittle programming techniques with which we are all
familiar. Models based on cognitive computing can often be as accurate as
closed-form analytic equations, but apply to a wider range of situations since
they are not limited, for instance, to being linear. However, one disadvantage
of cognitive computing techniques is that they often cannot be proven stable
by anything other than extensive field testing. Traditional analytic proofs of
stability are impossible to obtain for an approach that does not use an
analytic methodology.
Instead of an ordinary top-down, divide-and-conquer approach, cognitive
computing techniques seemingly jump to conclusions without going through all
the intermediate steps. The common element here is the notion of intelligent
interpolation. All three core cognitive computing technologies--neural-,
fuzzy- and genetic-based--derive their generality from interpolating the
solutions to problems with which they have not previously been faced from the
solutions to ones with which they are familiar. The intelligence with which
they perform this interpolation--sometimes called "generalization"--is key to
their success.
To be sure, if a reasonable number of closed-form equations can accurately
solve a problem, then there is no need to resort to cognitive computing
techniques. But if not, then cognitive computing offers more economical
alternatives to brute-force approaches like exhaustive searches through the
space of all possible solutions. The solution space for problems that resist
closed-form expression is usually too large to search in a reasonable amount
of time. But cognitive computing techniques sidestep that problem by
converging on an optional solution from relatively few working examples.
("Near optimal" may be more appropriate here, since again, these solution
cannot be proven to be optimal for the same reason they can't be proven
stable--no analytic method is available.)
Intelligent interpolation in a neural network, for instance, employs all
neurons simultaneously, each comparing its results to its neighbors in order
to combine them into an intelligent solution by consensus. A fuzzy system
performs a similar operation by evaluating all applicable rules in parallel
and intelligently combining their results. Likewise, genetic algorithms create
increasingly better solutions by mutating and splicing together the
best-so-far solutions until an optimum is reached.
When simulated on a standard computer, of course, you have to perform
supposedly parallel operations sequentially and keep track of intermediate
results. However, while sequential simulations of parallel processes may
squander hardware resources according to those building parallel analog
microchips, their operation appears to be no less effective than truly
parallel hardware (as long as your computer and/or accelerator board is fast
enough for a given application).


Applications


Deployed applications of cognitive computing abound--from signal processing to
pattern recognition, feature extraction, industrial inspection, business
forecasting, credit rating, securities picking, medical diagnosis, speech
processing, natural-language understanding, constraint satisfaction, robotics
control, and adaptive process control. The single most widely used technology
from cognitive computing is the adaptive filter, first named the "adaptive
neuron" by its inventor, electronic engineer Bernard Widrow. The adaptive
filter changes its characteristics in response to changing noise patterns in
the signals it receives. All modern high-speed communications rely on such
adaptive filters to cancel echo. [Editor's note: For more information on
adaptive filtering, see "Finding Significance in Noisy Data," by Roy E.
Kimbrell, DDJ, June 1992.]
Pattern recognition is at the heart of many neural-network applications. The
neural network learns the feature set from historical data. Such
classification systems work best when there is a large database of typical
examples from which to learn. Software simulations of neural networks have
learned to recognize everything from the underwater calls of whales to the
radio-waves of distant galaxies. For instance, Nestor (Providence, Rhode
Island), the oldest public neural-network company, got the banking industry
started with neural-based systems back in 1987. Now the whole banking industry
uses neural networks, according to BancTec (Dallas, Texas), to read the
numbers on bank checks and credit-card charge slips. Just last year (1992),
Synaptics (San Jose, California) announced the first microchip-based neural
network to read the numbers on checks and credit-card slips. Its I-1000
electronic retina is a parallel analog implementation that emulates the neural
networks of the eye-brain system (marketed to banks by VeriFone, Redwood City,
California). General-purpose neural and fuzzy microchips are already available
from Adaptive Solutions (Beaverton, Oregon), American NeuraLogix (Sanford,
Florida), Inform (Aachen, Germany), Intel (Santa Clara, California), Omron
(Schaumburg, Illinois), and Togai InfraLogic (Irvine, California), with
Motorola (Austin, Texas), NEC (Kawasaki, Japan), National Semiconductor (Santa
Clara, California), and Ricoh (Tokyo, Japan) planning or releasing microchips
in 1993.
Adaptive Solutions, Inform, NeuraLogix, and Omron wrote their own software for
programming their respective chips, but Intel has turned to independent
developers--BrainMaker from California Scientific Software (Nevada City,
California) and DynaMind from NeuroDynamX (Boulder, Colorado). Intel is also
working with Nestor on the development of an all-digital neural microchip
using a second-generation neural model based on radial-basis functions (to be
announced in 1993). The most successful neural-network startup company--HNC
(San Diego, California)--has had enough success with its pattern-recognition
capabilities for optical character recognition (OCR) to spin off that
operation to Mitek Systems (San Diego, California) last year. Neural
approaches to OCR have been successful enough at tough recognition tasks to
attract even the mainstream vendors. The first was Calera Recognition Systems
(Sunnyvale, California). The company has kept its use of neural technology
quiet, but last year (1992) it came out of the closet with its FaxGrabber
product for automatically converting incoming faxes to text. Calera's major
competitor in OCR, Caere (Los Gatos, California), announced it was using
neural networks with its AnyFax technology, prompting Calera's admission that
it had already been using neural networks.
Table 1
Table 1: Cognitive computing resource guide vendors. Source: The Cognizer
Almanac, Cognizer Co., Portland, Oregon.

 AbTech 804-977-0686
 AIMQAutomated Intelligent Modeler Machine-learning software using
 proprietary polynomial network technology. $1495

 Adaptive Solutions 503-690-1236
 BuildNet A comprehensive X-Window-Motif environment for conducting
 neural-network experiments and developing finished applications.
 $5000
 CNAPS-1064 General-purpose, all-digital microchips with 64 processing
 nodes running all popular neural-network algorithms can be programmed
 into this device.
 CodeNet Consists of software-development tools, including an assembler, a
 graphical-interface builder, a debugger, and libraries of common
 neural-network algorithms for CNAPS architecture. $15,000

 CNAPS-C C-language compiler extended for parallel processing and
 fixed-point math targeted at the CNAPS architecture. $3000

 AI Ware 216-421-2380
 N-NET EX Neural Network Development System Neural-net package for
 developing applications based on AI Ware's patented functional-link
 architecture. $1995
 N-NET 500 Series Neural Net Development System Same as above, for Sun
 Microsystems workstations. $2995

 American NeuraLogix 407-322-5608
 Fuzzy Pattern Comparator, NLX110 Fuzzy-data comparator using 32-bit
 precision with automatic configuration circuitry.
 Neural Processing Slice, NLX420 800 MIPS neural-slice architecture
 running at 300 million connections per second (CPS).
 FPC Applications System, ADS110 Fuzzy-pattern comparator applications
 system based on FDC device on PC card with menu-driven software. $395
 NPS Development System, NPS application system with PC card containing
 none or four NPS devices. $595 (ADS420-1), $1495 (ADS420-4)
 Fuzzy MicroController, NLX230 Eight-input fuzzy microcontroller with
 on-chip storage for 64 fuzzy rules sharing 16 membership functions.
 FMC Development System, ADS230 Fuzzy-logic controller development system
 based around FMC devices on PC card with menu-driven software. $395

 Applied Intelligent Systems 313-995-2035
 NetRead Software using neural-network learning method for reading
 semiconductor-wafer codes. $5000

 Aptronix (Marketed by Motorola) 408-428-1888, 408-428-1883 FuzzyNet BBS
 FIDE, Fuzzy Inference Development Environment Software-development
 environment for embedding fuzzy-logic-based solutions in applications.
 $1495

 Axcelis 206-624-2446
 Evolver Genetic-learning add-on for Wingz spreadsheet program. $345

 California Scientific Software 916-478-9040
 BrainMaker High-speed software simulation of back-propagation-of-errors
 learning method. Does not use floating-point operations. $195
 BrainMaker Professional Expanded version of the BrainMaker software
 simulator with hooks for calling a trained network from another
 program. $795
 NT5000 Neural Control System BrainMaker real-time data collecion and
 process-control interface. $7950
 Intel 80170NX Neural Network Development System Integrated-circuit
 neural-network control systems, with complete development environment
 allowing BrainMaker neural net to be created and trained on user's PC,
 then downloaded to a chip. 80170NX: $940 each; iNNTS: $11,800

 Epic Systems Group 818-355-2988
 Neuralyst A neural-network add-on for Microsoft's Excel spreadsheet.
 $165 (PC or Macintosh versions)
 Run-time Library Runtime library for the Macintosh or the PC. $495

 HyperLogic 619-746-2765
 CubiCalc User-programmable fuzzy-logic expert-system shell for Microsoft
 Windows; optional runtime compiler and library. $495 up
 CubiCalc RTC CubiCalc with runtime compiler and C-language object
 library allows programs to incorporate fuzzy logic into applications.

 $795
 CubiCard Hardware input-output board for connecting fuzzy spreadsheet to
 real-time data sources. $1495
 The OWL Neural Network Library C library of 20 neural-network
 simulations for PC with Microsoft or Turbo C or Macintosh with Think C.
 $499 up

 Inductive Solutions 212-945-0630
 NNetSheet 1.2 Neural-network algorithms implemented as spreadsheet
 formulas. The runtime system is the spreadsheet. $495
 Induce-It A case-based reasoning tool for building expert systems. $995
 NNetsheet-C A fast C implementation of many popular neural-network
 learning algorithms. $895
 GenSheet Uses Microsoft Excel as the user interface in implementing
 genetic algorithms as fast C-coded dynamic link libraries for
 IBM-compatible computers, and as code resources for Macintosh
 computers. $895

 Inform +49 2408-6094
 Fuzzy 166 chip Fuzzy microchip built for Inform by Siemens AG (Munich,
 Germany). $100
 FuzzyTECH Fuzzy-logic development tools, including a graphical input
 system, precompiler, compiler, and online development tool. Price
 ranges from $1600 stand-alone to $9200 fully configured.
 FuzzyTECH Shell Rule editor addition to fuzzyTECH that allows engineers
 to graphically adjust the values of running fuzzy systems. $1900
 FuzzyTECH NeuroFuzzy Module Adds the capability of permitting a neural
 network or other learning method such as a genetic algorithm.

 Intel 408-765-9235
 80170NX Neural-network chip using an all-analog circuitry plus EEPROM
 for synapses. $940
 iNNTS, Intel's Neural Network Training System Development system
 including two 80170s, interconnection board, and all software. $11,800
 EMB Board for users wishing to quickly prototype multichip setups. The
 multichip board can house up to eight 80170s (comes with two). $9750

 Mind's Eye 314-921-8433
 GA-lib C source code genetic-learning library.

 Nestor 401-331-9640
 Nestor Development System (NDS 1000) Neural-network development system
 for DOS-based PCs and Sun Microsystems workstations.

 NeuralWare 412-787-8222
 NeuralWorks Professional II-Sun (Sun-3, Sun-4, Sun-386i) Neural-network
 development system for professional engineers and researchers. $2995
 ($1495 PC, $4995 Silicon Graphics)
 Explorer-PC Introductory program for learning to use neural networks.
 $299
 NeuralWorks Designer Pack Converts networks developed under Professional
 II into C source code. (IBM PC or Sun))
 NeuralWorks User Defined Neurodynamics Allows customers to write their
 own learning rules and functions for Professional II. $995 (IBM PC),
 $1999 (Sun)

 NeuroDynamX 303-442-3539, 800-747-3531
 DynaMind A diagnostic tool for creating, training, and implementing
 neural networks. $145

 DynaMind Developer Bundles DynaMind with NeuroLink, a library of C-code
 routines for embedding runtime neural networks. $495
 iDynaMind Software trains networks on the Intel 80170NX ETANN.

 Norrad 603-434-0047
 NET-Link+ Genetic-learning Bridge Bridges gap between Nexpert Object,
 from Neuron Data, with several other packages, including: NeuralWare's
 neural-network simulator, NeuralWorks; AIM, the abductive-reasoning tool
 from AbTech Inc.; and TILShell, the fuzzy-logic development system from
 Togai InfraLogic, of Irvine, California, GA-Lib from Mind's Eye.
 $295--$595 (depending on configuration)

 Omron 708-843-7900
 FP 3000 Digital Fuzzy Processor Fuzzy-logic microchip designed for
 control applications.
 FS 10AT Digital Inference Software Fuzzy-logic development software using
 a fill-in-the-blank method to generate FP 3000 object code.

 Promised Land Technologies 203-562-7335
 Braincel Add-on neural network for the Excel spreadsheet. $249

 Synaptics 408-434-0110
 I-1000 Neural OCR microchip Analog neural-network used in the VeriFone
 Onyx check reader. Performs sensing and recognition of E13B at 1000
 CPS.

 Ward Systems Group 301-662-7950
 NeuroShell A production neural-network shell program designed to function
 in the domain of expert systems. $195
 NeuroSheet A NeuroShell option that imports spreadsheets, learns from
 them, and exports results back to spreadsheets. $99
 Database Option for NeuroShell Option for NeuroShell that processes
 dBase III files directly without conversion. $69
 NeuroWindows Attaches neural capabilities to programs written in
 Microsoft's Visual Basic, C, or other languages. $369

 Togai InfraLogic 714-975-8522
 Fuzzy C Development System Toolkit for developing fuzzy-logic expert
 systems.
 Generates C source code. $2490-$3900
 MicroFPL Development System Software toolkit for developing fuzzy-logic
 expert utilizing a runtime kernel for the target CPU. $2000-$17,000
 TILShell A computer-aided software engineering (CASE) tool for building
 fuzzy expert systems. $2300-$3300
 TILGen Automated rule-base generation tool. Uses neural-network
 learning to analyze inputs and generate its output. $975-$1100
 FC110 Development System Converts fuzzy-logic knowledge base to FC110
 DFP-based products. $750-$900

Calera's OCR engine learned from a huge training set resulting in over a
million different recognized variations of the standard alphanumeric
characters. Since its first product shipped in 1986, it has continually
updated its neural technology and is currently relying on an advanced learning
model derived from radial-basis functions. Caere won't reveal the exact neural
methodology used with its AnyFax technology, but did admit that it took over
100,000 examples of various types of fax documents from which to learn. Audre
Recognition Systems (San Diego, California) does not confine its OCR to
standard alphanumerics, but teaches its neural networks to read the
specialized symbols on engineering drawings, too. The company uses a learning
method that is a proprietary modification of the popular
back-propagation-of-errors approach. After learning a specific set of
symbols--say those used to build airplanes--the system will automatically
convert whole drawings to a standard computer-aided design (CAD) description
language and create a custom Postscript laser font for its special symbols.
Neural networks are also helping recognize patterns on the shop floor. For
instance, Applied Intelligent Systems (Ann Arbor, Michigan) built
neural-recognition capabilities into its vision computers for quality control
in factories. Other companies are also adding neural-recognition capabilities
to their industrial inspection systems. For instance, AI Ware (Cleveland,
Ohio) has enhanced traditional infrared spectroscopy verification systems with
neural networks for its customers. Ordinarily skilled experts read the plots
produced by spectroscopy, but AI Ware has trained a neural network to
automatically validate product quality on-the-fly. The system trains on known
good and bad parts after which it can spot the difference without human
intervention.
The National Science Foundation (Washington, D.C.) sponsored one of the more
unusual pattern-recognition applications, developed using BrainMaker. This
system uses a non-destructive testing method that can assess damage to bridges
and other steel and concrete structures, such as those found on highways,
without damaging them. The NSF neural network learns from the reflected sound
waves transmitted through concrete. Called the "impact-echo method," hardened
steel spheres drop on the surface of the concrete. The impact propagates a
wave through the material that reflects off cracks and voids in the structure.
A transducer measures displacement at the surface caused by the reflected
waves, and the neural network compares them with known solid-structure
reflections which it has already learned.
Neural networks recognize many other patterns, too. For instance, HNC's
software is being used to classify EKG signals in the presence of noise, as
well as to detect suspect pap smears at Neuromedical Systems (Suffern, New
York).
Fuzzy logic is seldom used for pattern recognition, but for process-control
applications it is being adopted by every major industrial controller vendor.
Startups are also popping up that combine cognitive computing techniques. For
instance, Pavilion Technologies (Austin, Texas)--a spin-off of the
Microelectronics and Computer Consortium (MCC, Austin, Texas)--has created
adaptive software for optimization and control in the materials-processing
industry. Called Process Insight, it combines neural networks, fuzzy logic,
and chaos theory to optimize processes and adjust their parameters for minimal
waste and highest yields. AI Ware also serves the process-control industry
with a neural-based technology that optimizes cost and product performance
while minimizing trial-and-error experimentation. Its machine-health
diagnostic system takes real-time measurements from machines at a plant and
correlates them with product-quality measurements. After learning these
correlations, the system can predict eminent failures and cue an embedded
expert system to schedule machine maintenance.
AI Ware also has a custom formulation system that optimizes processes for
producers of rubber, plastics, chemicals, metals, and products based on these
materials. After learning from in-house formulations, the system becomes an
automated expert chemist. Just specify a new formulation and the system
automatically determines the optimal set of ingredients and cookbook-style
processing steps. The ability of cognitive computing techniques to deal with
unforeseen circumstances makes them amenable to all sorts of forecasting
applications. The financial industry was one of the earliest converts.
NeuralWare (Pittsburgh, Pennsylvania) has its NeuralWorks software installed
throughout the financial industry, as do many other vendors. Besides just
predicting which securities will rise or fall in value, financial forecasting
also includes automated mortgage- and credit-approval systems, demand and
sales forecasting, and many industry-specific forecasting applications. The
diversity of these forecasting needs prompted HNC to create its DataBase
Mining Workstation (DMW). The DMW learns interdependent relationships and
logical sequences in any database, thereafter forecasting future contingencies
and answering "what-if" queries.
HNC has also crafted several vertical applications packages with the DMW. For
instance: SkuPLAN is a demand forecasting system that predicts sales for
chainwide stores; Automated Real Estate Appraisal System (AREAS) forecasts
property valuation based on multidimensional modeling; Automated Mortgage
Processing Systems (AMPS) forecasts good and bad risks for underwriters. Fuzzy
logic is also entering the financial markets. In 1991, "fuzzy" was the "word
of the year" in Japan followed by a flood of consumer items using it--from
cameras and video recorders to washing machines a vacuum cleaners. Now fuzzy
logic has penetrated the Japanese business arena, too, with several portfolios
being successfully managed by fuzzy- and neural-based systems.
In the U.S., every major consumer electronics maker and industrial controller
makes is integrating fuzzy logic into their wares. Business users are also
building fuzzy logic into their software base, but few are sharing their
secret. One company, FuziWare (Knoxville, Tennessee), is repackaging some of
the software it has written under contract for specific business users. Its
FuziQuote, for instance, is a bid-automation package that provides quick
turnaround quotes for complex jobs. Its FuziQuery aims at accurately finding
sales prospects. Its FuziCost is a modeling tool that accepts intuitive
knowledge about cost, profit, and strategic performance to provide what-if
scenarios. And its FuziCell is an area-management system for flexible factory
managers.
HyperLogic (Escondido, California) has created a fuzzy-Logic expert-system
shell it calls CubiCalc. Like a spreadsheet, you configure CubiCalc for
anything from business users to industrial control. Other companies are
offering add-ons for spreadsheets that incorporate cognitive computing
techniques. For instance, Inductive Solutions (New York, New York) offers
spreadsheet add-ons for genetic algorithms (GenSheet), neural networks
(NetSheet), and case-based reasoning (Inducelt). Axcelis (Seattle, Washington)
adds genetic algorithms to spreadsheet with its Evolver. Epic Systems Group
(Pasadena, California) adds neural capabilities to spreadsheets with its
Neuralyst. Promised Land Technologies (New Haven, Connecticut) adds neural
capabilities to spreadsheets with its Braincel. Ward Systems Group (Frederick,
Maryland) can read spreadsheets, perform neural-based learning with them, and
return the learned values transparently with NeuroSheet bridge to its
expert-system-like NeuroShell.
For a cheap neural-network subroutine library, try a somewhat oudated book
called Parallel Distributed Processing (MIT Press, 1986) which comes in a
version with a free disk. For an even cheaper fuzzy-logic library, try
Motorola's free one (download from 512-891-3733). The only genetic-algorithm
subroutine library of which I am aware is GA-lib from Mind's Eye (Florrisant,
Missouri). If you want to glue together some of these technologies in a hurry,
you might check out NET-Link+ from Norrad (Nashua, New Hampshire). NET-Link+
hacks together the expert-system building environment Nexpert Object, from
Neuron Data (Palo Alto, California), with several other packages including
NeuralWorks, GA-lib, AIM--the abductive-reasoning tool from AbTech, and the
TILShell, a fuzzy-logic development system from Togai InfraLogic.































































February, 1993
GENETIC ALGORITHMS


Nature's way to search for the best




Richard Spillman


Richard is a professor of computer science and can be contacted at the
Department of Computer Science and Engineering, Pacific Lutheran University,
Tacoma, WA 98467, 206-535-7406, or BITNET: SPILLMAN_R@PLU.


You've left your keys somewhere in the house, but can't remember where. To
find them, you could systematically search every room starting at the top
floor and working your way down. Or, you could randomly pick a room, search
it, and, if you don't find the keys, randomly select another. Either approach
would work fine for a small house, but what if you lived in a 110-room
mansion? Both approaches would result in a lot of wasted time (unless you're
lucky, but then if you were that lucky, you wouldn't have lost your keys in
the first place).
Another approach is to sit down and assign a value to each room in the house,
giving each room five points if you have been in the room since the last time
you saw your keys. Then begin to randomly search those rooms with high-point
totals. If this is your approach, then you know in a small but important way
how a genetic algorithm works.


How a Genetic Algorithm Works


Genetic algorithms (GAs), a way to randomly search for the best answers to
tough problems, were first suggested by John Holland in his book Adaptation in
Natural and Artificial Systems (University of Michigan Press, 1975). Over the
last 20 years, they've been used to solve a wide range of search,
optimization, and machine-learning problems. As their name indicates, genetic
algorithms attempt to solve problems in a fashion similar to the way in which
human genetic processes seem to operate (at least at a simple level). A good
survey of the nature and use of genetic algorithms can be found in David
Goldberg's Genetic Algorithms in Search, Optimization, and Machine Learning
(Addison-Wesley, 1989) and G.E. Liepins and M.R. Hilliard's paper "Genetic
Algorithms: Foundations and Applications" in Annals of Operations Research
(1989). You might also refer to Mike Morrow's "Genetic Algorithms" article in
DDJ (April, 1991).
The easiest way to consider a GA is in the context of a function-optimization
problem. The goal is to find an integer value between two fixed integers that
produces the largest result when substituted into a given function. A genetic
algorithm begins with a randomly selected population of function inputs
represented by strings of bits. The GA uses the current population of strings
to create a new population such that the strings in the new population are, on
average, "better" than those in the current population. For example, a
population of strings may be generated to search for a maximum value for the
function x{2}-x+ 1, where x is an integer between 0 and 31, as shown in Figure
1. The third element in this sample population represents x = 10 and produces
the best function output of the current population of three binary strings.
The idea is to use the best elements from the current population to help form
the new population. If this is done correctly, then the new population will,
on average, be "better" than the old population.
Figure 1: String population.

 Population Function New
 value population
 ------------------------------------

 0001 1 high function
 0110 31 value terms
 1010 91

Three processes--selection, mating, and mutation--are used to make the
transition from one population generation to the next. The basic
genetic-algorithm cycle based on these is shown in Figure 2.
The first step is the selection process. This determines which strings in the
current generation will be used to create the next generation. This is done by
using a biased random-selection methodology. That is, parents are randomly
selected from the current population in such a way that the "best" strings in
the population have the greatest chance of being selected. In Figure 1, the
string 1010 has the greatest chance of being a parent, while 0001 has the
least chance. By using the "best" points to determine the next population, the
algorithm seems to move in the most promising direction in its overall search.
The second step is the mating process, which determines the actual form of the
strings in the next generation. At this point, two of the selected parents are
paired. If the length of the each string is r, then a random number between 1
and r is selected, say s. The mating process is one of swapping bits s+1
through r of the first parent with bits s+1 through r of the second parent. In
this way, two new strings are created, as in Figure 3.
The final step is mutation. A fixed, small mutation probability is set at the
start of the algorithm. Bits in all the new strings are then subject to change
based on this mutation probability. In Figure 4, bits 6 and 10 are mutated.
(Bit 6 goes from dark to light, and bit 10 goes from light to dark.) The
result is a new generation of strings.
These three steps are repeated to create each new generation. It continues in
this fashion until some stopping condition is reached (such as a maximum
number of generations).
This algorithm has proven to be very effective in solving some tough search
problems. While it may seem to be a random search, in fact, the improvement in
each generation indicates that the algorithm provides an effective directed
search technique.


Genetic Algorithm Code


Listing One (page 90) is a Turbo Prolog version of a genetic algorithm that
searches for the maximum integer value of a function. It consists of 17
procedures, two functions, and a main program. Some of the procedures, such as
SetDisplay, GENOUT, clockoff, clockon, Screen_Win, FileOut, and DATAIN are
designed as part of the user interface and are not necessary for the genetic
algorithm; I won't discuss their function or operation. The OptFunc function
is the code that specifies the function to be maximized. If you want to run
the code as is, it will search for a maximum value for the function
-x{3}+25x{2}-2x+55. It is a simple matter to rewrite this function so that you
may insert any maximization problem into the code.
When you first run this code, you'll be asked to provide all the necessary
parameters for a genetic algorithm. The DATAIN procedure asks for the
population size, that is, the number of chromosomes in the population. I
recommend at least 12; the program is limited to a population size of no more
than 50. (You can change the value of the MaxChrom constant at the beginning
of the program if you want a larger population, but 50 is usually more than
enough). The next parameter is the number of bits in a gene. This value is
determined by the largest integer in your search range. For example, if your
search spans the integers from 0 to 1024, then you need 10 bits. If the search
is from 0 to 2048, then you need 11 bits, and so on. The program allows up to
200 bits, which represents a large integer-search range. (This value may also
be changed by changing the MaxN constant.) The third requested parameter is
the number of generations. This is the number of new populations you wish to
create during your search. Your search will stop at the number of generations
you specify. The next parameter requests a probability value. You may try
different mutation probabilities to observe their effect on the success of the
search. I recommend beginning with small mutation probabilities such as 0.1 or
0.2. The rest of the parameters are self-explanatory--except for the last one.
When you are requested to enter the display update, enter a number between 1
and the total number of allowed generations. If you enter 5, your search will
pause after every fifth generation to display the current results. This gives
you a chance to see how the GA is doing.
The relevant function and procedures for a genetic algorithm are PARENT,
MUTATE, and CROSSOVER. Each performs exactly as required by a classical
genetic algorithm. The function PARENT selects a random parent from the
current population. It is called twice by the main program to find two parents
to send to the CROSSOVER procedure. The CROSSOVER procedure selects a random
point and swaps the bits between the parents. It then calls the MUTATE
procedure which randomly mutates bits in the two offspring. The result is a
new generation of points; they are evaluated, and the best one is saved in
BEST.


A Sample Run


If you choose to run the program to find the maximum, use these parameters:
Population Size: 12
Number of Bits: 6
Number of Generations: 5
Mutation Probability: 0.1

Random Seed: 1234
Best Guess for the Max: 2337
Minimum Interval Value: 0
Maximum Interval Value: 5
Display Update: 1
Your initial population is going to look like this:
010010, 001100, 001011, 001011, 001000, 001000, 000101, 000100, 100011,
101010, 111000, 100100
The actual integer max occurs at 17. The random generation of 12 integers in
the range 0 to 50 did not do too badly since one of the random points is 18,
which is close to the maximum. Of course, with such a small range it is not
unusual for the algorithm to do this well. For problems in which the number of
possible answers is much greater than 50, it's unlikely that the initial
population will have such a good fit. After one generation, the program will
find the number 16. By generation 4, it will find 17, the actual integer
maximum. On a simple problem like this, it is more interesting to watch the
overall improvement in the search. The fitness of each member of a generation
is determined by dividing the value of the function at that point by the known
maximum value. Hence, in the positive range of the function, this fitness
value is between 0 and 1. Of course, for a real problem the maximum would not
be known at the beginning, so such a fitness value would not be available. For
this problem, however, it is instructive to determine the fitness of each
member of a given generation. If you run the program as is, you will notice
that the average fitness of each generation increases. The initial random
population has an average fitness of 0.39 while the fifth-generation average
fitness is 0.58. The point is that the genetic algorithm is finding "better"
points as it moves from generation to generation.


Improvements


Nothing in the classical approach to mutation or crossover requires that only
those methods be used in a genetic algorithm. You could try any procedure you
can think of to mix bits in a population to create a new population. Some of
your ideas may not produce a new population better than the previous one for a
specific application. On the other hand, some of your ideas may work very
well. For example, in place of a random mutation in which a bit is changed
from 0 to 1 or 1 to 0 with a low probability, why not swap two bits in the
current string? Code in the SWAP procedure does just that. While the current
version of the main program does not use SWAP, you may want to add it to the
system and observe the results. The best way to add this procedure is to
replace the call to MUTATE with a call to SWAP. Perhaps you may want to be
more daring and swap bits half the time and mutate them the other half. This
could be done by replacing the calls to MUTATE in the CROSSOVER procedure with
the code shown in Listing Two (page 93).
Another possibility, which I call "inversion," inverts the order of bits in a
string between two random points. If you begin with a string 01110101100011
and note two random points in the string with (), such as 011(101011)00011,
then the inversion operation will reverse the order of the bits between the
two random points to produce the string 011(110101)00011. A procedure for
inversion is also provided in the code of Listing One. It is up to you to add
it to the crossover procedure and observe its effects.
These are only a couple of examples of other ways in which the mutation
operation could be implemented. The crossover operation could also be changed.
The field is wide open. Anything you can think of is a valid possibility.

_GENETIC ALGORITHMS_
by Richard Spillman


[LISTING ONE]

{****************************************************************************}
{* GENERAL GENETIC ALGORITHM--a simple implmentation of a genetic algorithm *}
{* for function optimization. Dr. Richard Spillman, Dept.of Computer Science*}
{* Pacific Lutheran Uni. Tacoma WA 98447 206-535-7406 BITNET: SPILLMAN_R@PLU*}
{****************************************************************************}
program GGA;
uses Crt,Dos;
CONST
 MaxChrom = 50; {Maximum population size is 50}
 MaxN = 200; {Maximum bit size is 200}
TYPE
 ChromType = RECORD
 Chr : ARRAY[1..MaxN] of byte;
 Fit : real;
 Val : longint;
 CFit : real;
 END;
 ChromPool = ARRAY[1..MaxChrom] of ChromType;
VAR
 I,J,N,Seed,Dis : integer;
 PopSize,NumGen,Flip,K : integer;
 Par1,Par2,MaxI,MinI : integer;
 Pm,AveFit,Pi,start,stop : real;
 MAXS : real;
 A : char;
 Best : ChromType;
 Pool : ARRAY[0..1] of ChromPool;
 OutFile : text;
 Dupl : BOOLEAN;
 h,m,s,s100,DL : word;
{* DATAIN-Reads in (from keyboard) basic data for the genetic algorithm *}
PROCEDURE DATAIN;
VAR
 I : integer;
BEGIN
 ClrScr;

 writeln;
 writeln;
 writeln(' ** Genetic Parameters **');
 writeln;
 write(' ENTER POPULATION SIZE: ');readln(PopSize);
 write(' ENTER THE NUMBER OF BITS IN A GENE: ');readln(N);
 write(' ENTER NUMBER OF GENERATIONS: ');readln(NumGen);
 write(' ENTER MUTATION PROBABILITY: ');readln(Pm);
 write(' ENTER RANDOM SEED: ');readln(J);
 write(' ENTER YOUR BEST GUESS FOR THE MAX: ');readln(MAXS);
 write(' ENTER THE MINIMUM INTERVAL VALUE: ');readln(MinI);
 write(' ENTER THE MAXMUM INTERVAL VALUE: ');readln(MaxI);
 write(' ENTER THE DISPLAY UPDATE: ');readln(Dis);
 IF (PopSize Mod 2) = 1 THEN PopSize := PopSize + 1;
END;
{** POOLSORT - Sort the pool of chromsomes in order of fitness **}
PROCEDURE PoolSort(var Chrom:ChromPool);
VAR
 I,J,M1 : integer;
 Max : real;
 TmpChrom : ChromPool;
 ST : array[1..MaxChrom] of integer;
BEGIN
 For I:=1 to PopSize Do ST[I] := 0;
 For I:=1 to PopSize DO
 BEGIN
 Max := 0.0;
 For J := 1 to PopSize DO
 BEGIN
 IF (Chrom[J].Fit > Max) and (ST[J] = 0) THEN
 BEGIN
 MAX := Chrom[J].Fit;
 M1 := J;
 END;
 END;
 ST[M1] := 1;
 TmpChrom[I] := Chrom[M1];
 END;
 For I:=1 to PopSize DO
 Chrom[I] := TmpChrom[I];
END; {of POOLSORT}
{** BitsToDec - Converts a binary number to decimal **}
PROCEDURE BitsToDec(Chrom:ChromType; var Number:longint);
VAR
 i : integer;
 power : longint;
BEGIN
 Number := 0;
 power := 1;
 For i := N downto 1 do
 BEGIN
 IF Chrom.Chr[i] = 1 then Number := Number + power;
 power := power * 2;
 END;
END;
{** OPTFUNC-Evaluates function to be optimized. Insert your function here **}
FUNCTION OptFunc(Value:longint) : longint;
VAR
 Temp : longint;

BEGIN
 Temp := Value * Value * Value;
 OptFunc := -Temp + 25*Value*Value - 2*Value + 55;
END;
{** INITIALIZE - Generate initial Chromosome Pool **}
PROCEDURE INITIALIZE(var Chrom:ChromPool; IntStart:integer);
VAR
 I,J : integer;
 S,K : longint;
BEGIN
 Randomize; RandSeed := IntStart;
 FOR I:=1 to PopSize DO
 BEGIN
 REPEAT
 FOR J:=1 to N DO
 BEGIN
 S:=random(10);
 IF S > 5 THEN Chrom[I].Chr[J] := 1
 ELSE Chrom[I].Chr[J] := 0;
 END;
 BitsToDec(Chrom[I],K);
 Chrom[I].Val := OptFunc(K);
 UNTIL (K>=MinI) AND (K<=MaxI);
 END;
 FOR J:=1 to N DO Best.Chr[J] := 0;
 Best.Fit := 0.0;
END;
{** FITNESS - Determines the fitness of a chromosome pool **}
PROCEDURE Fitness(var Chrom:ChromPool);
VAR
 i : integer;
 Value : longint;
 CF : real;
BEGIN
 CF := 0.0;
 For i:=1 to PopSize DO
 BEGIN
 BitsToDec(Chrom[i],Value);
 Chrom[i].Val := OptFunc(Value);
 Chrom[i].Fit := Abs(Chrom[i].Val/MAXS);
 If Chrom[i].Fit > 1.0 then chrom[i].Fit := 0.01;
 CF := CF + Chrom[i].Fit;
 IF Chrom[i].Fit > Best.Fit THEN
 Best := Chrom[i];
 END;
 AveFit := CF/Popsize;
END; {of FITNESS}
{** PARENT - Selects a random parent from the pool **}
FUNCTION PARENT(Chrom:ChromPool):integer;
VAR
 Rnd : real;
 I : integer;
BEGIN
 Rnd := random;
 I := 1;
 WHILE Rnd > Chrom[I].CFit DO I := I + 1;
 PARENT := I;
END;
{** MUTATE - randomly complements a bit in the chromosome **}

PROCEDURE MUTATE(var Chrom:ChromType);
VAR
 I : integer;
BEGIN
 FOR I:=1 to N DO
 IF Pm > random THEN Chrom.Chr[I] := 1 - Chrom.Chr[I];
END;
{** SWAP-Randomly swap a bit with its neighbor; halfthe time with its upper
**}
{** neighbor, half the time with its lower neighbor. **}
PROCEDURE SWAP(var Chrom:ChromType);
VAR
 TMP,SP : integer;
BEGIN
 SP:=random(N);
 IF SP<2 THEN SP := 2;
 IF SP>(N-1) THEN SP := N-1;
 IF random > 0.5 THEN
 BEGIN
 TMP := Chrom.Chr[SP];
 Chrom.Chr[SP] := Chrom.Chr[SP-1];
 Chrom.Chr[SP-1] := TMP;
 END
 ELSE
 BEGIN
 TMP := Chrom.Chr[SP];
 Chrom.Chr[SP] := Chrom.Chr[SP+1];
 Chrom.Chr[SP+1] := TMP;
 END;
END;
{** CROSSOVER-Standard Crossover function. Selects one random point and **}
{** switches the tales of the two parents **}
PROCEDURE CROSSOVER(P1,P2:ChromType;Var C1,C2:ChromType);
VAR
 I,J :integer;
BEGIN
 I:=random(N);
 IF I=0 THEN I:=1;
 FOR J:=1 to N DO
 IF J < I THEN
 BEGIN
 C1.Chr[J] := P1.Chr[J];
 C2.Chr[J] := P2.Chr[J];
 END
 ELSE
 BEGIN
 C1.Chr[J] := P2.Chr[J];
 C2.Chr[J] := P1.Chr[J];
 END;
END;
{** MERGE-Combine two pools to create a single pool containing best of both
**}
PROCEDURE Merge(old:ChromPool;var new:chromPool);
VAR
 I,J,K : integer;
 Tmp1,Tmp2 : ChromType;
BEGIN
 K:=1;
 J:=1;
 WHILE J < PopSize+1 DO
 BEGIN

 IF old[K].Fit > new[J].Fit THEN
 BEGIN
 Tmp1 := new[J];
 new[J] := old[K];
 K:=K+1;
 IF K > 5 THEN J := PopSize+1;
 FOR I := J+1 to PopSize DO
 BEGIN
 Tmp2:=new[I];
 new[I] := Tmp1;
 Tmp1:=Tmp2;
 END;
 END
 ELSE
 J:=J+1;
 END;
 END;
{** DUPLICATE DETECTION/REPLACEMENT-Finds duplicate elements in pool and **}
{** replaces them with new random elements **}
PROCEDURE DupReplace(var Chrom:ChromPool);
VAR
 I,J,K,S : integer;
 DP : Boolean;
BEGIN
 I := 1;
 K := 1;
 DP := false;
 WHILE K <= PopSize DO
 BEGIN
 IF DP THEN DP := false ELSE K := I+1;
 DP := false;
 J := 1;
 WHILE (J <= N) and (Chrom[I].Chr[J] = Chrom[K].Chr[J]) DO J := J+1;
 IF J > N THEN DP := true;
 IF DP THEN
 BEGIN
 Dupl := true;
 FOR J := 1 to N DO
 BEGIN
 S := random(10);
 IF S > 5 THEN Chrom[K].Chr[J] := 1
 ELSE Chrom[K].Chr[J] := 0;
 END;
 K:=K+1;
 END
 ELSE I := K;
 END;
END;
{** FILE OUTPUT - outputs each generation to a file **}
PROCEDURE FileOut(Chrom:ChromPool; I:integer);
VAR
 J,K : integer;
BEGIN
 writeln(OutFile,'***************** GENERATION ',I:3,'*******************');
 writeln(OutFile);
 write(OutFile,'BEST FIT: ');
 FOR J:=1 to N DO write(OutFile,Best.Chr[J]);
 writeln(OutFile,' Fitness: ',Best.Fit);
 writeln(OutFile,' Average Fit for this generation: ',AveFit);

 writeln(OutFile);
 writeln(OutFIle,' Elasped Time: ',stop);
 FOR J:=1 to PopSize DO
 BEGIN
 FOR K:=1 to N DO
 write(OutFile,Chrom[J].Chr[K]);
 writeln(OutFile,' ',Chrom[J].Fit:4:3);
 END;
 writeln(OutFile);
 writeln(OutFile,'********************************************************');
 writeln(OutFile);
END; {of FILE OUTPUT}
{** SCREEN_WIN - Screen_Win will accept two colors and a row **}
{** location to set up a window for input/output **}
procedure Screen_Win(x1,x2,y1,y2,fg,bg:integer);
var i,j :byte;
begin
 TextColor(fg);
 TextBackground(bg);
 for i:=x1 to x2 do
 begin
 GotoXY(i,y1);
 write(#205);
 GotoXY(i,y2);
 write(#205)
 end;
 for i:=(y1+1) to (y2-1) do
 begin
 GotoXY(x1,i);
 write(#186);
 GotoXY(x2,i);
 write(#186)
 end;
 GotoXY(x1,y1);
 write(#201);
 GotoXY(x1,y2);
 write(#200);
 GotoXY(x2,y1);
 write(#187);
 GotoXY(x2,y2);
 write(#188);
 for i:=y1+1 to y2-1 do
 for j:=x1+1 to x2-1 do
 begin
 GotoXY(j,i);
 Write(' ')
 end;
end;
{** Start the timer **}
procedure clockon(var startclock : real;var DL:word);
var
 Y,M,DW : word;
Begin
 gettime(h,m,s,s100);
 startclock := (h*3600) + (m*60) + s + (s100/100);
 getdate(Y,M,DL,DW);
end;
{** Stop the timer **}
procedure clockoff(var stopclock:real; startclock : real; DL:word);

var
 i:integer;
 Y1,M,D,DW:word;
BEGIN
 gettime(h,m,s,s100);
 stopclock:=(h*3600) + (m*60) + s + (s100/100);
 getdate(Y1,M,D,DW);
 if D = DL then
 stopclock:=stopclock - startclock
 else
 stopclock:=(D - DL - 1)*86400 + 86400 - startclock + stopclock;
END;
{** GENERATION OUTPUT-Prints out the current generation and summary data **}
PROCEDURE GENOUT(Chrom : ChromPool;I:integer);
VAR
 J,K,S : integer;
BEGIN
 clockoff(stop,start,DL);
 GoToXY(6,5); write(stop:5:3);
 GoToXY(19,5); write(I:4);
 GoToXY(36,5); write(Best.Fit:5:3);
 GoToXY(50,5); write(AveFit:5:3);
 GoToXY(62,5); write(MAXS:7);
 S := N;
 IF N > 64 THEN S:=64;
 GoToXY(6,9); FOR J:=1 to S DO write(Best.chr[J]);
 IF N > 64 THEN
 BEGIN
 S:=N;
 IF N > 128 THEN S := 128;
 GoToXY(6,10);
 FOR J:=65 to S DO write(Best.chr[J]);
 IF N > 128 THEN
 BEGIN
 S:=N;
 IF N > 192 THEN S:=192;
 GoToXY(6,11);
 For J:=129 to S DO write(Best.chr[J]);
 END;
 END;
 GoToXY(72,9);
 write(Best.Val:6);
 GoToXY(9,16);
 write('Chrom Value FIT Chrom Value FIT');
 FOR J:=1 to 6 DO
 BEGIN
 GoToXY(9,16+J);
 write(J:2);
 GoToXY(17,16+J);
 write(Chrom[J].val:6);
 GoToXY(27,16+J);
 write(Chrom[J].Fit:4:3);
 GoToXY(37,16+J);
 write((J+6):2);
 GoToXY(43,16+J);
 write(Chrom[J+6].val:6);
 GoToXY(53,16+J);
 write(Chrom[J+6].Fit:4:3);
 END;

 IF (I mod Dis) = 0 THEN
 BEGIN
 GoToXY(56,2); write('PAUSE');
 GoToXY(56,3); write('HIT RETURN . . .'); readln;
 GoToXY(56,2); write(' ');
 GoToXY(56,3); write(' ');
 END;
END;
{** DISPLAY - Sets up the standard screen display **}
Procedure SetDisplay;
 BEGIN
 Textbackground(blue);
 Textcolor(white);
 clrscr;
 writeln;
 writeln(' GENETIC SEARCH');
 writeln;
 Screen_Win(4,14,4,6,yellow,blue);
 GotoXY(6,4);write('TIME');
 Screen_Win(18,31,4,6,yellow,blue);
 GotoXY(20,4); write('Generation');
 Screen_Win(34,46,4,6,yellow,blue);
 GotoXY(36,4); write('Best Fit');
 Screen_Win(48,59,4,6,yellow,blue);
 GotoXY(50,4); write('Ave Fit');
 Screen_Win(61,75,4,6,yellow,blue);
 GoToXY(63,4);write('Target');
 Screen_Win(4,78,8,13,yellow,blue);
 GotoXY(6,8);write('Current Best Point');
 Screen_Win(6,76,15,25,yellow,blue);
 GotoXY(8,15);write('Population Sample');
 END;
{** M A I N P R O G R A M **}
BEGIN
 DATAIN;
 assign(OutFile,'results.txt'); rewrite(OutFile);
 writeln(OutFile,' GENETIC RUN');
 writeln(OutFile,' ********** PARAMETERS FOR THIS RUN ************');
 writeln(OutFile,' Number of Generations: ',NumGen:3);
 writeln(OutFile,' Size of Chromosome Pool: ',PopSize:3);
 writeln(OutFile,' Mutation Probability: ',Pm:4:3);
 writeln(OutFile,' Random Start Number: ',Seed:5);
 writeln(OutFile,' ****************************************************');
 writeln(OutFile);
 clockon(start,DL);
 INITIALIZE(Pool[0],Seed);
 FITNESS(Pool[0]);
 PoolSort(Pool[0]);
 SetDisplay;
 GENOUT(Pool[0],0);
 I:=0;
 WHILE (I < NumGen) DO
 BEGIN
 FLIP := I Mod 2;
 K:=1;
 FOR J:=1 to (PopSize div 2) DO
 BEGIN
 Par1 := Parent(Pool[Flip]);
 Par2 := Parent(Pool[Flip]);

 CROSSOVER(Pool[Flip,Par1],Pool[Flip,Par2],
 Pool[1-Flip,K],Pool[1-Flip,K+1]);
 K:=K+2;
 END;
 FITNESS(Pool[1-Flip]);
 PoolSort(Pool[1-Flip]);
 Merge(Pool[Flip],Pool[1-Flip]);
 Dupl := false;
 DupReplace(Pool[1-Flip]);
 IF Dupl THEN
 BEGIN
 FITNESS(Pool[1-Flip]);
 PoolSort(Pool[1-Flip]);
 END;
 FileOut(Pool[1-Flip],I+1);
 GENOUT(Pool[1-Flip],I+1);
 I := I+1;
 END;
 GoToXY(56,2); write('DONE');
 GoToXY(56,3); write('HIT RETURN . . .'); readln;
 close(OutFile);
END.






[LISTING TWO]

{** INVERSION - Change the order of bits between two random **}
{** points - 100(10110)0110 becomes 100(01101)0110 **}
PROCEDURE INVERSION(var Chrom:ChromType);
VAR
 I,K,S1,S2,Tmp : integer;
BEGIN
 IF random < Pi THEN
 BEGIN
 S1 := random(N-1);
 IF S1=0 THEN S1:=1;
 S2 := random(N);
 WHILE S2 = S1 DO
 S2 := random(N);
 IF S2=0 THEN S2:=1;
 IF S1 > S2 THEN
 BEGIN
 Tmp := S2;
 S2 := S1;
 S1 := Tmp;
 END;
 K := (S2 - S1 + 1) DIV 2;
 FOR I:=1 TO K DO
 BEGIN
 Tmp := Chrom.Chr[I+S1-1];
 Chrom.Chr[I+S1-1] := Chrom.Chr[S2-I+1];
 Chrom.Chr[S2+1-I] := Tmp;
 END;
 END;
END;































































February, 1993
CELLULAR AUTOMATA FOR SOLVING MAZES


Finding your way from problems to solutions




Basem A. Nayfeh


Basem is a graduate student in Electrical Engineering at Stanford University
and can be contacted via e-mail at bnayfeh@leland.stanford.edu.


Many of us were introduced to cellular automata (CA for short) through
Conway's game of Life, in which "creatures" living in a grid of cells are
born, live, and die depending on their current state and the states of
creatures in neighboring cells. Each creature is subject to a rule that
defines its existence in terms of that of its neighbors, and the "community"
at large is defined by the collective states of the individual creatures. The
implication is that to model the "community" or a collective system, it may be
sufficient to regard it as a collection of similar simple systems interacting
only locally.
This concept is powerful when applied to many physical systems. For example,
in studying particle simulations, we can discretize the system into grid cells
that may or may not contain an individual particle. In grid cells containing a
particle, the particle's momentum (or in a simpler case, direction) is
adjusted due to collisions with any other particles in immediate neighboring
cells. Thus, only local interactions are taken into account and, by looking at
the momentum of the individual particles, the properties of the system as a
whole can be determined. This provides an attractive alternative to solving
sets of complex differential equations; and in fact, CA simulations scale very
well on parallel systems.
In addition to a wide variety of applications in fluid dynamics, astrophysics,
neural networks, and other fields, a potentially interesting application for
CAs is in the solution of mazes. Maze-solving algorithms have long been of
interest to computer scientists and engineers because of their use in routing
problems. This article presents a CA-based algorithm for solving mazes which
requires no more memory than that to store the original maze, provides all
possible solutions at the same time, and can determine if a solution doesn't
exists.


Conventional Wisdom


Conventional maze-solving algorithms have revolved around the "rat in the
maze" approach in which a "rat" effectively runs through the maze according to
set rules, backtracking only when reaching a dead end. The implication is that
the rat has only a local knowledge of the maze relative to its position since
it cannot "see" around corners. These algorithms are inherently recursive in
nature since they require that, upon encountering an intersection of paths in
the maze, one path is selected at a time while "remembering" the intersection.
If the selected path leads to a dead end, the rat returns to the last
intersection and follows another path until it reaches either a dead end or
the end goal. This is essentially a depth-first search on a tree with maximal
degree N, where N is the number of adjacent locations the rat can move to next
from any given location. For example, in a two-dimensional maze where only
north, east, west, and south (NEWS) moves are valid, N = 4. If diagonal moves
are also allowed, N = 8. These can easily be extended to higher dimensions
where, for a three-dimensional maze, N = 6 for NEWS and up and down moves, and
N = 26 if all diagonal moves are allowed as well. For a large maze, the memory
requirements to store such a potentially large tree and/or a stack used in
traversing it in a depth-first manner can be substantial.


The CA Approach


It may seem natural at this point to abandon the concept of a single "rat"
with only a local knowledge of the maze and its corresponding depth-first
search, and consider instead the implications of a global knowledge of the
maze due to simultaneous local interactions throughout the maze--being able to
"see" the whole maze at one time.
Let's assume that we want to solve a two-dimensional maze with only NEWS moves
allowed, although the algorithm can be extended to higher-dimensioned mazes
with other allowable moves in a straightforward manner. The maze is
represented by a two-dimensional array or grid Maze[Xsize][Ysize], where Maze
[x][y]== 1 denotes a wall at location (x,y) and Maze[x][y]== 0 indicates that
location (x,y) is free (0<=x<Xsize, 0<=y< Ysize). A location that is "free" at
a particular time during the algorithm's execution means that it can
potentially be part of the solution path or paths, with all free paths being
one unit wide. Likewise, a "wall" location cannot be part of the solution
path. In addition, the maze is bounded by walls except for the start and end
positions, which lie on the boundary and are considered to be permanent free
locations; thus, only locations (x,y) with 0<x<Xsize - 1 and 0<y<Ysize - 1
need to be evaluated.
The heart of the algorithm is really quite simple. First, a two-dimensional
array of cells is defined with each cell representing a location in the maze.
A cell can change its "state" during each iteration according to a state
transition rule which depends upon its current state and the current state of
its NEWS adjacent cells. This is, in effect, the definition of a CA.
Next, two different states are defined for each cell in the CA. A cell in the
FREE state, or a "free cell," corresponds to a free location, as described
previously. Likewise, a cell in the WALL state (or "wall cell") corresponds to
wall location.
In determining the state transition rule, the following observation can be
made: A dead-end condition may be characterized by a free cell surrounded by
at least N - 1 wall cells, where again N denotes the number of allowable
adjacent moves. In our case, N = 4, so a dead-end condition occurs if a free
cell is surrounded by either three or four wall cells in the NEWS directions.
The case where a cell is surrounded by N wall cells is trivial since it
implies that because the cell is completely surrounded by walls, it could
never be accessed. The important point to note is that a free cell in a
dead-end condition can never be part of the solution path since it has one
entry point and no exit points. Thus, the free cell can be changed to a wall
cell. Changing the free cell to a wall cell will now cause free cells that are
not on a solution path and that immediately precede the former free cells to
exhibit the dead-end condition themselves. The former free cells are
subsequently changed to wall cells and so on down the line. As the process
continues iteratively, dead-end paths become blocked off, and a steady-state
condition is reached if no free cells change during the iteration into a wall
cell. At this point, the free cells will denote the solution path or paths
through the maze. In the case where all the internal cells have become wall
cells, then no path exists from the start to end of the maze.
The rule applied to each cell in the CA during each iteration can then be
expressed as follows:
A free cell surrounded by three or four NEWS wall cells becomes a wall cell.
A wall cell always remains a wall cell.
A free cell surrounded by fewer than three NEWS wall cells remains a free cell
during the iteration.
The fact that FREE-to-WALL state transitions are allowed and WALL-to-FREE
state transitions are not shows that the algorithm can never diverge, since
oscillations cannot take place, and that the CA converges toward the solution
or solutions of the maze with every iteration. In other words, stopping the
algorithm at a certain point results in a partially solved maze whose solution
space is always decreased in the next iteration unless a steady-state
condition is reached. At that point, the solution space is the solution or
solutions of the maze.
A cell needs to store only its current state. Thus, no additional memory
beyond the original cell (maze) array is required if the original maze can be
over-written. This implies that the algorithm may be stopped and restarted
after any iteration using only the current states in the cell array. This is
unlike other recursive algorithms, which would require saving the entire
search stack in order to restart.
The C code in Example 1 illustrates how this might be implemented. Although
the code demonstrates how the algorithm may be implemented on a serial
machine, CAs are easily adaptable to parallel programming on distributed and
massively parallel architectures. This results from the single-rule and
nearest-neighbor data-dependency characteristics of CAs, which enable them to
map easily onto SIMD (single-instruction multiple-data) architectures.
Example 1: Implementing the cellular-automata algorithm.

 #define FALSE 0
 #define TRUE 1
 #define FREE 0
 #define WALL 1
 . . .
 do {
 steadystate = TRUE;
 /* scan the entire CA */
 for (x=1;x<Xsize-1;x++) {
 for (y=1;y<Ysize-1;y++) {
 if (cell[x][y] == FREE) {
 /* addition can be used here to determine if */
 /* a cell is surrounded by 3 or more walls */
 if ((cell[x+1][y] + cell[x-1][y]


 + cell[x][y+1] + cell[x][y-1]) >= 3) {
 cell[x][y] = WALL;
 steadystate = FALSE;
 }
 }
 }
 }
 /* keep scanning the CA until */
 /* a steady state condition is reached */
 } while(!steadystate);

 /* the cell array now contains the correct solution(s) */
 /* denoted by the remaining free cells */

 /* no solution if all cells are now wall cells */
 . . .



Conclusion


CAs have come a long way from the game of Life and are considered valuable
tools for simulating various complex systems. Parallel systems provide an
almost ideal platform for CA simulations. In turn, CAs may prove to be the
best method of simulation on massively parallel systems. As research in
parallel computing continues and as new algorithms for CAs are introduced,
cellular automata will play an important role in computing in years to come.


_CELLULAR AUTOMATA FOR SOLVING MAZES_
by Basem A. Nayfeh

[EXAMPLE ONE]

#define FALSE 0
#define TRUE 1
#define FREE 0
#define WALL 1

 do {
 steadystate = TRUE;
 /* scan the entire CA */
 for (x=1;x<Xsize-1;x++) {
 for (y=1;y<Ysize-1;y++) {
 if (cell[x][y] == FREE) {
 /* addition can be used here to determine if */
 /* a cell is surrounded by 3 or more walls */
 if ((cell[x+1][y] + cell[x-1][y]

 + cell[x][y+1] + cell[x][y-1]) >= 3) {
 cell[x][y] = WALL;
 steadystate = FALSE;
 }
 }
 }
 }
 /* keep scanning the CA until */
 /* a steady state condition is reached */
 } while(!steadystate);

 /* the cell array now contains the correct solution(s) */
 /* denoted by the remaining free cells */


 /* no solution if all cells are now wall cells */





























































February, 1993
FUZZY LOGIC IN C


Creating a fuzzy-based inference engine




Greg Viot


Greg is a member of the Motorola technical ladder and is currently merging
fuzzy logic with microcontrollers. He has an MSEE from National Technological
University and a BSEE from the University of Texas at Austin. Greg can be
contacted at Motorola Advanced Microcontroller Division, 6501 William Cannon
Drive West, Austin, Texas 78735-8598.


Fuzzy logic is a powerful, yet straightforward, problem-solving technique with
widespread applicability, especially in the areas of control and decision
making. In general, it is most useful in handling problems not easily
definable by practical mathematical models. For instance, fuzzy logic has been
employed in such tasks as managing stock-market portfolios and controlling
subway systems.
Fuzzy derives much of its power from its ability to draw conclusions and
generate responses based on vague, ambiguous, qualitative, incomplete, or
imprecise information. In this respect, fuzzy-based systems have a reasoning
ability similar to that of humans. In fact, the behavior of a fuzzy system is
represented in a very simple and natural way. This allows quick construction
of understandable, maintainable, and robust systems. In addition, a fuzzy
approach generally requires much less memory and computing power than
conventional methods, thereby permitting smaller and less expensive systems.
Lotfi Zadeh, a professor at the University of California at Berkeley, is the
person most widely associated with fuzzy logic. In 1965 he presented the
original paper formally defining fuzzy-set theory, from which fuzzy logic
emerged. Zadeh extended traditional theory to resolve the paradoxes sometimes
generated from the "nothing-or-all" classifications of Aristotelian logic.
Traditionally, a logic premise has two extremes: either completely true or
completely false. However, in the fuzzy world, a premise ranges in degree of
truth from 0 to 100 percent, which allows it to be partially true and
partially false.
By incorporating this "degree of truth" concept, fuzzy logic extends
traditional logic in two ways. First, sets are labeled qualitatively (using
linguistic terms such as "tall," "warm," "active," "nearby," and so on), and
the elements of these sets are assigned varying degrees of membership. For
instance, a 5'11" man and a 6'4" man may both be members of a set of "tall"
men, although the 6'4" man has a higher degree of membership. Secondly, any
action or output resulting from a premise being true executes to a strength
reflecting the degree to which that premise is true.
As an example, imagine a fan motor, the speed of which is a function of
temperature, as shown in Table 1. The current supplied to the fan motor is
regulated by sets of temperature: cold, cool, warm, and hot. In this system,
as the temperature gradually moves from warm to cool, the current gradually
moves from 50 to 15. By continuously tracking inputs, outputs can avoid abrupt
changes, even as inputs transcend set boundaries. Fuzzy-based systems are
constructed so that generated outputs change in a smooth and continuous
manner, regardless of inputs crossing set boundaries.
Table 1: Fan-speed control.

 Temperature Fan Speed Relative Motor
 Current
 ------------------------------------------

 Cold Off 0
 Cool Slow 15
 Warm Medium 50
 Hot Fast 100



Organization of a Fuzzy System


Figure 1 illustrates the flow of data through a fuzzy system. System inputs
undergo three transformations to become system outputs. First, a fuzzification
process that uses predefined membership functions maps each system input into
one or more degrees of membership. Then, the rules in the rule base (also
predefined) are evaluated by combining degrees of membership to form output
strengths. And lastly, the defuzzification process computes system outputs
based on strengths and membership functions.
Fuzzification of Inputs. Fuzzification is the process of assigning or
calculating a value to represent an input's degree of membership in one or
more qualitative groupings, called "fuzzy sets." Figure 2 shows a system
input, temperature, with fuzzy sets cold, cool, warm, and hot. Each
temperature value has a degree of membership in each of these sets. The degree
of membership is determined by a membership function, which is defined based
on experience or intuition. Figure 9 illustrates the degree of membership
calculation for a trapezoidal membership function. It is accepted that
membership functions change several times as the system is tuned to achieve
desired responses to given inputs.
Generally, once the system is in operation, the membership functions do not
change. Simple shapes such as trapezoids and triangles are often used to
define membership in fuzzy sets, but any suitable function can be used. In
addition, you must decide upon the number of fuzzy sets per system input.
In Figure 2, a fuzzy set labeled comfortable could be inserted between cool
and warm. The number of fuzzy-set membership functions and the shapes you
choose depend on such things as required accuracy, responsiveness and
stability of the system, ease of implementation, manipulation, and
maintenance, and so on. The trapezoidal and triangular membership functions
are most common and have proven to be good compromises between effectiveness
and efficiency. The fuzzy sets must span the X-axis covering the entire range,
or universe of discourse, for a system input. Mapping to the Y-axis ranges
from 0 to 1 and represents the degree to which an input value is a member of
that particular fuzzy set. Overlapping between set boundaries is desirable and
key to the smooth operation of the system. It permits membership in
multiple--even seemingly contradictory--sets. In Figure 2, 63 degrees can be
both cool and warm, but it is cool to a greater degree. An overlap of 25
percent between adjacent fuzzy sets is a general rule of thumb.
The fuzzification process permits a binding to take place between linguistic
terms (cold, nearby, active, large, and so on) and membership functions,
making the terms meaningful to a computer. As a result, a designer can express
or modify the behavior of a system using such natural language, thus enhancing
the possibility of clear and concise descriptions of complex tasks.
Evaluation of Rules. To govern the system's behavior, the designer develops a
set of rules that have the form of If-Then statements. The If side of a rule
contains one or more conditions, called "antecedents;" the Then side contains
one or more actions, called "consequences." The antecedents of rules
correspond directly to degrees of membership calculated during the
fuzzification process.
For example, consider a potential rule from the stock-market system shown in
Figure 3: If share price is decreasing And trading volume is heavy, Then order
is sell. The two conditions "share price is decreasing" and "trading volume is
heavy" are the rule's antecedents. Each antecedent has a degree-of-truth
(membership) value assigned to it as a result of fuzzification. The action of
the rule (or "fuzzy output") is to sell shares. During rule evaluation,
strengths are computed based on antecedent values and then assigned to the
rules' fuzzy outputs. Generally, a minimum function is used so that the
strength of a rule is assigned the value of its weakest or least true
antecedent. Other methods to compute rule strength can be used, such as
multiplying antecedent values together. The action of selling shares is
carried out to a degree that reflects the rule's strength. In other words, the
amount of shares sold is based on the degree to which share price is
decreasing and trading volume is heavy. Often, more than one rule applies to
the same specific action, in which case the common practice is to use the
strongest or most true rule; see Figure 4.
Figure 4: Rule-evaluation computation.

 Rule 1: if A & B then Z & X
 Rule 2: if C & D then Z & Y

 Strength of Rule 1 = min (A,B)
 Strength of Rule 2 = min (C,D)

 X = Strength of Rule 1
 Y = Strength of Rule 2

 Z = max (Strength of Rule 1
 Strength of Rule 2)
 = max (min(A,B), min(C,D))


Defuzzification of Outputs. Even though the rule-evaluation process assigns
strengths to each specific action, further processing, or "defuzzification,"
is required for two reasons. The first is to decipher the meaning of vague
(fuzzy) actions, such as "order is sell," using membership functions. The
second is to resolve conflicts between competing actions such as "order is
sell" and "order is hold," which may have been triggered by certain conditions
during rule evaluation. Defuzzification employs compromising techniques to
resolve both the vagueness and conflict issues.
One common defuzzification technique, the "center-of-gravity method," consists
of several steps. Initially, a centroid point on the X-axis is determined for
each output membership function. Then, the membership functions are limited in
height by the applied rule strength, and the areas of the membership functions
are computed. Finally, the defuzzified output is derived by a weighted average
of the X-axis centroid points and the computed areas, with the areas serving
as the weights. The center-of-gravity method is illustrated in Figure 5.
Sometimes, "singletons" are used to simplify the defuzzification process; see
Figure 6. A singleton is an output membership function represented by a single
vertical line. Since a singleton intersects the X-axis at only one point, the
center-of-gravity calculation reduces to just a weighted average calculation
of X-axis points and rule strengths, with the rule strengths used as weights.


Fuzzy Data Structures


To implement a fuzzy system in C, the following types of data must be
accommodated:
System inputs.
Input membership functions.
Antecedent values.
Rules.
Rule-output strengths.
Output membership functions.
System outputs.
Figure 7 illustrates an overall linked-list arrangement of system-input and
membership-function nodes. The details of these structures are shown in Figure
8. The system-input node is straight-forward and contains an input name, a
membership-function pointer, and a next-input pointer. More interesting is the
membership-function structure, which contains two X-axis points and two slope
values that describe a trapezoidal membership function. This information is
used to calculate antecedent values (degrees of membership), as shown in
Figure 9 and Listing Three (page 94). The resulting antecedent value is stored
in the "value" field of the membership-function structure. Rules can be
represented by two sets of pointers; see Figure 10. The first set indicates
which antecedent values are used to determine the rule's strength, and the
second set points to output locations where the strength is to be applied.
Finally, a data arrangement similar to the input-data structure handles
outputs and output membership functions; see Figure 11. Listing One (page 94)
includes the C-code definition of these data structures. Articles by James M.
Sibigtroth (see "References") explain the implementation of fuzzy systems at
the assembly language level.


Inverted-pendulum Example


Figure 12 shows a classic two-dimensional control problem known as the
"inverted pendulum." The idea is to keep a pole vertically balanced. The pole
is weighted at the top and attached at the bottom by a movable base. If the
pole falls to the right or left, the base moves in the same direction to
compensate. By monitoring the angle and angular velocity of the pendulum, a
fuzzy system can determine the proper force to apply at the base to keep it
balanced. Figure 13 shows the fuzzy sets associated with the system inputs and
output. The exact set of rules depends on the dynamics of the physical
components, required robustness, and range of operating conditions.
Theoretically, the rule base in Figure 14 is sufficient to balance the
pendulum, but other solutions exist. A general-purpose fuzzy inference engine
like that in Listings One through Four can be applied to many applications.
Listing One provides the header and data structures, Listings Two and Three
(page 94) present the major fuzzy processes, and Listing Four (page 94) lists
the math-support functions. The input-configuration files describing system
input/output, the membership functions, and the rule base differ from
application to application. Figures 14 and 15 contain the necessary
information to implement the inverted-pendulum problem.
Figure 14: Inverted pendulum rule base.

 Rule 1: IF (angle is NL) AND (velocity is ZE) THEN (force is PL)
 Rule 2: IF (angle is ZE) AND (velocity is NL) THEN (force is PL)
 Rule 3: IF (angle is NM) AND (velocity is ZE) THEN (force is PM)
 Rule 4: IF (angle is ZE) AND (velocity is NM) THEN (force is PM)
 Rule 5: IF (angle is NS) AND (velocity is ZE) THEN (force is PS)
 Rule 6: IF (angle is ZE) AND (velocity is NS) THEN (force is PS)
 Rule 7: IF (angle is NS) AND (velocity is PS) THEN (force is PS)
 Rule 8: IF (angle is ZE) AND (velocity is ZE) THEN (force is ZE)
 Rule 9: IF (angle is ZE) AND (velocity is PS) THEN (force is NS)
 Rule 10: IF (angle is PS) AND (velocity is ZE) THEN (force is NS)
 Rule 11: IF (angle is PS) AND (velocity is NS) THEN (force is NS)
 Rule 12: IF (angle is ZE) AND (velocity is PM) THEN (force is NM)
 Rule 13: IF (angle is NM) AND (velocity is ZE) THEN (force is NM)
 Rule 14: IF (angle is ZE) AND (velocity is PL) THEN (force is NL)
 Rule 15: IF (angle is PL) AND (velocity is ZE) THEN (force is NL)

Figure 15: Sample membership function input file.

 input: angle input: velocity output: force
 NL: 0 31 31 63 NL: 0 31 31 63 NL: 0 31 31 63
 NM: 31 63 63 95 NM: 31 63 63 95 NM: 31 63 63 95
 NS: 63 95 95 127 NS: 63 95 95 127 NS: 63 95 95 127
 ZE: 95 127 127 159 ZE: 95 127 127 159 ZE: 95 127 127 159
 PS: 127 159 159 191 PS: 127 159 159 191 PS: 127 159 159 191
 PM: 159 191 191 223 PM: 159 191 191 223 PM: 159 191 191 223
 PL: 191 223 223 255 PL: 191 223 223 255 PL: 191 223 223 255

Figure 15 repeats the input/output and membership information shown Figure 13
in a format that can be easily parsed by an initialization routine. Such an
initialization routine (not shown in the listings) sets up the required data
structures, converting the four points describing a membership function into
two points and two slopes; see Figure 16.
Generally, four points describe a trapezoid, but a triangle can be formed by
making the two midpoints identical, as in Figure 16.


Closing Remarks



The emergence of fuzzy logic is exciting because it is readily applicable to
many problems too awkward to solve with conventional techniques. Any
programmer can easily write code to implement a fuzzy inference engine like
the one presented here. However, excellent fuzzy development tools exist which
allow the designer to focus more on the application and behavior of the system
and less on the implementation. These tools provide user-friendly, graphical
interfaces with a rich set of support functions for analyzing, debugging, and
simulating the system. Examples of such tools are: FIDE from Aptronix (San
Jose, CA), CubiCalc from Hyperlogic (Escondido, CA), and TILShell from Togai
InfraLogic (Irvine, CA). In addition, Motorola distributes free fuzzy
development tools through their electronic BBS. "Freeware Data Services," at
512-891-3733 (in the subdirectory amcu/amcull).
Implementing the fuzzy engine manually, however, affords the ability to
understand, optimize, or customize the fuzzy inference engine. This is
especially important when experimenting with new fuzzy paradigms, such as
rulebase hierarchies and adaptive or hybrid systems.


References


Brubaker, David I. Introduction to Fuzzy Logic Systems. Menlo Park, CA: The
Huntington Group, 1991.
Kosko, Bart. Neural Networks and Fuzzy Systems. Englewood Cliffs., NJ:
Prentice-Hall, 1990.
Self, Kevin. "Designing with Fuzzy Logic." IEEE Spectrum (November, 1990).
Sibigtroth, James M. "Creating Fuzzy Micros." Embedded Systems Programming
(December, 1991).
Sibigtroth, James M. "Implementing Fuzzy Expert Rules." AI Expert (April,
1992).
Williams, Tom. "Fuzzy Logic is Anything but Fuzzy." Computer Design (April,
1992).

_FUZZY LOGIC IN C_
by Greg Viot


[LISTING ONE]

/* General-purpose fuzzy inference engine supporting any number of system
inputs and outputs, membership functions, and rules. Membership functions can
be any shape defineable by 2 points and 2 slopes--trapezoids, triangles,
rectanlges, etc. Rules can have any number of antecedents and outputs, and can
vary from rule to rule. "Min" method is used to compute rule strength, "Max"
for applying rule strengths, "Center-of-Gravity" for defuzzification. This
implementation of Inverted Pendulum control problem has: System Inputs, 2
(pendulum angle and velocity); System Outputs, 1 (force supplied to base of
pendulum); Membership Functions, 7 per system input/output; Rules, 15 (each
with 2 antecedents & 1 output). If more precision is required, integers can
be changed to real numbers.*/

#include <stdio.h>
#define MAXNAME 10 /* max number of characters in names */
#define UPPER_LIMIT 255 /* max number assigned as degree of membership */

/* io_type structure builds a list of system inputs and a list of system
outputs. After initialization, these lists are fixed, except for value field
which is updated on every inference pass. */
struct io_type{
 char name[MAXNAME]; /* name of system input/output */
 int value; /* value of system input/output */
 struct mf_type /* list of membership functions for */
 *membership_functions; /* this system input/output */
 struct io_type *next; /* pointer to next input/output */
 };
/* Membership functions are associated with each system input and output. */
struct mf_type{
 char name[MAXNAME]; /* name of membership function (fuzzy set) */
 int value; /* degree of membership or output strength */
 int point1; /* leftmost x-axis point of mem. function */
 int point2; /* rightmost x-axis point of mem. function */
 int slope1; /* slope of left side of membership function */
 int slope2; /* slope of right side of membership function */
 struct mf_type *next; /* pointer to next membership function */
 };
/* Each rule has an if side and a then side. Elements making up if side are
pointers to antecedent values inside mf_type structure. Elements making up
then

side of rule are pointers to output strength values, also inside mf_type
structure. Each rule structure contains a pointer to next rule in rule base.
*/
struct rule_element_type{
 int *value; /* pointer to antecedent/output strength value */
 struct rule_element_type *next; /* next antecedent/output element in rule */
 };
struct rule_type{
 struct rule_element_type *if_side; /* list of antecedents in rule */
 struct rule_element_type *then_side; /* list of outputs in rule */
 struct rule_type *next; /* next rule in rule base */
 };
struct rule_type *Rule_Base; /* list of all rules in rule base */







[LISTING TWO]

main()
{
 initialize_system();
 while(1){
 get_system_inputs();
 fuzzification();
 rule_evaluation();
 defuzzification();
 put_system_outputs();
 }
}






[LISTING THREE]

/* Fuzzification--Degree of membership value is calculated for each membership
function of each system input. Values correspond to antecedents in rules. */
fuzzification()
{
 struct io_type *si; /* system input pointer */
 struct mf_type *mf; /* membership function pointer */
for(si=System_Inputs; si != NULL; si=si->next)
 for(mf=si->membership_functions; mf != NULL; mf=mf->next)
 compute_degree_of_membership(mf,si->value);
}
/* Rule Evaluation--Each rule consists of a list of pointers to antecedents
(if side), list of pointers to outputs (then side), and pointer to next rule
in rule base. When a rule is evaluated, its antecedents are ANDed together,
using a minimum function, to form strength of rule. Then strength is applied
to each of listed rule outputs. If an output has already been assigned a rule
strength, during current inference pass, a maximum function is used to
determine which strength should apply. */
rule_evaluation()
{

 struct rule_type *rule;
 struct rule_element_type *ip; /* pointer of antecedents (if-parts) */
 struct rule_element_type *tp; /* pointer to consequences (then-parts) */
 int strength; /* strength of rule currently being evaluated */
 for(rule=Rule_Base; rule != NULL; rule=rule->next){
 strength = UPPER_LIMIT; /* max rule strength allowed */
 /* process if-side of rule to determine strength */
 for(ip=rule->if_side; ip != NULL; ip=ip->next)
 strength = min(strength,*(ip->value));
 /* process then-side of rule to apply strength */
 for(tp=rule->then_side; tp != NULL; tp=tp->next)
 *(tp->value) = max(strength,*(tp->value));
 }
}
/* Defuzzification */
defuzzification()
{
 struct io_type *so; /* system output pointer */
 struct mf_type *mf; /* output membership function pointer */
 int sum_of_products; /* sum of products of area & centroid */
 int sum_of_areas; /* sum of shortend trapezoid area */
 int area;
 int centroid;
 /* compute a defuzzified value for each system output */
for(so=System_Outputs; so != NULL; so=so->next){
 sum_of_products = 0;
 sum_of_areas = 0;
 for(mf=so->membership_functions; mf != NULL; mf=mf->next){
 area = compute_area_of_trapezoid(mf);
 centroid = mf->point1 + (mf->point2 - mf->point1)/2;
 sum_of_products += area * centroid;
 sum_of_areas += area;
 }
 so->value = sum_of_products/sum_of_areas; /* weighted average */
 }
}





[LISTING FOUR]

/* Compute Degree of Membership--Degree to which input is a member of mf is
calculated as follows: 1. Compute delta terms to determine if input is inside
or outside membership function. 2. If outside, then degree of membership is 0.
Otherwise, smaller of delta_1 * slope1 and delta_2 * slope2 applies.
3. Enforce upper limit. */
compute_degree_of_membership(mf,input)
struct mf_type *mf;
int input;
{
 int delta_1;
 int delta_2;
 delta_1 = input - mf->point1;
 delta_2 = mf->point2 - input;
 if ((delta_1 <= 0) (delta_2 <= 0)) /* input outside mem. function ? */
 mf->value = 0; /* then degree of membership is 0 */
 else

 mf->value = min( (mf->slope1*delta_1),(mf->slope2*delta_2) );
 mf->value = min(mf->value,UPPER_LIMIT); /* enforce upper limit */
}
/* Compute Area of Trapezoid--Each inference pass produces a new set of output
strengths which affect the areas of trapezoidal membership functions used in
center-of-gravity defuzzification. Area values must be recalculated with each
pass. Area of trapezoid is h*(a+b)/2 where h=height=output_strength=mf->value
b=base=mf->point2-mf->point1 a=top= must be derived from h,b, and slopes1&2 */
compute_area_of_trapezoid(mf)
struct mf_type *mf;
{
 int run_1;
 int run_2;
 int base;
 int top;
 int area;
 base = mf->point2 - mf->point1;
 run_1 = mf->value/mf->slope1;
 run_2 = mf->value/mf->slope2;
 top = base - run_1 - run_2;
 area = mf->value * ( base + top)/2;
 return(area);
}







































February, 1993
A NEURAL-NETWORK AUDIO SYNTHESIZER


Generating natural and space-age sounds in hardware




Mark Thorson, Forrest Warthman, and Mark Holler


Mark Thorson designed and implemented the synthesizer's primary hardware. He
received his AB in neurobiology and is associate editor of Microprocessor
Report. Forrest Warthman conceived the synthesizer project, designed the user
interface, and maintains the project's momentum. He is president of Warthman
Associates, Palo Alto, California. Mark Holler is Intel's program manager for
neural-network products. The authors can be contacted at 240 Hamilton Ave.,
Palo Alto, CA 94301.


Although neural networks got their start in software, the computation power
required by applications such as control systems makes hardware
implementations of neural nets a natural evolution. In fact, within the next
decade we'll likely see more neural nets in hardware than software--and
Intel's 80170NX Electrically Trainable Analog Neural Network (ETANN) chip is
the first widely available silicon implementation of the technology.
This article describes an 80170NX-based musical instrument of unique design.
The instrument, which synthesizes analog audio signals, evolved from a project
begun in 1989 with David Tudor, a pioneering electronic-music composer and a
musician with the Merce Cunningham Dance Company in New York. Tudor and his
colleague, Takehisa Kosugi, introduced the synthesizer in a series of
performances by the Merce Cunningham Dance Company at the Paris Opera House in
November 1992.
The synthesizer can generate a remarkable range of audio effects, from unique
space-age and science-fiction sounds to passages that sound like heart beats,
drums, gongs, porpoises, birds, engines, and musical instruments such as
violas and flutes.
Sounds are generated internally by the synthesizer, without external inputs,
using the neural-network chip's 64 artificial neurons. The neurons are
connected on-chip in loops, using programmable synaptic weights, or off-chip,
using patch cables and feedback circuits. Oscillations occur as a result of
delay in the feedback paths. The sounds are generally rich because of the
complexity of the circuitry. External inputs such as voice, music, or random
sounds can be used to enrich or control the internally generated sounds.
In this article, we present the design and implementation of the
synthesizer--from circuits to firmware--as an example of a typical,
hardware-based, neural-net embedded system. For background on neural networks,
see the sources listed at the end of this article and "Untangling Neural Nets"
by Jeannette Lawrence (DDJ, April 1990).


Synthesizer Architecture


The synthesizer's console housing has dozens of audio jacks buffered to the
analog inputs and outputs of the neural-network chip; see Figure 1(a). Patch
cables are routed to and from the jacks to feed chip outputs back to chip
inputs; to connect external inputs to chip inputs; and to connect chip outputs
to external amplifiers, recorders, or display devices. Some of the chip
outputs have multiple console jacks so that a single neuron on the chip can
drive several destinations.
The 80170NX is at the heart of the synthesizer; see Figure 1(b). The chip
contains 64 artificial neurons, each with 128 analog inputs. Artificial
"synapses" connect each neuron with the 128 inputs. Each synapse in an
artificial neuron consists of a multiplier and a non-volatile weight; see
Figure 2. The function of a neuron is to sum the products of all inputs x
weights (the "inner product" or "dot product" of the vectors and output a
result that is a sigmoid function of the inner product. The sigmoid function
has a nonlinear threshold shape, like a stretched out letter "S."
There are two 64x64 arrays of synapses--an input array and a feedback array.
The input array is programmed with weights, and the results produced by the 64
neurons can optionally be fed back on-chip to the feedback array. This allows
any neuron to be connected to any other neuron by programming weights at the
appropriate synapses in the feedback array.
Chip outputs can also be fed back to inputs externally, through feedback
circuits in the synthesizer (Figure 3), or they can be brought to the
synthesizer's console to drive multiple audio and/or oscilloscope channels.
The music synthesizer is unique in that it relies heavily on feedback and the
dynamics of t e analog circuitry, rather than just the feed-forward
computations of the artificial neurons. The behavior of the synthesizer can
only be described by a set of coupled, nonlinear differential equations. It's
not feasible to simulate circuits of this complexity on today's digital
computers. Only by building the synthesizer could its behavior be discovered.


Synthesizer Circuitry


The synthesizer circuit (Figure 4) has simple buffer structures on all inputs
and outputs of the neural-network chip. These buffers consist primarily of
LM324 op-amps in a unity-gain configuration, wired as analog buffers. Their
main purpose is to protect the expensive neural-network chip against damage
from high-voltage signals and short-circuit loads. Each audio input has a
0.27muF capacitor in series, which strips any DC component from the input
signal. A 100K resistor connected to the unity-gain op-amp supplies the DC
offset of the signal. For protection against extreme inputs, the input section
has heavy rectifier diodes to clamp signals more than a diode drop (0.7V)
above Vcc or below ground. At worst, an inexpensive quad op-amp chip or a
diode needs to be replaced if an errant signal appears.
The input section is also equipped with a passive network for adjusting the
center voltage (the DC offset) of the analog inputs. The inputs are AC-coupled
to nodes that are weakly coupled to a static voltage defined by a
potentiometer on the front panel. We considered this feature important because
the neuron amplifiers are only linear in a small range of input voltages. We
feared that operation outside this range would cause distortion of external
audio sources fed into the network. The potentiometer for controlling DC
offset allows external signals to be centered on the "sweet spot" (the linear
region) of the gain function.
Originally, there were four front-panel potentiometers for defining static
voltage levels: In addition to the DC-offset control, there were three
special-purpose controls for inputs to the neural-network chip. These
controlled the gain of the neuron amplifiers, the input reference level (the
zero level), and the output range. After some experience with the unit, the
latter two signals were tied to static voltage levels (1.5V). The
potentiometers are implemented as simple voltage dividers between Vcc and
ground, with a capacitor for filtering out noise.
At first, we tied DC-offset control directly to the analog inputs through an
array of 100K resistors. Later, we wired spare op-amps in as unity-gain
buffers between the voltage divider and the resistors to prevent
cross-coupling (leakage of signals between audio-input channels).
Certain inputs were dedicated as control inputs, our intent being that one
audio source could modulate another. However, there does not seem to be a
simple way to make the neural-network chip do true modulation, because it only
performs multiplication by constant coefficients stored on-chip. It can almost
modulate one signal by another when the modulating signal is fed into the gain
input--and we tried that--but there is significant feedthrough of the
modulating signal to the output. Besides, this technique would only provide
one modulation channel.
Instead, we performed a sort of on/off modulation in which control signals
connected to the neurons by large synaptic weights are used to blot out the
audio source by driving the neurons into saturation. Our first step was to
implement a simple switcher, in which audio inputs could be routed to audio
outputs under the control of inputs (control signals) that were themselves
other audio sources. To get this to work nicely, we made three modifications
to the input circuits for the control signals: The gain of the op-amps had to
be increased from unity to about 1000; the signals had to be rectified; and
heavy low-pass filtering had to be added. Depending on the potentiometer
settings, the time constant of the low-pass filtering was about 0.1 to 0.5
seconds, chosen to correspond roughly to a spoken syllable or a musical note.
One of our first experiments allowed us to modulate music using a tape of a
lecture. It resulted in the odd experience of hearing word-sized snatches of
music with the cadence of speech--something like hearing a strange foreign
language.
After we implemented the basic functions, we began exploring the capability of
the system to generate sounds using feedback networks. During this time, the
flexible construction of the unit proved invaluable. For example, the
high-amplification factor on the control inputs was undesirable for
oscillation experiments, and this could be easily changed by swapping a
socketed resistor pack. We added DIP switches to allow the heavy filtering and
rectification on the control inputs to be temporarily removed.
Noise was an early problem. Although the 50muV peak-to-peak noise on the
summing lines was small, it was large enough to be annoying when the
synthesizer was used to process external audio signals. We surmised that the
cause was thermal noise on the neuron summing lines; see Figure 2(b). The
amplification factor between these nodes and the neural-network outputs is a
factor of about 1000. Our solution was simple and worked quite well: Because
we were using a relatively small number of chip inputs, we could afford to run
each audio signal into several input pins in parallel. With the same synapse
weighting on each chip input for a given audio signal, the strength of the
signal at the summing nodes is increased, while the noise level is unchanged.
By using nine parallel chip inputs for each audio signal, the signal-to-noise
ratio was improved by a factor of nine.
Noise is not always bad. It is useful during synthesis for adding randomness
to the sounds. The neuron gain is set high to maximize amplification of the
noise, and then feedback attenuation is adjusted until the network is just at
the edge of oscillation. The noise intermittently stimulates oscillation of
the network.


Firmware


The synthesizer's firmware consists of signed weights that represent the
strength of connections between inputs and neurons. The weights are downloaded
to the neural-network chip with the Intel Neural Network Training System
(iNNTS), a software/hardware kit used to train the 80170NX. After
down-loading, the weights are analogous to the strength of synaptic
connections in a biological neural network. Unlike typical neural networks,
which use input/output pattern pairs and a learning algorithm to derive a set
of weights, the synthesizer's weights are manually set to be unique for each
neuron. Since our first goal was to synthesize original sounds, we did not use
existing examples from which to learn.
Figure 5 shows an early version of this firmware. Here, six synthesizer inputs
are shown as the rows, and the 14 neurons are shown as the columns. The
weights are at the matrix intersections. The sigmoidal neuron amplifiers
appear as triangles at the top of the matrix diagram. These represent the
synthesizer outputs that can be routed back to inputs, to amplifier-speaker
channels, and/or to oscilloscope channels.
In a later version of the firmware, two additional chip inputs, virtually all
of the 64 neurons, and a large number of the chip's synapse connections were
used to achieve greater complexity of sound.
Although the weights on the neural-network chip can be set with at least 6-bit
precision, only values of +2.5V and -2.5V were used for the earliest version.
The weights could have been changeable under the training system's control
during audio synthesis if the synthesizer had been built on one of Intel's
multi-chip prototyping boards. This approach, though more costly, would have
facilitated easy reconfiguration and reduced the number of potentiometers and
patch cables.


Inputs and Outputs



The first version of the synthesizer had seven inputs (four audio inputs and
three control inputs), as shown in Figure 4. An input-bias adjust circuit on
the audio inputs generates a DC voltage used to bias the inputs to the neural
network in the middle of their operating range. The synapse multipliers are
most linear when the inputs are near V[REFi], an input reference voltage
supplied to the chip. V[REFi] is supplied by another voltage divider shown on
the right side of Figure 4.
The three high-gain control inputs are switchable between an audio source,
shown in Figure 4 as an audio connector, and the +5V power supply. When a
control input is connected to the power supply by the switch, the
potentiometer associated with that input sets the control input to a static
level. This ability to set individual input levels allows biasing of
individual neurons at different operating points, some at high-gain and others
saturated high or low.
A sigmoid-gain adjust is provided for all audio outputs as a group. This
circuit adjusts the slope of the neuron output's threshold function. Finally,
each audio output has a unity-gain op-amp for short-circuit protection and a
decoupling capacitor.


Feedback


Two types of feedback are used to generate two different types of
oscillations. The first type of feedback, used to synthesize sinusoidal
oscillations, is generated by the phase-shifting bandpass-filter feedback
circuit shown at the left side of Figure 4. This circuit can be patch-cabled
between any audio output and any audio input. A potentiometer associated with
the feedback circuit allows attenuation of the feedback signal; the more
feedback, the larger the oscillations and the higher the frequency of the
oscillations. The lower cut-off frequency of the bandpass filter is
proportional to 1/RC, and the upper cut-off frequency is proportional to R/L.
The dominant R and C in the feedback path are actually the 100K resistor and
the 0.27muF capacitor in the audio-input buffer circuitry.
The second type of feedback produces relaxation oscillations. It is
accomplished by directly connecting audio outputs to audio inputs. The 100K
resistor and the 0.27muF decoupling capacitors again are the dominant elements
in this oscillation circuit. The oscillations generated by this type of
feedback are abrupt switching transitions followed by an RC decay back toward
a switch point. The abrupt transition has the sound of a pop. Figure 6 shows
the type of waveforms that can be produced. The waveforms generated are often
similar to those of the action potentials or spikes in biological neurons.


Synthesizer Operation


The synthesizer is operated by configuring the cables (inputs, feedback loops,
and outputs) and setting the potentiometers. First, the input-bias adjust
potentiometer is set to bias the chip's neurons in the narrow region where
they amplify linearly. The correct bias is detectable by listening for maximum
noise output. Next, the Sigmoid gain is increased and feedback attenuation is
reduced until the network breaks into oscillations. Changes in the relative
gain of the various feedback paths or the network architecture produce
different sounds.
In some configurations, the synthesizer generates predictable or
semi-predictable rhythms, the periodicity and complexity of which can be
varied. Some of these responses suggest the biological analogy of the
circuit--the firing of neurons in an organic, not-quite predictable sequence.
In other configurations, the synthesizer generates remarkably complex and
unique sounds that cannot be repeated predictably due to the high sensitivity
of the oscillations to small changes in the feedback gain when the synthesizer
is set just at the threshold of oscillation. At this bias point, very random
behavior is often observed, much like the random ticks of a Geiger counter.
This behavior is due to the thermal noise on the summing lines stimulating the
network to oscillate for a few cycles, then dying out.


Summary


The synthesizer provides unique insights into the dynamics of neural networks
and complex nonlinear systems in general. These insights are novel because
they're experienced in terms of audio and visual (oscilloscope) responses.
Potential near-term applications include musical instruments and controls for
audio/visual entertainment performances. Long-range applications are an open
frontier. Further development could result in products that respond to the
unique pitch and volume of audio inputs with specific synthesized sounds.
Development would likely include experimentation with larger networks and more
complex feedback circuits. By making connections with weights on the
neural-network chip rather than with patch cables, network architecture could
be reconfigured in milliseconds under computer control. This approach would
facilitate the modification of network architecture rhythmically, during
synthesis.


References


80170NX Electrically Trainable Analog Neural Network (ETANN) Data Sheet. Santa
Clara, CA: Intel Corp., 1991. Literature orders: 800-548-4725.
Hopfield J.J. and D.W. Tank. "Computing with Neural Circuits: A Model."
Science (August, 1986).
Kandel, E.R. and J.H. Schwartz. Principles of Neural Science, second edition.
New York, NY: Elsevier, 1985.
Mead, C. Analog VLSI and Neural Systems. Reading, MA: Addison-Wesley, 1989.
Rumelhart, D.E. and J.L. McClelland. Parallel Distributed Processing
Explorations in the Microstructure of Cognition, volumes 1 through 3.
Cambridge, MA: MIT Press, 1988.
Todd, P.M. and D.G. Log, eds. Music and Connectionism. Cambridge, MA: MIT
Press, 1992.


























February, 1993
UNTANGLING THE WINDOWS SOCKETS API


A standardized interface for network development




Mike Calbaum, Frank Porcaro, Mark Ruegsegger, Bruce Backman


The authors are engineers at Frontier Technologies Corp. and be contacted at
10201 North Port Washington Road, Mequon WI 53092 or at tcp@frontiertech.com.


The Windows Sockets API is an open, standard programming interface for
developing TCP/IP network applications for Microsoft Windows. As a
standardized programming interface, the API allows you to develop one
application that will run unmodified over any TCP/IP network stack with a
Windows Sockets-compliant API. Before the standard, you needed to develop a
version of your application for every implementation of TCP/IP on which you
wished to run the application. The Windows Sockets API is implemented as a
Windows dynamic link library (DLL) with standardized function interfaces.
The Windows Sockets API idea was conceived at a Birds Of A Feather session
held at Interop '91. After numerous drafts and revisions, the final
specification was released for implementation by TCP/IP vendors. The API has
its origins in the socket interface first distributed with 4.1cBSD UNIX for
the VAX in 1982 and updated with the 4.3BSD VAX release in 1986. The original
BSD sockets interface supported UNIX domain sockets for interprocess
communication on a single host, TCP/IP domain sockets, and Xerox XNS domain
sockets. The Windows Sockets API currently only supports TCP/IP networks.
The sockets interface removes from you the responsibility of knowing the
details of the underlying network and bundles the details of network
programming into a handy abstraction called a socket. A socket is one endpoint
in a network-communication path. Sockets are differentiated from each other by
names that consists of the local-host network address and a port number. You
can either select the port number or let the sockets interface select one. The
available port numbers are: 1-1023, which are reserved for standard TCP/IP
server applications (FTP, rwho, finger, and the like); 1024-5000, which can be
assigned to a socket by the system; and 5001, and above, which are available
for you to assign to your sockets.
The sockets interface provides a means for both reliable, connection-oriented
stream communications and unreliable, connectionless datagram-oriented
communications. Reliable, connection-oriented communication is provided by
SOCK_STREAM-type sockets. Applications communicating through stream sockets
have a socket in each application that is "connected" to the socket in the
other application. When a socket is connected to another socket, it can only
send data to the socket on the other endpoint of the connection and can only
receive data from that socket. Stream-type sockets use a reliable
data-transmission protocol (usually TCP) to provide reliable communications
with a peer application. A reliable transmission protocol guarantees that data
will arrive at its destination uncorrupted and in the same order it was sent.
If data is lost or corrupted, it will be retransmitted until a successful
reception of the data is acknowledged by the receiving host. A stream socket
provides bidirectional communication with its peer socket, which means that
data can be sent and received on the same socket. The data in a stream socket
is sent and received as a continuous stream of bytes with no record
delimiters. If an application sends different types of data through a single
socket, it must provide record delimiters or closely coordinate the order of
the data transmitted between the two applications.
Most distributed applications use a client/server model, in which one
application acts as a server, accepting requests for services from client
applications, processing the requests, and then returning the results. This
model is particularly well suited to stream sockets, where one application
must be listening for incoming connections and the other must attempt to
connect. A typical client/server interaction using stream sockets follows the
line of execution shown in Figure 1.
The sockets interface can also provide connectionless datagram-oriented
communications through SOCK_DGRAM-type sockets. Datagram sockets do not need
to be connected to another socket and can therefore send data to and receive
data from multiple sockets. Datagram sockets send and receive their data as
self-contained packages with one package of data sent or received at a time,
rather than as a continuous stream of data bytes. Datagram sockets use an
unreliable transmission protocol (usually UDP) to transmit their data. Since
the protocol is unreliable, lost data will not be retransmitted, and data can
arrive in a different order than it was transmitted. Datagram sockets can
broadcast messages across the local network to be picked up by any application
receiving datagrams on the port to which it was broadcast.
The sockets interface also provides functions to associate user-readable host
names with network addresses, network addresses with host names, server names
with port numbers, and the like. These functions--gethostbyname(),
gethostbyaddress(), getservbyport(), and so on--are collectively known as the
getXbyY functions and may be implemented such that they search a local
database file for the information, request it from a server on the same
network, or both.
The Berkeley sockets interface has some major shortcomings when implemented
for Windows. Potentially time-consuming operations (like waiting for an
incoming connection or doing a host-name lookup) will block the execution of
the application until they're finished. In Windows, this will cause the
application's user interface to be unresponsive until the operation is
completed, preventing the user from aborting any pending operations. Also, the
select() function, which can be used to test a socket for outstanding,
incoming, connections, readability, writability, and the like must be called
repeatedly in a polling fashion until the desired condition occurs.
To overcome these shortcomings, the authors of the Windows Sockets
Specification extended the Berkeley definitions to include asynchronous
versions of the select() and getXbyY functions and a function to abort any
blocking operation. Rather than block execution until finished, the
asynchronous functions return immediately and post messages to an application
window when the select function detects the desired condition or when the
getXbyY request has been resolved.


Developing a Windows Sockets Application


To illustrate the use of the Windows Sockets API to develop a network
application, we'll build a program that implements some of the standard UNIX
network utilities. We'll develop an application that implements the finger
utility as an example of a stream-oriented client application, and the fingerd
server as an example of a stream-oriented server application. Listing One
(page 96) lists sock.c, Listing Two (page 99) is dlg.h, and Listing Three
(page 99) is socket.rc. All Windows Sockets applications must include the file
winsock.h. All the related data structures are defined in this file, and all
the Windows Sockets functions have prototypes in this file.
After doing the standard Windows startup tasks, registering our window class,
creating the window, and so on, the application must first call the Windows
Sockets function WSAStartup() and pass it two parameters: 1. A WORD whose
low-order byte is equal to the major version number and whose high-order byte
is the minor revision number of the Windows Sockets version needed to support
this application; and 2. a pointer to a WSADATA structure. The WSAStartup()
function will return 0 if the Windows Sockets DLL can support the version of
the specification that the application requires and initialization succeeded;
otherwise it will return an error code. If a 0 is returned, the WSADATA
structure is filled with information about the Windows Sockets DLL. If
WSAStartup() succeeds, we start executing the application message loop.
When we exit the application message loop, we know that the application is
shutting down so we close any open sockets and call the WSACleanup() function
to advise the Windows Sockets DLL that we are going away and it can clean up
any memory allocations that support the application.
The WinMain() function is fairly straightforward: We do our standard Windows
initializations, call WSAStartup() to register our application with the
Windows Sockets DLL, perform the application message loop, close any open
sockets, and call WSACleanup() before exiting.
The main window procedure does all of the standard window-procedure
processing: handling keyboard input, creating windows, updating and responding
to menu selections, and responding to close messages.
The application window has one menu, the Sockets menu, which has five menu
items: Finger Client, Finger Server, Cancel Operation, About, and Exit. About
and Exit are standard items for a Windows application and are handled in the
standard way. About displays a message box containing information about the
application; and Exit posts a close message to the window to end execution of
the application. The remaining three menu items use the Windows Sockets API to
implement the UNIX network utilities finger and fingerd.


A Sample Client Program


The finger client utility connects to a remote host and sends it the user name
of a person on that system; the finger server accepts the connection, reads
the user name from the socket, looks up information about that user, and sends
the information back to the client.
The first step is to develop the client finger application. If you have a
network that contains a UNIX host, you'll be able to use the client
application to get information about users on that host; otherwise, you'll
need to develop the finger-server portion of the application and run it on
another machine to test your application.
When the user selects the Finger Client menu item, the program prompts for the
name of the remote host to query. If the user enters a host name and presses
the OK button, the Windows Sockets function gethostbyname() is called, passing
the host name as a parameter. (See lines 184-194 in Listing One.) If
gethostbyname() successfully locates information about a host with this name,
it will return a pointer to a hostent structure containing the information. If
it cannot find information about this host, it will return NULL. If
gethostbyname() returns a non-NULL pointer, we copy the hosts network address
from the h_addr field of the hostent structure into the sin_addr.s_addr field
of a sockaddr_in structure. For TCP/IP, the address will always be four bytes
long, but to be safe, convention dictates that we examine the h_length field
of the hostent structure and copy that many bytes into the sockaddr_in
structure. The pointers returned by the getXbyY functions point to static
memory areas that may be reused by the Windows Sockets DLL for other
information; therefore, all the information we need from the structure should
be copied before any subsequent Windows Sockets functions are called. If
gethostbyname() returns NULL, we display a message box for the user telling
them we do not know this host.
If the network address of the remote host was located successfully, the
program prompts the user for the user name for which they would like
information. Entering a blank user name is not an error; the finger server
considers a blank user name to be a request for information on all users
currently logged in to the remote host, as in lines 195-197 of Listing One.
If we have not previously looked up the port number used by the finger server,
we now obtain it using getservbyname() (see Listing One, lines 198-209) which
takes two strings as parameters, the server name and the protocol name. We
pass in "finger" and "tcp" as the parameters. getservbyname() returns a
pointer to a servent structure if successful and NULL if unsuccessful. If we
receive a non-NULL pointer, we copy the port number from the servent structure
into a global variable. TCP/IP networks use all numbers in a byte order
opposite that used by the Intel and DEC VAX processors. When providing numbers
to the sockets interface or getting them from the sockets interface, we need
to convert the byte order. Sockets provides utility functions to convert the
byte order of the numbers. ntohs is used to convert a short from network to
host byte order; htons converts a short from host to network byte order. ntohl
and htonl serve the same purpose for longs. Even if the host for which your
application is targeted uses the same byte ordering as the network, you should
use these functions for the sake of portability. If we receive a NULL pointer
from getservbyname(), we alert the user with a message box and exit the switch
statement.
If everything has been successful so far, we next create a socket by calling
socket(), as in lines 210-228 of Listing One. If the socket function fails to
allocate a socket for us, it will return INVALID_SOCKET; otherwise it will
return a socket descriptor. If we get a valid socket descriptor, we finish
filling in the sockaddr_in structure with the address of the socket to which
we wish to connect. We have already filled in the host address, so we now fill
in the address family, which is always PF_INET for TCP_IP and the port number
in network byte order. We then call the connect function to connect to the
finger-server socket. The connect function will return O on success and
nonzero on failure. The return value of all Windows Sockets functions should
be checked to detect any failures. If any Windows Sockets call fails, we can
call the WSAGetLastError() function to determine a specific error code. The
sockaddr_in structure holding the servers address is cast to type struct
sockaddr when passed to most sockets calls. This is a holdover from the
Berkeley interface that supported both UNIX domain sockets and TCP/IP sockets.
If we successfully connect to the finger server, we concatenate a carriage
return/line feed onto the user name (refer to lines 231-235) and send the
whole string to the finger server using the send function. The carriage
return/line feed has no special meaning for the sockets DLL, but is used by
the server to detect the end of the user name.
When the finger server receives the user-name string, it looks up information
on the specified user and sends it to the finger client on the same socket. We
read the data from the socket into a global buffer and invalidate the window
rectangle so the application will receive a WM_PAINT message and update the
window with the finger data; see lines 237-250 in Listing One.
We then close the socket with the closesocket() function and return to the
application message loop.


A Sample Server Program


The finger server creates a socket and binds it to a well-known port number.
It then waits for clients to connect to the socket, and when a connection is
made, it reads a carriage return/line feed terminated string from the socket.
The string is the name of a user about whom the client would like information.
The server looks up information about the user and sends it to the client on
the socket created when the connection was accepted.
The Finger Server menu selection is implemented as a server that can be
toggled off and on. When the server is active and accepting connections, a
check mark is placed next to the menu item.
To create a server using sockets, we first create a socket with socket() and
then bind a name to it. We bind a name to the socket by filling in a
sockaddr_in structure with the desired name and calling the bind() function;
see lines 272-295 in Listing One. The sockaddr_in structure is filled in the
same manner as it was for the finger client with one notable exception: We do
not need to supply the network address of the server's host. If we fill the
sin_addr.s_addr field of the socket address with the constant INADDR_ANY, the
sockets DLL will determine and fill in the local-host network address for us.
For the finger server, we are binding the socket to a port number found using
getservbyname(). If we did not care what port number was assigned to this
socket we could use 0 as the port number and have the sockets DLL supply us
with an available port number. If a port number is specified and another
socket is using this port number, bind() will fail.
The next step to creating a server is to call the listen() function, as in
lines 296-308 of Listing One. This function puts the socket into a state in
which it will accept incoming connections from clients. The listen() function
takes as parameters the socket descriptor and an integer indicating the number
of backlogged connections the sockets DLL should allow to accumulate. The
sockets DLL will allow at most a backlog of five connections. Even if you
request a larger backlog, you will only get five. If the listen call succeeds,
we call WSAAsyncSelect() to notify the sockets DLL that we wish to receive a
window message whenever a connection is ready to be accepted for this socket.
The WSAAsyncSelect() function takes as parameters a socket descriptor, a
handle to a window to which the messages should be posted, the message to
post, and the conditions in which the message should be posted. If
WSAAsyncSelect() succeeds, we place a check mark next to the Finger Server
menu item to indicate that the server is functioning and then return to the
application message loop and wait for a connection message.

Once we have told the sockets DLL that we want to be notified of incoming
connections for our server, we wait and process window messages until a
message is received notifying us that a connection has been established and
can be accepted by our application. The message we'll receive is the message
that we passed as a parameter to WSAAsyncSelect(). In this case we have
defined a constant named CONN_MSG and asked for this message to be posted to
our application window when a connection is ready. When we receive a CONN_MSG
message, we use the WSAGETSELECTERROR() macro (lines 331-340 in Listing One)
to extract any error code from lParam. If there was no error, we call accept()
to accept the connection. If accept() succeeds, it will return a new socket
descriptor, which we will use to communicate with the client. A new socket
descriptor is returned so that the original socket which we are using to
listen for connections can remain in a listening state, accepting connections
from clients. If the accept function fails, it will return the constant
INVALID_SOCKET.
If we receive a valid socket descriptor from the accept function, we first
call recv() (Listing One, lines 343-354) to read the user name string from the
client. In our server we ignore the string, but we do check the return value
from recv() to detect any errors. If there was no error, we call send() to
send the client the string entered by the user as their real name. We then
close the received socket and return to the application message loop to wait
for more connections.
If we already have a finger server listening for clients when the user selects
the Finger Server menu item, we close the finger-server socket and remove the
check mark from the menu.
If we receive a valid socket descriptor from the accept function, we call
WSAAsyncSelect on our new socket to notify the socket's DLL that we wish to be
informed when data arrives for this socket. We then wait and process window
messages until a message is received notifying us that data has arrived. We
then call recv() (Listing One, lines 346-353) to read the user-name string
from the client.


Summary


Together, the finger client and finger server implement the UNIX finger
utility and provide a fairly complete example of network application
development using the Windows Sockets API. Developers new to the sockets
interface should read the appropriate chapters in UNIX Network Programming, by
W. Richard Stevens (Prentice-Hall, 1990). Anyone using the Windows Sockets API
should get the Windows Sockets Specification by Martin Hall, Mark Towfiq,
Geoff Arnold, David Treadwell, and Henry Sanders available through anonymous
ftp from vax.ftp.com and ftp.uu.net or downloaded from CompuServe from the
Microsoft Software Libraries (GO MSL). Any comments or questions about this
article can be sent to tcp@ frontiertech.com.

_UNTANGLING THE WINDOWS SOCKETS API_
by Mike Calbaum, Frank Porcaro, Mark Ruegsegger, and Bruce Backman



[LISTING ONE]

/*
 * Sock.c -- Windows sockets sample application
 *
 * Implements psuedo finger client and server.
 * Should work with any UNIX finger daemon
 *
 */



#include "windows.h"
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <winsock.h>
#include "dlg.h"


#define PROMPT_STRING 130
#define PROMPT_TEXT 131

#define CONN_MSG (WM_USER+25)
#define FNGR_READ (WM_USER+26)

extern void errormsg(LPSTR cp);
extern int PASCAL WinMain(HANDLE, HANDLE, LPSTR, int);
extern BOOL InitApplication(HANDLE);
extern BOOL InitInstance(HANDLE, int);
extern long FAR PASCAL MainWndProc(HWND, unsigned, WORD, LONG);
extern BOOL FAR PASCAL About(HWND, unsigned, WORD, LONG);
extern long FAR PASCAL SockWndProc(HWND hWnd, unsigned message, WORD wParam,
LONG lParam);
extern char *prompt(HWND hWnd, HANDLE hInst, char *message, char *buf);


/* Global buffers */
char buff[4096];
char msg[80], name[125];

HANDLE hInst;
HWND hWnd;
struct sockaddr_in anaddr;

int whoport, fngrport; /* Well known port numbers */
int app_closing;

/* Sockets */
SOCKET fclient = INVALID_SOCKET;
SOCKET fserver = INVALID_SOCKET;
SOCKET fconnect = INVALID_SOCKET;


/*
 * standard windows initialization routines
 */
BOOL InitApplication(HANDLE hInstance)
 {
 WNDCLASS wc;

 buff[0] = '\0';

 wc.style = NULL;
 wc.lpfnWndProc = SockWndProc;

 wc.cbClsExtra = 0;
 wc.cbWndExtra = 0;
 wc.hInstance = hInstance;
 wc.hIcon = LoadIcon(NULL, IDI_APPLICATION);
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = "SockMenu";
 wc.lpszClassName = "SockWClass";

 return(RegisterClass(&wc));
 }

BOOL InitInstance(HANDLE hInstance, int nCmdShow)
 {

 hInst = hInstance;

 /* Create a main window for this application instance. */

 hWnd = CreateWindow("SockWClass", "Sockets Sample Application",
 WS_OVERLAPPEDWINDOW, CW_USEDEFAULT,
 CW_USEDEFAULT, CW_USEDEFAULT,
 CW_USEDEFAULT, NULL, NULL, hInstance, NULL);

 if (!hWnd)
 return (FALSE);

 ShowWindow(hWnd, nCmdShow);
 UpdateWindow(hWnd);
 return (TRUE);
 }


int PASCAL
WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpCmdLine, int nCmdShow)
 {
 MSG msg;
 FARPROC lpfn;

 WSADATA wskData;

 if (!hPrevInstance)
 if (!InitApplication(hInstance))
 return(FALSE);

 if (!InitInstance(hInstance, nCmdShow))
 return(FALSE);

 /*
 * register application with windows sockets dll
 */

 if(!WSAStartup(1, &wskData) && (wskData.wVersion == 1))
 {

 while (GetMessage(&msg, NULL, NULL, NULL))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 if (app_closing)
 {
 /* The application has been requested to shut down. Continue processing
 * the close request by calling DestroyWindow */
 DestroyWindow(hWnd);
 }
 }

 /* Close any open sockets */
 if(fclient != INVALID_SOCKET)
 closesocket(fclient);
 if(fserver != INVALID_SOCKET)
 closesocket(fserver);
 if(fconnect != INVALID_SOCKET)
 closesocket(fconnect);

 /* Let windows sockets know we're done */
 WSACleanup();
 return(msg.wParam);
 }

 return(0);
 }


BOOL FAR PASCAL About(HWND hDlg, unsigned message, WORD wParam, LONG lParam)
 {
 switch (message)
 {
 case WM_INITDIALOG:
 return (TRUE);

 case WM_COMMAND:
 if (wParam == IDOK wParam == IDCANCEL)
 {
 EndDialog(hDlg, TRUE);
 return (TRUE);
 }
 break;

 }
 return (FALSE);
 }


long FAR PASCAL
SockWndProc(HWND hWnd, unsigned message, WORD wParam, LONG lParam)
 {
 int len, cnt;
 RECT wRect;
 PAINTSTRUCT ps;
 HDC hdc;
 struct hostent FAR *hp;
 static HANDLE arrow, hrGlass, hMenu;

 switch (message)
 {
 case WM_CREATE:
 hMenu = GetMenu(hWnd);
 arrow = LoadCursor(NULL, IDC_ARROW);
 hrGlass = LoadCursor(NULL, IDC_WAIT);
 SetCursor(arrow);
 break;

 /*
 * If user presses control-c, interpret as an interrupt request.
 * Could also provide a menu item or a dialogbox with a "cancel"
 * button.
 *
 */
 case WM_CHAR:
 if (wParam == 3/* ctrl-c */)
 {
 WSACancelBlockingCall();
 strcpy(buff, "Call cancelled!!");
 InvalidateRect(hWnd, NULL, 1);
 }
 else
 return(DefWindowProc(hWnd, message, wParam, lParam));
 break;

 /* Update the window */
 case WM_PAINT:
 hdc = BeginPaint(hWnd, &ps);

 /* Use fixed fontand expand tabs so tabs align */
 SelectObject(hdc, GetStockObject(SYSTEM_FIXED_FONT));
 GetClientRect(hWnd, &wRect);
 DrawText(hdc, buff, -1, &wRect, DT_LEFT DT_EXPANDTABS);
 EndPaint(hWnd, &ps);
 break;

 /* Menu item selected, wParam == menu item ID */
 case WM_COMMAND:
 switch (wParam)
 {
 /* Display About Box */
 case IDM_ABOUT:
 {

 FARPROC lpfn;

 lpfn = MakeProcInstance(About, hInst);
 DialogBox(hInst, "AboutBox", hWnd, lpfn);
 FreeProcInstance(lpfn);
 break;
 }

 /* Finger client utility */
 case FINGER:
 /* Get user name */
 if(prompt(hWnd, hInst, "Hostname:", msg) == NULL)
 break;

 /* Look up the host address */
 if((hp = gethostbyname(msg)) == NULL)
 {
 errormsg("Unknown host");
 break;
 }
 _fmemcpy((char FAR *) &anaddr.sin_addr.s_addr, hp->h_addr, hp->h_length);

 /* Get username to finger */
 if(!prompt(hWnd, hInst, "Finger User:", msg))
 strcpy(msg, "");

 /* Look up port number */
 if(fngrport == 0)
 {
 struct servent FAR *sp;

 if((sp = getservbyname("finger", "tcp")) == NULL)
 {
 errormsg("Cannot determine port number for finger daemon.");
 break;
 }
 fngrport = htons(sp->s_port);
 }

 /* create socket */
 if((fclient = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
 {
 errormsg("Unable to create a socket");
 }
 else
 {
 /* Fill in address to which we will connect */
 anaddr.sin_family = PF_INET;
 anaddr.sin_port = htons(fngrport);

 /* try to connect */
 if(connect(fclient, (struct sockaddr FAR *) &anaddr, sizeof(struct
sockaddr_in)))
 {
 errormsg("Unable to connect to finger daemon.");
 closesocket(fclient);
 fclient = INVALID_SOCKET;
 }
 else
 {

 strcat(msg, "\r\n");
 if(send(fclient, msg, strlen(msg), 0) < (int) strlen(msg))
 {
 errormsg("Error sending string to finger daemon");
 }
 else
 if((cnt = recv(fclient, buff, sizeof(buff), 0)) == SOCKET_ERROR)
 {
 errormsg("Error reading data from daemon");
 }
 else
 if(cnt == 0)
 errormsg("No data received from finger daemon");
 else
 {
 buff[cnt] = '\0';
 InvalidateRect(hWnd, NULL, 1);
 }
 }
 closesocket(fclient);
 fclient = INVALID_SOCKET;
 }
 break;

 case FINGER_SRV:
 /* Look up port number */
 if(fngrport == 0)
 {
 struct servent FAR *sp;

 if((sp = getservbyname("finger", "tcp")) == NULL)
 {
 errormsg("Cannot determine port number for finger daemon.");
 break;
 }
 fngrport = htons(sp->s_port);
 }

 /* if not acting as server, start */
 if(fserver == INVALID_SOCKET)
 {
 /* Get user name */
 if(prompt(hWnd, hInst, "Real Name:", msg) == NULL)
 break;
 else
 wsprintf(name, "In Real Life: %s\r\n", (LPSTR) msg);

 /* allocate socket */
 if((fserver = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
 {
 errormsg("Cannot allocate socket for finger server");
 break;
 }

 /* Bind our socket to the finger port number */
 anaddr.sin_port = htons(fngrport);
 anaddr.sin_addr.s_addr = INADDR_ANY;
 anaddr.sin_family = PF_INET;
 if(bind(fserver, (struct sockaddr FAR *) &anaddr, sizeof(anaddr)))

 {
 errormsg("Error binding to finger daemon port");
 closesocket(fserver);
 fserver = INVALID_SOCKET;
 break;
 }

 /* listen for connections and ask for a message when they come in */
 if(listen(fserver, 5) WSAAsyncSelect(fserver, hWnd, CONN_MSG, FD_ACCEPT))
 {
 errormsg("Error trying to listen on finger server socket");
 closesocket(fserver);
 fserver = INVALID_SOCKET;
 }
 else
 CheckMenuItem(hMenu, FINGER_SRV, MF_BYCOMMAND MF_CHECKED);
 }
 else /* Already acting as server, stop */
 {
 closesocket(fserver);
 fserver = INVALID_SOCKET;
 CheckMenuItem(hMenu, FINGER_SRV, MF_BYCOMMAND MF_UNCHECKED);
 }
 break;

 case CANCEL: /* Cancel any outstanding blocking call (send or recv) */
 WSACancelBlockingCall();
 strcpy(buff, "Call cancelled!!");
 InvalidateRect(hWnd, NULL, 1);
 break;

 case QUIT:
 SendMessage(hWnd, WM_CLOSE, 0, 0L);
 break;

 default:
 return (DefWindowProc(hWnd, message, wParam, lParam));
 }
 break;

 case CONN_MSG: /* Connection on finger server */
 if(WSAGETSELECTERROR(lParam))
 errormsg("Error listening for finger connections");
 else
 {
 /* accept connection */
 len = sizeof(struct sockaddr_in);
 fconnect = accept(fserver, (struct sockaddr FAR *) &anaddr, &len);
 if(fconnect == INVALID_SOCKET)
 errormsg("Error accepting connection to finger server");
 else
 WSAAsyncSelect(fconnect, hWnd, FNGR_READ, FD_READ);
 }
 break;

 case FNGR_READ: /* Data available for reading */
 {
 /* The finger client will send us a string but we will ignore it */
 cnt = recv(fconnect, buff, sizeof(buff), 0);

 buff[0] = '\0';
 if(cnt == SOCKET_ERROR)
 errormsg("Error reading from finger client");
 else
 if(send(fconnect, name, strlen(name), 0) < (int) strlen(name))
 errormsg("Error sending data to finger client");

 /* Close up the accepted connection */
 closesocket(fconnect);
 fconnect = INVALID_SOCKET;
 }
 break;

 case WM_CLOSE:
 /* Application's main window is closing. Set flag to notify main loop
 * The main function is responsible for handling the close
 */
 app_closing = 1;
 break;

 case WM_DESTROY:
 PostQuitMessage(0);
 break;

 default:
 return (DefWindowProc(hWnd, message, wParam, lParam));
 }
 return (NULL);
 }



char PromptText[41]; /* prompt message text for prompt dlg */
char PromptString[81]; /* user's input from prompt dlg */

/*
 * PromptDlgProc - Processes the "prompt" dialog box
 *
 * globals: PromptText - prompt message text
 * PromptString - user's input
 * return: IDOK or IDCANCEL
 */
BOOL FAR PASCAL
PromptDlgProc(HWND hDlg, unsigned message, WORD wParam, LONG lParam)
 {
 switch (message)
 {
 case WM_COMMAND:
 if (wParam == IDOK)
 {
 GetDlgItemText(hDlg, PROMPT_STRING, PromptString, 80);
 EndDialog(hDlg, IDOK);
 return TRUE;
 }
 else if (wParam == IDCANCEL)
 {
 EndDialog(hDlg, IDCANCEL);
 return TRUE;
 }

 return FALSE;
 break;

 case WM_INITDIALOG:
 SetDlgItemText(hDlg, PROMPT_TEXT, PromptText);
 return(TRUE);
 break;

 default:
 return FALSE;
 }
 }

/*
 * prompt - displays message, copies user response into buf
 *
 * return: NULL for cancel, or FAR * to buf
 */
char *prompt(HWND hWnd, HANDLE hInst, char *message, char *buf)
 {
 int rc;
 FARPROC lpDlgProc;

 lstrcpy(PromptText, message);

 lpDlgProc = MakeProcInstance(PromptDlgProc, hInst);
 rc = DialogBox(hInst, "PROMPT", hWnd, lpDlgProc);
 FreeProcInstance(lpDlgProc);

 if (rc == IDOK)
 lstrcpy(buf, PromptString);
 else
 buf = NULL;
 return buf;
 }

/*
 * errormsg - displays an error message and the last error
 *
 */
void errormsg(LPSTR cp)
{
 char buf[128];

 wsprintf(buf, "%s (%d)", (LPSTR)cp, WSAGetLastError());
 MessageBox(hWnd, buf, NULL, MB_ICONHAND);
}






[LISTING TWO]

#define FINGER 112
#define FINGER_SRV 113
#define CANCEL 114
#define IDM_ABOUT 115

#define QUIT 116

#define PROMPT_STRING 130
#define PROMPT_TEXT 131





[LISTING THREE]

#include "windows.h"
#include "dlg.h"
#include "sock.dlg"

SockMenu MENU
BEGIN
 POPUP "&Sockets"
 BEGIN
 MENUITEM "Finger &Client", FINGER
 MENUITEM "&Finger Server...", FINGER_SRV
 MENUITEM "&Cancel Operation", CANCEL
 MENUITEM "A&bout...", IDM_ABOUT
 MENUITEM "&Quit", QUIT
 END
END




































February, 1993
INSIDE THE WINDOWS MESSAGING SYSTEM


Opening up the heart of Windows




Matt Pietrek


Matt, who works for a California programming-tools vendor, specializes in
debuggers and file-format programming. This article contains material that
will appear in greater detail in Matt's upcoming book, Windows Internals
(Addison-Wesley, 1993). He can be contacted through the DDJ offices.


The Windows messaging system is like a heart: It pumps the lifegiving message
stream on which all Windows apps depend. Windows messages signal when the
mouse moves, a menu item is selected, and a window is created. Dialogs, menus,
and other controls rely on messages to communicate with each other; messages
also serve as a form of interprocess communication. Even the KERNEL module,
which is supposed to lie below the level of the messaging system (implemented
in USER.EXE), uses messages to indicate changes in the global heap. Truly
understanding Windows means becoming familiar with the inner workings of its
messaging system.
This article provides a detailed look at this complex, not fully documented
area of Windows 3.1 and presents pseudocode for key routines such as
GetMessage(), DispatchMessage(), PeekMessage(), and SendMessage(). I also
cover internal functions in Windows that even Undocumented Windows
(Addison-Wesley, 1992) does not discuss; these are presented using their real
names, which I obtained by examining the symbolic information in the debugging
versions of the Windows DLLs.


The Five Kinds of Messages


There are five ways that messages enter the message stream. I used
GetQueueStatus(), newly documented and improved in Windows 3.1, to look at
return values (QS_*), which are defined in WINDOWS.H. The five categories are:
Input messages (values of QS_KEY, QS_MOUSEMOVE, and QS_MOUSEBUTTON). Although
GetQueueStatus() assigns different QS values, you can consider them all to be
input messages generated by hardware devices, which get stored in the shared
system message queue.
Posted messages (QS_POSTMESSAGE). These messages are placed in the application
message queue via PostMessage() or PostAppMessage(). There's one application
message queue per program.
Paint messages (QS_PAINT). Like QS_TIMER messages, paint messages don't wait
in a queue, but are generated as needed when an application requests a
message. The Windows window manager is responsible for knowing if a particular
window needs updating. When a window region is invalidated, the messaging
system is informed that a repaint is necessary (the QS_PAINT flag is set).
Then, when an application asks for a message, a WM_PAINT message is composed.
Timer messages (QS_TIMER). These are similar to QS_PAINT; both are generated
on-the-fly when an application calls GetMessage() or PeekMessage(), instead of
being stored in message queues, and thus they do not fill up the queues.
Sent messages (QS_SENDMESSAGE). SendMessage() sends a message to any window
and guarantees that the receiving window will reply before anything else
occurs. Sending messages between two windows of the same application is not
hard; sending messages between two different tasks is more difficult. Because
each window procedure must operate in its normal task context, the Windows
scheduler must come into play. Accomplishing this correctly involves
synchronization between the two tasks.
These distinctions are not based on the message number (such as 0x000F), but
on how the message came into existence. For instance, the WM_PAINT message is
normally synthesized when your application calls GetMessage(). Your program
doesn't have to care how the WM_PAINT message was created. On the other hand,
it's perfectly legal for an application to use SendMessage() to send a
WM_PAINT message to another window. This message will be seen in the queue as
QS_SENDMESSAGE rather than as a QS_PAINT message. Likewise, you can do a
PostMessage() of a WM_PAINT message, which results in a QS_POSTMESSAGE-type
message. The message numbers aren't important for this discussion; it is
important that there are multiple ways to introduce messages into the system.
(Incidentally, you wouldn't want to send or post WM_PAINT messages; I'm only
using this example because this message can be generated three different ways.


The Application Message Queue


Every window in the system is associated with a particular application message
queue. In reviewing the fields in a WND data structure (described in
Undocumented Windows), note the one that contains a message-queue handle. When
a message is posted, this field determines to which queue the message will be
added. Even the desktop window has a message queue associated with it.
But the application queue is much more than a holding area for posted
messages. Because it contains most of the data used by the Windows messaging
system, think of the queue as a sort of command center linking a window handle
to a particular task, and serving as the keeper of the status bits vital to
GetMessage()/PeekMessage(). The application message queue is closely tied to
the application's task database (TDB). Message-queue fields contain the
selector of the associated TDB, and vice versa.
At startup, a program's message queue is created by the InitApp() routine.
Memory for the application message queue comes from the global heap. You can
obtain a handle to the current message queue via the undocumented
GetTaskQueue() (USER.35), whose prototype is HANDLE FAR PASCAL
GetTaskQueue(HANDLE hTask). If you pass it an hTask value of 0, you'll get the
current task's queue.
Messages are placed in the application's queue via PostMessage(). Some
internal Windows functions will also call PostMessage() behind the
scenes--DefWindowProc() for instance.
The default size for an application message queue is eight messages, usually
enough to contain all the messages actually posted to an application.
Typically, more messages are sent directly to the window via SendMessage().
You can alter the size of the application queue with SetMessageQueue(). Call
this function before any windows are created, because the old message queue
gets deleted and a new one created, and this causes confusion if the original
message queue is already in use. An alternative to using SetMessageQueue() is
to modify the DefaultQueueSize setting in WIN.INI. This is an undocumented
key, so you may have to add it if it's not present.
The Windows 3.1 application message-queue structure is in Listing One (page
100). The queue contains data for several purposes. One is to maintain a
circular queue of messages. This queue, similar in concept to the ROM-BIOS
keyboard buffer, contains read and write pointers which wrap back to the
beginning when past the end of the buffer and indicate where the next message
will be read from and written to.
The application message queue also supports SendMessage() between tasks by
storing the parameters, return values, and current state of the transaction.
The section used is not the one for posted messages because sent messages are
guaranteed to be processed immediately, ahead of other waiting messages.
To illustrate how to access the contents of the message queue, I wrote a
program, Queue.C, which is available electronically; see "Availability," page
5.


The System Message Queue


The system message queue is a kind of half-brother to the application message
queue. The system message queue's job is to hold all hardware-input messages.
This includes mouse, keyboard, and other input-device events.
In general, hardware events occur at a good clip. Moving your mouse across the
screen causes dozens of WM_MOUSEMOVE messages. In order not to lose any of
these messages, the system queue's capacity is larger than that of the
application queue, containing by default 120 messages. (You can change this by
modifying or adding the Type-Ahead entry in WIN.INI.)
The system queue is also allocated and initialized by USER.EXE. There's only
one system queue for Windows. The format of the system queue is the same as
the application queue, except for stored messages. But the only fields of the
system queue actually used are those that implement the circular message
buffer.
There's no API to obtain the handle of the system queue, but you can get its
handle via a sneaky hack. The first WORD in the segment 0x2C of USER contains
the system queue's handle. (In Windows 3.0, it's the WORD at offset 2 of
segment 0x2B.) The GlobalEntryModule() function in ToolHelp provides a way to
obtain a segment's selector handle, given its ordinal number in the module.
More Details.
Messages in the system queue are not destined for a particular window because
the processing of one system message can affect which window/task subsequent
messages go to. For instance, a WM_LBUTTONDOWN message can cause a change of
focus. Subsequent messages in the queue must then go to the new focus window
rather than the previous one.
On the other hand, the system queue can be locked by a task, ensuring that no
other task reads system queue messages until the locking task is done. For
example, a double-click message is synthesized out of a series of button
up/down messages. One task shouldn't steal messages in the middle of the
process. The system queue is unlocked when no messages are left for a task, or
when another task's message is found.
How do events get into the system queue? In USER.EXE, EnableInput() calls the
mouse and keyboard drivers enable functions (ordinal entry #2). Their
parameters are the addresses of the exported USER functions mouse_event() and
keybd_event(), respectively; mouse_event() and keybd_event() are essentially
interrupt-level functions. When the mouse is moved or a key is struck, a
hardware interrupt is generated. The DOS-extender subsystem in Windows vectors
control to the appropriate interrupt-handler function in the mouse or keyboard
device driver (typically called MOUSE.DRV and KEYBOARD.DRV). The mouse and
keyboard drivers then call mouse_event() and keybd_event() via the function
pointers passed during the enablement process. Processing occurs inside
mouse_event() and keybd_event() to place appropriate values in registers
before calling SaveEvent().
SaveEvent() places the message in the system queue via a call to
WriteSysMsg(), then attempts to coalesce multiple WM_KEYDOWN messages that
result from autorepeating keys. Lastly, it calls WakeSomeone(), which
determines the best application candidate to receive the message. When an
application is found, flags are set in that app's message queue, and an event
is posted to its TDB. The application wakes up and receives the message.
Pseudocode for WakeSomeone() is in Listing Two, page 100.
In Listing Two, the test for hQCapture implements the Windows capture
mechanism. When your application calls SetCapture(), hQCapture is set to the
queue associated with the hwnd parameter to SetCapture(). If hQCapture is
nonnull inside WakeSomeone(), the hQCapture queue receives the QS_MOUSE event
instead of the queue which would ordinarily have received it. If Windows is in
a system modal state, the hQSysModal queue is highest in the pecking order,
ahead of the hQCapture queue.



WakeBits, WaitEvent, and the Scheduler


If no messages are waiting for processing inside GetMessage(), the system
allows other programs to retrieve pending messages. Before describing how this
happens, I'll define a few terms:
WakeBits. Bitfields located at offset 44 in the message queue that indicate
that a particular kind of message (QS_PAINT, QS_TIMER, and so on) is available
to the task. For instance, QS_PAINT means a paint message is waiting for the
application, but hasn't been retrieved. Only QS_POSTEVENT messages exist in
the application message queue; other message types imply messages synthesized
by the system.
More Details.
WakeMask. This value, at offset 46 in the message queue, is a mask of the
QS_xxx message types that the application is actively waiting for. Typically,
GetMessage() is called with wMsgFilterMin and wMsgFilterMax set to 0. This
sets the WakeMask to include all the QS_xxx message types. If you specify an
actual range of messages in the GetMessage() call, then an appropriate set of
QS_xxx bits will be generated inside of GetMessage().
ChangeBits. This field, at offset 42 in the message queue, contains QS_xxx
bits that have changed since the last call to GetQueueStatus(), GetMessage(),
or PeekMessage().
Now look at Listing Three, page 100. GetMessage() calls SleepHq() to wait for
a message, but still yields to other tasks if they have messages. The
messaging system checks for sent messages in many places because these
messages must be processed immediately. SleepHq() really wants to wait for a
QS_POSTMESSAGE, or a QS_PAINT, or whatever; but if it sees a pending
QS_SENDMESSAGE flag, it calls ReceiveMessage() to deal with it immediately,
and then goes back to its normal business.
Because SendMessage() processing is dealt with inside SleepHq(), your
application does not have to do anything special to receive sent messages--it
comes free when you call GetMessage(). Your application cannot receive sent
messages at any arbitrary time, only inside of GetMessage()/PeekMessage(),
when you call SendMessage(), or when calling a function that uses
SendMessage() (such as a dialog-box function). So if your program is crunching
a long series of numbers, there's no worry that a sent message will
unexpectedly arrive and disrupt processing.
The event-count field, located at offset 6 of the TDB, is like a flag on a
mailbox. If it's up (contains a nonzero value), then there's a reason to
switch to the task because something is waiting for it, as signified by the
WakeBits in the message queue (see Listing Four, page 100). The scheduler
doesn't know why the task should be awakened, just that it's necessary.
WaitEvent() thus waits for the mailbox flag to pop up. SleepHq() is
responsible for checking the mailbox, and either waiting some more for a
desired QS_xxx letter, or returning when it finds what it wants. If it sees a
QS_SENDMESSAGE in the mailbox, SleepHq() takes it out, deals with it promptly,
and goes back to waiting for the desired QS_xxx letter. (For more information
on the event-count field, see my article, "Inside the Windows Scheduler," DDJ,
August 1992.)
Where do the QS_xxx bits come from? SetWakeBit2() is responsible for setting
the WakeBits in the application's message queue, as well as ensuring that the
program will be scheduled so that it can respond to the message. Pseudocode
for SetWakeBit2() is in Listing Four. SetWakeBit2() is heavily used, and
called by these USER routines:
WakeSomeone() sets the QS_MOUSE or QS_KEY bits; it's called by the
hardware-event handlers when a message has been added to the system queue.
IncPaintCount() sets the QS_PAINT bit; it's called when a window region is
invalidated.
SendMessage() sets the QS_SENDMESSAGE bits in the queue of the receiving task
during an intertask SendMessage() so that the task will wake up and process
the message.
ReceiveMessage() sets a bit not included in the previously defined QS_xxx bits
when the receiving task is done processing the message during an intertask
SendMessage() and needs to wake up the sending task to receive the result.
ScanTimers() sets the QS_TIMER bit if sufficient time has elapsed; it's called
by the timer interrupt service routine.
WriteMessage() sets the QS_POSTMESSAGE bit. PostMessage() and PostAppMessage()
call PostMessage2(), which uses WriteMessage() to put the message in the
application's queue.


Bringing it All Together


GetMessage() and PeekMessage() are really front ends for a call to
GetMessage2(), which does most of the actual work. The pseudocode for the
GetMessage()/PeekMessage() front ends and for the workhorse GetMessage2() is
in Listing Five, page 100. Listing Six (page 102) presents pseudocode for
CheckForNewInput().
Here's how each of the five types of messages are dealt with in
GetMessage()/PeekMessage():
QS_SENDMESSAGE. CheckForNewInput() is called several times in GetMessage2().
Its priority is checking for sent messages. If GetMessage2() ends up sleeping,
via SleepHq(), sent messages are checked for in SleepHq() code.
QS_POSTMESSAGE. ReadMessage() extracts the message from the application
message queue. The message fields are copied into the addresses specified in
the GetMessage()/PeekMessage() call.
QS_MOUSE and QS_KEY. ScanSysQueue() extracts the message from the application
message queue. The message fields are copied into the addresses specified in
the GetMessage()/PeekMessage() call.
QS_PAINT. DoPaint() extracts the message from the system message queue. The
message fields are copied into the addresses specified in the
GetMessage()/PeekMessage() call.
QS_TIMER. DoTimer() writes the timer message into the application queue.
GetMessage2() then starts at the beginning, and finds the timer message as if
it were a normal PostMessage().
A couple of conclusions can be drawn from the code. First,
GetMessage()/PeekMessage() will not yield to other applications if messages
are waiting. Second, there's a definite pecking order of message priorities.
Messages sent via SendMessage() always have top priority. This is necessary
because the task that did the SendMessage() is cooling its heels, waiting for
the reply. Next in priority are messages posted via PostMessage(). Messages
from the input system (mouse and keyboard) come after that, and then WM_PAINT
messages. WM_PAINT messages are handled after other messages because
processing of other messages might generate additional paint operations.
Processed at the very end, just before GetMessage2() gives up, goes to sleep,
and yields to other tasks, are WM_TIMER messages.


How DispatchMessage Works


Once your application has retrieved a message, you're expected to deal with
it--typically, by dispatching it to the appropriate window. Rather than
requiring you to determine the address of the window procedure and call it
directly, Windows provides DispatchMessage(); see Listing Seven, page 102.
DispatchMessage() is straightforward, except for a few things. At the start of
the code, there's special handling for WM_TIMER and WM_SYSTIMER messages. If
the lParam field of the message is nonzero, a user-supplied callback is called
instead of the standard window procedure. The SDK documentation for SetTimer()
describes how to use timers.
Also, DispatchMessage() handles "bad" programs that don't call BeginPaint() in
their WM_PAINT handler. Apparently, Microsoft feels that it's enough of a
problem that DispatchMessage() always checks if BeginPaint() was called by the
app's message handler. If the program didn't call BeginPaint(), Dispatch
Message() goes ahead and does some default painting to correct the situation
(and whine at you with a debug message if you're running the debug version of
Windows).
Lastly, you might notice that, before your program's window procedure is
called, DS is set to the hInstance of the application. This compensates for
applications that fail to export their callback functions. Under Windows 3.0,
this may result in a GP fault (due to an invalid DS) when your window
procedure gets called. With Windows 3.1, some people claim you no longer have
to export functions or call MakeProcInstance(). This may or may not be sound
advice, but Microsoft seems to feel that setting DS is a worthwhile activity
for DispatchMessage().


Anatomy of a SendMessage Call


SendMessage() is one of the most frequently used Windows functions, yet
perhaps the least understood. Many programmers mistakenly assume that
SendMessage() just calls the appropriate window procedure. They forget that
Sendmessage() needs to operate in two different task contexts when one
application sends a message to another.
This situation can become rather complex. The receiver of a "sent message"
might need to send a message to another task before it can respond to the
original message, resulting in nested calls to SendMessage(). The processing
of an intertask SendMessage() is shown in Listing Eight (page 102). Listing
Nine presents pseudocode for ReceiveMessage(), and Listing Ten (page 103) is
ReplyMessage().
As you can see from the pseudocode, handling the case where an application
sends a message to itself is straightforward. The parameters are pushed on the
stack, and the window procedure is called. The bulk of the code in
SendMessage() is for handling situations in which the receiving window is in a
different task. Within the intertask SendMessage() code and in
ReceiveMessage() and ReplyMessage(), a large amount of code has to do with
handling nested SendMessage() calls. As these calls pile up on top of each
other, the system builds a linked list which specifies the message queues
waiting for SendMessage() to return. The most recent queue is at the head of
the list. As each message is replied to, the head of the list is removed, and
the list shrinks.
Although not normally done, your application program can call ReplyMessage()
(within a WH_CALLWND-PROC hook, for example) to prevent the window which
ordinarily would get the message from actually receiving it. It's also useful
to call ReplyMessage() when handling a message sent to you via SendMessage().
The sending program cannot execute until your program finishes processing the
message. When handling the message, if your program calls a Windows function
that yields control, such as MessageBox(), a potential deadlock situation can
arise. A call to ReplyMessage() before this will avoid the deadlock.
--M.P.



Why's it So Hard to Write a GUI Debugger?


The fatal flaw in the Windows input system is that it is "single threaded." If
your application fails to call GetMessage() or PeekMessage() in a timely
manner, the system locks up. You can still move the mouse, and background
processing in Enhanced-mode DOS boxes continues, but none of the Windows
applications can respond to mouse or keyboard input because they aren't given
a chance to run.
Say your database program gets a WM_COMMAND message, which it interprets to
mean, "Go sort this database of 300,000 records," and dutifully conducts this
45-minute operation; during that time, all apps are locked out until your next
call to GetMessage(). The polite thing is for your program to call
PeekMessage() occasionally, thus yielding to other applications.

A quirk in the messaging system rears its head when you try to write a
Windows-hosted debugger (also called a "GUI debugger"). A GUI debugger is a
debugger for Windows programs that itself uses the Windows display mechanisms.
What's the problem with that? Well, imagine the following scenario: A GUI
debugger places a breakpoint inside of a Window procedure. Eventually, the
debuggee program hits the breakpoint, and stops--and cannot call GetMessage()
to yield control to other tasks! That means no other tasks--including the GUI
debugger--can get their messages. The debugger can't even respond the mouse
clicks that tell the debuggee to run again.
You may ask: But there are GUI debuggers available, so how do they deal with
this?
Unfortunately, the answer is, "not extremely well." When the debuggee hits the
breakpoint (or stops for any reason), the GUI debugger must take over the
duties of calling GetMessage() and DispatchMessage() for the debuggee. The
debugger must prevent any code in the debuggee process from running. To do
this, the debugger needs to somehow intercept all messages that would normally
go to the debuggee, and deal with them instead.
One way to accomplish this is by subclassing all of the debuggee's windows.
The question then arises: How do you deal with all the messages originally
intended for the debuggee? The debugger surely doesn't know how to paint the
debuggees windows in response to a WM_PAINT message. Situations where message
ordering is critical, such as DDE transactions, are even harder to deal with.
Unfortunately, there's no perfect solution. GUI debugger designers deal with
this as best they can. This explains why both Borland's Turbo Debugger for
Windows and Microsoft's Codeview for Windows are text-mode debuggers. In
Win32, the input mechanism has been redesigned (although by the same person
who designed the Windows and OS/2 PM input systems). A major goal was to
eliminate the input-system problem described above. Consequently, Win32 uses a
separate input queues for each task. A thread in the Win32 subsystem
continually assigns messages to the appropriate applications queue as input
events occur. This lets programs deal with messages in their own sweet time,
without adversely affecting the responsiveness of the system as a whole.
Unfortunately, this improved functionality does not extend to Win32s
applications. Under Win32s, the Windows 3.1 USER.EXE module is still in charge
of the input system, thereby causing Win32s applications to be in the same
boat as regular Windows programs.
--M.P.


_INSIDE THE WINDOWS MESSAGING SYSTEM_
by Matt Pietrek


[LISTING ONE]

00h WORD Selector of next message queue, (implements linked list).
02h WORD hTask of task that owns this queue.
04h WORD Size of a message in this queue. (In Windows 3.1, this is 22).
06h WORD Number of messages waiting that have not been removed
 by a GetMessage() or PeekMessage(PM_REMOVE).
08h WORD Offset in the queue segment of next message to be retrieved.
0Ah WORD Offset in the queue segment where next message will be written.
0Ch WORD The length in bytes of the queue's segment.
0Eh DWORD DWORD value returned by GetMessageTime().
12h DWORD DWORD value returned by GetMessagePos().
16h WORD Unknown. Sometimes contains 1.
18h DWORD Information returned by GetMessageExtraInfo().
1Ch WORD Unknown.
1Eh DWORD Contains the LPARAM of a SendMessage() to another task.
22h WORD Contains the WPARAM of a SendMessage() to another task.
24h WORD Contains the MSG of a SendMessage() to another task.
26h WORD Contains the HWND of a SendMessage() to another task.
28h WORD Contains the DWORD result from the SendMessage().
2Ch WORD PostQuitMessage() has been called by this program.
2Eh WORD PostQuitMessage() exit code.
30h WORD Flags of some sort.
32h DWORD Unknown.
36h WORD Expected Windows version, from NE file.
38h WORD Queue handle of application that is sending a message to this app.
3Ah WORD Used for an intertask SendMessage().
3Ch WORD Used for an intertask SendMessage().
3Eh WORD Number of "paints" needed by this application.
40h WORD Number of timer events waiting for this application
42h WORD QS_xxx bits that have changed since the last call to
 GetMessage(), PeekMessage(), or GetQueueStatus().
44h WORD QS_xxx bits indicating the kind of messages that are waiting
 for the application.
46h WORD Contains the QS_xxx bits that an application is
 currently waiting for.
48h WORD Used for intertask SendMessages().

4Ah WORD Used for intertask SendMessages().
4Ch WORD Used for intertask SendMessages().
4Eh WORD Something having to do with hooks
50h BYTE[1Eh] Unknown. Possibly having to do with hooks.
6Eh WORD Start of the posted message storage area. The
 memory from here, to the end of the segment, can
 be thought of as an array of messages, each message
 being 22 bytes in length.








[LISTING TWO]

// Global variables: hQCursor - The queue "associated" with the cursor
// hQActive - The queue of the "active" window that has focus
// hQCapture - The queue associated with the capture window
// hQSysModal - The queue associated with the system modal window
// Local variables: best_queue - contains the current "best guess" as to which
// queue should be woken up to receive the message
// wakebit - contains the QS_xxx message type (QS_MOUSEMOVE,
// QS_MOUSEBUTTON, or QS_KEY) that will be placed in the WakeBits
// of whatever queue is selected to receive the message.
 best_queue = hQCursor
 if ( message is a not a key message )
 goto mouse_event
 wakebit = QS_KEY
 if ( hQActive != NULL )
 best_queue = hQActive
 goto system_modal_check
mouse_event:
 if ( message == WM_MOUSEMOVE )
 wakebit = QS_MOUSEMOVE
 else
 wakebit = QS_MOUSEBUTTON
 if ( hQCapture != NULL )
 best_queue = hQCapture
system_modal_check:
 if ( hQSysModal != NULL )
 best_queue = hQSysModal
 if ( best_queue != 0 )
 goto wake_em_up
 iterate through queue linked list
 {
 if ( queues WakeMask includes wakebit determined
 previously )
 {
 best_queue = current queue under examination

 goto wake_em_up
 }
 if ( at end of queues linked list )
 return
 }
wake_em_up:
 SetWakeBit2(); // Sets WakeBits, and posts event
 return






[LISTING THREE]

// WakeMask contains QS_xxx OR'ed together. SleepHq() will not return until at

// 1 of QS_xxx bits in the WakeMask parameter has been set in the ChangeBits.

void SleepHq( unsigned WakeMask )
{
 HANDLE currQ
SleepHq_check_flags:
 currQ = Get_current_task_queue
 // If already have a message then go get it
 if ( WakeMask & currQ.ChangeBits )
 goto SleepHq_done
 // Check for SendMessages and deal with them
 if ( currQ.WakeBits & QS_SENDMESSAGE )
 goto SleepHq_have_SendMessage
 // Always check for SendMessages
 currQ.WakeMask = WakeMask & QS_SENDMESSAGE
 if ( WakeMask & currQ.ChangeBits )
 goto SleepHq_done
 WaitEvent() // Kernel routine that waits for an event
 goto SleepHq_check_flags:
SleepHq_done:
 zero_out_currQ.WakeMask
 return
SleepHq_have_SendMessage:
 zero_out_qWakeMask
 // Deal with the SendMessage(). Described in the section on SendMessage()
 ReceiveMessage()
 goto SleepHq_check_flags

}






[LISTING FOUR]

void SetWakeBit2(HANDLE hQueue, UINT WakeBit)
{
 hQueue.ChangeBit = WakeBit // Turn on the QS_xxx flags
 hQueue.WakeBit = WakeBit
 // If we're setting a QS_xxx bit that the queue is waiting
 // for, then force the scheduler to schedule the task
 if ( WakeBit & hQueue.WakeMask )
 {
 hQueue.WakeMask = 0
 PostEvent() to hQueue's task
 }
}







[LISTING FIVE]

// "flags" are the "flags" parameter to PeekMessage(). "removeFlag" is a local

// indicating whether a message will be read from the queue. "WakeMask" is a
// local containing a QX_xxx mask of messages types GetMessage()/PeekMessage()
// are waiting for. "WakeBits" is a local containing the the QS_xxx bits that
// indicate which types of messages are waiting for this task.

PeekMessage:
 Is_GetMessage_call = 0
 goto GetMessage2
GetMessage:
 Is_GetMessage_call = 1
 Insert a flags WORD in the stack frame so that the stack
 frame for GetMessage() is the same as for PeekMessage().
 The flag is set to PM_REMOVE.
GetMessage2: // This is where GetMessage() and PeekMessage()
 // start sharing their code
 if ( current task is locked )
 set PM_NOYIELD in flags
 removeFlag = flags & PM_REMOVE
 Unlock the system queue if this task holds it.
 if ( (msgMin != 0) or (msgMax != 0) )
 Call function to set up WakeMask for the specified
 message range
 else
 WakeMask = QS_MOUSE QS_KEY QS_POSTMESSAGE
 QS_TIMER QS_PAINT

begin_looking_for_msgs:
 if ( !CheckForNewInput() )
 goto wait_for_input
 if ( system queue not locked )
 goto not_in_system_queue
 if ( system queue not locked by current queue )
 goto not_in_system_queue
 if ( (QS_MOUSE QS_KEY) set in WakeMask and WakeMask )
 {
 if ( ScanSysQueue() )
 goto GetMessage_have_msg
 }
not_in_system_queue:
 if ( QS_POSTMESSAGE set in WakeBits and WakeMask )
 if ( ReadMessage() )
 goto GetMessage_have_msg
 if ( (QS_MOUSE or QS_KEY) set in WakeBits and WakeMask )
 if ( ScanSysQueue() )
 goto GetMessage_have_msg
 if ( !CheckForNewInput() )
 goto wait_for_input
 if ( QS_PAINT set in WakeBits and WakeMask )
 if ( DoPaint() )
 goto GetMessage_have_msg
 if ( PM_NOYIELD set in flags )
 goto check_for_timer_msg
 UserYield()
 if ( !CheckForNewInput() )
 goto wait_for_input
check_for_timer_msg:
 if ( QS_TIMER set in WakeBits and WakeMask )
 if ( DoTimer() )
 begin_looking_for_msgs

wait_for_input:
 if ( FSHRINKGDI )
 ShrinkGDIheap() ; Where is this defined???
 // If not in GetMessage, we must be in PeekMessage
 if ( Is_GetMessage_call == 0 )
 goto PeekMessage_exit
 SleepHq(wakemask)
 goto begin_looking_for_msgs

GetMessage_have_message:
 if ( a WH_GETMESSAGE hook is installed )
 call the hook function
 // If not in GetMessage, we must be in PeekMessage
 if ( Is_GetMessage_call )
 return 1
 if ( returning msg == WM_QUIT )
 return 0
 else
 return 1
PeekMessage_exit:
 if ( ! PM_NOYIELD )
 UserYield() // Yield to any higher priority app
 return 0






[LISTING SIX]

// Returns Zero Flag set if no desired input flag is set. WakeMask & WakeBits
// are in registers, and are same as WakeMask and WakeBits in GetMessage2().
top:
 Get handle of current queue
 if ( QS_SENDMESSAGE set in the queues wakebits )
 {
 ReceiveMessage()
 goto top
 }
 // AND instruction sets the Zero flag if any bits match
 AND WakeMask, WakeBits together
 Return






[LISTING SEVEN]

 LPMSG lpMsg // ptr to passed-in message, used as scratch variable.
 if ( (msg != WM_TIMER) && (msg != WM_SYSTIMER) )
 goto handle_normally
 if ( msg.lParam == 0 )
 goto handle_normally
 GetTickCount()
 push msg parameters on stack
 lpMsg = msg.lParam // Timer function callback address

 AX = SS // Something with MakeProcInstance thunk???
 goto call_function
handle_normally:
 if ( msg.hwnd == 0 )
 return;
 push msg parameters on stack
 if ( msg.msg == WM_PAINT )
 set "paint" flag in WND structure
 lpMsg = Window proc address // stored in WND data structure;
 // pointed to by msg.hwnd
 AX = hInstance from WND structure // For use by MakeProcInstance() thunks
call_function:
 ES = DS = SS // Set all segment registers to hInstance of application
 call [lpMsg] // Call the window proceedure (or timer callback fn).
 // lpMsg is now used to store the address of window
 // function (or timer callback function) to be called
 if ( msg.msg != WM_PAINT )
 goto DispatchMessage_done
 // Check for destroyed window
 if ( ! IsWindow(msg.msg) )
 goto DispatchMessage_done
 if ( "paint" flag in wnd structure still set )
 goto No_BeginPaint
DispatchMessage_done:
 return
No_BeginPaint:
 Display debugging message "Missing BeginPaint..."
 Call DoSyncPaint() to handle the painting correctly
 goto DispatchMessage_done






[LISTING EIGHT]

 if ( receiving HWnd == -1 )
 goto BroadcastMessage // Not included here
 Verify sending app has a message queue
 Get receiving apps queue from receiving hWnd
 // Are the sending and receiving queues the same???
 Intertask = ( receivingHQueue == sendingHQueue )
 Call any installed WH_CALLWNDPROC hooks
 if ( Intertask )
 goto InterTaskSend
 // Next section deals with calling a window proceedure within same program
 // This is the simple case and is much easier than calling between two
 // different programs (below)
 Push address of the wndproc of the receiving WND structure on stack
 Push SendMessage params on stack
 Put hInstance into AX
 Load DS & ES from the SS register
 Call through the wndproc address in the window structure
SendMessage_done:
 Return to caller
SendMessage_error: // Common JMP location when errors occurr
 Put 0 in DX:AX
 Goto SendMessage_done

 // SendMessage()'s that go between different tasks come here.
 // This is where the code gets complex.
InterTaskSend:
 if ( A task is locked )
 {
 display a diagnostic in debugging version
 Goto SendMessage_Error
 }
 if ( sending task is terminating )
 {
 display a diagnostic in debugging version
 Goto SendMessage_Error
 }
 if (SendMessage parameter area in sending app is already used)
 {
 display a diagnostic in debugging version
 Sleep until the parameter area is free // Uses SleepHq()
 }
 Grab parameter area in sending app
 Save the address where the result of the call will be stored
 Copy the SendMessage parameters off the stack into the sending hQueue
 Put the receiving queue at the head of the SendMessage() list
 // Set bits to wake up the receiving task

 SetWakeBit2( QS_SENDMESSAGE )
SendMessage_wakeup_receiving_task:
 if ( a previous SendMessage() has completed )
 goto got_reply
 Turn off "have result" flags in sending queue
 Call DirectedYield() to force the child task to run next
 // When the DirectedYield() returns, the receiving task should have awoken
 // and called ReceiveMessage() and ReplyMessage(). Described below.
 Sleep until result is back from child
 // Uses SleepHq(). Probably redundant, because there already should be a
 // result available when the prior DirectedYield() returned.
got_reply:
 Copy the return value to the "result" area on the stack
 Release parameter area in sending queue
 if ( Not replied to )
 goto SendMessage_wakeup_receiving_task
 goto SendMessage_done






[LISTING NINE]

 Make sure there is a SendMessage waiting for us.
 Remove sending queue from SendMessage() list of queues.
 Clear QS_SENDMSG bit if the list of queues is empty.
 Save copies of the sending hQueue and pointer to area
 where results should be saved in the sending task.
 Free the the SMPARAMS area in the sending queue.
 Make sure target window is still valid.
 Copy the ExtraInfo data from sender to receiver.
 Call the target window proc.
 Call ReplyMessage.

 Return.





[LISTING TEN]

 // Reply message takes the value that should be returned to
 // the sender as a parameter. Here, it's called "return_value"
ReplyMessage_start:
 If ( message has already been replied to, or
 if there is no sending queue )
 return
 if ( QS_SENDMESSAGE bit set in receiving queue)
 {
 ReceiveMessage()
 Goto ReplyMessage_start

 }
 if ( result area in use )
 {
 OldYield()
 Goto ReplyMessage_start
 }
 Copy return_value into sending hQueue
 Restore pointer to result area on stack in the sending hQueue
 Set AlreadyRepliedFlag
 SetWakeBit2( QS_SMRESULT )
 DirectedYield(SendingTask)
 Return










//=================================
// LISTING 1: QUEUE.C
//
// QUEUE, by Matt Pietrek, 1992
//
//=================================

#include <windows.h>
#include <dos.h>
#include "winio.h"

// If your IMPORT.LIB or LIBW.LIB doesn't include
// GetTaskQueue(), you'll have to add it to the IMPORTS section
// of the .DEF file. The ordinal number is KERNEL.35

WORD FAR PASCAL GetTaskQueue(WORD hTask);

typedef struct

{
 DWORD extraInfo;
 HWND hwnd;
 WORD message;
 WORD wParam;
 DWORD lParam;
 DWORD time;
 POINT pt;
} QUEUEMSG;

typedef struct
{
 WORD NextQueue;
 WORD OwningTask;
 WORD MessageSize;
 WORD NumMessages;
 WORD ReadPtr;
 WORD WritePtr;
 WORD Size;
 LONG MessageTime;
 POINT MessagePoint;
 WORD Unknown1;
 DWORD ExtraInfo;
 WORD Unknown2;
 LONG SendMessageLParam;
 WORD SendMessageWParam;
 WORD SendMessageMessage;
 HWND SendMessageHWnd;
 DWORD SendMessageResult;
 WORD QuitFlag;
 int ExitCode;
 WORD flags;
 DWORD Unknown3;
 WORD ExpWinVersion;
 WORD SendingHQ;
 WORD sendmsg_helper1;
 WORD sendmsg_helper2;
 WORD PaintCount;
 WORD TimersCount;
 WORD ChangeBits;
 WORD WakeBits;
 WORD WakeMask;
 WORD SendMessageResult1;
 WORD SendMessageResult2;
 WORD SendMessageResult3;
 WORD Hook;
 BYTE Hooks2[30];
 BYTE MessageArrayStart;
} QUEUE;

//
// Dumps selected fields of a message queue
//

void DumpQueueContents(QUEUE far *queue)
{
 QUEUEMSG far *queuemsg;
 unsigned maxMessages, i;


 maxMessages =
 ( queue->Size - FP_OFF(&queue->MessageArrayStart))
 / sizeof(QUEUEMSG);

 queuemsg = (QUEUEMSG far *) &queue->MessageArrayStart;

 printf("Messages: %u ReadPtr: %04X WritePtr: %04X\n",
 queue->NumMessages, queue->ReadPtr, queue->WritePtr);

 printf("WakeBits: ");
 if ( queue->WakeBits & QS_KEY )
 printf("QS_KEY ");
 if ( queue->WakeBits & QS_MOUSE )
 printf("QS_MOUSE ");
 if ( queue->WakeBits & QS_POSTMESSAGE )
 printf("QS_POSTMESSAGE ");
 if ( queue->WakeBits & QS_TIMER )
 printf("QS_TIMER ");
 if ( queue->WakeBits & QS_PAINT )
 printf("QS_PAINT ");
 printf("\n");

 for ( i=0; i < maxMessages; i++ )
 {
 printf(
 "HWnd: %04X Msg: %04X WParam: %04X LParam: %08lX\n",
 queuemsg->hwnd, queuemsg->message,
 queuemsg->wParam, queuemsg->lParam );

 queuemsg++;
 }
 printf("\n");
}

//
// Get a pointer to the application message queue. Then, puts
// some messages into the queue, and retrieve them. We display
// the contents of the queue at each state, so that we can see
// the principles involved.
//

void ExamineQueue(void)
{
 QUEUE far *queue;
 MSG msg;

 queue = MK_FP( GetTaskQueue(GetCurrentTask()), 0 );

 if ( !queue )
 {
 printf("Unable to find message queue\n");
 return;
 }

 printf("Here we have an empty queue:\n\n");
 DumpQueueContents(queue);

 printf(
 "We'll now call PostAppMessage() to put some messages in\n"

 "the queue. Note that the message count goes up, and that\n"
 "QS_POSTMESSAGE is now set:\n\n");

 PostAppMessage(GetCurrentTask(), 0x1234, 0x5678, 0x12345678L);
 PostAppMessage(GetCurrentTask(), 0x2345, 0x6789, 0x12345678L);
 PostAppMessage(GetCurrentTask(), 0x3456, 0x789A, 0x12345678L);
 PostAppMessage(GetCurrentTask(), 0x4567, 0x89AB, 0x12345678L);

 DumpQueueContents(queue);

 printf(
 "We'll now call GetMessage() to remove a message. The\n"
 "message still appears in the message array, but the Read\n"
 "pointer has been incremented. We also print out the\n"
 "contents of the retrieved message to show that it matches\n"
 "what was in the queue:\n\n");

 GetMessage(&msg, 0, 0, 0);
 DumpQueueContents(queue);

 printf(
 "The message retrieved into the MSG struct:\n"
 "HWnd: %04X Msg: %04X WParam: %04X LParam: %08lX\n\n",
 msg.hwnd, msg.message, msg.wParam, msg.lParam );

 printf(
 "We now call GetMessage 3 more times to get rid of the\n"
 "remaining messages. Note that the Read and Write ptrs are\n"
 "equal, the QS_POSTMESSAGE flag is no longer set, and the\n"
 "message count field shows 0. Thus, the queue is considered\n"
 "to be empty:\n\n");

 GetMessage(&msg, 0, 0, 0);
 GetMessage(&msg, 0, 0, 0);
 GetMessage(&msg, 0, 0, 0);
 DumpQueueContents(queue);
}

int main()
{
 // This program uses the message queue format for Windows
 // 3.1. Abort if running under any other version.

 if ( LOWORD(GetVersion()) != 0x0A03 )
 {
 winio_warn(FALSE, "QUEUE",
 "This program requires Windows 3.1");

 return 1;
 }

 // Turn off repaints. If we don't do this, the WINIO library
 // will attempt to use the queue while we're in the process of
 // examining it.

 winio_setbusy();
 winio_setpaint(winio_current(), FALSE);

 ExamineQueue();


 // Turn the repaints back on. This allows WINIO to refresh
 // the display with all the output that was created in
 // ExamineQueue().

 winio_setpaint(winio_current(), TRUE);
 winio_resetbusy();
 winio_home(winio_current());
 return 0;
}




















































February, 1993
NEURAL NETS FOR PREDICTING BEHAVIOR


Software that learns from its mistakes




James F. Farley and Peter D. Varhol


James is a project manager at Arm-tech Industries (Manchester, New Hampshire).
Peter is an assistant professor of computer science and mathematics at Rivier
College in Nashua, New Hampshire.


Neural networks will find use in a decision-making capacity only if the
decision-making process inherent in the network can be embedded in a larger
software or hardware system. The advantage over traditional embedded
approaches is that the network can be developed interactively using modern,
neural-net development tools, and the complex mathematical interactions
between network nodes can provide a better model of processes than more
traditional software approaches.
Two months of developing different neural-network models of a complex
nonlinear process demonstrated the effectiveness of this approach. We
investigated several approaches to developing an appropriate model of the
behavior of an electronic wind sensor.
We generated simulated data based on a behavioral model of a device described
in "Development of Subminiature Multi-Sensor Hot-Wire Probes" (see
"References"). The data we produced related sensory input with a given wind
speed and direction, so that the speed and angle of the wind could be known
with great accuracy. Based on these test data, we had to incorporate an
algorithm that could make a prediction of wind speed and direction from
sensory inputs into an embedded microprocessor.
We examined two conventional approaches to determining wind speed and
direction from the sensor readings. One was purely computational and used a
look-up table residing in memory. This was computationally fast, but the data
were too ambiguous to determine results in a straightforward manner. The
second approach was to fit the data to an appropriate curve, which yielded
some success. Wind speed was not difficult, but after extensive curve fitting
we could not get wind direction to any better than +-30 degrees, an
unacceptable error rate for any purpose.
Part of the problem was that some of the data were noisy and difficult to
interpret. Real-life sensors would have to be protected from flying debris
with a wire mesh or other covering, which would create currents and eddies
around the sensor interfaces. Furthermore, the sensors' readings would be less
accurate at some angles than at others, depending on the configuration of the
sensors themselves. (See the accompanying textbox, "Two Configurations for
Wind Sensors.") These limitations made for confusing and sometimes
contradictory data; in other words, a normal state of affairs in an actual
design project.


Investigating Neural-net Technology


Our approach to solving this problem was, to say the least, unconventional.
Neural networks are usually described as being useful as models for
classification, but at least some of the network structures can also be used
for prediction. We thought that it might be possible to construct and train an
appropriate network using our generated data, and to produce a model that
could take sensor input and produce the appropriate wind speed and direction
as output.
Using NeuralWare's NeuralWorks II Plus software, we built literally hundreds
of different neural networks, often working into the early hours of the
morning as the results of one run gave us ideas of new models or directions to
pursue. NeuralWorks' DOS-based graphical interface lets you start building
networks almost immediately, and the straight-forward design and training
process makes it possible to quickly generate large numbers of increasingly
complex networks.
Early on, we settled on the backward-propagation model. This model is
considered to be a feed-forward model, in that the results of the processing
elements (PEs) of one level are not fed back to the PEs in the previous level.
Where the backward propagation comes in is with the resulting error. The net
determines the difference between the desired output and the actual output,
then propagates that error backward through the net so that the weights at
each processing element can be adjusted. In this way, the net "learns" from
its mistakes. This approach facilitates making predictions rather than
classifications.
Because of the nature of the problem and the curve-fitting techniques used to
try to solve it, we viewed a neural network as a mathematical model. We wanted
to input certain values and have nonlinear transformations occur on these
values to produce the appropriate outputs. In the world before neural
networks, we would have had to do it manually, deriving our own set of
differential equations, and testing various combinations of equations with the
available data. This could have occupied us full time for months or even
years.


Surprises Along the Way


Suspecting that determining wind speed might have a simple solution, we broke
this component out into a separate network. It had, in fact, a straightforward
nonlinear solution without a neural network. However, we let the network do
our work for us, and produced a solution with a single sensor input, four
processing elements in the hidden layer, and one output. This net delivered an
average error of 0.7 meters per second and a maximum error of 2.29 meters per
second.
In the process of designing a workable neural network for wind direction, we
had one surprise. Because we had generated data for two different sensor
designs, we were able to determine which design produced data that improved
the neural-network output. We were, in effect, using the neural net to make
hardware-design recommendations. This is possible only if systems designers
agree to incorporate a neural network into the final product, since the
hardware configuration is dependent upon the network to interpret its data.
However, this means that the neural-net approach to analyzing data can be used
as an integral part of the overall system-design process.
Our final solution for determining wind direction was unexpected. On the basis
of advice from manufacturing engineers who claimed that the design was in fact
realistic, we decided to combine the two data sets to emulate a configuration
that combined both types of sensor design. This resulted in by far the best
results to date, with an average error of 2.1 degrees and a maximum error of
7.17 degrees on one set of test data. This was our primary breakthrough.


A Filter for Noisy Data


We had yet another surprise in coming up with the best neural-network
configuration. The data for one sensor design was taken at angles of 5-degree
intervals, while the data for the second was generated at 7.5-degree
intervals. To quickly emulate a combined design, we simply combined the two
data sets, including data only for common angles. This resulted in data at
angles of 15-degree intervals, which, when used to train a network, began to
produce predictions of wind direction within reasonable error rates.
However, in addition to having more data for each angle, we also had fewer
angles, at a greater distance from one another, with which to train the net.
Was this at least partially responsible for our improved results? As it turns
out, the answer was yes. Going back to one of the original, uncombined data
sets, we filtered out excess data and trained a similar network at 15-degree
intervals, then tested it with data taken at 5-degree intervals. The results
were significantly better than those obtained by training the net with data at
5-degree intervals, although not as good as those found with the combined
data.
NeuralWare technical support expressed surprise at this result, indicating
that more data is almost always better than less. Together we speculated that
it might be due to the noisy nature of the data. We considered using a Kalman
filter on the data prior to running it through the neural net, but a Kalman
filter assumes white noise and an autoregressive-moving average (ARMA)
time-series model. Neither of these assumptions seemed to fit our problem.
More Details.
Instead, we let the neural net itself act as a noise filter. In addition to
training the net at angle intervals that made it easier to learn the
differences between different wind speeds, our final network had three hidden
layers, the most permitted by NeuralWorks. Each hidden layer acted as a
filter, examining different combinations of inputs, determining which of them
were best contributing to an accurate solution, and passing only those
combinations along to the next layer.


The Final Network


The final network structure for determining wind direction is diagrammed in
Figure 1. Figure 2 shows the net for determining wind speed. Figure 1 shows 12
input-processing elements, which correspond to 12 sensor readings. At each
successive layer in the network, we reduced the number of processing elements
by three, and had one output-processing element. NeuralWare was once again
surprised to discover that more than one hidden layer made a difference, but
agreed that the noisy nature of the data was the likely culprit.
Code generation was simply a matter of selecting a menu item, naming a
function and source file, and letting the software do its work. The resulting
code was generic C, which we compiled without modification, using Turbo C++.
Listing One shows an example of code produced from a much simpler network, the
one used to determine wind speed alone.


Further Work is Needed



While the design described above produces results that largely meet the
specifications, more refinement is needed. Of some interest is fine tuning the
variables used to get incrementally better results. The learning momentum and
learning coefficients seem to make a difference, probably because lowering
these values dampens some of the noise in the data.
We also want to cut down the size of the network; currently, NeuralWorks
generates just over 10 Kbytes of C source code for the wind-direction network
and almost 2 Kbytes for the wind-speed network. NeuralWare claims that using
more than a single hidden layer rarely improves performance, but our tests
with a single-layer network produced solutions with up to four times the
average error rate of the best model. It seems that one is not enough, but
maybe the problem can be solved with a less complex network. For
embedded-system use, this is a critical consideration.
The input data also needs refining. Our 12 inputs include atmospheric pressure
and temperature from both designs' data sets, since the data were generated
under slightly different conditions. However, removing pressure and
temperature entirely from the input data produced less accurate results.
Ideally, we would like to prototype the design our network recommends and
generate new test data. With integrated inputs, we suspect that we can refine
our network to be both smaller and more accurate than it is now.
The sensor and neural-network technology explored has applications beyond
determining wind speed and direction. The same approach can be used on
aircraft and marine vessels to determine vessel speed and to help analyze the
speed and direction of sonar targets.


Many Features Left Unexplored


As for NeuralWorks Professional II Plus, our two months of exploring models
barely scratched the surface of its capabilities. It provides tools for
creating over two dozen different types of networks, as well as choosing
different learning rules, transfer functions, training schedules, and other
variables. It requires extended memory, but can handle networks with up to
8192 processing elements. Using an 80486-based PC with the integrated
coprocessor, it's possible to run 50,000 training trials in just a few
minutes.
Its sole weakness was the inability to analyze the results of trial networks
from within the package itself, so we turned to SPSS and Quattro Pro.
NeuralWare is remedying that weakness with the introduction of the
DataSculptor, a Windows-based graphical tool for preparing input data and
analyzing results. Since these activities can take up to 80 percent of the
time it takes to build a neural network, a good method of manipulating large
amounts of data is a necessity. Our early look at a beta copy of DataSculptor
indicates that it could easily have saved us some time.
However, we're confident of its ability to more quickly and accurately program
embedded systems. The caveat is that systems designers have to be committed to
integrating neural networks into the design process from the beginning of the
project. The combination of easily experimenting with very different
behavioral models and quickly generating C code make neural networks and
NeuralWorks valuable additions to many design and development projects.


References


Westphal, Russell V., Phillip M. Ligrani, and Fred R. Lemos. "Development of
Subminiature Multi-Sensor Hot-Wire Probes." NASA Technical Memorandum 100052,
1988.
Nelson, Marilyn McCord and W.T. Illingworth. A Practical Guide to Neural Nets.
Reading, MA: Addison-Wesley, 1991.
Soucek, Branko, and the IRIS Group. Neural and Intelligent Systems
Integration. New York, NY: John Wiley and Sons, 1992.


Products Mentioned


NeuralWorks II Plus NeuralWare Inc. Penn Center West Building IV Pittsburgh,
PA 15276 412-787-8222


Two Configurations for Wind Sensors


The sensors envisioned in this article and described in the cited reference
were based on hot-wire technology. A wire is electrically heated to maintain a
constant temperature. As air moves past the element, heat is drawn off,
requiring more current to maintain the constant temperature on the wire. The
increased voltage required to maintain the temperature is correlated to the
amount of heat removed. This technology is the subject of many textbooks on
fluid dynamics.
Several atmospheric variables affect the amount of heat taken from the
elements. Wind speed and direction, the quantities we are trying to determine,
have an affect. This is the effect we hope to be measuring, but there are
confounding variables, such as wind currents and eddies swirling around the
sensors. Atmospheric pressure and temperature must also be factored in.
The Westphal, et al. monograph describes the design of the wire-sensor probe.
From this we generated two different sensor configurations. The first design
used three elements: Two are separated by a 90-degree angle, in an x,y
configuration, while the third is perpendicular to that plane in a z
direction. The second used two elements, running in parallel with one another.
It was by combining these two design approaches that we were able to obtain a
satisfactory neural-network model for wind direction.
--J.F.F. and P.D.V.


_NEURAL NETS FOR PREDICTING BEHAVIOR_
by James F. Farley and Peter D. Varhol


[LISTING ONE]

#if __STDC__
#define ARGS(x) x
#else
#define ARGS(x) ()
#endif /* __STDC__ */

/* --- External Routines --- */
extern double tanh ARGS((double));

#if __STDC__
int NN_Recall( void *NetPtr, float Yin[1], float Yout[1] )
#else
int NN_Recall( NetPtr, Yin, Yout )
void *NetPtr; /* Network Pointer (not used) */
float Yin[1], Yout[1]; /* Data */

#endif /* __STDC__ */
{
 float Xout[8]; /* work arrays */
 long ICmpT; /* temp for comparisons */

 /* Read and scale input into network */
 Xout[2] = Yin[0] * (1.4204544) + (-3.3721588);
LAB110:

 /* Generating code for PE 0 in layer 3 */
 Xout[3] = (float)(-0.079340167) + (float)(1.0320195) * Xout[2];
 Xout[3] = tanh( Xout[3] );

 /* Generating code for PE 1 in layer 3 */
 Xout[4] = (float)(3.1511722) + (float)(-3.2430165) * Xout[2];
 Xout[4] = tanh( Xout[4] );

 /* Generating code for PE 2 in layer 3 */
 Xout[5] = (float)(-0.095040455) + (float)(1.0205934) * Xout[2];
 Xout[5] = tanh( Xout[5] );

 /* Generating code for PE 3 in layer 3 */
 Xout[6] = (float)(-0.076126046) + (float)(1.0341249) * Xout[2];
 Xout[6] = tanh( Xout[6] );

 /* Generating code for PE 0 in layer 4 */
 Xout[7] = (float)(0.41889122) + (float)(0.31151304) * Xout[3] +
 (float)(-0.79348314) * Xout[4] + (float)(0.30291474) * Xout[5] +
 (float)(0.31325051) * Xout[6];
 Xout[7] = tanh( Xout[7] );

 /* De-scale and write output from network */
 Yout[0] = Xout[7] * (35.937499) + (31.25);
 return( 0 );
}



























February, 1993
PROGRAMMING PARADIGMS


Stephen Wolfram: Strong Opinions




Michael Swaine


This month we continue with the conversation Ray Valdes and I had with Stephen
Wolfram. Last month Wolfram talked about Mathematica and programming
paradigms. This month he talks about science, programming, business, and why
(some) mathematicians don't like him.
It's a fact that not everyone likes Stephen Wolfram. He can be exceedingly
impatient--he's never had the patience to complete any sort of formal academic
degree program, for example, although he was granted a well-deserved doctorate
by Cal Tech. And he can come across as more than a little arrogant. Whether
that's an accurate impression, though, is open to question. Perhaps he simply
knows how good his mind is and sees no point in false modesty. But when, in
the conversation that follows, he claims to know "a lot of the leading
scientists in a lot of different areas," don't think for a moment that he's
exaggerating.
DDJ: You're a scientist and a mathematician, which makes you a target user of
Mathematica, and you say you use it every day. But we've talked to various
mathematician friends and they express mistrust for most mathematical
programs, including Mathematica.
SW: I spend part of my time doing science and I know a lot of the leading
scientists in a lot of different areas. And in mathematics, for example, I
think it's fair to say that the people I know who are the best mathematicians
in their various areas use Mathematica, and they use it in many cases
extremely enthusiastically. Now there are two sociological phenomena that go
on in mathematics. One of them is this idea that computers are anathema to
mathematics. This is a mistake. It's a fundamental intellectual mistake. You
don't have to take that from me; just look at what's going on in the field,
look at the fact that the best mathematicians use them.
DDJ: And the other phenomenon?
SW: There is this very weird thing about mathematics that's different from
every other science. The way you make progress in mathematics is that you
think of a theorem and generate a proof for it. It's a purely theoretical
activity that proceeds in this theorem-proof mode. In every other field of
science, experiment is the thing that originally drives what goes on. People
don't make models and theories and work out their consequences. But one of the
things that Mathematica is making possible in mathematics--and this is one of
the things that good mathematicians who use it often say about it--is that you
can actually do experiments. In a reasonably short amount of time you can do a
reasonably nontrivial experiment and find out what's likely to be true before
you go through the traditional mathematical approach.
DDJ: It's a highly respected approach.
SW: Which is, guess what's true and then try to prove it. From the point of
view of a sort of mental exercise, guess what's true and try to prove it is
way out there; it's one of the hardest things people can imagine doing. On the
other hand, that's not the way you're likely to make the most progress, by
doing the hardest thing people can imagine doing. You're likely to make more
progress by making things easier for yourself. If you look at every other
field of science, physics, anything else, there are a lot more
experimentalists than theoreticians.
DDJ: Then mathematicians have to become more like the experimentalists in
being aware of the limitations of their instruments for experiment. There was
an article in Mathematica Journal by a mathematician at Berkeley who was using
Mathematica to point out flaws in some other paper. The [original paper's]
conclusion was right but the process was wrong because of a floating-point
error, which Mathematica didn't have. Or maybe it had a different
floating-point error ...
SW: No.
DDJ: OK, but they need to be aware of their instruments, it seems.
SW: That's true, but most of the proofs that are published are probably wrong.
In detail, I mean. Checking a mathematical proof is at least as hard as
debugging a computer program perfectly. The only difference is that, with a
computer program, you can run it, so you can actually see what's happening,
whereas a proof just sits there in a journal and generations of graduate
students try to [understand] it. One thing is sure: People have to learn to
use these tools well. If they're all writing Fortran-like programs in
Mathematica, that's not going to get them that far. In terms of understanding
the limitations of these tools, yes, they have to have some concept of what's
going on. Being able to predict, if I do a calculation of such-and-such a
size, will I be likely to run into a memory chip that blows up or a bug in the
program--that's hard for anybody to assess, really. And it's certainly true in
doing experiments--I know it is when I do experiments--the most likely form of
error is human error. That your program has a bug in it is much more likely
than that Mathematica has a bug in it, which is again more likely than that
the CPU that you're using has a bug in its logic.
DDJ: Which does happen.
SW: In testing out Mathematica, we found things like that. If you look at the
history of computer programming, one of the extremely unexpected things that
happened was that there were bugs in programs. This was not anticipated. In
Turing's original paper on the universal Turing machine, the program was
riddled with bugs. From the very earliest computer program, programs had bugs.
That was not something that people expected. And it's an important conceptual
thing for people to realize, that human fallibility is at its most obvious in
writing computer programs. The mathematician who comes up to Mathematica and
types something in and gets something they don't expect and says, "That must
be the right answer, I'm going to write it in my paper," is a foolish person
indeed. Because the chances are, if they didn't expect it and can't understand
it and can't explain it, then probably there was a bug in the program they
wrote.
DDJ: But the idea of accepting the existence of bugs in programs--or
proofs--somehow doesn't sound like the kind of idea that would sit well with a
mathematician. How would you characterize the acceptance of Mathematica by
mathematicians?
SW: The mathematics community is a most puristic community. In a sense, I've
been pleasantly surprised with how easily Mathematica has been accepted in
that community. There's another thing, quite honestly, that that community has
a hard time with. They sort of hate one aspect of what I have done, which is
to take intellectual developments and make a company out of them and sell
things to people.
DDJ: Probably not surprising, if mathematicians are the most puristic of
scientists.
SW: My own view of that, which has hardened over the years, is, my god, that's
the right thing to do. If you look at what's happened with TeX, for example,
which went in the other direction ... well, Mathematica could not have been
brought to where it is today if it had not been done as a commercial effort.
The amount of money that has to be spent to do all the details of development,
you just can't support that in any other way than this unique American idea of
the entrepreneurial company. If you ask a mathematician why they don't like
Mathematica--it's more why they don't like me than why they don't like
Mathematica--that's it.
DDJ: There's a lot of work involved in bringing a product up to commercial
standards and making it something you can support.
SW: In the research I've been doing, one of the people who's been working with
me has developed a nice program that allows you to lay out networks on the
screen. It's a problem that I've wanted to have solved for ten years or so,
and he's got a fairly nice solution and a nice interactive program and all
that. I've talked to people about it, so people know that my company did this
program. The problem is, we didn't develop this program in a commercial way.
We developed it because I needed this thing for solving a particular problem.
Most of the things that have been developed in the technical computing
community have been developed in that kind of way, and I realized, knowing the
standards that one has for having a commercial product that is properly
supported, what incredible distance there is between this fairly nice piece of
code that does something fairly useful and something like Mathematica. And in
fact, I have a hard time even thinking of giving it to friends of mine because
I know they will expect that, since it was produced by somebody who works for
my company, it should be a thing like Mathematica. I suppose it's obvious, but
I think that many in the mathematics community don't realize the distance.
DDJ: We know you've given a lot of thought to programming paradigms. Do you
have any opinions on the dominant paradigms of today, and about which will
survive into the next decade?
SW: I think the transformational-rule paradigm is working fairly well. I think
the functional paradigm is largely working well. I think the procedural
paradigm sucks, basically. I think the fundamental problem with it is there's
much too much hidden state in the procedural paradigm. You have these weird
variables that are getting updated and things that are happening that you
can't see. I strongly believe that there is a way to do procedural programming
that does not use hidden states. For example, here's a thing that I'd love to
be able to do: make a kind of symbolic template of the execution history of a
program. The kind of thing that trace does--
DDJ: Right.
SW:--of taking a program that's executing and giving you back the symbolic
representation. What I'd like to be able to do is program by saying, here's
what I want--it's not quite a flowchart, it's something beyond a flowchart--my
program to be like, now I would actually do the things that make it do this.
That's kind of vague, and the reason it's vague is because I haven't figured
out how to do it. But I think one of the directions could be very fruitful is
how you take these conceptual ideas about procedural programming and turn them
into something that's easier to look at once you have a program. I mean, the
idea of procedural programming, of loops and so on, people have no trouble
grasping. But once they've written their programs, they have a lot of trouble
grasping what the programs do. And if one could have a more explicit way of
representing these things, one would be in good shape, I think.
DDJ: Yeah, so one rationale behind procedural programming is that it's easy to
learn. But one rationale for a hidden state is an optimization of some sort.
SW: I don't think people need optimization any more.
DDJ: Oh, really?
SW: There are very few programs that are written for the first time where
execution speed is an issue. When you're running your word processor, you
don't want the scrolling to be slow, but that's a different point. If you look
at the history of programming-language design, almost every major screw-up is
a consequence of people pandering to some optimization, starting from Fortran
Hollerith-format statements. The trick is figuring out what you want and then
figuring out how to get there, rather than worrying all the time about how
you're going to get there. Another direction that I've thought about is
parallel processing and its relationship to languages. There was a language
called C* that I made the original design for, for the Connection Machine.
Unfortunately, what C* finally became was extremely far from what I had worked
on. That's one of the reasons I don't do consulting any more.
DDJ: Parallel processing and its relation to language? What's the question?
SW: One of the questions is, are there paradigms that are applicable to
parallel programming that aren't applicable to sequential programming, and
what are they? Functional programming, list-based programming, things like
that are readily applicable to parallel systems. In fact, they work very
nicely and elegantly in parallel systems. That is indeed the main algorithm
that is used in the various [parallel] Fortrans. There is a question,
particularly with respect to SIMD architectures, of [whether] there are other
fundamental kinds of programming-language ideas that we just don't have yet?
DDJ: Do you have an answer?
SW: Well, I've spent a lot of time thinking about them and I didn't come up
with them. It could be that there aren't any. One of the ways that you can get
a clue about this relates to the other side of my life, which is trying to do
science. The kinds of things I'm interested in are using fundamental ideas
from computation to understand more about scientific systems. And one of the
things one is led to there is what kind of simple computational systems really
capture the essence of what's going in a biological system, in a growing
plant. Figuring out the answer to that question has been one of my big
projects. And what I've found is that so far, with one exception that I'm
still grappling with, all of the things that I have found to be useful as
fundamental models--whether they're things like Turing machines, or cellular
automata, or register machines, or graphs--turn out to be very simple to do in
our existing programming paradigms. So one question is, is there something out
there in nature that is working according to a different programming paradigm
that we should be able to learn from? If you look at the construction of
organisms, for instance, there are many segmented organisms:
biology-discovered iteration. There are many branching organisms:
biology-discovered recursion. There are a few other of these kinds of things
that are a little less familiar but are still one line of Mathematica code,
and that are commonly used in biology. Does that mean we've really discovered
all the useful programming paradigms? And if nature presents us with something
we can't understand along those lines, that's a good clue that there's another
programming paradigm out there to be figured out. And as far as I can tell,
there isn't much else out there.
DDJ: In the second Artificial Life conference proceedings Doyne Farmer says
he's now of the view that partial differential equations is the most general
model ....
SW: He is completely and utterly, unquestionably, unequivocally, totally
wrong. It's interesting you would pick him as an example, because his
responses to things that I have been right about in the past have been as
wrong as they could possibly be. As a matter of fact, I always find it very
amusing--the equations of general relativity are partial differential
equations, as you probably know, and there is this fairly amusing thing that
is said about these equations, which is that there can be singularities, and
the laws of physics as we know them break down. Well, what does this mean?
This means the partial differential equations show singular behavior which can
no longer be described by the partial differential equations. Well, it turns
out that in the same sense physics breaks down around the space shuttle.
Because the equations of fluid dynamics have in them an approximation that
works just fine so long as there aren't certain kinds of strong shock waves
involved. When you get into hypersonic flow, which is what happens around the
shuttle as it enters the atmosphere, the shock front has a thickness that is
less than the mean free path that molecules go before they collide with other
molecules. So in this same sense, physics as we know it breaks down. Of course
it doesn't. What actually is going on is that partial differential equations
are an approximation that turn out to be not a very good approximation in the
case of hypersonic flow. It's actually an interesting historical thing that
I've been studying, how partial differential equations ended up being thought
by people to be the fundamental equations of physics. It's very bizarre,
because it isn't true, and not only is it not true, even the fact that atoms
exist makes it clear that it's not true. So why is it that people will [say]
that the fundamental equations of physics are partial differential equations?
What happened, I think, is that when these models were first developed, the
only methods for figuring out what the conseq


uences were was
hand calculation. Computers are a very recent phenomenon in the history of
science, and the fundamental models that exist in science have not yet adapted
to computation. And that's my next big thing.
[Editor's note: In retrospect, Ray Valdes feels he may have overstated
Farmer's position. Here's what Farmer actually said: "Connectionist models are
a useful tool for solving problems in learning and adaptation .... However,
connectionism represents a level of abstraction that is ultimately limited by
such factors as the need to specify connections explicitly, and the lack of
built-in spatial structure. Many problems in adaptive systems ultimately
require models such as partial differential equations or cellular automata
with spatial structure." ("A Rosetta Stone for Connectionism," in Emergent
Computation, MIT Press, 1991, pp. 183.)]










February, 1993
C PROGRAMMING


Comparing D-Flat and D-Flat++


 This article contains the following executables: DFPP01.ARC


Al Stevens


The evolution of D-Flat++ continues to reveal some interesting surprises. The
library has grown to where it now has enough classes to build a simple
application with a menu and a few controls. The source code that implements
these features is smaller and easier to read than its D-Flat counterparts in
C. This month we look at the Application, Control, and TextBox window classes
and compare the source code for these modules with the D-Flat C source code
for the same features. My December 1991 column discussed D-Flat text boxes,
and the May 1992 column discussed the application window. You will find the C
source code for those modules in those issues.
The DF++ Application and TextBox classes derive from the DFWindow class, which
we discussed last month. Every DF++ application has an Application window
object, and most of the controls--menus, list boxes, edit boxes, buttons, and
so on--derive from the TextBox class, so the Application and TextBox classes
are, in effect, the foundation for the DF++ application. The Control class is
the base for all control windows, and it encapsulates the behavior that
control windows share.


The Application Class


Listing One, page 138, is applicat.h, the header file that describes the
Application window class. A significant difference exists between D-Flat and
D-Flat++ in the way that the two systems describe window classes, and
applicat.h illustrates the difference by its relative simplicity. D-Flat has
the data members for all window classes in a common window structure. The
advantage is that the code for every class does not need to de-reference
class-specific data-block extensions. The disadvantage is that every window
class bears the size overhead of all classes. DF++ uses the C++ inheritance
mechanism to derive window classes in a hierarchy. The result is simpler code
and data structures that are no bigger than they need to be. The DF++
Application class has only three data members beyond the ones it inherits: a
pointer to the MenuBar object, a pointer to the StatusBar object, and a switch
that its Show message uses to indicate that the Application object is taking
the focus.
Listing Two, page 138, is applicat.cpp. When you compare it to applicat.c from
last May, the first thing you notice is that the C++ version is a lot smaller.
There are several reasons for this. First, the C++ code does not need to
intercept and interpret message codes in order to receive and process
messages. DF++ messages are sent in the C++ tradition--by calls to class
member functions. Therefore, each message is called directly by the sender and
involves no message-passing logic like D-Flat uses. Second, applicat.cpp does
not support the Window menu for multiple-document interface. Third,
applicat.cpp does not support changing screen formats, shelling out to DOS, or
the usual Help menu commands. Some applications will not use these features,
and the ones that do will be better served by derived classes that include
them.
The Application class's OpenWindow function registers the object with the
DeskTop object by calling the SetApplication function, which allows the
desktop, and therefore any other objects, to send messages to the application.
The OpenWindow function declares the MenuBar and StatusBar objects. The
MenuBar object is declared only if the Application constructor includes a
pointer to an array of MenuBarItem objects, which define the menu bar and
pop-down menus. We'll spend more time on menus in a later column.
The SetFocus and Show functions cause the Application window to take the focus
differently than other windows. Most windows use the routine functions
provided by the DFWindow base class. An Application window behaves
differently. Rather than allow itself to be repainted whenever the user clicks
on it, the Application window simply calls the Border function to let the
window frame reflect the in-focus condition. This strategy prevents the system
from repainting all the child windows every time the user clicks the
Application window. The assumption is that they are already visible and that
nothing in the Application window needs to be repainted just because the focus
has shifted.
The Application window receives any keystrokes that the in-focus child window
does not intercept. The DFWindow class sees to this by sending any unprocessed
Keyboard messages to the parent of the window that received them. The
Application window will use Ctrl+F4 and Alt+F4 to close the window. It sends
all other Keyboard messages to the MenuBar object. This strategy allows the
MenuBar to receive and process menu-bar shortcut keys and menu-command
accelerator keys.
The Application window passes all ClockTick and StatusMessage messages to the
StatusBar object to process.


The Control Class


Listings Three and Four, page 138, are control.h and control.cpp,
respectively, the header files that define the Control class. All control
window classes derive from this class. Its purpose is to define the behavior
of a control window regardless of its function. Control windows are, by
definition, user input devices that are child windows to another window. They
may be children of an application-specific document window, a dialog box, or
the Application window itself.
At present, the only behavior encapsulated by the Control class is the
management of an enabled/disabled state variable and the processing of some
default keystrokes. The Keyboard message intercepts the keys that move the
focus between the sibling controls of a parent window. As I develop the
controls, the dialog-box logic, and the help system, I'll no doubt add
features to this class.
The D-Flat C library builds the generic control logic as a window-processing
module in dialbox.c, which I described in my June 1992 column. It included
code sensitive to the class type of the control window. The DF++ encapsulates
that code into individual control-window classes.


The TextBox Class


Listing Five, page 138, is textbox.h, the header file that defines the TextBox
class, which derives from the Control class. The TextBox is a base class for
many other controls that will display, scroll, and page text. Edit boxes, list
boxes, pop-down menus, pushbuttons, and others all derive directly or
indirectly from the TextBox class, which encapsulates the behavior associated
with scroll bars, text representation and display, and marked blocks of text.
TextBox is a catch-all class. Almost every kind of window has text as its
base.
A TextBox object consists of a window with a body of text and, optionally,
horizontal and vertical scroll bars. The text is represented by a single
String object named text, which contains newline characters to mark the ends
of lines and a null character to mark the end of the text. The bufflen data
member records the length of the string buffer, which is not always the same
as the length of the string itself. The wlines data member is a line count and
the TextPointers data member points to an array of integer text-line offsets.
Each member of the array is an offset to the first character of a text line.
This array supports efficient retrieval of specified lines of text. The
textlen data member is the length of the text string, and the textwidth data
member is the length of the longest line in the body of text. The wtop and
wleft data members are offsets that represent the line and column positions of
the text as it is currently displayed in the window. The BlkBeg... and
BlkEnd... data members specify the start and end points of a marked block of
text, if one exists.
Listing Six, page 139, is textbox.cpp, the code that implements the TextBox
class. This module is representative of how control windows work. Most other
controls will derive from the TextBox class, adding their own behavior. The
TextBox class intercepts its Show message to add scroll bars to the window if
there are none and the scroll-bar attributes are set. The TextBox class has
unique messages of its own to initialize the text, append text, clear it,
change the buffer length, and extract a specified line of text. There are
internal messages, not available to users of the class, that display a line of
text with attention given to embedded shortcut characters. These functions
support the display of menu and botton labels, which have shortcut keys that
must be displayed in a highlighting color.
The TextBox's Paint message paints the window by displaying the lines in the
window's field of view. The Keyboard message intercepts the paging and
scrolling keys to send paging and scrolling messages to the window, which in
turn page and scroll the window horizontally and vertically. The TextBox class
does not need to process mouse messages to page and scroll. Those messages are
received by the child scroll bars, which are window classes of their own. They
interpret the mouse messages and send paging and scrolling messages to the
TextBox window object. This approach is in improvement over the way D-Flat
manages paging and scrolling. The D-Flat text-box window intercepts and
processes the scroll-bar mouse messages itself, a procedure that binds scroll
bars to text boxes and does not permit them to be used for other purposes.
This is an example of how C++ lures you into building better code. By allowing
encapsulation, the language encourages you to organize th functions of an
object into a separate definition and to bind that object as loosely as
possible with other objects. You could always do that in C, and we always
tried, but the lure wasn't there.


Messages and Virtual Functions


The D-Flat C library resembles the Windows API in the way that it manages
messages. Any message can go to any window. If the window is not interested in
the message, it can pass the message on to its base class, which has the same
opportunity, and this proceeds all the way down to the base class. The window
can process the message and/or pass it on as well. The window can reject the
message by not allowing it to proceed, or supersede the processing by any base
class by processing it and then not passing it on. These procedures are
managed by the custom window-processing modules assigned to an instance of a
window class and/or by the default window processing modules assigned to the
class in its definition. All this looks very much like the object-oriented
messages and polymorphism of C++, and it is similar, except that the program
itself makes the polymorphic decisions at run time instead of having them
defined in the class hierarchy. The main difference is that the C library's
SendMessage process can send any message to any window through its window
pointer, and the message will arrive, whether or not it gets processed.
D-Flat++ has no SendMessage function. Messages are member functions. How is
this different?
The DFWindow class has a number of virtual member functions that represent the
lowest common denominator of window processing. All window classes have these
messages in common. For example, any window might have a title, receive a
keystroke, or be moved. If a window works fine with the default DFWindow
behavior for those messages, it needs nothing more in its definition. If the
window wants to modify that behavior, it provides its own overloaded member
functions, and the behavior is appropriately modified.
Messages specific to a particular derived class are not defined in the
DFWindow class. To illustrate, the PageUp message for the TextBox class is not
defined in DFWindow. What are the implications of that? First, you cannot send
the PageUp message to a window not derived from the TextBox class. Not that
you'd want to, but even if you did, you cannot do it. Second, you cannot send
the PageUp message to a TextBox window through a reference or pointer to one
of its base classes. There might be times when you'd want to do that, but you
cannot.
To fully emulate the message-passing paradigm of D-Flat or Windows, a C++
class library would need to include virtual member functions for every
possible message in the topmost base window base class, which is DFWindow in
D-Flat++. What would be wrong with that? Two things. First, every time you
built a new window class to support your application, you'd have to add empty
virtual member functions to the DFWindow class. Second, every window class
would carry the overhead of every message. Why is that?
When you put virtual functions in a class, you add overhead. Each object of
the class contains a vptr pointer to a vtbl table of virtual function
pointers. If the virtual function is an empty function, code space is set
aside for a function return statement. There are vptrs for the object class
and each of its base classes. All objects of the same class point to a common
vtbl. Windows has about 130 messages. D-Flat has about 75. If a class library
encapsulated all those messages as empty virtual functions, there would be
tables for every class of window object that a program declared. D-Flat has
over 20 window classes. Windows has a lot more. If D-Flat++ had empty virtual
functions for every message, there would be over 1500 vtbl entries plus the
ones that would result as you added window classes to support your
application.
Various C++ class libraries that wrap around the Windows API solve the problem
in different ways. Borland's ObjectWindows Library extends the syntax of the
C++ language to define message functions. The Microsoft Foundation Class
library uses preprocessor macros to map message functions to a message map.
The Windows++ class library that I discussed in December 1992, builds virtual
functions for a few of the most commonly used messages and passes the others
through a default function that the window class must interpret to its own
purposes. D-Flat++ puts message functions in the class definitions at the
highest level in the hierarchy where the message is relevant and assumes that
the system and application will send messages only to objects of the window
classes that expect them.


"You Can't Go Home Again"



Thomas Wolfe was right. Looking at D-Flat from the perspective of the C++
programmer, and having now solved the same problem in both languages, I can
see many ways that the C code could have been better--more object oriented,
perhaps, and certainly better organized. It makes me want to rewrite the C
version, but that would be looking backward, and we don't want to do that.


RTFM: (Read the Friendly Manual--or README File)


About a year ago I reported bug to Borland, which I thought they ignored. I
was a Windows 3.1 beta tester, and when I ran the Brief programmer's editor
followed by the Borland C++ 3.0 command-line MAKE utility program, my computer
crashed. I tested it on several computers and got the same result. The culprit
turned out to be the Windows 3.1 EMM386.EXE memory manager. By using the DOS
version of the memory manager I was able to circumvent the problem. Since
Windows 3.1 was in beta and the bug involved a Solution Systems product (Brief
3.1) and a Borland product, it looked like nobody bothered to fix it.
In the meantime, Borland upgraded their C++ to 3.1 and acquired the Brief
editor, and Windows 3.1 was released. The bug persisted. I still used the
workaround, but there came a new wrinkle. The problem returned with the beta
DOS 6.0's EMM386.EXE. This is a real concern because there are compelling
reasons to use the newer EMM386.EXE. It frees up more conventional memory than
the older one, and who knows what else in the new DOS depends on a
more-current memory manager to work properly. I got online to the Borland
sysops on CompuServe and asked if anyone had any comments.
If you are one of the many programmers who uses DOS, Brief, and the BC++
command-line programs, and if you expect to use Windows or upgrade to DOS 6.0,
you need to know about this one. The sysops pointed me to a neglected (by me)
item tucked away in the BC++ 3.1 README file that says to install a patch in
their DPMI software to get around a condition related to the Windows 3.1 (and
now the DOS 6.0) memory manager. I'm not sure why it's a patch and not a
permanent part of the software, but I made it, and now everything works fine.
Ironically, the discussion of the patch comes just before, but not within, a
section titled, "IMPORTANT INFORMATION."

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

// -------------- applicat.h
#ifndef APPLICAT_H
#define APPLICAT_H
#include "menubar.h"
#include "statbar.h"
class Application : public DFWindow {
 MenuBar *menubar; // points to menu bar
 StatusBar *statusbar; // points to status bar
 Bool takingfocus; // true while taking focus
 virtual void SetColors();
protected:
 // ------------- client window coordinate adjustments
 virtual void AdjustBorders();
public:
 Application(char *ttl,int lf,int tp,int ht,int wd,MenuBarItem *Menu = NULL)
 : DFWindow(ttl, lf, tp, ht, wd, NULL)
 { OpenWindow(Menu); }
 Application(char *ttl, int ht, int wd, MenuBarItem *Menu = NULL)
 : DFWindow(ttl, ht, wd, NULL)
 { OpenWindow(Menu); }
 Application(int lf, int tp, int ht, int wd, MenuBarItem *Menu = NULL)
 : DFWindow(lf, tp, ht, wd, NULL)
 { OpenWindow(Menu); }
 Application(int ht, int wd, MenuBarItem *Menu = NULL)
 : DFWindow(ht, wd, NULL)
 { OpenWindow(Menu); }
 Application(char *ttl, MenuBarItem *Menu = NULL)
 : DFWindow(ttl)
 { OpenWindow(Menu); }
 virtual ~Application()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- API messages
 virtual void OpenWindow() { OpenWindow(NULL); }
 void OpenWindow(MenuBarItem *menu);
 virtual void CloseWindow();
 virtual Bool SetFocus();
 virtual void Show();
 virtual void Keyboard(int key);
 virtual void ClockTick();
 void StatusMessage(String& Msg);
};
void DispatchEvents(Application *ApWnd);

void InitializeEvents(void);
#endif






[LISTING TWO]

// ------------ applicat.cpp
#include "dflatpp.h"
void Application::OpenWindow(MenuBarItem *menu)
{
 extern DeskTop desktop;
 desktop.SetApplication(this);
 windowtype = ApplicationWindow;
 if (windowstate == CLOSED)
 DFWindow::OpenWindow();
 SetAttribute(BORDER SAVESELF CONTROLBOX STATUSBAR);
 SetColors();
 if (menu != NULL) {
 SetAttribute(MENUBAR);
 menubar = new MenuBar(menu, this);
 }
 else
 menubar = NULL;
 statusbar = new StatusBar(this);
 desktop.mouse().Show();
 DFWindow::SetFocus();
 takingfocus = False;
}
void Application::CloseWindow()
{
 if (menubar != NULL)
 delete menubar;
 if (statusbar != NULL)
 delete statusbar;
 desktop.SetApplication(NULL);
 DFWindow::CloseWindow();
}
// -------- set the fg/bg colors for the window
void Application::SetColors()
{
 colors.fg =
 colors.sfg =
 colors.ffg =
 colors.hfg = LIGHTGRAY;
 colors.bg =
 colors.sbg =
 colors.fbg =
 colors.hbg = BLUE;
}
void Application::AdjustBorders()
{
 DFWindow::AdjustBorders();
 if (attrib & MENUBAR)
 TopBorderAdj++;
 if (attrib & STATUSBAR)

 BottomBorderAdj = 1;
}
Bool Application::SetFocus()
{
 takingfocus = True;
 DFWindow::SetFocus();
 takingfocus = False;
 return True;
}
void Application::Show()
{
 if (!takingfocus !isVisible())
 DFWindow::Show();
 else
 Border();
}
void Application::Keyboard(int key)
{
 switch (key) {
 case CTRL_F4:
 case ALT_F4:
 CloseWindow();
 break;
 default:
 // ---- forward unprocessed keys to the menubar
 if (menubar != NULL)
 menubar->Keyboard(key);
 break;
 }
}
void Application::ClockTick()
{
 if (statusbar != NULL)
 statusbar->ClockTick();
}
void Application::StatusMessage(String& Msg)
{
 if (statusbar != NULL)
 statusbar->StatusMessage(Msg);
}







[LISTING THREE]

// ---------- control.h
#ifndef CONTROL_H
#define CONTROL_H
#include "dfwindow.h"
class TextBox;
class Control : public DFWindow {
 Bool enabled; // true if control is enabled
public:
 Control(char *ttl,int lf,int tp,int ht,int wd,DFWindow *par)
 : DFWindow(ttl, lf, tp, ht, wd, par)

 { OpenControl(); }
 Control(char *ttl, int ht, int wd, DFWindow *par)
 : DFWindow(ttl, ht, wd, par)
 { OpenControl(); }
 Control(int lf, int tp, int ht, int wd, DFWindow *par)
 : DFWindow(lf, tp, ht, wd, par)
 { OpenControl(); }
 Control(int ht,int wd,DFWindow *par) : DFWindow(ht,wd,par)
 { OpenControl(); }
 Control(char *ttl) : DFWindow(ttl)
 { OpenControl(); }
 virtual ~Control() { /* null */ }
 virtual void Keyboard(int key);
 void OpenControl() { Enable(); }
 void Enable() { enabled = True; }
 void Disable() { enabled = False; }
 Bool isEnabled() { return enabled; }
};
#endif







[LISTING FOUR]

// ------------ control.cpp
#include "control.h"
#include "desktop.h"
void Control::Keyboard(int key)
{
 DFWindow *Wnd;
 switch (key) {
 case UP:
 Wnd = desktop.InFocus();
 do {
 PrevSiblingFocus();
 if (Wnd == desktop.InFocus())
 break;
 } while (desktop.InFocus()->WindowType() == MenubarWindow);
 break;
 case '\t':
 case DN:
 case ALT_F6:
 Wnd = desktop.InFocus();
 do {
 NextSiblingFocus();
 if (Wnd == desktop.InFocus())
 break;
 } while (desktop.InFocus()->WindowType() ==
 MenubarWindow);
 break;
 default:
 DFWindow::Keyboard(key);
 break;
 }
}








[LISTING FIVE]

// ------------- textbox.h
#ifndef TEXTBOX_H
#define TEXTBOX_H
#include "control.h"
const int SHORTCUTCHAR = '~';
const int InitialBufferSize = 1024;
class ScrollBar;
class TextBox : public Control {
 ScrollBar *hscrollbar; // horizontal scroll bar
 ScrollBar *vscrollbar; // vertical scroll bar
 unsigned *TextPointers; // -> list of line offsets
 Bool resetscrollbox;
protected:
 // ---- text buffer
 String *text; // window text
 unsigned int bufflen; // length of buffer
 int wlines; // number of lines of text
 unsigned int textlen; // text length
 int textwidth; // width of longest line in textbox
 // ---- text display
 int wtop; // text line on top of display
 int wleft; // left position in window viewport
 int BlkBegLine; // beginning line of marked block
 int BlkBegCol; // beginning column of marked block
 int BlkEndLine; // ending line of marked block
 int BlkEndCol; // ending column of marked block
 int shortcutfg; // shortcut key color
 char *TextLine(int line)
 { return (char *)(*text) + *(TextPointers+line); }
 int DisplayShortcutField(
 String sc, int x, int y, int fg, int bg);
 void WriteShortcutLine(int lno, int fg, int bg);
 void WriteTextLine(int lno, int fg, int bg);
 void BuildTextPointers();
 void SetScrollBoxes();
 virtual void SetColors();
public:
 TextBox(char *ttl,int lf,int tp,int ht,int wd,DFWindow *par)
 : Control(ttl, lf, tp, ht, wd, par)
 { OpenWindow(); }
 TextBox(char *ttl, int ht, int wd, DFWindow *par)
 : Control(ttl, ht, wd, par)
 { OpenWindow(); }
 TextBox(int lf, int tp, int ht, int wd, DFWindow *par)
 : Control(lf, tp, ht, wd, par)
 { OpenWindow(); }
 TextBox(int ht, int wd, DFWindow *par) : Control(ht,wd,par)
 { OpenWindow(); }
 TextBox(char *ttl) : Control(ttl)
 { OpenWindow(); }

 virtual ~TextBox()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- textbox API messages
 virtual void ScrollUp();
 virtual void ScrollDown();
 virtual void ScrollRight();
 virtual void ScrollLeft();
 virtual void PageUp();
 virtual void PageDown();
 virtual void PageRight();
 virtual void PageLeft();
 virtual void Home();
 virtual void End();
 virtual void OpenWindow();
 virtual void CloseWindow();
 virtual void AddText(char *txt);
 virtual void SetText(char *txt);
 virtual void SetTextLength(unsigned int len);
 virtual void ClearText();
 virtual void Show();
 virtual void Paint();
 virtual void Keyboard(int key);
 String ExtractTextLine(int lno);
 void ClearTextBlock()
 { BlkBegLine=BlkEndLine=BlkBegCol=BlkEndCol=0; }
 void HorizontalPagePosition(int pct);
 void VerticalPagePosition(int pct);
};
#endif





[LISTING SIX]

// ------------- textbox.cpp
#include "desktop.h"
#include "textbox.h"
#include "scrolbar.h"
// ----------- common constructor code
void TextBox::OpenWindow()
{
 windowtype = TextboxWindow;
 if (windowstate == CLOSED)
 Control::OpenWindow();
 text = NULL;
 bufflen = InitialBufferSize;
 hscrollbar = vscrollbar = NULL;
 TextPointers = NULL;
 ClearText();
 SetColors();
}
void TextBox::CloseWindow()
{
 ClearText();
 if (hscrollbar != NULL)
 delete hscrollbar;
 if (vscrollbar != NULL)

 delete vscrollbar;
 Control::CloseWindow();
}
// ------ show the textbox
void TextBox::Show()
{
 if ((attrib & HSCROLLBAR) && hscrollbar == NULL) {
 hscrollbar = new ScrollBar(HORIZONTAL, this);
 hscrollbar->SetAttribute(FRAMEWND);
 }
 if ((attrib & VSCROLLBAR) && vscrollbar == NULL) {
 vscrollbar = new ScrollBar(VERTICAL, this);
 vscrollbar->SetAttribute(FRAMEWND);
 }
 Control::Show();
}
// ------------ build the text line pointers
void TextBox::BuildTextPointers()
{
 textwidth = wlines = 0;
 // ---- count the lines of text
 char *cp1, *cp = *text;
 while (*cp) {
 wlines++;
 while (*cp && *cp != '\n')
 cp++;
 if (*cp)
 cp++;
 }
 // ----- build the pointer array
 delete TextPointers;
 TextPointers = new unsigned int[wlines];
 unsigned int off;
 cp = *text;
 wlines = 0;
 while (*cp) {
 off = (unsigned int) (cp - *text);
 *(TextPointers + wlines++) = off;
 cp1 = cp;
 while (*cp && *cp != '\n')
 cp++;
 textwidth = max(textwidth, (unsigned int) (cp - cp1));
 if (*cp)
 cp++;
 }
}
// --------- add a line of text to the textbox
void TextBox::AddText(char *txt)
{
 String tx(txt);
 int len = tx.Strlen();
 if (tx[len-1] != '\n')
 textlen++;
 textlen += len;
 if (text == NULL)
 text = new String(bufflen);
 if (textlen > text->StrBufLen())
 text->ChangeLength(max(bufflen, textlen));
 *text += tx;

 if (*text[len-1] != '\n')
 *text += String("\n");
 BuildTextPointers();
}
// --------- set the textbox's text buffer to new text
void TextBox::SetText(char *txt)
{
 ClearText();
 AddText(txt);
}
// ------ set the length of the text buffer
void TextBox::SetTextLength(unsigned int len)
{
 if (text != NULL)
 text->ChangeLength(len);
 bufflen = len;
}
// --------- clear the text from the textbox
void TextBox::ClearText()
{
 if (text != NULL)
 delete text;
 textlen = 0;
 wlines = 0;
 textwidth = 0;
 wtop = wleft = 0;
 ClearTextBlock();
 if (TextPointers != NULL)
 delete TextPointers;
}
// -------- set the fg/bg colors for the window
void TextBox::SetColors()
{
 colors.fg = BLACK;
 colors.bg = LIGHTGRAY;
 colors.sfg = LIGHTGRAY;
 colors.sbg = BLACK;
 colors.ffg = BLACK;
 colors.fbg = LIGHTGRAY;
 colors.hfg = BLACK;
 colors.hbg = LIGHTGRAY;
 shortcutfg = BLUE;
}
// ------- extract a text line
String TextBox::ExtractTextLine(int lno)
{
 char *lp = TextLine(lno);
 int offset = lp - (char *) *text;
 for (int len = 0; *(lp+len) && *(lp+len) != '\n'; len++)
 ;
 return text->mid(len, offset);
}
// ---- display a line with a shortcut key character
void TextBox::WriteShortcutLine(int lno, int fg, int bg)
{
 String sc = ExtractTextLine(lno);
 int x = sc.Strlen();
 int y = lno-wtop;
 x -= DisplayShortcutField(sc, 0, y, fg, bg);

 // --------- pad the line
 int wd = ClientWidth() - x;
 if (wd > 0)
 WriteClientString(String(wd, ' '), x, y, fg, bg);
}
// ---- display a shortcut field character
int TextBox::DisplayShortcutField(String sc, int x, int y, int fg, int bg)
{
 int scs = 0;
 int off = sc.FindChar(SHORTCUTCHAR);
 if (off != -1) {
 scs++;
 if (off != 0) {
 String ls = sc.left(off);
 WriteClientString(ls, x, y, fg, bg);
 }
 WriteClientChar(sc[off+1], x+off, y, shortcutfg, bg);
 int len = sc.Strlen()-off-2;
 if (len > 0) {
 String rs = sc.right(len);
 scs += DisplayShortcutField(rs, x+off+1, y, fg, bg);
 }
 }
 else
 WriteClientString(sc, x, y, fg, bg);
 return scs;
}
// ------- write a text line to the textbox
void TextBox::WriteTextLine(int lno, int fg, int bg)
{
 if (lno < wtop lno >= wtop + ClientHeight())
 return;
 int wd = ClientWidth();
 String tl = ExtractTextLine(lno);
 String ln = tl.mid(wd, wleft);
 int dif = wd-ln.Strlen();
 if (dif > 0)
 ln = ln + String(dif, ' '); // pad the line with spaces
 // ----- display the line
 WriteClientString(ln, 0, lno-wtop, fg, bg);
}
// ---------- paint the textbox
void TextBox::Paint()
{
 if (text == NULL)
 Control::Paint();
 else {
 int ht = ClientHeight();
 int wd = ClientWidth();
 int fg = colors.fg;
 int bg = colors.bg;
 for (int i = 0; i < min(wlines-wtop,ht); i++)
 WriteTextLine(wtop+i, fg, bg);
 // ---- pad the bottom lines in the window
 String line(wd, ' ');
 while (i < ht)
 WriteClientString(line, 0, i++, fg, bg);
 if (resetscrollbox)
 SetScrollBoxes();

 resetscrollbox = False;
 }
}
// ------ process a textbox keystroke
void TextBox::Keyboard(int key)
{
 switch (key) {
 case UP:
 if (ClientTop() == ClientBottom())
 break;
 ScrollDown();
 return;
 case DN:
 if (ClientTop() == ClientBottom())
 break;
 ScrollUp();
 return;
 case FWD:
 ScrollLeft();
 return;
 case BS:
 ScrollRight();
 return;
 case PGUP:
 PageUp();
 return;
 case PGDN:
 PageDown();
 return;
 case CTRL_PGUP:
 PageLeft();
 return;
 case CTRL_PGDN:
 PageRight();
 return;
 case HOME:
 Home();
 return;
 case END:
 End();
 return;
 default:
 break;
 }
 Control::Keyboard(key);
}
// ------- scroll up one line
void TextBox::ScrollUp()
{
 if (wtop < wlines-1) {
 int fg = colors.fg;
 int bg = colors.bg;
 desktop.screen().Scroll(ClientRect(), 1, fg, bg);
 wtop++;
 int ln = wtop+ClientHeight()-1;
 if (ln < wlines)
 WriteTextLine(ln, fg, bg);
 SetScrollBoxes();
 }

}
// ------- scroll down one line
void TextBox::ScrollDown()
{
 if (wtop) {
 int fg = colors.fg;
 int bg = colors.bg;
 desktop.screen().Scroll(ClientRect(), 0, fg, bg);
 --wtop;
 WriteTextLine(wtop, fg, bg);
 SetScrollBoxes();
 }
}
// ------- scroll left one character
void TextBox::ScrollLeft()
{
 if (wleft < textwidth-1)
 wleft++;
 Paint();
}
// ------- scroll right one character
void TextBox::ScrollRight()
{
 if (wleft > 0)
 --wleft;
 Paint();
}
// ------- page up one screenfull
void TextBox::PageUp()
{
 if (wtop) {
 wtop -= ClientHeight();
 if (wtop < 0)
 wtop = 0;
 resetscrollbox = True;
 Paint();
 }
}
// ------- page down one screenfull
void TextBox::PageDown()
{
 if (wtop < wlines-1) {
 wtop += ClientHeight();
 if (wlines < wtop)
 wtop = wlines-1;
 resetscrollbox = True;
 Paint();
 }
}
// ------- page right one screenwidth
void TextBox::PageRight()
{
 if (wleft < textwidth-1) {
 wleft += ClientWidth();
 if (textwidth < wleft)
 wleft = textwidth-1;
 resetscrollbox = True;
 Paint();
 }

}
// ------- page left one screenwidth
void TextBox::PageLeft()
{
 if (wleft) {
 wleft -= ClientWidth();
 if (wleft < 0)
 wleft = 0;
 resetscrollbox = True;
 Paint();
 }
}
// ----- move to the first line of the textbox
void TextBox::Home()
{
 wtop = 0;
 Paint();
}
// ----- move to the last line of the textbox
void TextBox::End()
{
 wtop = wlines-ClientHeight();
 if (wtop < 0)
 wtop = 0;
 Paint();
}
// ----- position the scroll boxes
void TextBox::SetScrollBoxes()
{
 if (vscrollbar != NULL)
 vscrollbar->TextPosition(wlines ? (wtop*100)/wlines : 0);
 if (hscrollbar != NULL)
 hscrollbar->TextPosition(textwidth ? (wleft*100)/textwidth : 0);
}
// ---- compute the horizontal page position
void TextBox::HorizontalPagePosition(int pct)
{
 wleft = (textwidth * pct) / 100;
 Paint();
}
// ---- compute the vertical page position
void TextBox::VerticalPagePosition(int pct)
{
 wtop = (wlines * pct) / 100;
 Paint();
}
















February, 1993
STRUCTURED PROGRAMMING


Shoplifting in Reverse




Jeff Duntemann KG7JF


I don't know whether to confess this one or not: I've been reverse shoplifting
again, and I just can't stop myself. I go into CompUSA or Bizmart or someplace
like that with three or four copies of the magazine I publish under my arm. I
pretend to browse the computer magazines in the magazine section, and then
when nobody's looking I slip the three or four copies of PC Techniques I
brought in onto a prominent place on the magazine rack and then nonchalantly
make my escape.
It's not that I'm trying to sell magazines this way, or even get attention; a
dozen copies left anonymously on newsstands around Phoenix obviously isn't
going to do anything for my profile or my market share. I think what I'm
really doing is tormenting the people who maintain the computerized inventory
systems into adopting new modes of thought. See, when a customer takes one of
my gray-market (mauve-market? raw umber-market?) magazines to the front
counter, the clerk waves it over the laser scanner, and the cash-register
terminal protests that the number is not on file. No problem--the clerk pounds
the UPC number off the bar code into the terminal, collects the customer's
$4.95, (which is marked on the cover) and moves on to the next guy in line.
No, the good stuff happens at the end of the month, when the inventory boys
are trying to settle all their accounts. After updating their master inventory
database with the code numbers of everything that hadn't quite gotten logged
last time, they match the UPC against the ISSN index to find the name PC
Techniques, and start scratching their heads.
"Hey, Charlie, I can't figure this one. We didn't order any copies of this
magazine PC Techniques. The distributor didn't send us any. The distributor
doesn't even carry it."
"So?" Charlie asks. "What's the problem?"
"We sold four copies."
Charlie chews on his lip for a second. $19.80 in revenue had come in, and has
to be credited to something the firm had purchased wholesale to sell. But
there was no record in the inventory file, no record in the invoice file, no
record anywhere.
"So fix it," Charlie says, and walks away.


It's Inevitable


They say that whatever isn't impossible is inevitable. Bosh. Whatever isn't
against the physical laws is inevitable, and I'm real picky about what I
consider a physical law. It's a good idea to keep that in mind as you design
any system, but in particular a database-oriented application that not only
has to store data fed in from the outside world, but also make sense of it.
That's the essential difference between storing data in a file and storing it
in a database. The data in the database has to make sense; that is, there are
certain requirements about how items in one record relate to another record,
and so on. The more complex the data is, and the more separate places (files,
machines, networks, and so on) the data lives, the more likely you're going to
encounter the "impossible." And when the impossible happens, you know that
your boss is simply going to look at you and say, "Fix it. "Sympathy isn't
part of the deal.
This real-world wisdom gets thoroughly lost in many discussions of database
design, especially in the academic world, where far too many instructors have
never implemented anything real in their lives. In the midst of arcane
discussions of referential integrity, many-to-many joins, and theta selects,
people forget that if somebody can reverse-shoplift, they will, and the system
you're building had better be able to handle it.
It's time to talk about intelligently managing data. The structured languages
(Pascal, Modula, Ada, and, well, OK, C) are particularly weak at data
management, which has always puzzled me, since the vast majority of all
programming in the commercial world is, at bottom, data-management
programming. If you know nothing about data management, this would be a good
time to start. If you do, hey, follow along and stop me if I say anything
marginal.


One-to-Many


One file does not a database make. If everything you're storing fits in one
file, you've got what they call a "flat file," and while some folks say
"flat-file database," that's kind of like saying "honest congressman." The two
halves of the term are mutually exclusive. To be a database, there must be at
least two separate groups of data records that relate to one another in some
well-defined way. Each group shares a common record structure.
Let's call each group of records a table, and to tie things more clearly to
the real world, let's assume for this discussion that a table is also a disk
file. A table is laid out very much like a Pascal file of records, and the
term "record" is conveniently used to describe one row of the table. Some
people call rows "tuples," which is an ugly and unnecessary piece of jargon.
In most people's estimation, a row is a record, and that's jargon (and
confusion) enough. The fields of a table line up vertically, in a way
reminiscent of a spreadsheet, and are not surprisingly called the table's
"columns." Figure 1 sums it up.
This is still just a file. To make it a database, there must be a second table
with some defined relationship to the first table. For example, if you've got
a table of contact names, you're going to want a table of contact addresses as
well. That's what I've shown in Figure 2.
To many experienced programmers, something like this is painfully obvious. But
I will admit, it took me a long time to realize that contact addresses should
not be wedged into the same file as contact names. Why not? Simply because a
contact may have more than one address. Okay--we'll have two separate address
areas for each contact in the contact file. Well then, what about my friend
George Ewing WA8WTE, who has four addresses?
How can a guy have four addresses?
I dunno. How could a guy reverse shoplift?
Just fix it. In this case, we put all the addresses out as a separate table.
To be useful, there has to be a well-defined relationship between the two
tables, or we'd never be able to associate an address with a contact. So what
can we say about the Addresses table? Just this: That for every record in the
Addresses table, there is an associated name in the Contacts table. To make
the relationship unambiguous, we make sure that there is a code field in each
one of the Addresses records that matches a code field in one of the Contacts
records.
Why not just use some sort of last-name/first-name appellation for the code?
Well, there are two Tom Campbells in my contacts database, and more Mike
Smiths than I care to think about. Give each person a unique ID code. You
don't have to tell them that they're a number and not a name. It'll be your
secret.
Figure 2 makes a number of notable points. One is that there can be any number
of Mike Smiths in the Contacts table, because each one has a unique code
number. Another is that a given Mike Smith's four addresses can be anywhere in
the Addresses table. They don't have to be adjacent to one another, nor in any
particular place. All that matters is that each one of Mike's addresses is
correctly tagged with Mike's ID code number. Finally, the ID code numbers are
arbitrary. There is no connection at all between a contact record's position
in the Contacts file and its ID code number. That is, our
Mike-Smith-with-four-addresses is not necessarily the 174th record in the
table.
The ID codes point up perhaps the most significant relationship between the
two tables. There may be any number of addresses belonging to contact ID 174
in the Addresses table. However, there may only be one contact ID 174 in the
Contacts table. This relationship is called a one-to-many relationship. That
is, for each one record in the Contacts table there may be many address
records in the Addresses table. In drawing relationships between tables on
paper, people use the crowfoot-like symbol I've included between the tables.
The crowfoot indicates the "many" end of the one-to-many relationship.


Multiple Relationships


It's worth stepping back for a moment and thinking about this concept in
design terms. The most blatant advantage to setting up a Contacts database
this way is that it eliminates the need for wasted empty address fields. Many
flat files have two or sometimes even three separate address areas in each
record. For people who have only one address, this means a significant number
of bytes in each record are just landfill.
Following from that is virtually unfettered flexibility. A contact can have
one, two, four, or seventeen addresses. Or none--suppose you once had a
contact but he's now dead. It might pay to remember him in some
circumstances--say, as the original holder of a patent you're researching--but
you sure don't want to send him any mail. Basically, a database like this can
handle any number of addresses. There is no "impossible." And that's always
important in the design phase of any system.
Things get interesting when you continue with the design and consider phone
numbers. It shouldn't surprise you that some people have more than one phone
number. What surprises me sometimes is how many phone numbers a single person
can have. A guy I know has a home phone, a home modem line, a work phone, two
work modem lines, a cellular phone, and a pager (and he's thinking of setting
up his own BBS...). We can add a third table to the database to contain phone
numbers. That's a no-brainer. As with addresses, each contact in the Contacts
table can have any number of phone numbers in the Phones table.
Alas, that's not enough. You gotta know where the phones are, because ol' Mike
Smith gets around. You may know he's at work, but which phone number in Phones
is at his work address? There's a missing relationship here. As shown in
Figure 3, it's another one-to-many relationship. There can be multiple phones
at any given address, but each phone can have only one address. It's also true
that each phone number in Phones can belong to only one contact in Contacts.
So Phones has a relationship with Addresses, and an entirely separate (if
similar) relationship with Contacts.


To ID or Not to ID?



You've probably guessed by now that anytime there's a one-to-many
relationship, the "one" end of the relationship has to be unique in its table.
Addresses are usually unique within a limited geographical area, but are not
necessarily unique--and the broader you range, the more likely you are to hit
a duplicate. (How many #17 Maple Streets do you suspect there are in the
country?) More to the point, an address is a biggish item compared to an ID
code. To be unique in most (but not all!) cases, you have to consider the
location, address, city, state, and zip code fields as a single aggregate
field. If you're going to be doing a lot of searches or sorts, the
computational burden of examining, comparing, and swapping around whole
addresses can be brutal.
The rule of thumb is pretty simple: Anytime that you must access a record in a
table uniquely, assign the record a unique key that you control. Do not assume
that any given data that comes from outside your system will be unique.
A lot of naive designers have been stung on this one: Surely the federally
assigned Social Security number is unique for each American citizen! True,
I've never heard of duplicates occurring. But there's a real-world snag: No
one is legally required to give you their Social Security number unless you're
some sort of government agency--with the legal corollary that you have no
recourse against a person who hands you a made-up number. And a made-up number
is not necessarily unique. If someone can zoom you, they will.
Good design maxim: If you have to ID uniquely, assign the ID within the
system.
There is a term for a field in a record that uniquely identifies that record
within its table: primary key. In Figure 3, the primary-key fields are shaded.
The Contacts table and the Addresses table each have a primary-key field. The
Phones table does not.
Well, why should it?
Keep straight in your mind what primary keys do: They allow us to stake down
the "one" portion of a one-to-many relationship. The Contacts table is on the
"one" end of two such relationships so far, and if this is any significant
business system, you can bet there will be a lot more. The Addresses table is
on the "one" end of a one-to-many relationship with Phones: We have to be able
to relate each phone number uniquely to one address. Is there any one-to-many
relationship keyed to phone numbers? That is, can you think of any class of
whatevers that we must relate to one phone number? A phone number can have a
set of attributes attached to it (type of service, listed/unlisted, perhaps a
baud rate), but each phone number will have only one such set. There's no
"many" related to phone numbers, and hence no need to give the Phones table a
primary key.


The Notion of "Countables"


Once you've gotten a fair grasp of the idea of a one-to-many relationship,
it's time to consider the notion of what I call countables. Any reasonable
database system contains numerous countables. In any system, the countables
are those entities that are independently enumeratable. There is no absolutely
reliable one-to-one correspondence between distinct countables. In other
words, nothing guarantees that for each name in your Contacts table there will
be only one or two addresses or phone numbers. So people, addresses, and phone
numbers are each countables.
Design maxim: Each countable entity must reside in its own table.
Identifying the countables in your system is something that should be done
fairly early in the design stage. I've always done it using a variation on the
"stepwise-refinement" method used to design procedures and short, simple
programs. It works this way: Look at your system spec from a height and
identify the large countables within it. Then look carefully at each of those
large countables and try to identify any smaller countables inside it.
Let's take as an example a simple system for handling mail-order books
selling. It's easy to identify several large countables in such a system: the
customers, the inventory, the orders, and the moneystuff. Let's look for the
smaller countables in each.
We've already broken down the notion of a customer about as far as it needs to
go. A customer table, an address table, and a phone-number table are about all
we need there.
Inventory is stuff to be sold, so you obviously need a table with a catalog
number (that you generate) as its primary key. But the books come from
somewhere, so you'll have as another countable the vendors from whom you buy
the books. And inside that countable, just as with customers, you'll have
distinct countables for vendor addresses and phone numbers.
Is that far enough for inventory? It may be for now. However, before too long
you might want to get a little fancier in tracking your sales and what drives
them. This might imply that the prices you charge for your books becomes
another countable, since a single book could have a cover price, a
preferred-customer discount price, a summer-readers' sale price, and a blowout
inventory-clearance price. The Prices table would contain information such as
when the price became valid, when it expires, and the like.
Orders are countables that should have a table of their own, with an order
number that you generate as a primary key. Each order has at least one but
probably more items, which is another countable.
"Moneystuff" is the collective term I use for invoices and payments. When a
customer owes you money, you send a series of increasingly obnoxious invoices,
each of which should have its own record in some permanently retained table.
In most cases, the customer eventually sends you some sort of payment, or
multiple payments, that are applied against the customer's account balance.
Invoices and payments are thus separate countables, each with a table of their
own.


Deciding What Goes in a Table


Identifying your countables should be done well before you sit down and try to
enumerate the fields that need to be stored in a given record. You need to peg
the big database picture before you can make any informed decisions on what
goes where. Once you feel confident that you have all your countables counted
out and identified, you need to think about precisely what information goes in
each table, and how that information should be divided into fields.
This has always been a kind of a painful process for me, because I'm a pack
rat. I have milk jugs full of ceramic tube sockets out in the garage, some of
which have been with me since I was a teenager. (I haven't been a teenager
since 1971.) I feel the same way about data. Why fail to store something when
you can store it...just in case you need it someday?
Maybe it's better to have it and not need it than need it and not have it.
You'll have to decide for your own circumstances. It is true that it's always
better to be aware of your choices than to realize down the road that you just
never considered the possibility of having to remember that some customer is
allergic to mink musk. That means you should consider every possible item of
relevant data and make a keep/throw decision on each one.
So brainstorm. Sit down with a pad (paper or virtual) and blurt out everything
that might ever go into each of your tables. For a moment at least, be truly
paranoid about forgetting anything, in the interest of getting it all down.
After you're through brainstorming, walk around the block, sober up a little
bit, have a couple of Triscuits, review your constraints (Is this system going
to have to run on bottom-feeder machines with minimal memory and hard-disk
capacity? Is the intelligence or dedication of the users not exactly off the
top of the charts? Do I have to deliver this thing by next weekend?) and start
scratching off the unnecessary.
It may take a few passes through your brainstorming list. Once the level of
agony in deciding on each item becomes unbearable, you've probably got a
pretty reasonable list of keepable fields.


Breaking Up Huge Tables


In an ideal world, there'd be no one-to-one relationships between tables. If
there's only one record in A for each record in B, A and B are really the same
table. Then again, when you consider your constraints, you may find that some
of your big tables exceed a maximum record size for the data manager, or are
just too ungainly to keep in one piece. When that happens, you have to
consider the frequency with which some data is going to be needed. Are you
going to have to look up the customer's hat size all that often? Stack-rank
the fields in an oversized table by how often you're going to have to read or
write them. Percolate the most frequently accessed fields to the top of the
stack, and then pick a reasonable place to break the table in two, leaving the
often-used stuff in one table and the rarely used stuff in the other table.
Don't forget that you'll need a primary key of some sort to pin down the
connection between records in the two tables.
If your files are going to be kept on a network server, it helps to remember
that networking is still slow compared to the speed of a fast hard disk, and
if you're hauling a couple of thousand bytes of near-trivia in from the server
every time you want to look up a customer's credit balance, you're wasting
precious time and cable bandwidth.


The Essence of Database Design


In summary, high-level database design goes pretty much like this:
Identify your countables. This includes the smaller countables within the
larger countables. In general, each countable becomes a database table.
Define the relationships between the countables. Most database relationships
are one-to-many. Decide what logic links two tables, and what sort of primary
key should carry that logic. Any key that must be unique should be generated
within the system. Beware of making assumptions about data that you do not
control!
Brainstorm all possible fields for each identified table.
Reconsider your constraints and eliminate all unnecessary fields from the
brainstorm list for each table.
Decide if any single table is too large to leave in one piece. For any that
are, stack-rank all fields by frequency of access and migrate all frequently
accessed fields to one table, with the rarely accessed fields going to another
table. Link the two parts of the broken table by a unique primary key.
And never ever forget, at any level of the design, that anything that can
happen, no matter how weird, is likely to happen someday. Design accordingly.
It's a crazier world than you think.











February, 1993
GRAPHICS PROGRAMMING


More Dirty (Dirtier?) Rectangles


 This article contains the following executables: XSHRP21.ZIP


Michael Abrash


Programming is, by and large, a linear process. One statement or instruction
follows another, in predictable sequences, with tiny building blocks strung
together to make a custom state machine. As programmers, we grow adept at this
sort of idealized linear thinking, which is, of course, A Good Thing. Still,
it's important to keep in mind there's a large chunk of the human mind that
doesn't work in a linear fashion.
I've written elsewhere about the virtues of
nonlinear/right-brain/lateral/what-have-you thinking in solving tough
programming problems, such as debugging or optimization, but it bears
repeating. The mind can be an awesome pattern-matching and extrapolation tool,
if you let it. For example, the other day I was grinding my way through a
particularly difficult bit of debugging. The code had been written by someone
else, and, to my mind, there's nothing worse than debugging someone else's
code; there's always the nasty feeling that you don't quite know what's going
on. The overall operation of this code wouldn't come clear in my head, no
matter how long I stared at it, leaving me with a rising sense of frustration
and a determination not to quit until I got this bug.
In the midst of this, a coworker poked his head through the door and told me
he had something I had to listen to. Reluctantly, I went to his office,
whereupon he played a tape of what is surely one of the most bizarre 911 calls
in history. No doubt some of you have heard this tape, which I will briefly
describe as involving a deer destroying the interior of a car and biting a man
in the neck. Perhaps you found it funny, perhaps not--but as for me, it hit me
exactly right. I started laughing helplessly, tears rolling down my face. When
I went back to work--presto!--the pieces of the debugging puzzle had come
together in my head, and the work went quickly and easily.
Obviously, my mind needed a break from linear, left-brain, push-it-out
thinking, so it could do the sort of integrating work it does so well--but
that it's rarely willing to do under conscious control. It was exactly this
sort of thinking I had in mind when I titled my book Zen of Assembly Language.
(Although I must admit that few people seem to have gotten the connection, and
I've had to field a lot of questions about whether I'm a Zen disciple. I'm
not--more of a Dave Barry disciple. If you don't know who Dave Barry is, you
should; he's good for your right brain.) Give your mind a break once in a
while, and I'll bet you'll find you're more productive.
We're strange thinking machines, but we're the best ones yet invented, and
it's worth learning how to tap our full potential. And with that, it's back to
dirty-rectangle animation.


Dirty-Rectangle Animation, Continued


Last month, we got our feet wet with dirty-rectangle animation. This technique
is an alternative to page flipping that's capable of producing animation of
very high visual quality, without any help at all from video hardware, and
without the need for any extra, nondisplayed video memory. This makes
dirty-rectangle animation more widely usable than page flipping, because many
adapters don't support page flipping. Dirty-rectangle animation also tends to
be simpler to implement than page flipping, because there's only one bitmap to
keep track of. A final advantage of dirty-rectangle animation is that it's
potentially somewhat faster than page flipping, because display-memory
accesses can theoretically be reduced to exactly one access for each pixel
that changes from one frame to the next.
The speed advantage of dirty-rectangle animation was entirely theoretical last
month, because the implementation was completely in C, and because no attempt
was made to minimize display memory accesses. The visual quality of last
month's animation was also less than ideal, for reasons we'll explore shortly.
The code in Listings One (page 142) and Two (page 144) addresses the
shortcomings of last month's code.
Listing Two implements the low-level drawing routines in assembly language,
which boosts performance a good deal. For maximum performance, it would be
worthwhile to convert more of Listing One into assembler, so a call isn't
required for each animated image, and overall performance could be improved by
streamlining the C code, but Listing Two goes a long way toward boosting
animation speed. This program now supports snappy animation of 15 images (as
opposed to 10 last month), and the images are two pixels wider this month.
That level of performance is all the more impressive considering that this
month I've converted the code from using rectangular images to using masked
images.


Masked Images


Masked images are rendered by drawing an object's pixels through a mask;
pixels are actually drawn only where the mask specifies that drawing is
allowed. This makes it possible to draw nonrectangular objects that don't
improperly interfere with each other when they overlap. Masked images also
make it possible to have transparent areas (windows) within objects. Masked
images produce far more realistic animation than do rectangular images, and
therefore are more desirable. Unfortunately, masked images are also
considerably slower to draw; however, a good assembly language implementation
can go a long way toward making masked images draw rapidly enough, as
illustrated by this month's code. (Masked images are also known as "sprites;"
some video hardware supports sprites directly, but on the IBM PC it's
necessary to handle sprites in software.)
Masked images make it possible to render scenes so a given image convincingly
appears to be in front of or behind other images; that is, so images are
displayed in z-order (by distance). By consistently drawing images that are
supposed to be farther away before drawing nearer images, the nearer images
will appear in front of the other images, and because masked images draw only
precisely the correct pixels (as opposed to blank pixels in the bounding
rectangle), there's no interference between overlapping images to destroy the
illusion.


Internal Animation


I've added another feature essential to producing convincing animation:
internal animation, the process of changing the appearance of a given object
over time, as distinguished from changing the location of a given object.
Internal animation makes images look active and alive. I've implemented the
simplest possible form of internal animation in Listing One--alternation
between two images--but even this level of internal animation greatly improves
the feel of the overall animation. You could easily increase the number of
images cycled through, simply by increasing InternalAnimateMax for a given
entity. You could also implement more complex image-selection logic to produce
more interesting and less predictable internal-animation effects, such as
jumping, ducking, running, and the like.


Dirty-rectangle Management


As mentioned above, dirty-rectangle animation makes it possible to access
display memory a minimum number of times. Last month's code didn't do any of
that; instead, it copied every dirty rectangle to the screen, regardless of
overlap between rectangles. This month's code goes to the other extreme,
taking great pains never to draw overlapped portions of rectangles more than
once. This is accomplished by checking for overlap whenever a rectangle is to
be added to the dirty list. When overlap with an existing rectangle is
detected, the new rectangle is reduced to between zero and four nonoverlapping
rectangles. Those rectangles are then again considered for addition to the
dirty list, and may again be reduced, if additional overlap is detected.
A good deal of code is required to generate a fully nonoverlapped dirty list.
Is it worth it? It certainly can be, but in Listing One, it probably isn't.
For one thing, you'd need bigger, heavily overlapped objects for this approach
to pay off big. Besides, this program is mostly in C, and spends a lot of time
doing things other than actually accessing display memory. It also takes a
fair amount of time just to generate the nonoverlapped list; the overhead of
all the looping, intersecting, and calling required to generate the list eats
up a lot of the benefits of accessing display memory less often. Nonetheless,
fully nonoverlapped drawing can be useful under the right circumstances, and
I've implemented it in Listing One so you'll have something to refer to should
you decide to try this route.
There are a couple of additional techniques you might try if you want to wring
maximum performance out of dirty-rectangle animation. You could try coalescing
rectangles as you generate the dirty-rectangle list. That is, you could detect
pairs of rectangles that can be joined together into larger rectangles, so
that fewer, larger rectangles would have to be copied. This would boost the
efficiency of the low-level copying code, albeit at the cost of some cycles in
the dirty-list management code.
You might also try taking advantage of the natural coherence of animated
graphics screens. In particular, because the rectangle used to erase an image
at its old location often overlaps the rectangle within which the image
resides at its new location, you could simply directly generate the two or
three nonoverlapped rectangles required to copy both the erase rectangle and
the new-image rectangle for any single moving image. The calculation of these
rectangles could be very efficient, given that you know in advance the
direction of motion of your images. Handling this particular overlap case
would eliminate most overlapped drawing, at a minimal cost; you might then
decide to ignore overlapped drawing between different images, which tends to
be both less common and more expensive to identify and handle.


Drawing Order and Visual Quality


A final note on dirty-rectangle animation concerns the quality of the
displayed screen image. Last month, we simply stuffed dirty rectangles in a
list in the order they became dirty, and then copied all of the rectangles in
that same order. Unfortunately, this caused all of the erase rectangles to be
copied first, followed by all of the rectangles of the images at their new
locations. Consequently, there was a significant delay between the appearance
of the erase rectangle for a given image and the appearance of the new
rectangle. A byproduct was the fact that a partially complete--part old, part
new--image was visible long enough to be noticed. In short, although the
pixels ended up correct, they were in an intermediate, incorrect state for a
sufficient period of time to make the animation look wrong.
This violated a fundamental rule of animation: No pixel should ever be
perceptible in an incorrect state. To correct the problem, this month I've
sorted the dirty rectangles by Y coordinate, and secondarily by X coordinate.
This means the screen updates from the top down, and from left to right, so
the several nonoverlapping rectangles copied to draw a given image should be
drawn nearly simultaneously. Run last month's code and then this month's;
you'll see quite a difference in appearance.
Avoid the trap of thinking animation is merely a matter of drawing the right
pixels, one after another. Animation is the art of drawing the right pixels at
the right times so the eye and brain see what you want them to see. Animation
is a lot more challenging than merely cranking out pixels, and it sure as heck
isn't a purely linear process.



Until We Meet Again


It's been two years since my first graphics column for DDJ. In that time, I've
learned a great deal about graphics, partly from my research for the column,
but mostly from those of you around the world. For instance, I got two letters
from the Ukraine last week. (One was accompanied by a manuscript--in Russian.
Two important tips: Send your manuscripts to the DDJ editorial offices, not to
me, and write them in English.) You are a remarkably inquisitive and sharing
lot, and it's hard to imagine how I could have enjoyed writing this column any
more than I have.
Unfortunately, this will be my last column, for now at least. Other
responsibilities and challenges beckon, and I've covered much of what I set
out to share two years ago (six years ago, if you count my graphics columns in
the late Programmer's Journal). I hope the world is a little better because of
the interchange of ideas, information, and even a bit of humor that went on in
this space. I know I'm richer for having written and corresponded with many of
you.
I'm not vanishing off the face of the Earth, of course; we'll meet in these
pages again. Until then, thanks for your support and sharing. In particular,
thanks for your support of the Careware effort; you've helped change many
lives for the better. I'd also like to thank the DDJ staff, especially Monica
Berg and Tami Zemel, for their unfailing support and good humor.
Au revoir.

_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Sample simple dirty-rectangle animation program, partially optimized and
featuring internal animation, masked images (sprites), and nonoverlapping
dirty
rectangle copying. Tested with Borland C++ 3.0 in the small model. */

#include <stdlib.h>
#include <conio.h>
#include <alloc.h>
#include <memory.h>
#include <dos.h>

/* Comment out to disable overlap elimination in the dirty rectangle list. */
#define CHECK_OVERLAP 1
#define SCREEN_WIDTH 320
#define SCREEN_HEIGHT 200
#define SCREEN_SEGMENT 0xA000

/* Describes a dirty rectangle */
typedef struct {
 void *Next; /* pointer to next node in linked dirty rect list */
 int Top;
 int Left;
 int Right;
 int Bottom;
} DirtyRectangle;
/* Describes an animated object */
typedef struct {
 int X; /* upper left corner in virtual bitmap */
 int Y;
 int XDirection; /* direction and distance of movement */
 int YDirection;
 int InternalAnimateCount; /* tracking internal animation state */
 int InternalAnimateMax; /* maximum internal animation state */
} Entity;
/* Storage used for dirty rectangles */
#define MAX_DIRTY_RECTANGLES 100
int NumDirtyRectangles;
DirtyRectangle DirtyRectangles[MAX_DIRTY_RECTANGLES];
/* Head/tail of dirty rectangle list */
DirtyRectangle DirtyHead;
/* If set to 1, ignore dirty rectangle list and copy the whole screen. */
int DrawWholeScreen = 0;
/* Pixels and masks for the two internally animated versions of the image
 we'll animate */
#define IMAGE_WIDTH 13
#define IMAGE_HEIGHT 11

char ImagePixels0[] = {
 0, 0, 0, 9, 9, 9, 9, 9, 0, 0, 0, 0, 0,
 0, 0, 9, 9, 9, 9, 9, 9, 9, 0, 0, 0, 0,
 0, 9, 9, 0, 0,14,14,14, 9, 9, 0, 0, 0,
 9, 9, 0, 0, 0, 0,14,14,14, 9, 9, 0, 0,
 9, 9, 0, 0, 0, 0,14,14,14, 9, 9, 0, 0,
 9, 9,14, 0, 0,14,14,14,14, 9, 9, 0, 0,
 9, 9,14,14,14,14,14,14,14, 9, 9, 0, 0,
 9, 9,14,14,14,14,14,14,14, 9, 9, 0, 0,
 0, 9, 9,14,14,14,14,14, 9, 9, 0, 0, 0,
 0, 0, 9, 9, 9, 9, 9, 9, 9, 0, 0, 0, 0,
 0, 0, 0, 9, 9, 9, 9, 9, 0, 0, 0, 0, 0,
};
char ImageMask0[] = {
 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,
 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0,
 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0,
 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0,
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0,
 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,
 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
};
char ImagePixels1[] = {
 0, 0, 0, 9, 9, 9, 9, 9, 0, 0, 0, 0, 9,
 0, 0, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9,
 0, 9, 9, 0, 0,14,14,14, 9, 9, 9, 9, 0,
 9, 9, 0, 0, 0, 0,14,14,14, 0, 0, 0, 0,
 9, 9, 0, 0, 0, 0,14,14, 0, 0, 0, 0, 0,
 9, 9,14, 0, 0,14,14,14, 0, 0, 0, 0, 0,
 9, 9,14,14,14,14,14,14, 0, 0, 0, 0, 0,
 9, 9,14,14,14,14,14,14,14, 0, 0, 0, 0,
 0, 9, 9,14,14,14,14,14, 9, 9, 9, 9, 0,
 0, 0, 9, 9, 9, 9, 9, 9, 9, 0, 9, 9, 9,
 0, 0, 0, 9, 9, 9, 9, 9, 0, 0, 0, 9, 9,
};
char ImageMask1[] = {
 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1,
 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1,
 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0,
 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0,
 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0,
 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,
 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1,
 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1,
};
/* Pointers to pixel and mask data for various internally animated
 versions of our animated image. */
char * ImagePixelArray[] = {ImagePixels0, ImagePixels1};
char * ImageMaskArray[] = {ImageMask0, ImageMask1};
/* Animated entities */
#define NUM_ENTITIES 15
Entity Entities[NUM_ENTITIES];

/* Pointer to system buffer into which we'll draw */
char far *SystemBufferPtr;
/* Pointer to screen */
char far *ScreenPtr;
void EraseEntities(void);
void CopyDirtyRectanglesToScreen(void);
void DrawEntities(void);
void AddDirtyRect(Entity *, int, int);
void DrawMasked(char far *, char *, char *, int, int, int);
void FillRect(char far *, int, int, int, int);
void CopyRect(char far *, char far *, int, int, int, int);
void main()
{
 int i, XTemp, YTemp;
 unsigned int TempCount;
 char far *TempPtr;
 union REGS regs;
 /* Allocate memory for the system buffer into which we'll draw */
 if (!(SystemBufferPtr = farmalloc((unsigned int)SCREEN_WIDTH*
 SCREEN_HEIGHT))) {
 printf("Couldn't get memory\n");
 exit(1);
 }
 /* Clear the system buffer */
 TempPtr = SystemBufferPtr;
 for (TempCount = ((unsigned)SCREEN_WIDTH*SCREEN_HEIGHT); TempCount--; ) {
 *TempPtr++ = 0;
 }
 /* Point to the screen */
 ScreenPtr = MK_FP(SCREEN_SEGMENT, 0);
 /* Set up the entities we'll animate, at random locations */
 randomize();
 for (i = 0; i < NUM_ENTITIES; i++) {
 Entities[i].X = random(SCREEN_WIDTH - IMAGE_WIDTH);
 Entities[i].Y = random(SCREEN_HEIGHT - IMAGE_HEIGHT);
 Entities[i].XDirection = 1;
 Entities[i].YDirection = -1;
 Entities[i].InternalAnimateCount = i & 1;
 Entities[i].InternalAnimateMax = 2;
 }
 /* Set the dirty rectangle list to empty, and set up the head/tail node
 as a sentinel */
 NumDirtyRectangles = 0;
 DirtyHead.Next = &DirtyHead;
 DirtyHead.Top = 0x7FFF;
 DirtyHead.Left= 0x7FFF;
 DirtyHead.Bottom = 0x7FFF;
 DirtyHead.Right = 0x7FFF;
 /* Set 320x200 256-color graphics mode */
 regs.x.ax = 0x0013;
 int86(0x10, &regs, &regs);
 /* Loop and draw until a key is pressed */
 do {
 /* Draw the entities to the system buffer at their current locations,
 updating the dirty rectangle list */
 DrawEntities();
 /* Draw the dirty rectangles, or the whole system buffer if
 appropriate */
 CopyDirtyRectanglesToScreen();

 /* Reset the dirty rectangle list to empty */
 NumDirtyRectangles = 0;
 DirtyHead.Next = &DirtyHead;
 /* Erase the entities in the system buffer at their old locations,
 updating the dirty rectangle list */
 EraseEntities();
 /* Move the entities, bouncing off the edges of the screen */
 for (i = 0; i < NUM_ENTITIES; i++) {
 XTemp = Entities[i].X + Entities[i].XDirection;
 YTemp = Entities[i].Y + Entities[i].YDirection;
 if ((XTemp < 0) ((XTemp + IMAGE_WIDTH) > SCREEN_WIDTH)) {
 Entities[i].XDirection = -Entities[i].XDirection;
 XTemp = Entities[i].X + Entities[i].XDirection;
 }
 if ((YTemp < 0) ((YTemp + IMAGE_HEIGHT) > SCREEN_HEIGHT)) {
 Entities[i].YDirection = -Entities[i].YDirection;
 YTemp = Entities[i].Y + Entities[i].YDirection;
 }
 Entities[i].X = XTemp;
 Entities[i].Y = YTemp;
 }
 } while (!kbhit());
 getch(); /* clear the keypress */

 /* Back to text mode */
 regs.x.ax = 0x0003;
 int86(0x10, &regs, &regs);
}
/* Draw entities at their current locations, updating dirty rectangle list. */
void DrawEntities()
{
 int i;
 char far *RowPtrBuffer;
 char *TempPtrImage;
 char *TempPtrMask;
 Entity *EntityPtr;

 for (i = 0, EntityPtr = Entities; i < NUM_ENTITIES; i++, EntityPtr++) {
 /* Remember the dirty rectangle info for this entity */
 AddDirtyRect(EntityPtr, IMAGE_HEIGHT, IMAGE_WIDTH);
 /* Point to the destination in the system buffer */
 RowPtrBuffer = SystemBufferPtr + (EntityPtr->Y * SCREEN_WIDTH) +
 EntityPtr->X;
 /* Advance the image animation pointer */
 if (++EntityPtr->InternalAnimateCount >=
 EntityPtr->InternalAnimateMax) {
 EntityPtr->InternalAnimateCount = 0;
 }
 /* Point to the image and mask to draw */
 TempPtrImage = ImagePixelArray[EntityPtr->InternalAnimateCount];
 TempPtrMask = ImageMaskArray[EntityPtr->InternalAnimateCount];
 DrawMasked(RowPtrBuffer, TempPtrImage, TempPtrMask, IMAGE_HEIGHT,
 IMAGE_WIDTH, SCREEN_WIDTH);
 }
}
/* Copy the dirty rectangles, or the whole system buffer if appropriate,
 to the screen. */
void CopyDirtyRectanglesToScreen()
{

 int i, RectWidth, RectHeight;
 unsigned int Offset;
 DirtyRectangle * DirtyPtr;
 if (DrawWholeScreen) {
 /* Just copy the whole buffer to the screen */
 DrawWholeScreen = 0;
 CopyRect(ScreenPtr, SystemBufferPtr, SCREEN_HEIGHT, SCREEN_WIDTH,
 SCREEN_WIDTH, SCREEN_WIDTH);
 } else {
 /* Copy only the dirty rectangles, in the YX-sorted order in which
 they're linked */
 DirtyPtr = DirtyHead.Next;
 for (i = 0; i < NumDirtyRectangles; i++) {
 /* Offset in both system buffer and screen of image */
 Offset = (unsigned int) (DirtyPtr->Top * SCREEN_WIDTH) +
 DirtyPtr->Left;
 /* Dimensions of dirty rectangle */
 RectWidth = DirtyPtr->Right - DirtyPtr->Left;
 RectHeight = DirtyPtr->Bottom - DirtyPtr->Top;
 /* Copy a dirty rectangle */
 CopyRect(ScreenPtr + Offset, SystemBufferPtr + Offset,
 RectHeight, RectWidth, SCREEN_WIDTH, SCREEN_WIDTH);
 /* Point to the next dirty rectangle */
 DirtyPtr = DirtyPtr->Next;
 }
 }
}
/* Erase the entities in the system buffer at their current locations,
 updating the dirty rectangle list. */
void EraseEntities()
{
 int i;
 char far *RowPtr;
 for (i = 0; i < NUM_ENTITIES; i++) {
 /* Remember the dirty rectangle info for this entity */
 AddDirtyRect(&Entities[i], IMAGE_HEIGHT, IMAGE_WIDTH);
 /* Point to the destination in the system buffer */
 RowPtr = SystemBufferPtr + (Entities[i].Y * SCREEN_WIDTH) +
 Entities[i].X;
 /* Clear the rectangle */
 FillRect(RowPtr, IMAGE_HEIGHT, IMAGE_WIDTH, SCREEN_WIDTH, 0);
 }
}
/* Add a dirty rectangle to the list. The list is maintained in top-to-bottom,
left-to-right (YX sorted) order, with no pixel ever included twice, to
minimize
the number of display memory accesses and to avoid screen artifacts resulting
from a large time interval between erasure and redraw for a given object or
for
adjacent objects. The technique used is to check for overlap between the
rectangle and all rectangles already in the list. If no overlap is found, the
rectangle is added to the list. If overlap is found, the rectangle is broken
into nonoverlapping pieces, and the pieces are added to the list by recursive
calls to this function. */
void AddDirtyRect(Entity * pEntity, int ImageHeight, int ImageWidth)
{
 DirtyRectangle * DirtyPtr;
 DirtyRectangle * TempPtr;
 Entity TempEntity;
 int i;
 if (NumDirtyRectangles >= MAX_DIRTY_RECTANGLES) {

 /* Too many dirty rectangles; just redraw the whole screen */
 DrawWholeScreen = 1;
 return;
 }
 /* Remember this dirty rectangle. Break up if necessary to avoid
 overlap with rectangles already in the list, then add whatever
 rectangles are left, in YX sorted order */
#ifdef CHECK_OVERLAP
 /* Check for overlap with existing rectangles */
 TempPtr = DirtyHead.Next;
 for (i = 0; i < NumDirtyRectangles; i++, TempPtr = TempPtr->Next) {
 if ((TempPtr->Left < (pEntity->X + ImageWidth)) &&
 (TempPtr->Right > pEntity->X) &&
 (TempPtr->Top < (pEntity->Y + ImageHeight)) &&
 (TempPtr->Bottom > pEntity->Y)) {

 /* We've found an overlapping rectangle. Calculate the
 rectangles, if any, remaining after subtracting out the
 overlapped areas, and add them to the dirty list */
 /* Check for a nonoverlapped left portion */
 if (TempPtr->Left > pEntity->X) {
 /* There's definitely a nonoverlapped portion at the left; add
 it, but only to at most the top and bottom of the overlapping
 rect; top and bottom strips are taken care of below */
 TempEntity.X = pEntity->X;
 TempEntity.Y = max(pEntity->Y, TempPtr->Top);
 AddDirtyRect(&TempEntity,
 min(pEntity->Y + ImageHeight, TempPtr->Bottom) -
 TempEntity.Y,
 TempPtr->Left - pEntity->X);
 }
 /* Check for a nonoverlapped right portion */
 if (TempPtr->Right < (pEntity->X + ImageWidth)) {
 /* There's definitely a nonoverlapped portion at the right; add
 it, but only to at most the top and bottom of the overlapping
 rect; top and bottom strips are taken care of below */
 TempEntity.X = TempPtr->Right;
 TempEntity.Y = max(pEntity->Y, TempPtr->Top);
 AddDirtyRect(&TempEntity,
 min(pEntity->Y + ImageHeight, TempPtr->Bottom) -
 TempEntity.Y,
 (pEntity->X + ImageWidth) - TempPtr->Right);
 }
 /* Check for a nonoverlapped top portion */
 if (TempPtr->Top > pEntity->Y) {
 /* There's a top portion that's not overlapped */
 TempEntity.X = pEntity->X;
 TempEntity.Y = pEntity->Y;
 AddDirtyRect(&TempEntity, TempPtr->Top - pEntity->Y, ImageWidth);
 }
 /* Check for a nonoverlapped bottom portion */
 if (TempPtr->Bottom < (pEntity->Y + ImageHeight)) {
 /* There's a bottom portion that's not overlapped */
 TempEntity.X = pEntity->X;
 TempEntity.Y = TempPtr->Bottom;
 AddDirtyRect(&TempEntity,
 (pEntity->Y + ImageHeight) - TempPtr->Bottom, ImageWidth);
 }
 /* We've added all non-overlapped portions to the dirty list */

 return;
 }
 }
#endif /* CHECK_OVERLAP */
 /* There's no overlap with any existing rectangle, so we can just
 add this rectangle as-is */
 /* Find the YX-sorted insertion point. Searches will always terminate,
 because the head/tail rectangle is set to the maximum values */
 TempPtr = &DirtyHead;
 while (((DirtyRectangle *)TempPtr->Next)->Top < pEntity->Y) {
 TempPtr = TempPtr->Next;
 }
 while ((((DirtyRectangle *)TempPtr->Next)->Top == pEntity->Y) &&
 (((DirtyRectangle *)TempPtr->Next)->Left < pEntity->X)) {
 TempPtr = TempPtr->Next;
 }
 /* Set the rectangle and actually add it to the dirty list */
 DirtyPtr = &DirtyRectangles[NumDirtyRectangles++];
 DirtyPtr->Left = pEntity->X;
 DirtyPtr->Top = pEntity->Y;
 DirtyPtr->Right = pEntity->X + ImageWidth;
 DirtyPtr->Bottom = pEntity->Y + ImageHeight;
 DirtyPtr->Next = TempPtr->Next;
 TempPtr->Next = DirtyPtr;
}





[LISTING TWO]

; Assembly language helper routines for dirty rectangle animation. Tested with
; TASM 3.0. Fills a rectangle in the specified buffer. C-callable as:
; void FillRect(char far * BufferPtr, int RectHeight, int RectWidth,
; int BufferWidth, int Color);
 .model small
 .code
parms struc
 dw ? ;pushed BP
 dw ? ;pushed return address
BufferPtr dd ? ;far pointer to buffer in which to fill
RectHeight dw ? ;height of rectangle to fill
RectWidth dw ? ;width of rectangle to fill
BufferWidth dw ? ;width of buffer in which to fill
Color dw ? ;color with which to fill
parms ends
 public _FillRect
_FillRect proc near
 cld
 push bp
 mov bp,sp
 push di

 les di,[bp+BufferPtr]
 mov dx,[bp+RectHeight]
 mov bx,[bp+BufferWidth]
 sub bx,[bp+RectWidth] ;distance from end of one dest scan
 ; to start of next

 mov al,byte ptr [bp+Color]
 mov ah,al ;double the color for REP STOSW
RowLoop:
 mov cx,[bp+RectWidth]
 shr cx,1
 rep stosw
 adc cx,cx
 rep stosb
 add di,bx ;point to next scan to fill
 dec dx ;count down rows to fill
 jnz RowLoop

 pop di
 pop bp
 ret
_FillRect endp

; Draws a masked image (a sprite) to the specified buffer. C-callable as:
; void DrawMasked(char far * BufferPtr, char * Pixels, char * Mask,
; int ImageHeight, int ImageWidth, int BufferWidth);
parms2 struc
 dw ? ;pushed BP
 dw ? ;pushed return address
BufferPtr2 dd ? ;far pointer to buffer in which to draw
Pixels dw ? ;pointer to image pixels
Mask dw ? ;pointer to image mask
ImageHeight dw ? ;height of image to draw
ImageWidth dw ? ;width of image to draw
BufferWidth2 dw ? ;width of buffer in which to draw
parms2 ends
 public _DrawMasked
_DrawMasked proc near
 cld
 push bp
 mov bp,sp
 push si
 push di

 les di,[bp+BufferPtr2]
 mov si,[bp+Mask]
 mov bx,[bp+Pixels]
 mov dx,[bp+ImageHeight]
 mov ax,[bp+BufferWidth2]
 sub ax,[bp+ImageWidth] ;distance from end of one dest scan
 mov [bp+BufferWidth2],ax ; to start of next
RowLoop2:
 mov cx,[bp+ImageWidth]
ColumnLoop:
 lodsb ;get the next mask byte
 and al,al ;draw this pixel?
 jz SkipPixel ;no
 mov al,[bx] ;yes, draw the pixel
 mov es:[di],al
SkipPixel:
 inc bx ;point to next source pixel
 inc di ;point to next dest pixel
 dec cx
 jnz ColumnLoop
 add di,[bp+BufferWidth2] ;point to next scan to fill

 dec dx ;count down rows to fill
 jnz RowLoop2

 pop di
 pop si
 pop bp
 ret
_DrawMasked endp

; Copies a rectangle from one buffer to another. C-callable as:
; void CopyRect(DestBufferPtr, SrcBufferPtr, CopyHeight, CopyWidth,
; DestBufferWidth, SrcBufferWidth);

parms3 struc
 dw ? ;pushed BP
 dw ? ;pushed return address
DestBufferPtr dd ? ;far pointer to buffer to which to copy
SrcBufferPtr dd ? ;far pointer to buffer from which to copy
CopyHeight dw ? ;height of rect to copy
CopyWidth dw ? ;width of rect to copy
DestBufferWidth dw ? ;width of buffer to which to copy
SrcBufferWidth dw ? ;width of buffer from which to copy
parms3 ends
 public _CopyRect
_CopyRect proc near
 cld
 push bp
 mov bp,sp
 push si
 push di
 push ds

 les di,[bp+DestBufferPtr]
 lds si,[bp+SrcBufferPtr]
 mov dx,[bp+CopyHeight]
 mov bx,[bp+DestBufferWidth] ;distance from end of one dest scan
 sub bx,[bp+CopyWidth] ; of copy to the next
 mov ax,[bp+SrcBufferWidth] ;distance from end of one source scan
 sub ax,[bp+CopyWidth] ; of copy to the next
RowLoop3:
 mov cx,[bp+CopyWidth] ;# of bytes to copy
 shr cx,1
 rep movsw ;copy as many words as possible
 adc cx,cx
 rep movsb ;copy odd byte, if any
 add si,ax ;point to next source scan line
 add di,bx ;point to next dest scan line
 dec dx ;count down rows to fill
 jnz RowLoop3

 pop ds
 pop di
 pop si
 pop bp
 ret
_CopyRect endp
 end

































































February, 1993
PROGRAMMER'S BOOKSHELF


Roaming the Internet




Ray Duncan


Every year of her life, Laura thought, the Net had been growing more expansive
and seamless. Computers did it. Computers melted other machines, fusing them
together. Television-telephone-telex. Tape recorder-VCR-laser disk[sic].
Broadcast tower linked to microwave dish linked to satellite. Phone line,
cable TV, fiber-optic cords hissing out words and pictures in torrents of pure
light. All netted together in a web over the world, a global nervous system,
an octopus of data. There'd been plenty of hype about it. It was easy to make
it sound transcendently incredible.
Islands in the Net Bruce Sterling, 1988
The vision in Bruce Sterling's superb book, written as science fiction and set
in the year A.D. 2023, looks likely to materialize at least a couple of
decades sooner. Multi-MIPS single-chip processors, voice- and pen-recognition
technology, high-speed fiber-optic backbones, wireless networking, pervasive
cellular-telephone access, satellite-based position determination, experiments
with virtual reality, the decline of traditional literacy, and, of course, the
Internet--all these factors and a host of others are rapidly converging to
create a new way of life that even we computer geeks can barely imagine. And
when this Brave New World arrives, the Internet or its successor will
undoubtedly be the foundation.
The very name of the Internet carries with it a certain mystique and even, I
might say, a certain dread among DOS and Mac users. Many historical factors
contribute to this, not the least of which is the Internet's traditional BSD
UNIX power base. Indispensable network utility programs such as ftp, telnet,
and rn are all prime examples of the well-known UNIX tendencies toward
counterintuitive command syntax and cryptic, case-sensitive switches. Another
extremely important factor has been the chronic lack of well-organized,
well-written documentation for the Internet at either the user or the
technical level. Most Internet lore has been carried around in the head of
networking gurus or stashed away in documentation files that were only
accessible via--you guessed it--the Internet.
What is the Internet, anyway? The Internet is a network of networks, based on
the TCP/IP protocol, composed largely of Sun and DEC VAX hosts interfaced to
special-purpose message-processing computers, routers, and bridges. The
Internet had its origins in packet-switching experiments financed by the
Department of Defense, and its high-speed backbones are still heavily
subsidized by the government, but there is no centralized administration
except for the assignment of computer node or "host" identification numbers
and names. The Internet is run by, and evolves through, an odd sort of
participatory technocracy. Most important technological decisions are made by
an inner cabal of network wizards and hackers that dates back nearly two
decades, but its mandate is based on grass-roots support from the
administrators and users of the participating networks and, of course, an
impressive track record.
The Internet started with just a few computer hosts in the early '70s, reached
100 hosts in about 1980, 1000 hosts in 1984, 10,000 hosts in 1987, 100,000
hosts in 1989, and was nearing a million hosts at the end of 1992. It now
reaches every corner of the globe, including the ex-Communist bloc. Of course,
this kind of exponential growth can't continue forever, but it's reasonable to
assume that the number of directly connected hosts will expand by at least
another couple of orders of magnitude before the curve starts to flatten out.
As it is, the Internet is already much more pervasive than most people
realize. If you use a computer on a local area network in a large company, or
you own a modem and subscribe to any of the popular online services, you very
likely can reach the Internet, even if you aren't aware of it. For example,
CompuServe, BIX, and MCI Mail all offer gateways to the Internet.
So assuming that the Internet is indeed at your disposal, what can you do with
it? For a start, the Internet provides free, or nearly free (from the end
user's point of view anyway), electronic mail to any user on any connected
host--with, for the most part, astonishingly prompt and reliable service.
Another popular service available via the Internet is the so-called
USENET--somewhat analogous to a bulletin-board system with hundreds of
conferences on every conceivable topic, but message postings are automatically
distributed throughout the network in near-real time. (USENET is not
synonymous with the Internet, though; its conferences are also propagated by
several other mechanisms.) Additional Internet facilities that may interest
you include file servers with massive collections of public-domain programs
and data files, the archie file finders, the gopher distributed-information
retrieval system, and the World-Wide-Web hypertext servers.
Getting started with the Internet can be very baffling, especially if you
don't have the benefit of coaching by some experienced user. Even the simplest
dabblings in Internet waters can confront you with software that is almost
unbelievably aggravating by DOS or Mac standards. (For example, the first time
you run a USENET "news reader," it will automatically assume that you want to
subscribe to every single one of the existing conferences. You can only get
rid of the ones you aren't interested in by manually "unsubscribing" them one
by one, or by using the hideous vi editor to modify a hidden configuration
file.) Fortunately, the explosive growth of the Internet has finally attracted
the attention of the trade-book publishers, and at least a dozen reasonably
good books about the Internet have appeared within the last year. I've picked
three user-oriented books to discuss in this installment of the "Programmer's
Bookshelf," and will continue with a sampling of more technically oriented
books in a later issue of DDJ.
Zen and the Art of the Internet is a concise and well-focused introduction to
the Internet, directed at the computer literate and to some degree at the UNIX
literate. This book is almost ideal for the DDJ type of reader; it can be
assimilated in half an hour, and it will get you off the launching pad with
all of the crucial networking programs and facilities. Regrettably, although
the book is extremely useful, it's not very good. It was patched together out
of a variety of Internet samizdat documents, so the writing and editing are
uneven. Technojargon, insider references, and gratuitous admonishments are
rampant. Moreover, the book was apparently designed and typeset by amateurs;
the wide availability of tools like TeX and troff on UNIX systems has
definitely been a double-edged sword.
The Whole Internet User's Guide & Catalog, by Ed Krol, is much more
comprehensive than Zen and also takes much less for granted. The book starts
with an explanation of the Internet, how it came to be, how it works, and what
you're allowed to do on your Internet connection. It continues with detailed
chapters on mail and finding and retrieving remote files, and winds up its
narrative with a beautiful essay on network problem solving. The final part of
the book is devoted to an annotated description of some of the more
interesting Internet databases, news groups, and other resources, a directory
of Internet providers, an international Internet addressing guide, and a
glossary. I can't possibly praise this book too highly; it should serve as a
model for technical writers and publishers everywhere. The writing, editing,
and production are simply splendid.
!%@:: The Directory of Electronic Mail Addressing and Networks is the perfect
coffee-table book or holiday gift for the Internet hacker that you love. It is
basically a yellow-pages directory to approximately 130 of the Internet's
participating networks circa 1990, with maps, mail-addressing guidelines,
contact information for network administrators, and miscellaneous technical
factoids such as the speed and character of each network's links to the
Internet backbone. The book is interesting in a nerdy sort of way, although
it's hard to see how most of it could be useful to anyone but another network
administrator, and much of the book's contents must have been outdated nearly
as soon as it left the printer. [Editor's note: A third edition of this book
should now be available, but as we were going to press the price was as yet
undetermined.]






































February, 1993
OF INTEREST





PARTS, Digitalk's new Parts Assembly and Reuse Tool Set technology, looks to
be a powerful, yet straightforward, tool that allows you to quickly create
applications from prefabricated software components in an open,
language-neutral manner. Although written in Smalltalk, PARTS components can
be written in C, C++, Smalltalk/V, or other languages. PARTS also includes a
built-in Smalltalk-like scripting language.
The PARTS Workbench consists of a catalog of prebuilt visual and nonvisual
components and a workbench window. To create an application, simply drag parts
from the catalog into the workbench and then wire them together by drawing
lines between them. In one demo DDJ saw at OOPSLA '92, Digitalk
representatives created a full-function text editor in literally a matter of
minutes by simply "connecting" prefabricated parts.
The PARTS Workbench for OS/2 costs $1995.00. Reader service no. 20.
Digitalk Inc. 9841 Airport Boulevard Los Angeles, CA 90045 310-645-1082
The DynaMind Developer neural-network software package from NeuroDynamX
bundles neural-network training software with several runtime options,
including linkable C routines and neural-network hardware simulation. DynaMind
Developer guides programmers with little neural-network experience through the
process of creating, training, and embedding neural networks.
Network architecture and training parameters are set using pop-up menus and
graphical displays. Training methods supported include a proprietary algorithm
for solving forecasting and sequence-recognition problems; training data can
be taken from spreadsheets and databases.
Networks trained with DynaMind can be run on a stand-alone basis, embedded
into C programs, or used in simulations of neural-network hardware. For
embedding runtime neural networks, the NeuroLink C library includes routines
to load, probe and test networks, and to link multiple networks serially or in
parallel. Networks can be linked in a modular fashion to solve complex
problems.
For putting networks into hardware, DynaMind Developer lets you simulate the
Intel 8017ONX Electrically Trainable Analog Neural Network chip. You can
prototype this chip before committing resources for hardware. Networks trained
in simulation mode can later be written to the 8017NX chip using iDynaMind on
the Intel Neural Network Training system. The trained 8017NX chip can run
networks up to 40,000 times faster than software-only networks.
Also included is DftBuild, which creates ready-to-run neural-network filters
that perform a discrete Fourier transform; the filters are embedded into
applications using the NeuroLink library.
The package costs $495.00; separately, DynaMind costs $145.00. Reader service
no. 21.
NeuroDynamX Inc. P.O. Box 323 Boulder, CO 80306 303-442-3539
Iconic Query from IntelligenceWare is an iconic database-access tool that lets
the user point and click at visual representations instead of using a query
language. It comes with a library of icons that can be assigned to different
tables. Relationships are assigned by drawing connecting lines between the
icons, and SQL statements are automatically generated and shown in a separate
window.
Iconic Query uses an extended entity relationship and object model on top of a
SQL system. The mapping between the iconic and SQL layers is also done via
point and click, and entities and relationships are represented by bitmaps,
drawings, and icons. Iconic Query's open architecture merges data from
Paradox, Oracle, Sybase, DB2, and others.
Iconic Query sells for $290.00. Reader service no. 22.
IntelligenceWare Inc. 5933 West Century Boulevard Los Angeles, CA 90045
310-216-6177
Brigent has released BootBug, a Macintosh debugger for all code that loads
before MacsBug and the operating system, such as NuBus, SCSI, ADB, Network,
video drivers, primary inits, and accelerators. BootBug's command set lets you
set conditional breakpoints on A-Traps and addresses, disassemble your
driver's code, display and set memory, and single-step through RAM and ROM.
You can watch your driver execute, one instruction at a time, and debug video
and ADB drivers without interference.
The debugger in BootBug's declaration ROM is loaded into the system before any
other software by plugging the BootBug NuBus card into the first slot of the
system. Because BootBug is loaded before the display or ADB driver is
initialized, it must send the debugging information to an external terminal
through its on-board serial port.
BootBug uses the same command set as MacsBug and costs $450.00. Reader service
no. 23.
Brigent 684 Costigan Circle Milpitas, CA 95035 408-956-1234
Portable Basic for Sun 3 and 4 workstations running SunOS 4.1 is available
from Software Engineering Associates. Compatible with GW-Basic, Portable Basic
provides a migration path from PCs to Sun workstations. Programs compiled with
Portable Basic run several hundred times faster than those interpreted with
GW-Basic.
Portable Basic includes color graphics capabilities, the ability to print to
the screen in multiple fonts, and support of variable-length strings, 32-bit
integers, and single- and double-precision floating-point numbers. Program
size is unlimited, and arrays are not subject to PC-architecture limitations.
Long variable names and source-code lines are supported.
Portable Basic simplifies text processing by not requiring declaration of
maximum string lengths, and high performance is achieved for string operations
using optimization techniques that minimize memory references and generate
common string operations inline.
Prices start at $600.00 for a single-user license. Reader service no. 24.
Software Engineering Associates P.O. Box 7396 Nashua, NH 03060 603-672-1160
Pixel Press is a 32-bit RISC and video-display system from Applied Data
Systems. Within its 3.1x5.1-inch dimensions, the Pixel Press module contains a
32-bit RISC processor (130-ns cycle time), 786 Kbytes of EPROM, a parallel
interface, an RS-232 debug port, a voltage supervisor, and a watchdog timer.
The power requirement is 700 mA at 5 volts and the video output provides
1024x768 non-interlaced resolution with 16 colors.
The module connects directly to either a single-chip processor, an external
bus processor, or a standard Centronics or IBM printer port. User C or
assembly language software can be placed in RISC EPROM for execution on the
internal RISC processor. Internal execution allows the parallel interface to
serve as a user data bus for the control of over 64 user I/O registers.
Pixel Press costs $325.00. A demo kit including a 14-inch color monitor, power
supply, and prototype board costs $750.00. Source code and development tools
are available. Reader service no. 25.
Applied Data Systems Inc. 409A East Preston Street Baltimore, MD 21202
800-541-2003 or 410-576-0335
Tom Sawyer Software is shipping the Graph Layout Toolkit, a topology
management tool for improving the graphics output of various applications via
its automated layout services. The Graph Layout Toolkit is bundled directly
into an application and helps the application clearly represent extensive and
complicated data by implementing object-oriented programming techniques. The
Graph Layout Toolkit can be used for flow charts, telecommunications-network
maps, hypertext navigation, class hierarchies, relational databases, CASE
software, project-management software, entity-relationship diagrams,
function-call graphs, AI rule graphs, and PERT charts.
The Graph Layout Toolkit was written as a set of extensible class libraries
with C++ and ANSI C APIs. The internal developer's release costs $499.00 for
Macintosh and Windows platforms, $699.00 for UNIX. Reader service no. 26.
Tom Sawyer Software Corp. 1824B Fourth Street Berkeley, CA 94710 510-848-0853
COMPEDITOR II is a finite-state program compiler for C and Pascal from AYECO.
It allows you to quickly form event-driven finite-state finite-automaton, and
decision-table programs. It forms a program's state table with up to 10,000
cells. This table allows the developer to pick and choose a program's response
to any event. The output is an ASCII source file; the source includes a
commented copy of the program's state table.
In simple applications, the designer enters into each state-table cell the
name of a procedure to execute and the program's next state if the conditions
represented by the cell occur during run time. If descriptive names are
assigned to procedures, the state table documents the program's operation
concisely. In complicated applications, the developer can insert source code
directly into an expanded table cell or the state program itself with the
COMPEDITOR's editor.
Any program can be represented as a state program: Events can be derived by
external or internal conditions; a state program can be formed from a
flowchart or a state-transition diagram, or by dividing it into its
operational states.
COMPEDITOR retails for $300.00. Reader service no. 27.
AYECO Inc. 5025 Nassau Circle Orlando, FL 32808 407-295-0930
Now available from National Design is Volante Warp 10, an ISA board that
speeds Windows performance up to 50 times faster. This is possible because of
the Volante's fast-forward driver optimization, which compacts and reduces
transmission of bits and vectors without losing the integrity of the original
image.
Volante Warp includes 1152x900 and 1024x768 and below Super-VGA resolution,
with 16.7 million colors potentially available. It has a connector to tie to
multimedia capture devices and a Windows instant presenter system. The price
is $299.00. Reader service no. 29.
National Design Inc. 1515 Capital of Texas Highway South Fifth Floor Austin,
TX 78746 512-329-5055
SET Laboratories has announced version 4.0 of PC-METRIC for C, its software
measurement and analysis package. New to this version are additional metrics,
reports, and a revamped interactive query and analysis system that lets you
track metrics across releases. This makes it easy to allocate code-review and
testing resources and examine trends in the code as it evolves.
PC-METRIC uses software complexity metrics to identify those parts of a
program's source code that are particularly complex and therefore more
susceptible to errors. Most of the review and testing resources can then be
allocated to those parts of the code.
PC-METRIC supports Ada, assembly language, C, C++, Cobol, dBase, Fortran,
Jovial, Modula-2, Pascal, and PL/1. The price is $399.00 for a single-user
license. Reader service no. 28.
SET Laboratories Inc. P.O. Box 868 Mulino, OR 97042 503-829-7123











February, 1993
SWAINE'S FLAMES


The Redmond Horror




Michael Swaine


The cold mists creep in from the sea and the unrelenting rains keep that
northwestern valley perpetually blanketed in a dank, unwholesome shroud.
Perhaps it is only this that makes me shudder when I think of that place, or
perhaps it is the ineradicable memory of what I saw on that dreary morning in
December.
I was working in my library on an obscure point of computer lore when my
cousin Corbett dropped in. He had recently taken on the responsibility of
conducting a science-fiction and fantasy writing workshop for computer
programmers in Silicon Valley, he explained, and had come seeking advice on
writing assignments.
Unsure what sort of advice he wanted and wishing to get back to my work, I
pressed upon him the first book that came to hand, Manes and Andrews'
biography of Bill Gates. At first skeptical, he began to brighten as he paged
through the volume.
"Yes, yes, I think I can use this," he said. I watched him go and thought no
more about the matter.
The winds whipped up from the seacoast to my mountain retreat that December,
driving the coastal fog even up to my little aerie and putting me in mind of
another, more sinister, place. One overcast morning, my cousin Corbett
returned with a manila envelope and a bemused expression.
"I gave them the old alternate-universe, what-if assignment," he said,
spreading the contents of the envelope across my desk. "I put the Gates bio on
reserve and told them to consider the consequences of changing one key detail
in the history of Microsoft."
"Sounds fascinating," I muttered, extracting my notes from under his papers.
"They apparently thought so. They turned in some interesting stories, but
there's something about them--well, see for yourself."
It was clear that I wouldn't be rid of him until I had read his students'
stories, so I picked up the nearest, settled back in my chair, and began. I
read: "'No, you may not drop out of college to start a software company,'"
Mary 'Giggles' Gates told her eager, bespectacled son Bill...."
I'll spare you the student prose. The story told of bespectacled Bill's years
at Harvard, culminating in a meeting in his senior year with another student,
Dan Bricklin, who has an idea for a software product. The two form a company,
VisiCalc is a spectacular success, and Gates emerges as the billionaire
puppetmaster of the software universe. I turned to the next story.
This story assumed that Gary Kildall did not go flying when IBM came to
Digital Research for an operating system for its new personal computer. As a
result, DRI rather than Microsoft becomes IBM's system-software partner. Soon,
though, Kildall, tired of CP/M, throws his company's efforts into a new
operating system IBM proposes; immediately, DRI is mired in OS/2 and Microsoft
is selling a CP/M alternative called MS-DOS to clonemakers and plowing the
profits into a marketing blitz for a new graphical operating environment
called Windows. Gates becomes a billionaire and Microsoft dominates the
industry. I started flipping through the other stories.
In one, Apple wins its look-and-feel suit against Microsoft. This forces
cosmetic changes in Windows that marketing explains as a new and improved
windows, for a huge boost in sales.
In another, IBM buys Microsoft, only to spin it off as a separate company in
its 1992 reorganization. Gates rides out the changes and prospers.
Three stories had Ross Perot buying Microsoft, something he actually
considered and now regrets passing up. In one, he takes the company public,
then leaves in a huff over a disagreement with the board of directors. In
another, he is forced out by IBM, which wants to work with Microsoft but has
unpleasant memories of Perot. In the third, he resigns to run for President.
In each story, Bill Gates steps in at the critical juncture and both he and
Microsoft prosper beyond the dreams of avarice.
They were all like that. MITS retains control of Microsoft Basic, but Gates,
noticing that the entire microcomputer word-processing market is owned by one
programmer who started business with an unlisted phone number, borrows money
from his family, and buys Michael Shrayer's Electric Pencil, beefing up the
company and the product; WordStar never happens. Young system-hacker Gates is
arrested and does time in a juvenile correctional facility in Washington, but
uses all the free time to write versions of Basic for every known
microprocessor and sell them by mail.
"Here's the one that gives me the willies," Corbett said, passing me the last
of the stories.
It was a long, rambling narrative, starting out in the European theater of
World War II and ending in an obscure laboratory in the Pacific northwest. But
I followed it with unhealthy fascination and a growing sense of horror,
reading of the young second lieutenant tragically killed in battle, the
wealthy family intent on keeping its line alive, the unspeakable experiment
performed in that noxious laboratory just before Halloween in 1955, and the
strange child that was the result. I threw it into the fire.
The wind is rising again tonight, and I sit here in my library staring out
into the impenetrable fog, musing on the unknowable designs of a strange and
terrible universe.































March, 1993
March, 1993
EDITORIAL


Sound Bites




Jonathan Erickson


After more than two years as a regular in our "Programmer's Bookshelf"
section, Andrew Schulman is changing DDJ hats, putting aside his book beat to
launch a column called the "Undocumented Corner." Andrew, whose books
Undocumented DOS and Undocumented Windows have made him the de facto guru of
undocumented operating-system features, will be examining programming
interfaces for the likes of DOS, Windows, OS/2, networks, and more.
However, "Undocumented Corner" will be different from columns you'll find in
this and other magazines. While Andrew will write many of the articles
himself, he'll also be working with you to explore undocumented features
you've uncovered. To this end, "Undocumented Corner" is, in the finest DDJ
tradition, a forum where you can interact and share ideas and techniques with
your fellow programmers. This month, for example, Andrew presents Joe Newcomer
and Bruce Horn's investigation into the undocumented RGNOBJ structure of
Microsoft Windows.


John Kemeny, R.I.P.


With the recent passing of John Kemeny, the programming community lost yet
another pioneer. Along with fellow Dartmouth College professor Thomas Kurtz,
Kemeny was the co-inventor in 1964 of the Basic programming language.
Before his stint at Dartmouth (which culminated with his being president from
1970-1981), Kemeny worked on the Manhattan Project, and later served as a
research assistant to Albert Einstein while studying at Princeton. Near the
end of his Dartmouth career, Kemeny chaired the federal commission (known as
the Kemeny Commission) that investigated the Three Mile Island power plant
nuclear accident in Pennsylvania in 1979.
Still, it's Kemeny and Kurtz's research into programming languages that's
relevant to us. While another programming language would surely have come
along, the world of computing would be much different today if Kemeny and
Kurtz hadn't developed the easy-to-grasp language they called the "Beginner's
All-purpose Symbolic Instruction Code," or Basic for short.
You could, in fact, argue whether or not the PC industry would have gotten out
of the gate with such fury had not Basic been there to jumpstart it. (Can you
imagine programming the early TRS-80s, Apples, or Commodore PETs in ROM-based
Fortran or Cobol, for example?)
Nor would Bill Gates and Paul Allen have implemented a version of the language
for the Altair, leading us to wonder whether Microsoft would be the software
giant it is--if it existed at all.
And, of course, without Kemeny and Kurtz's efforts, Bob Albrecht and Dennis
Allison would never have created Tiny Basic--and then launched Dr. Dobb's
Journal of Computer Calisthenics & Orthodontia to present their implementation
to the world.
With all of the current hubbub over C++ and its object-oriented cousins, it's
easy to forget that Basic was one of the keystones of an industry that daily
continues to change the way we live. Kemeny's work made a difference, and we
should all be glad he did what he did, when he did.


Collar ID


The 1100-mile long Iditarod Trail Sled Dog Race is underway in Alaska, leading
me to think that even embedded systems are going to the dogs.
For the first time, race organizers have resorted to computer technology as a
means of keeping things on track. In the past, individual dogs have been
identified by numbers painted on their backs, making it relatively easy for
unprincipled mushers to substitute fresh dogs at opportune times. Starting
this year, however, veterinarians are injecting computer chips the size of a
grain of rice under the dog's skin using syringes equipped with fat,
hollow-tipped needles. Race officials then pass a hand-held scanner over each
of the 1600 dogs in the race to read their ID codes, which are displayed on
the scanner's LCD. Officials can cross-reference the IDs to a database to
ensure that the Fido is indeed Fido, not Bruno.
The computer technology is admittedly being kept on a short leash. The tiny
read-only chips (manufactured by Avid Inc.) only store about four bytes of
data. The EPROM chips consist of a minute ferrite coil, a capacitor, and a
custom IC that attends to power supply, clock signals, memory sequencing, and
output. The IC is encoded by the manufacturer and encapsulated within
biocompatible glass. The one-pound, battery-powered scanner (also made by
Avid) sends RF wakeup signals to the passive chip, which returns the
preprogrammed code in less than 0.04 seconds. The scanner can store up to 450
ID reads and, via an RS-232 interface, send the data to a PC.
Actually, there's nothing new about this application of embedded-systems
technology, known as "transcutaneous avian identification." Services like
"InfoPet," "DataPet," and "PETtrac" will, for $30.00 or $40.00, implant an ID
chip into your pet and register your name with a listing service. If you're
interested, you can find out more about all this in recent issues of Dr.
Dogg's Journal.

























March, 1993
LETTERS







Spatial Update


Dear DDJ,
Further to our article, "Spatial Data and the Voronoi Tessellation" (DDJ,
December 1992), we would like to report a significant new development to those
developing geographic applications.
The U.S. Defense Mapping Agency, in cooperation with their counterpart
agencies in Canada, the U.K., and Australia, have recently released the
Digital Chart of the World (DCW). This "chart" consists of over 1.5 gigabytes
of reasonable-quality vector data distributed on four CD-ROMs. The set
provides worldwide coverage of 17 themes, including coast-lines, rivers,
roads, railways, airports, cities, towns, spot elevations, and depths, and
over 100,000 place names. Much of the vector data has been digitized from the
1:1,000,000 scale Operational Navigation Charts (ONCs) used by flyers the
world over. The price tag? $200.00!
For the first time, developers everywhere can affordably include general
geographic objects in their applications. Because the data is in vector
format, with varying amounts of work, proper spatial objects can be
constructed. Then spatial relationships between them and other
application-specific objects can be calculated. But even if an application
might not need this data to construct spatial objects, all can use data like
this for graphic user-orientation purposes.
The PC-access software provided (VPFVIEW) permits you to browse the world,
selecting location, scale, and theme. Selected features can be extracted and
used in your own application, without copyright violation. The access software
is written in C and source is provided. To be sure, the access software is
painfully slow and at times unstable. But no matter, that can be remedied
easily enough. The main thing is that general geographic vector data is now
available at a truly remarkable price! Besides the CD-ROMs (ISO-9660
compatible) and the 80286/287 diskettes, you get installation instructions and
a user's manual. Sources are provided for technical information describing the
external data structures.
Here are four sources of the DCW:
U.S. Geological Survey
P.O. Box 25286
Denver Federal Center
Denver, CO 80225
Digital Distribution Services
Energy, Mines, and Resources Canada
615 Booth Street
Ottawa, ON K1A OE9
Canada
Director General of Military Survey
(Survey 3)
Elmwood Avenue
Feltham, Middlesex TW13 7AH
United Kingdom
Director of Survey, Australian Army
Department of Defense
Campbell Park Offices (CP2-4-24)
Campbell ACT 2601
Australia
Hats off to all four governments for this latest "open skies" initiative.
John Russell
Calgary, Alberta
Canada


Adamantly Ada


Dear DDJ,
Let me first say that I really enjoy your magazine. Keep up the outstanding
work on your indispensable publication--it is the best of its type anywhere!
However, I was disappointed that your annual issue on object-oriented
programming did not address the current enhancements being made to Ada 9X.
Perhaps you can address this next year when the revision is finalized.
Speaking of languages, I'd like to say a few words about the
uncharacteristically biased presentation in your October 1992 issue, "Safe
Programming with Modula-3" by Sam Harbison. In his zeal to promote his
favorite language, Mr. Harbison doesn't look before he leaps. Specifically, he
says, "Even Ada ... does not protect you against dangling pointers...." On the
contrary, Ada specifically addresses dangling pointers and does provide some
measure of protection. Pointers in Ada (access types) are automatically set to
null when instantiated, and reset to null when the user exits the block they
are declared in. (Of course, it's still possible to dangle pointers within the
declared block, but that's why we train programmers, right?) Furthermore, to
suggest that Modula-3 is more "safe" than Ada is ludicrous! Any runtime type
checking is, by definition, less safe than the static type checking required
by Ada.
Sam declares that there are many features "already found" in Modula-3, and
then insinuates that current standard revisions in Ada and C++ are playing
catch-up. This is a rather immature stance to take. Since Modula-3 is a new
language, one would expect it to incorporate modern concepts. Just because it
is now being "released" while Ada and C++ are being revised, this timing
doesn't make it a better language! All useful languages are defined by
standards which, at any point in time, are always in a state of flux. To
assign some sort of qualitative value on this unrelated temporal relationship
is meaningless, especially if the subject features are in the process of being
implemented anyway (which the author points out). If the author's opinion was
that these features are currently supported in Modula-3 and not in the other
languages, this point could have been adequately expressed sans the subtle
religious overtones.
As the article failed to point out, there are several features of both Ada and
C++ that are not found in Modula-3. In addition, several of the language
features of Modula-3 and C++ were borrowed from Ada (exceptions, generics,
separate compilation, etc.). In my opinion, the option to use multiple
inheritance in C++ is one of its greatest assets because a design may be
implemented directly by the language. And Ada's support of concurrent
programming, low-level programming, built-in attributes, and support of
programming in the large are unmatched by any popular language. Furthermore,
both Ada and C++ do not specify automatic garbage collection for a reason--not
only are there many different ways to implement a free-list scheme, but also
language-imposed garbage collection is always machine-dependent, restricts
flexibility, and exacts a runtime cost. Lastly, the automatic initialization
of variables can be just as dangerous as no initialization (which a compiler
can catch), by potentially rendering a false sense of accuracy.
Aside from the biased presentation, I found Modula-3 interesting. Like any
language, it has its place depending on the application. However, if the
author's motive was to truly compare the relative utility of a language, a
more useful question to ask would be, "Is there anything I cannot do when
using this language?" I can answer this question positively for every language
I have ever been exposed to and certainly for Modula-3.
In closing, let me say that I advocate the diversity of expression allowed by
the existence of many programming languages. I have never understood the need
of some to extoll the virtues of their pet language at the expense of other
languages (except, of course, to use the other language to better explain the
features of the subject language in an objective fashion). All languages have
varying levels of support for different features. I prefer (and expect) to
read an article that takes a positive approach to introducing these features
over one which seeks to establish some sort of childish ranking scheme.
Judging from the usual high caliber of the letters printed, the readers of Dr.
Dobb's are a little more sophisticated than this article seems to presume. In
the future, just the (correct) facts, ma'am.
Spencer Roberts
Redondo Beach, California


Ada, C, & the Infamous MOD Function



Dear DDJ,
I was surprised to see another letter about the MOD function in the September
1992 issue. Perhaps your readers would be interested in how Ada deals with the
matter. Ada has two functions, rem and mod. rem is the integer-division
remainder operator, and returns the same result as the Pascal mod operator;
the result always has the sign of the left operand. mod implements the
mathematical modulus operation, and the result always has the sign of the
right operand.
I was also surprised to see another letter about implementing a generic swap
in C. I should say that I view C as a fairly successful attempt to create a
more-or-less portable assembler, and as such should be considered a low-level
language, or at best a mid-level language. I would reject any claim that C is
a high-level language. Perhaps the many C users among your readers would enjoy
seeing how this operation can be implemented in a high-level language, again
using Ada, as in Example 1.
Example 1

 generic
 type t is private;
 procedure swap (left : in out t; right : in out t);
 procedure swap (left : in out t; right : in out t) is
 temp : t := left;
 begin -- swap
 left := right;
 right := temp;
 end swap;

This will work for any type for which assignment is defined (in the interests
of safety, Ada allows the definition of types for which assignment is not
defined, and it is immediately clear what the code is doing, unlike versions
which make use of multiple exclusive ORs.
Jeffrey R. Carter
Sterling, Virginia


For All the Wrong Reasons


Dear DDJ,
I take exception with Al Stevens's April 1992 "C Programming" column, in which
he states that programmers will not become obsolete. I agree, but not for the
same reasons. There will always be a need for those who dedicate themselves to
a field of expertise.
Al states that, "...a decade later, we are no closer to user-written
programs." Apparently he lives in a closed C environment. My own experience
with Borland's ObjectVision would discount this. There are other examples out
there.
Al uses C++ as his standard programming environment to support his argument
that programmers are still needed. What C++ needs is not programmers but C++
construction experts. Most of the C++ programming effort is not spent writing
a computer program but building the foundation for that program. There are
more efficient methods to create computer programs. More efficient for the
programmer and for the client spending large amounts of money to hopefully get
what he is paying for.
Al states that, "...to be a complete programmer you still need to know how to
design, code, test, and install a program, and for that you need skill,
discipline, and structure, and all that is getting harder, not easier." Does
this mean a doctor or accountant does not require skill, discipline, and
structure? I have known many professionals who have been forced to learn a
computer language, never C, in order to get the required computer program.
Computer programmers may know how to write a program, but damn few know how to
listen to the customer. One thing these professionals do is first select a
computer language that best fits their needs and requirements. Most have
selected Pascal. None have used C.
Because these nonprogrammer professionals took the time to select programming
platforms that efficiently get the job done, patients and clients still have
doctors and accountants who haven't been bankrupted by C++ overkill.
John S. Krill
Santa Ana, California


Linguistic Reflections and Suggestions


Dear DDJ,
Thanks to Al Stevens for "Another Curmudgeon Reflects" (DDJ, December 1992),
an excellent counterpoint to Scott Guthery's piece. I have agonized over the
switch from C to C++; while neither Al nor Scott cured my affliction, both
helped me reflect more intelligently. I suspect the truth lies somewhere
amidst the two perspectives. In any event, the work is appreciated--bravo!
Carson Wilson
Chicago, Illinois
P.S. It's "phylogeny," not "philology."
Al responds: You're absolutely right; I turned an old cliche into a bad joke
and apparently it didn't work.


Touting Toolkits


Dear DDJ,
Having recently evaluated graphical user interface toolkits for our use, it
was interesting to read Ray Valdes's article, "Sizing Up GUI Toolkits" in the
November 1992 issue. I, too, spent time with two of the tested toolkits, but
rejected them in favor of the TEGL Windows Toolkit from TEGL Systems.
Version 2 of the toolkit, the one previous to the current one, contained
everything from fast, low-level graphics routines to complete windows, menus,
and dialogs, as well as a remarkably useful virtual memory manager. With full
source code included, it was one of the best bargains going, at $100.00. On
top of this, the technical support was, and continues to be, excellent. The
current version adds support for protected-mode C compilers while continuing
to support several 16-bit Pascal and C compilers.
This package has given us all that we have needed to produce applications
which perform well and look good. I assume that the reason that this
exceptional product was not included in your test is that you were unaware of
it. Nonetheless, its omission was unfortunate.
Tim Allman
Guelph, Ontario, Canada
Ray responds: Thanks for the suggestion. If we have a follow-up article, they
certainly will be included. Readers who want to contact TEGL Systems Corp. can
do so at 789 West Pender St., # 780, Vancouver, BC, Canada V6C 1H2;
604-669-2577.






March, 1993
THE FLIC FILE FORMAT


A fast and simple file format for graphics and animation


 This article contains the following executables: FLIC.ARC


Jim Kent


Jim is the author of Autodesk Animator and can be contacted at Dancing Flame,
305 Dolan Ave., Mill Valley, CA 94941.


Flic files are a sequence of still frames which can be flipped through rapidly
to achieve the illusion of movement--the software equivalent of movies. I
developed the flic file format for the Autodesk Animator. A number of other
multimedia applications, including IBM's Ultimedia Tool Series and Microsoft's
Video for Windows, have since supported it. Although flics provide only 256
colors and don't contain sound, the speed and simplicity of playing them back
has made them the format of choice for many artists and game developers. This
article explains the flic file format and presents C programs that let you
play flic files.
There are two types of flic files: .FLI and .FLC. The .FLI files are older and
are limited to 32Ox2OO resolution. The newer .FLC files use a faster
compression scheme, have a slightly different header, and can have any
resolution. The program presented with this article is capable of playing both
.FLI and .FLC files.


Overview of File Structure


The key idea behind flic delta compression is simple: You only store the parts
of the picture that change from frame to frame. This helps make the file much
smaller than if each frame had to be stored completely. Delta compression also
makes it possible to quickly display a flic, even on a relatively slow display
device, since only the parts of the screen that are changing need to be
updated. Flic files use a simple-minded, run-length compression scheme on the
bits of the picture that do change from frame to frame. The entire first frame
is stored run-length compressed as well. Figure 1(a) shows an overview of the
file structure.
Figure 1:(a) Overview of the flic-file stucture; (b)flic-file hierarchy of
chunks.

 (a)

 file header
 optional prefix chunk
 first frame (run-length compressed)
 second frame (delta compressed)
 ...
 nth frame (delta compressed)
 ring frame (delta between nth frame and first)

(b)

 Flic chunk (includes header and rest of file)
 prefix chunk (in .FLC files only)
 settings chunk (Animator Pro private area)
 position chunk (Offset of cel into screen)
 frame 1 chunk
 postage stamp (icon first frame, .FLC only)
 color data (256 colors)
 pixel data (normally run length encoded)
 frame 2 chunk
 color data (only colors that change)
 pixel data (delta encoded)
...
 ring frame chunk
 color data (only colors that change)
 pixel data (delta encoded)

The flic includes the ring frame so the flic can be played repeatedly without
a perceptible pause between the last frame and the first. This is necessary
since unpacking the first frame (which is run-length compressed) is generally
much slower than updating the screen from a delta.
Flic files are structured in a hierarchy of chunks. A chunk contains the size
of itself, a type, and some data peculiar to whatever type of chunk it is. The
hierarchy in a flic is three levels deep, as shown in Figure 1(b). The
advantage of the chunk structure is that new types of chunks can be added to
the file format without breaking existing flic readers. A flic player simply
skips over chunks it does not understand or is not interested in.



Reading a Flic


To read a flic, first open the file, read in the 128-byte flic header (see
Table 1), and check that the type field has an appropriate number. If the type
field indicates it is an .FLI file, convert the speed field from 1/70th to
1/1000th of a second. If it's an .FLC file, seek to the first frame position
as defined by the oframe1 field of the header. You can then read in a 16-byte
frame header (see Tables 2 and 3). It's good to check the type field here, as
well, and report an error if it doesn't match hexadecimal F1FA. The size of
the data portion of the frame will vary. You can allocate buffer space
according to the size field of the header minus 16 (since the size field
includes the size of the header) and read the rest of the frame into the
buffer. To decode the data in the buffer, loop through each chunk, dispatching
the ones you recognize to routines specific to that chunk type. Table 4 lists
the chunk types that occur inside a frame.
Table 1: Flic header structure (128 bytes). Fields marked .FLC only are set to
0 in an .FLI.

 Offset Size Name Description

 0 4 size Size of entire file.
 4 2 type File-format identifier. Always hex
 AF12 for .FLC; AF11 for .FLI files.
 6 2 frames Number of frames in flic.
 8 2 width Screen width in pixels.
 10 2 height Screen height in pixels.
 12 2 depth Bits per pixel (always 8).
 14 2 flags Set to hex 0003.
 16 4 speed Time delay between frames. For .FLI
 files, in units of 1/70 second; for .FLC,
 in milliseconds.
 22 4 created MS-DOS-formatted date and time of
 file's creation (.FLC only).
 26 4 creator Animator Pro sets this to serial number
 of copy of program that created flic.
 Safe to set to 0 and ignore (.FLC only).
 30 4 updated MS-DOS-formatted date and time of
 file's most recent update (.FLC only).
 34 4 updater Serial number of program that last
 up-dated file. See creator (.FLC only).
 38 2 aspectx X-axis aspect ratio of display where file
 was created. A rectangle of dimensions
 aspectx by aspecty will appear square.
 For 320x200 display, aspect ratio is
 6:5. For most other displays, 1:1.
 (.FLC only).
 40 2 aspecty Y-axis aspect ratio of flic (.FLC only).
 42 38 reserved Set to 0.
 80 4 oframe 1 Offset from beginning of file to first frame
 (.FLC only).
 84 4 oframe 2 Offset from beginning of file to second
 frame (.FLC only).
 88 40 reserved Set to 0.

Table 2: Prefix header structure (16 bytes). Most programs just seek over the
prefix chunk if it is present.

Offset Size Name Description

 0 4 size Size of prefix chunk, including subchunks.
 4 2 type Prefix chunk identifier. Always hex F100.
 6 2 chunks Number of subchunks.
 8 8 reserved Set to 0.

Table 3: Frame header structure (16 bytes).

Offset Size Name Description

 0 4 size Size of frame chunk, including subchunks
 and this header.
 4 2 type Prefix chunk identifier. Always hex F1FA.
 6 2 chunks Number of subchunks.

 8 8 reserved Set to 0.

Table 4: Frame subchunk types.

Value Name Description

 4 COLOR_256 256-level color palette information (.FLC
 only).
 7 DELTA_FLC Delta compression (.FLC only).
 11 COLOR_64 64-level color pallete information (.FLI only).
 12 DELTA_FLI Delta compression (.FLI only).
 13 BLACK Entire frame set to color 0.
 15 BYTE_RUN Byte run-length compression (first frame).
 16 LITERAL Uncompressed pixels.
 18 PSTAMP Miniature image of flic for visual file
 requestor. First frame only, safe to ignore.

Decoding subsequent frames is just like decoding the first. Some chunks appear
only in the first frame and others appear only in subsequent frames, but it is
convenient to decode all fr mes with a single routine that includes all chunk
types in its central switch. A flic reader reads and decodes one frame after
the other until it comes to the last (ring) frame. If the flic reader only
wants to play the flic once, it stops here. If it wishes to play the flic
again, it reads and decodes the ring frame, then seeks back to the second
frame of the flic to start over.


Decoding Chunk Types


There are a variety of different chunk types, and understanding how to decode
them is central to writing flic-supported programs. The file READFLIC.C
(Listing One , page 92) presents routines that read and decompress a flic. The
routines assume Intel byte ordering, but are otherwise fairly portable. The
file starts with the low-level decompression routines: first for colors, then
for pixels. It then goes to the higher-level exported flic_xxxx routines as
prototyped in READFLIC.H (Listing Two, page 96). The program requires other
files, such as those that attend to machine-specific details. Because of their
length, these files and an executable version of the program are available
only electronically; see "Availability," page 5. In this section, I'll
describe in detail the available chunk types.
Chunk Type 16 (FLI_COPY). No compression. This chunk contains an uncompressed
image of the frame. The number of pixels following the chunk header is exactly
the width of the animation times its height. The data starts in the upper-left
corner with pixels copied from left to right and then top to bottom. Chunk 16
is created when the preferred compression method generates more data than the
uncompressed frame image--a relatively rare situation.
Chunk Type 13 (BLACK). No data. This chunk has no data following the header.
All pixels in the frame are set to color-index 0.
Chunk Type 15 (BYTE_RUN). Byte run-length compression. This chunk contains the
entire image in a compressed format. Usually this chunk is used in the first
frame of an animation, or within a postage-stamp image chunk.
The data is organized in lines. Each line contains packets of compressed
pixels. The first line is at the top of the animation, followed by subsequent
lines moving downward. The number of lines in this chunk is given by the
height of the animation. The first byte of each line is a count of packets in
the line. This value is ignored--it is a holdover from the original Animator.
It is possible to generate more than 255 packets on a line. The width of the
animation is now used to drive the decoding of packets on a line; continue
reading and processing packets until width pixels have been processed, then
proceed to the next line.
Each packet consist of a type/size byte, followed by one or more pixels. If
the packet type is negative, it is a count of pixels to be copied from the
packet to the animation image. If the packet type is positive, it contains a
single pixel that is to be replicated; the absolute value of the packet type
is the number of times the pixel is to be replicated.
Chunk Type 12 (DELTA_FLI). Byte-oriented delta compression. This chunk
contains the differences between the previous frame and this frame. This
compression method was used by the original Animator, but is not created by
Animator Pro. This type of chunk can appear in an Animator Pro file, however,
if the file was originally created by Animator and some (but not all) frames
were modified using Animator Pro.
The first 16-bit word following the chunk header contains the position of the
first line in the chunk. This is a count of lines (down from the top of the
image) unchanged from the prior frame. The second 16-bit word contains the
number of lines in the chunk. The data for the lines follows these two words.
Each line begins with two bytes. The first byte of each line contains the
number of packets for the line. Unlike BRUN compression, the packet count is
significant (because this compression method is only used on 32Ox2OO flics).
Each packet consists of a single-byte column skip, followed by a packet
type/size byte. If the packet type is positive, it is a count of pixels to be
copied from the packet to the animation image. If the packet type is negative,
it contains a single pixel which is to be replicated; the absolute value of
the packet type gives the number of times the pixel is to be replicated. (The
negative/positive meaning of the packet-type bytes in DELTA_FLI compression is
reversed from that used in BYTE_RUN compression. This gives better performance
during playback.)
Chunk Type 7 (DELTA_FLC). Word-oriented delta compression. This format
contains the differences between consecutive frames. This is the format most
often used by Animator Pro for frames other than the first frame of an
animation. It is similar to the byte-oriented delta (DELTA_FLI) compression,
but is word oriented instead of byte oriented. The data is organized into
lines, and each line is organized into packets.
The first word in the data following the chunk header contains the number of
lines in the chunk. Each line can begin with some optional words that are used
to skip lines and set the last byte in the line for animations with odd
widths. These optional words are followed by a count of the packets in the
line. The line count does not include skipped lines.
The high-order two bits of the word are used to determine the contents of the
word, as shown in Table 5. The packets in each line are similar to the packets
for the line-coded chunk. The first byte of each packet is a column-skip
count. The second byte is a packet type. If the packet type is positive, the
packet type is a count of words to be copied from the packet to the animation
image. If the packet type is negative, the packet contains one more word which
is to be replicated. The absolute value of the packet type gives the number of
times the word is to be replicated. The high-and low-order byte in the
replicated word do not necessarily have the same value.
Table 5: High-order two bits of the word determine the contents of the word.

Bit 15 Bit 14 Meaning

 0 0 Word contains packet count. Packets follow this
 word. Packet count can be 0; occurs when only
 last pixel on a line changes.

 1 0 Low-order byte is to be stored in the last byte of
 current line. Packet count always follows this
 word.

 1 1 Word contains a line skip count. Number of lines
 skipped is given by absolute value of word.
 Word can be followed by more skip counts, a
 last byte word, or packet count.

Chunk Type 4 (COLOR_256). 256-level color map. The data in this chunk is
organized in packets. The first word following the chunk header is a count of
the number of packets in the chunk. Each packet consists of a 1-byte
color-index skip count, a 1-byte color count, and three bytes of color in
formation for each color defined.
At the start of the chunk, the color index is assumed to be 0. Before
processing any colors in a packet, the color-index skip count is added to the
current color index. The number of colors defined in the packet is retrieved.
A 0 in this byte indicates that 256 colors follow. The three bytes for each
color define the red, green, and blue components of the color in that order.
Each component can range from 0 (off) to 255 (full on). The data to change
colors 2, 7, 8, and 9 would appear as in Example 1.
Example 1: Data to change colors 2, 7, 8, and 9.

 2 ; two packets
 2, 1, r, g, b ; skip 2, change 1
 4, 3, r, g, b, r, g, b, r, g, b ; skip 4, change 3


Chunk Type 11 (COLOR_64). 64-level color map. This chunk is identical to
FLI_COLOR256 except that the values for the red, green, and blue components
are in the range of 0-63 instead of 0-255.


Limitations and Conclusions


In the worst case, the Animator will currently produce first frames that are
7102+width * height bytes and subsequent frames that are 802+width * height
bytes. Fortunately, in most cases, significant image compression is possible
for hand-drawn and computer-synthesized imagery. Digitized video imagery does
not compress well with the simple lossless run-length delta-compression
schemes used in a flic. Frames of objects moving against a solid dark
background may compress significantly even if taken from video; but
lighter-colored backgrounds often contain enough noise on the video signal to
foil the flic compression scheme even if the camera isn't moving. Still, for
those of us lacking MPEG hardware, flics are one of the best ways of storing
and displaying moving digital imagery on our computers.

_THE FLIC FILE FORMAT_
by Jim Kent


[LISTING ONE]

/* readflic.c -- Routines to read and decompress a flic. Assumes Intel byte
 * ordering, but otherwise should be fairly portable. Calls machine-specific
 * stuff in pcclone.c. This file starts with the low-level decompression
 * routines: first for colors, then for pixels. It then goes to higher-level
 * exported flic_xxxx routines as prototyped in readflic.h.
 * Copyright (c) 1992 Jim Kent. This file may be freely used, modified,
 * copied and distributed. This file was first published as part of
 * an article for Dr. Dobb's Journal March 1993 issue. */

#include <errno.h>
#include <string.h>
#include <io.h>
#include "types.h"
#include "pcclone.h"
#include "flic.h"
#include "readflic.h"

typedef void ColorOut(Screen *s, int start, Color far *colors, int count);
 /* This is the type of output parameter to our decode_color below.
 * Not coincidently screen_put_color is of this type. */
static void decode_color(Uchar huge *data
, Flic *flic, Screen *s, ColorOut *output)
 /* Decode color map. Put results into output. Two color compressions
 * are identical except that RGB values are 0-63 or 0-255. Passing in
 * an output that does appropriate shifting on way to real pallete lets
 * us use the same code for both COLOR_64 and COLOR_256 compression. */
{
int start = 0;
Uchar far *cbuf = (Uchar far *)data;
Short far *wp = (Short far *)cbuf;
Short ops;
int count;

ops = *wp;
cbuf += sizeof(*wp);
while (--ops >= 0)
 {
 start += *cbuf++;
 if ((count = *cbuf++) == 0)
 count = 256;
 (*output)(s, start, (Color far *)cbuf, count);
 cbuf += 3*count;
 start += count;
 }

}
static void decode_color_256(Uchar huge *data, Flic *flic, Screen *s)
 /* Decode COLOR_256 chunk. */
{
decode_color(data, flic, s, screen_put_colors);
}
static void decode_color_64(Uchar huge *data, Flic *flic, Screen *s)
 /* Decode COLOR_64 chunk. */
{
decode_color(data, flic, s, screen_put_colors_64);
}
static void decode_byte_run(Uchar huge *data, Flic *flic, Screen *s)
 /* Byte-run-length decompression. */
{
int x,y;
int width = flic->head.width;
int height = flic->head.height;
Char psize;
Char huge *cpt = data;
int end;

y = flic->yoff;
end = flic->xoff + width;
while (--height >= 0)
 {
 x = flic->xoff;
 cpt += 1; /* skip over obsolete opcount byte */
 psize = 0;
 while ((x+=psize) < end)
 {
 psize = *cpt++;
 if (psize >= 0)
 {
 screen_repeat_one(s, x, y, *cpt++, psize);
 }
 else
 {
 psize = -psize;
 screen_copy_seg(s, x, y, (Pixel far *)cpt, psize);
 cpt += psize;
 }
 }
 y++;
 }
}
static void decode_delta_fli(Uchar huge *data, Flic *flic, Screen *s)
 /* Fli style delta decompression. */
{
int xorg = flic->xoff;
int yorg = flic->yoff;
Short huge *wpt = (Short huge *)data;
Uchar huge *cpt = (Uchar huge *)(wpt + 2);
int x,y;
Short lines;
Uchar opcount;
Char psize;

y = yorg + *wpt++;
lines = *wpt;

while (--lines >= 0)
 {
 x = xorg;
 opcount = *cpt++;
 while (opcount > 0)
 {
 x += *cpt++;
 psize = *cpt++;
 if (psize < 0)
 {
 psize = -psize;
 screen_repeat_one(s, x, y, *cpt++, psize);
 x += psize;
 opcount-=1;
 }
 else
 {
 screen_copy_seg(s, x, y, (Pixel far *)cpt, psize);
 cpt += psize;
 x += psize;
 opcount -= 1;
 }
 }
 y++;
 }
}
static void decode_delta_flc(Uchar huge *data, Flic *flic, Screen *s)
 /* Flc-style delta decompression. Data is word-oriented. Much control info
 * (how far to skip, how many words to copy) are still byte-oriented. */
{
int xorg = flic->xoff;
int yorg = flic->yoff;
int width = flic->head.width;
int x,y;
Short lp_count;
Short opcount;
int psize;
union {Short huge *w; Uchar huge *ub; Char huge *b; Pixels2 huge *p2;} wpt;
int lastx;
 lastx = xorg + width - 1;
 wpt.ub = data;
 lp_count = *wpt.w++;
 y = yorg;
 goto LPACK;
SKIPLINES: /* Advance over some lines. */
 y -= opcount;
LPACK: /* do next line */
 if ((opcount = *wpt.w++) >= 0)
 goto DO_SS2OPS;
 if( ((Ushort)opcount) & 0x4000) /* skip lines */
 goto SKIPLINES;
 screen_put_dot(s,(Uchar)opcount,lastx,y); /* eol dot with low byte */
 if((opcount = *wpt.w++) == 0)
 {
 ++y;
 if (--lp_count > 0)
 goto LPACK;
 goto OUT;
 }

DO_SS2OPS:
 x = xorg;
PPACK: /* do next packet */
 x += *wpt.ub++;
 psize = *wpt.b++;
 if ((psize += psize) >= 0)
 {
 screen_copy_seg(s, x, y, (Pixel far *)wpt.ub, psize);
 x += psize;
 wpt.ub += psize;
 if (--opcount != 0)
 goto PPACK;
 ++y;
 if (--lp_count > 0)
 goto LPACK;
 }
 else
 {
 psize = -psize;
 screen_repeat_two(s, x, y, *wpt.p2++, psize>>1);
 x += psize;
 if (--opcount != 0)
 goto PPACK;
 ++y;
 if (--lp_count > 0)
 goto LPACK;
 }
OUT:
 return;
}
static void decode_black(Uchar huge *data, Flic *flic, Screen *s)
 /* Decode a BLACK chunk. Set frame to solid color 0 one line at a time. */
{
Pixels2 black;
int i;
int height = flic->head.height;
int width = flic->head.width;
int x = flic->xoff;
int y = flic->yoff;

black.pixels[0] = black.pixels[1] = 0;
for (i=0; i<height; ++i)
 {
 screen_repeat_two(s, x, y+i, black, width/2);
 if (width & 1) /* if odd set last pixel */
 screen_put_dot(s, x+width-1, y+i, 0);
 }
}
static void decode_literal(Uchar huge *data, Flic *flic, Screen *s)
 /* Decode a LITERAL chunk. Copy data to screen one line at a time. */
{
int i;
int height = flic->head.height;
int width = flic->head.width;
int x = flic->xoff;
int y = flic->yoff;

for (i=0; i<height; ++i)
 {

 screen_copy_seg(s, x, y+i, (Pixel far *)data, width);
 data += width;
 }
}
ErrCode flic_open(Flic *flic, char *name)
 /* Open flic file. Read header, verify it's a flic. Seek to first frame. */
{
ErrCode err;
ClearStruct(flic); /* Start at a known state. */
if ((err = file_open_to_read(&flic->handle, name)) >= Success)
 {
 if ((err = file_read_block(flic->handle, &flic->head, sizeof(flic->head)))
 >= Success)
 {
 flic->name = name; /* Save name for future use. */
 if (flic->head.type == FLC_TYPE)
 {
 /* Seek frame 1. */
 lseek(flic->handle,flic->head.oframe1,SEEK_SET);
 return Success;
 }
 if (flic->head.type == FLI_TYPE)
 {
 /* Do some conversion work here. */
 flic->head.oframe1 = sizeof(flic->head);
 flic->head.speed = flic->head.speed * 1000L / 70L;
 return Success;
 }
 else
 {
 err = ErrBadFlic;
 }
 }
 flic_close(flic); /* Close down and scrub partially opened flic. */
 }
return err;
}
void flic_close(Flic *flic)
 /* Close flic file and scrub flic. */
{
close(flic->handle);
ClearStruct(flic); /* Discourage use after close. */
}
static ErrCode decode_frame(Flic *flic
, FrameHead *frame, Uchar huge *data, Screen *s)
 /* Decode a frame that is in memory already into screen. Here we
 * loop through each chunk calling appropriate chunk decoder. */
{
int i;
ChunkHead huge *chunk;
for (i=0; i<frame->chunks; ++i)
 {
 chunk = (ChunkHead huge *)data;
 data += chunk->size;
 switch (chunk->type)
 {
 case COLOR_256:
 decode_color_256((Uchar huge *)(chunk+1), flic, s);
 break;

 case DELTA_FLC:
 decode_delta_flc((Uchar huge *)(chunk+1), flic, s);
 break;
 case COLOR_64:
 decode_color_64((Uchar huge *)(chunk+1), flic, s);
 break;
 case DELTA_FLI:
 decode_delta_fli((Uchar huge *)(chunk+1), flic, s);
 break;
 case BLACK:
 decode_black((Uchar huge *)(chunk+1), flic, s);
 break;
 case BYTE_RUN:
 decode_byte_run((Uchar huge *)(chunk+1), flic, s);
 break;
 case LITERAL:
 decode_literal((Uchar huge *)(chunk+1), flic, s);
 break;
 default:
 break;
 }
 }
return Success;
}
ErrCode flic_next_frame(Flic *flic, Screen *screen)
 /* Advance to next frame of flic. */
{
FrameHead head;
ErrCode err;
BigBlock bb;
long size;
if ((err = file_read_block(flic->handle, &head, sizeof(head))) >= Success)
 {
 if (head.type == FRAME_TYPE)
 {
 size = head.size - sizeof(head); /* Don't include head. */
 if (size > 0)
 {
 if ((err = big_alloc(&bb, size)) >= Success)
 {
 if ((err = file_read_big_block(flic->handle, &bb, size))
 >= Success)
 {
 err = decode_frame(flic, &head, bb.hpt, screen);
 }
 big_free(&bb);
 }
 }
 }
 else
 {
 err = ErrBadFrame;
 }
 }
return err;
}
static Ulong calc_end_time(Ulong millis, Clock *clock)
 /* Little helper subroutine to find out when to start on next frame. */
{

return clock_ticks(clock) + millis * clock->speed / 1000l;
}
static ErrCode wait_til(Ulong end_time, Machine *machine)
 /* This waits until key is hit or end_time arrives. Return Success if timed
 * out, ErrCancel if key hit. Insures keyboard poll at least once. */
{
 do
 {
 if (key_ready(&machine->key))
 {
 key_read(&machine->key);
 return ErrCancel;
 }
 }
 while (clock_ticks(&machine->clock) < end_time);
 return Success;
}
ErrCode flic_play_once(Flic *flic, Machine *machine)
 /* Play a flic through once. */
{
ErrCode err;
int i;
Ulong end_time;
for (i=0; i<flic->head.frames; ++i)
 {
 end_time = calc_end_time(flic->head.speed, &machine->clock);
 if ((err = flic_next_frame(flic, &machine->screen)) < Success)
 break;
 if ((err = wait_til(end_time, machine)) < Success)
 break;
 }
return err;
}
static ErrCode fill_in_frame2(Flic *flic)
 /* This determines where second frame of flic is (useful for a loop). */
{
FrameHead head;
ErrCode err;
lseek(flic->handle, flic->head.oframe1, SEEK_SET);
if ((err = file_read_block(flic->handle, &head, sizeof(head))) < Success)
 return err;
flic->head.oframe2 = flic->head.oframe1 + head.size;
return Success;
}
ErrCode flic_play_loop(Flic *flic, Machine *machine)
 /* Play a flic until key is pressed. */
{
int i;
Ulong end_time;
ErrCode err;

if (flic->head.oframe2 == 0)
 {
 fill_in_frame2(flic);
 }
 /* Seek to first frame. */
lseek(flic->handle, flic->head.oframe1, SEEK_SET);
 /* Save time to move on. */
end_time = calc_end_time(flic->head.speed, &machine->clock);

 /* Display first frame. */
if ((err = flic_next_frame(flic, &machine->screen)) < Success)
 return err;
for (;;)
 {
 /* Seek to second frame */
 lseek(flic->handle, flic->head.oframe2, SEEK_SET);
 /* Loop from 2nd frame thru ring frame*/
 for (i=0; i<flic->head.frames; ++i)
 {
 if (wait_til(end_time, machine) < Success)
 return Success; /* Time out is a success here. */
 if ((err = flic_next_frame(flic, &machine->screen)) < Success)
 return err;
 end_time = calc_end_time(flic->head.speed, &machine->clock);
 }
 }
}
static char *err_strings[] =
 {
 "Unspecified error",
 "Not enough memory",
 "Not a flic file",
 "Bad frame in flic",
 NULL,
 NULL,
 "Couldn't open display",
 "Couldn't open keyboard",
 "User canceled action",
 };
char *flic_err_string(ErrCode err)
 /* Return a string that describes an error. */
{
 if (err >= Success)
 return "Success"; /* Shouldn't happen really... */
 if (err == ErrOpen err == ErrRead)
 return strerror(errno); /* Get Disk IO error from DOS. */
 err = -err;
 err -= 1;
 if (err > ArrayEls(err_strings))
 return "Unknown error";
 return err_strings[err];
}





[LISTING TWO]

/* Readflic.h -- Prototypes and other structural info for readflic program.
 * Copyright (c) 1992 Jim Kent. This file may be freely used, modified,
 * copied and distributed. This file was first published as part of
 * an article for Dr. Dobb's Journal March 1993 issue. */

/* Some handy macros I use in lots of programs: */
#define ArrayEls(a) (sizeof(a)/sizeof((a)[0]))
 /* Count up number of elements in an array */
#define ClearMem(buf,size) memset(buf, 0, size)

 /* Clear a block of memory. */
#define ClearStruct(pt) ClearMem(pt, sizeof(*(pt)))
 /* Clear a structure (pass in pointer) */
/* Data structures peculiar to readflic program: */
typedef struct
 {
 FlicHead head; /* Flic file header. */
 int handle; /* File handle. */
 int frame; /* Current frame in flic. */
 char *name; /* Name from flic_open. */
 int xoff,yoff; /* Offset to display flic at. */
 } Flic;
/* Prototypes peculiar to readflic program: */
ErrCode flic_open(Flic *flic, char *name);
 /* Open flic file. Read header and verify it's a flic. */
void flic_close(Flic *flic);
 /* Close flic file and scrub flic. */
ErrCode flic_play_once(Flic *flic, Machine *machine);
 /* Play a flic through once. */
ErrCode flic_play_loop(Flic *flic, Machine *machine);
 /* Play a flic until key is pressed. */
ErrCode flic_next_frame(Flic *flic, Screen *screen);
 /* Advance to next frame of flic. */
/* Various error codes flic reader can get. */
#define ErrNoMemory -2 /* Not enough memory. */
#define ErrBadFlic -3 /* File isn't a flic. */
#define ErrBadFrame -4 /* Bad frame in flic. */
#define ErrOpen -5 /* Couldn't open file. Check errno. */
#define ErrRead -6 /* Couldn't read file. Check errno. */
#define ErrDisplay -7 /* Couldn't open display. */
#define ErrClock -8 /* Couldn't open clock. */
#define ErrKey -9 /* Couldn't open keyboard. */
#define ErrCancel -10 /* User cancelled. */





Figure 1

(a)

file header
optional prefix chunk
first frame (run-length compressed)
second frame (delta compressed)
 ...
nth frame (delta compressed)
ring frame (delta between nth frame and first)



(b)

Flic chunk (includes header and rest of file)
 prefix chunk (in .FLC files only)
 settings chunk (Animator Pro private area)
 position chunk (Offset of cel into screen)
 frame 1 chunk

 postage stamp (icon of first frame, .FLC only)
 color data (256 colors)
 pixel data (normally run length encoded)
 frame 2 chunk
 color data (only colors that change)
 pixel data (delta encoded)
 ...
 ring frame chunk
 color data (only colors that change)
 pixel data (delta encoded)



Example 1:

2 ; two packets
2,1,r,g,b ; skip 2, change 1
4,3,r,g,b,r,g,b,r,g,b ; skip 4, change 3












































March, 1993
FILE CONVERSION USING C++ TEMPLATES


What goes in isn't what comes out




Timothy Butterfield


Tim is a developer with RDI Software Technologies' object technology group. He
can be reached at 6300 N. River Rd., Suite 200, Rosemont, IL 60018, by phone
at 708-518-0181, x618, or on CompuServe at 70304,277.


Converting files from one format to another can be as simple as importing one
line of text into a database record or as complex as importing data from a
variety of accounting systems into a single tax-worksheet program. Likewise,
graphical data often must be converted before it can be printed to different
printers or plotters (each of which requires different command sets) or
displayed in various formats on a screen.
In a recent project I worked on, multi-megabyte billing-report text files on
mainframes needed to be converted to several DBF files. Each page of the
report had to be checked for various criteria in differing locations, split
into different sections, and put into a memo field in its entirety for later
viewing.
Our approach to the problem was to build a parser-based conversion engine
using C++ templates that, as it turns out, works well for both converting
files between different formats and printing or displaying graphical data. By
using templates to design the conversion engine, we were able to design a
"black-box" conversion class. This class allows us to use the various required
data types and processes without rewriting the basics of the conversion engine
for each new combination.
An additional advantage of templates is the division of each major part into
separate, manageable components. Instead of developing the entire conversion
process as one block, each section was individually designed, tested, and
maintained.


Converter-engine Requirements


In its most basic form, a converter must be able to input data and output data
and have input, conversion, and output mechanisms. For example, importing a
text file to a DBF format requires the following: data from the text file; the
data record to write to the DBF file; a function to read in the text-file
data; a function to convert the text to the record format; and finally, a
function to write the output record to the DBF file.
To be classified as an "engine," the individual sections of the converter must
be replaceable. The input data type could be anything from an individual
character or integer to a structure or a class instance. Likewise, the output
data type can range from an individual character through a class instance. The
input class retrieves the input data and supplies it to the conversion engine.
The convert class processes the input data, converting it into the output data
format. The output class receives the output data and puts it where it needs
to go, whether to screen, disk, and so on.


Conversion Classes


The Converter class is defined with each of the five conversion requirements
being part of the template, as shown in Figure 1(a), which is excerpted from
the CONVERTR.HPP header file (Listing One, page 98). A pointer is stored to an
instance of each of the five sections with a method provided to allow
on-the-fly changing of each. The current stage and status are stored with a
method for access to them.
Figure 1: (a) The Conversion template class declaration; (b) required
conversion-engine support-class method prototypes.

 (a)

 template
 < class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
 class Converter { ... };

 (b)

 INPUT->getData ( INTYPE &id );
 CONVERT->process ( INTYPE &id, OUTTYPE &od );
 OUTPUT->putData ( OUTTYPE &od );

The idea of an engine implies a consistent interface. Therefore, each of the
three required classes has required methods; their prototypes are shown in
Figure 1(b). By using the standard means of access, the engine and each class
can remain consistent for each new usage. Each of the required methods
receives the data by reference and returns a status of OK, Done, or Error.
This can easily be extended to handle more explicit messages, but for ease of
explanation, I use just these three.
Since each class is a stand-alone definition, they may contain more than the
minimum required methods. Each can have its own instance data and additional
methods. Each class, or certain methods, can be made a friend to the other
classes and allow communication beyond that used in the conversion-engine
class. For instance, the convert class may need to call a rewind function in
the input class if it needs to return to previous data.


A Sample Program


CVTEST.CPP (Listing Two, page 98) demonstrates the use of the Converter class
engine by converting a range of numbers (int) into a char * format and
outputting them onto the console using cout; see Figure 2. This gives an
example of using the template class, though on a much smaller scale than an
actual file conversion.
To demonstrate the flexibility of the engine, I used a simple integer for the
input data type, while a class is used for the output data type. The
Output-Data class is a simple means of demonstrating the use of a class for a
data type. It uses the itoa() function to convert the input data integer into
an array of char. A setNumber() method is provided for this as well as a
getNumber() method for returning the address of the character array. To take
advantage of the type checking provided in C++, typedefs are used for the
input and output data types.
The input class constructor accepts a range of numbers to increment through.
In this example, error checking is not provided, though it would probably be
used in most cases. The beginning, ending, and current count are stored as
member variables and are initialized in the constructor. The required
getData() method checks the current count against the ending value and returns
CONVERTR_DONE if it is greater. Otherwise, it copies the current count into
the input data instance (passed by reference), increments, and returns
CONVERTR_OK.
The convert class consists entirely of the required process() method. This
method calls the setNumber() method of the OutputData class, passing it the
input data, and then returning CONVERTR_OK.
Similarly, the output class consists entirely of the required putData()
method. This method uses cout and the OutputData class getNumber() method to
display the output data on the console. It then returns CONVERTR_OK to the
conversion-engine class.



Summary


Though the example used is simple, it demonstrates the usage of the
conversion-engine template class. By replacing the five requirements--input
data type, output data type, input, convert, and output--a file converter or
graphical output mechanism can be developed with relative ease. The full
project mentioned earlier was designed with a paging class for the input data
type and a record-instruction class for the output data type. This allowed for
an entire page of the billing report to be processed as a unit and for the
development of a set of instructions for writing the particular page to the
DBFS.
While there are alternatives to templates, such as using a standard set of
base classes for the conversion engine with derived classes for each section,
the approach described here provides both the flexibility of a variety of
formats and a mechanism for the data and its processing.


Acknowledgments


Special thanks to Eric Nagler (TeamB) for his compilations with the Metaware
High C/C++ compiler and to Pete Becker (Borland) for his explanations of
Borland's implementation of the AT&T CFRONT 3.0 C++ standard.

_FILE CONVERSION USING C++ TEMPLATES_
by Tim Butterfield


[LISTING ONE]

// --------- CONVERTR.HPP --- Converter class ---------

#ifndef _CONVERTR_HPP_
#define _CONVERTR_HPP_

#define CONVERTR_OK 0
#define CONVERTR_DONE 1
#define CONVERTR_ERROR -1

#define STAGE_INPUT 0
#define STAGE_CONVERT 1
#define STAGE_OUTPUT 2
#define STAGE_DONE 3

template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
class Converter
{
 public:
 // constructor
 Converter
 ( CONVERT &c, INPUT &i, INTYPE &id, OUTPUT &o, OUTTYPE &od );

 inline void setInput ( INPUT &i ); // set a new INPUT object
 inline void setConvert ( CONVERT &c ); // set a new CONVERT object
 inline void setOutput ( OUTPUT &o ); // set a new OUTPUT object
 inline void setInData ( INTYPE &id ); // set a new INTYPE object
 inline void setOutData ( OUTTYPE &od ); // set a new OUTTYPE object

 int getStatus() { return status; } // return the converter status
 int getStage() { return stage; } // return the current stage

 void run(); // run the converter

 private:
 INPUT *input; // input class
 CONVERT *convert; // conversion class
 OUTPUT *output; // output class
 INTYPE *indata; // passed by reference to input::getData()

 // and convert::process()
 OUTTYPE *outdata; // passed by reference to convert::process()
 // and output::putData()
 int status; // current status - ok == 0, done > 0, error < 0
 int stage; // current process
};

// ------- converter class methods -------
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 Converter ( CONVERT &c, INPUT &i, INTYPE &id, OUTPUT &o, OUTTYPE &od )
{
 setInput( i ); // set input class
 setConvert( c ); // set conversion class
 setOutput( o ); // set output class
 setInData( id ); // set input data item
 setOutData( od ); // set output data item
}
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
void Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 setInput ( INPUT &i )
{
 input = &i; // save pointer to input class
}
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
void Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 setConvert ( CONVERT &c )
{
 convert = &c; // save pointer to conversion class
}
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
void Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 setOutput ( OUTPUT &o )
{
 output = &o; // save pointer to output class
}
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
void Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 setInData ( INTYPE &id )
{
 indata = &id; // save pointer to input data
}
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
void Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 setOutData( OUTTYPE &od )
{
 outdata = &od; // save pointer to output data
}
template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
void Converter < CONVERT, INPUT, INTYPE, OUTPUT, OUTTYPE >::
 run()
{

 status = CONVERTR_OK;
 do
 {
 // go through each stage of the convertion
 for ( stage = STAGE_INPUT;
 status == CONVERTR_OK && stage < STAGE_DONE;
 stage++ )
 {
 switch ( stage ) // which stage is the converter in
 {
 case STAGE_INPUT: // get data
 status = input->getData( *indata );
 break;
 case STAGE_CONVERT: // process data
 status = convert->process( *indata, *outdata );
 break;
 case STAGE_OUTPUT: // put data
 status = output->putData( *outdata );
 break;
 default: // shouldn't get here
 break;
 }
 }
 } while ( status == CONVERTR_OK ); // not done, no errors
}
#endif // #ifndef _CONVERTR_HPP_

// ------- EOF: CONVERTR.HPP -------






[LISTING TWO]

// ------- CVTEST.CPP --- Converter class test program -------

#include <stdlib.h> // itoa() in OutputData class
#include <iostream.h> // cout
#include <iomanip.h> // endl
#include "convertr.hpp" // Converter class

// ------- converter data types -------
// simple class for storing the output data
class OutputData
{
 public:
 // constructor
 OutputData ( ) { number[0] = '\0'; } // clear number
 // retrieve the number
 char *getNumber ( ) { return number; }
 // set a new number
 void setNumber ( int n ) { itoa( n, number, 10 ); }

 private:
 char number[6]; // make room for number
};


// use of typedefs allows better type checking
typedef int inType; // input data block type
typedef class OutputData outType; // output data block type

// ------- converter Input class -------

// class declaration
class Input
{
 public:
 Input ( int b, int e ); // constructor
 int getData ( inType &id ); // called by converter::run()
 private:
 int begin; // beginning count
 int end; // ending count
 int cur; // current count
};
// constructor
Input::Input ( int b, int e )
{
 begin = b; // set beginning count
 end = e; // set ending count
 cur = begin; // set current count
}
// get data for the converter
int Input::getData ( inType &id )
{
 if ( cur > end ) // past the end yet ?
 {
 return ( CONVERTR_DONE ); // yes, return "done"
 }
 id = cur++; // put data in "input block", increment count
 return ( CONVERTR_OK ); // return "ok"
}

// ------- converter conversion class -------

// class declaration
class Convert
{
 public:
 int process ( inType &id, outType &od ); // called by converter::run()
};
// process the data for the converter engine
int Convert::process ( inType &id, outType &od )
{
 od.setNumber( id ); // convert to output data type

 return ( CONVERTR_OK ); // return "ok"
}

// ------ converter Output class -------

// class declaration
class Output
{
 public:
 int putData ( outType &od ); // called by converter::run()
};

// put data from the converter
int Output::putData ( outType &od )
{
 // display the output "block"
 cout << "[" << od.getNumber() << "]" << endl;

 return ( CONVERTR_OK ); // return "ok"
}

// ------- main -------
void main ( )
{
 // class instances required by the converted
 Input i(1,10); // initialize input class instance
 Convert c; // initialize convert class instance
 Output o; // initialize output class instance
 // data blocks required by the converter
 inType id; // initialize input data block
 outType od; // initialize output data block class instance
 // create an instance of the converter using the defined types
 Converter< Convert, Input, inType, Output, outType >
 cv( c, i, id, o, od );
 // run the converter
 cv.run();
}

// ------- EOF: CVTEST.CPP -------





Figure 1: (a) The Conversion template class declaration; (b) required
conversion engine support class method prototypes.


(a)

template
< class CONVERT, class INPUT, class INTYPE, class OUTPUT, class OUTTYPE >
class Converter { ... };

(b)

INPUT->getData ( INTYPE &id );
CONVERT->process ( INTYPE &id, OUTTYPE &od );
OUTPUT->putData ( OUTTYPE &od );
















March, 1993
COMPOUND DOCUMENTS


What could be better than ASCII?




Lowell Williams


Lowell is a senior software engineer in Digital's electronic publishing
systems engineering group. He can be contacted at 603-884-1241.


Marshall McLuhan was never more right than in the post-Macintosh era. The
media really is the message. Images, artwork, and presentation style have
become as important as text for conveying meaning in computer-generated
documents. As the amount of information represented by nontext features
increases, so too does the need to interchange complete documents, not just
document text. Because it is strictly a text format, ASCII is no longer
adequate to fulfill the role it is continually called upon to play--that of
universal document-interchange standard. Programmers need, and need to learn
how to use, an encoding model that can interchange complete documents the way
ASCII has been used to interchange text.
Let's say, for example, that we wish to encode the newspaper USA Today so that
we will be able to interchange text, photos, layout, and other information
between editorial, printing, and sales offices around the world. At these
locations, different applications perform various operations on the paper,
such as news retrieval, printing, and regional advertising insertion. These
applications need to process a combination of document objects, such as
galleys, photos, artwork, and stylized headlines, in addition to plain vanilla
ASCII. A universal document-encoding model would ensure that each application,
regardless of vendor, could see and therefore work with these objects, just as
ASCII allows current applications to see text.
While most programmers know how to process ASCII, few know how to process
documents from applications for which they don't have source code. One
problem, of course, is a lack of equivalents to ASCII for encoding compound
documents--that is, documents that contain a mix of content types. But a
bigger problem may be adjusting to the idea of multidimensional primitives. An
ASCII primitive (a 7-bit character) expresses meaning in one dimension:
content. A compound ASCII-equivalent primitive (whatever that might be) may
express meaning in three (or more) dimensions: content, structure, and
presentation style. Yet the two situations are parallel. Files of ASCII
primitives encode revisable text. Files of compound-document primitives encode
revisable documents. The question is: What rules for constructing and using
primitives would allow applications to interchange and revise each other's
compound documents in all required dimensions?


The Problem and Some Solutions


Most documents contain a mix of different content types, and are therefore
called compound documents. These content types might include text, line art,
raster graphics (images), tabular data, and eventually even audio and video.
Frequently, a compound document also contains a mix of layout and content
presentation styles. Layout is the way content is arranged on a page, such as
in columns, footnotes, headings, and so on. Presentation styles affect the
rendering of the content itself, such as with boldface, italics, and
underlining. A compound ASCII equivalent, therefore, would faithfully encode
mixed-content, layout, and presentation information across applications.
In addition to providing the rules (called semantics) to encode documents, a
compound ASCII equivalent might also provide the process (runtime system) to
encode documents. Whether a standards body should be in the business of
providing software in addition to an encoding format is a subject of debate.
Experience indicates, however, that even though two applications follow the
same encoding rules, minor differences can still occur when two programs
attempt to reproduce the same document. Having common software perform the
encoding greatly reduces this possibility.
What makes ASCII so popular, despite its limitations, is that it is
interpreted consistently by applications that support ASCII and is universally
accessible to developers. Developers currently have four candidates available
to them as a potential compound ASCII equivalent: private formats, the Open
Document Architecture (ODA) standard, the Standard Generalized Markup Language
(SGML) standard, and Digital's Compound Document Architecture (CDA).
Private formats rely entirely on application-specific semantics and process
for features. If any interchange or cooperation happens, it results from
homogeneous applications and platforms using a pair-wise agreement on what the
data format is. The upside of this approach is that the developer can build
whatever functionality he or she wants into software, without worrying about
achieving consensus with others. The downside is that there may be fewer
applications with which to exchange documents and that time is wasted
reinventing features already present in incompatible software.
The other three solutions are "public" in the sense that the specifications
are published and that documentation and software are available at nominal
cost. Each solution, however, brings a significantly different approach to
encoding revisable compound documents. SGML (ISO 8879:1986) and ODA (ISO
8613:1989) are both entirely in the public domain. To date, however, ODA has
yet to be implemented in real products. Also, ODA does not include runtime
software for encoding documents that comply with ODA semantics, leaving the
door open to slightly inconsistent encodings of the same document by different
ODA-compliant products.
Unlike ODA, SGML is widely supported by many applications, although it, too,
does not specify a particular encoding process. Nor, in fact, does SGML
provide a standard of semantics to encode documents. Rather it is a standard
of semantics to encode the semantics to encode documents. SGML-compliant
semantics are called document type definitions (DTDs). In order to interchange
revisable compound documents, two applications must comply with the same DTD,
of which there are hundreds. A developer may define his or her own DTD or use
one of many public DTD libraries, the most popular of which is CALS
(Computer-aided Acquisition and Logistics Support), a DOD specification
(MIL-M-28001).
Like ODA, CDA provides a consistent set of semantics with which to encode
documents. Like SGML, CDA is also widely implemented. Currently, CDA is
supported by over 200 applications from a list of over 50 vendors, including
Microsoft, Lotus, and WordPerfect. Unlike both ODA and SGML, CDA provides a
set of runtime services which obey CDA-complaint semantics and ensure
universally interchangeable documents.
CDA semantics are called DDIF (Digital Document Interchange Format). The
runtime services, appropriately called the CDA Run Time Services, translate
DDIF documents into a machine-readable format called Digital Document
Interchange Standard (DDIS), a subset of ASN.1 (ISO 8824) encoding. ASN.1 is
an ISO application-level binary format for information interchange between
heterogeneous computer systems. Currently, CDA Run Time Services are available
under VAX/VMS, RISC/Ultrix, OS/2, and Windows. (DDIF specifications and
documentation are available from Digital.)


Programming in DDIF


DDIF is like ASCII in two important respects. First, both models define
classes of objects (characters, in the case of ASCII) that are encoded
consistently, regardless of application or platform. Second, programmers use
both ASCII and DDIF to define objects within documents. The two models are not
equivalent, however, in that ASCII does not provide semantics that let
developers specify how objects are presented and laid out. As far as ASCII is
concerned, an ASCII "A" is an ASCII "A." But in DDIF, a Times 14 bold ASCII
"A" can consistently be a Times 14 bold ASCII "A." Furthermore, the DDIF
document might also define the "A" to be part of a higher-level object, such
as a paragraph or a title, and that maintains a specified position within a
still higher-level object, such as a frame or a page.
Developers typically define DDIF document objects using C, although other
high-level languages are also supported. Object definitions are specified
according to the semantics of DDIF. These blocks result in the ASN.1 (DDIS)
fields that are the primitives of object encodings. As such, they are
analogous to the 7-bit ASCII primitives that define characters.
There are two kinds of DDIF/DDIS primitives--aggregates and items. In each
document segment there are three kinds of aggregates: segment root, attribute,
and content. A segment is a part of a document (for instance, a footnote) that
has different structure, presentation style, or content from surrounding
parts. A segment is encoded as a linked list of aggregates attached to the
segment's root aggregate. A document is encoded as a list of segment
aggregates linked to the document's root aggregate.
Items are attached to aggregates. An item supplies either an attribute (to an
attribute aggregate) or content (to a content aggregate). An attribute
aggregate may have multiple items attached to it that define various
attributes of the segment's layout and presentation style, while a content
aggregate may only have content. Inheritance applies. An attribute value (say,
a font style) specified for a parent segment, applies to a child segment,
unless overruled.
Example 1(a) is one of several text items linked to Example 1(b), a content
aggregate, while Example 1(c) is one of several attributes (a galley frame)
linked to Example 1(d) , an attribute aggregate.
Example 1: Examples of DDIF primitives.

 (a)

 aggregate_item = DDIF$_TXT_CONTENT
 status = CdaStoreItem ( root_aggregate_handle
 aggregate_handle_stack [ahs_index],
 aggregate_item,
 txt_content_length,
 txt_content,
 0, 0)

 (b)
 aggregate_type = DDIF$_TXT;
 status = CdaCreateAggregate ( root_aggregate_handle,
 aggregate_type,
 &aggregate_handle_stack [ahs_index]


 (c)

 aggregate_item = DDIF$_SGA_FRM_POSITION_C:
 interger_value = DDIF$K_FRAME_GALLEY;
 status = CdaStoreItem ( root_aggregate_handle
 aggregate_handle_stack [ahs_index],
 aggregate_item,
 integer_length,
 & integer_value,
 0, 0)

 (d)

 aggregate_type = DDIF$_SGA;
 status = CdaCreateAggregate ( root_aggregate-handle
 aggregate_handle_stack [ahs_index],
 aggregate_item = DDIF$_TYD_ATTRIBUTES;
 status = CdaStoreItem ( root_aggregate_handle
 aggregate_handle_stack [ahs_index-1],
 aggregate_item,
 aggregate_handle_length,
 & aggregate_handle_stack [ahs_index],
 0, 0)

DDIF-capable modules process DDIF primitives in a way somewhat analogous to
that in which ASCII-capable modules process ASCII primitives. Program A, for
example, can process primitives written by Program B. What Program A decides
to do with those primitives is up to its developer. Program A may simply
recreate some or all of Program B's primitives on the local system. It may
combine Program A's primitives with those from other programs; or it may
simply scan the primitives for information.
The essential differences between the two models are as follows: An ASCII
primitive always represents just one class of object (content) and just one
particular category of content, a character. An ASCII primitive also contains
no information about how to present the content (for example, a font size) or
where the content belongs within a higher-order object, such as a table,
graphic, or document. A DDIF primitive can represent either content or
attributes (layout/presentation). In addition, the DDIF primitive provides
information, via the linked list, that allows a processing application to
understand and/or assert where a primitive belongs within a segment and
document. Figure 1 illustrates key differences between ASCII and DDIF. Note in
this figure that both models encode revisable documents from primitives that
are interchangeable between heterogeneous applications. ASCII primitives are
predetermined to be the 128 ASCII character set. DDIF primitives are defined
by C modules employing predetermined DDIF semantics.
The semantics that define particular DDIF primitives are interpreted by
routines within the CDA Run Time Services and are included in programs that
process the primitives. This is analogous to a compiler supplying ASCII 7-bit
codes to a program, so that the source code doesn't have to. Document content
may be defined as some literal content or as space assigned to hold the
literal content (as in Example 1). Definitions may also reference data
generated in other programs via external pointers called live links. When the
value of the linked item changes in a remote document (say, a spreadsheet),
that change is reflected in the local document (a bar chart, for example).
Programmers can define DDIF documents within their applications by handcoding
each and every primitive directly or by using scripting tools supplied with
CDA Run Time Services, called High Level Routines. As with any scripting
language, the High Level Routines streamline encoding by allowing programmers
to replace commonly used and lengthy routines with a single call. A few
examples provide a flavor of what it's like to define a DDIF document using
these routines. For instance, Listing One (page 101) shows the syntax that
would write the title at the top of Figure 2(a). The title is defined to be
some literal text that is black, highlighted, underlined, and centered at a
specific location on the page.
Note that each High Level Routine uses the hlr_ prefix. The color and font
style were taken from arrays created earlier in the program from which the
following statements were taken: fonts [0] fontname = "-Adobe-Helvetica-Bold-
R-Normal--*-100-*-*-p-*-*IS08859-1" and colors[3].red=0; colors[3].green=0;
colors[3].blue=0:. The three rectangles are drawn with the routine in Listing
Two (page 101).
The text in Figure 2(b) illustrates how to use DDIF to create galleys, which
specify an area of space in a document in which text is allowed to flow. For
example, a galley might contain a newspaper story, the contents of which
appear on various pages. In the figure, Galley 0 is the paragraph at the top
of the page. The other galleys are shown as nine boxes (bordered) at the
bottom. Galley 0 has no border. The syntax in Listing Three (page 101) defines
the galleys. The code in Listing Four (page 101) inserts text in each of the
galleys and employs a previously defined paragraph style, paragraph (2).
As these examples illustrate, defining a document is a straightforward
process, directly analogous to creating a document on a high-quality word
processor. First, the high-level structures (segments such as footnotes,
figures, galleys, and so forth) are declared. Next, the writer declares the
overall attributes of those objects (segment layout, fonts, color, and so on).
Finally the programmer provides (or points to) the content for the various
segments (artwork, text, images, and so on). Prior to document definition
itself, the developer may wish to create arrays that define font styles,
paragraph styles, colors, recurring text strings, or other features to be
referenced within the definition.
At some point, the thought of using ASCII to interchange revisable documents
may seem as obsolete as the thought of using character-cell terminals does now
for creating documents. When that happens, programmers will take for granted
that file primitives can be interchanged among heterogeneous applications
without necessarily belonging to a static set of predefined bit patterns.
Employing these new primitives requires standardized semantics and possibly
standardized runtime software. Given the promise of universal document
interchange, that's not a lot to get used to.

_COMPOUND DOCUMENTS_
by Lowell Williams


[LISTING ONE]

/* specify the page in the document */
setup_subtitle_page 1
{
/* Change the color to black and font to helvetica */
status = hlr_set_color (3)
status = font (0)

/* Turn on highlighting and underlining */
status = hlr_set_text_rendition(hlr_c-highlight);
status = hlr_set_text_rendition(hlr_c-underline);
/* draw a page title */
status = hlr_draw_text{
 1800, /* upper left x */
 12500 /* upper left y */
 1200*5 /* line length */
 300 /* text height */
 RSAMPLE TEST - continued R, /* text string */
 hlr_c_text_center /* center text */
 };

}






[LISTING TWO]


{
/* Set the line width */
status = hlr_set_line_width (25)
/* draw three rectangles */
status = hlr_draw_rectangle (3600, 8000, 5200, 9700);
status = hlr_draw_rectangle (3800, 8400, 5400, 10000);
status = hlr_draw_rectangle (4000, 8800, 5600, 10400);
return;
}






[LISTING THREE]


i = 0
status = define_galls (galls4, "gal0", i, 1200, 10800, 8400, 12000, "gal1",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal1", ++i, 1200, 5500, 3600, 7000, "gal",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal2", ++i, 3601, 5500, 6000, 7000, "gal3",
 hlr_c_default,hlr_c_no_border_on);
status = define_galls (galls4, "gal3", ++i, 5*1201, 5500, 7*1200, 7000,
"gal4",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal4", ++i, 1201, 4000, 3600, 5500, "gal5",
 hlr_c_default,hlr_c_no_border_on);
status = define_galls (galls4, "gal5", ++i, 3*1201, 4000, 5*1200, 5500,
"gal6",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal6", ++i, 5*1201, 4000, 7*1200, 5500,
"gal7",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal7", ++i, 1201, 2500, 3600, 4000, "gal",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal8", ++i, 3*1201, 2500, 5*1200, 4000,
"gal9",
 hlr_c_default, hlr_c_no_border_on);
status = define_galls (galls4, "gal9", ++i, 5*1201, 2500, 7*1200,
4000,"gal5a",
 hlr_c_default, hlr_c_no_border_on);






[LISTING FOUR]

int create_page_1
status = hlr_set_galley (4,0)

status = hlr_set_paragraph (2);
status = hlr_write_text ("this is the first paragraph of the first galley
 containing the first flow for this sample program. ");
status = hlr_write_text ("The text is justified and is displayed above the
 path-based text that is displayed above the table.");
status = hlr_set_new_ galley(1)
status = hlr_set_paragraph (2);
for (i=1; i < 9; i++) {
sprintf (temp1 , "%s%d%s", "This is flow ",i,". ");
status = hlr_write_text (temp1);
/* change galleys */
if (i < 9)
 status = hlr_set_new_galley (1)
 status = hlr_set_paragraph (2)
 }




Example 1: Examples of DDIF primitives


(a)

aggregate_item = DDIF$_TXT_CONTENT
status = CdaStoreItem ( root_aggregate-handle
 aggregate_handle_stack[ahs_index],
 aggregate_item,
 txt_content_length,
 txt_content,
 0, 0)

(b)

aggregate_type = DDIF$_TXT;
status = CdaCreateAggregate ( root_aggregate_handle,
 aggregate_type,
 &aggregate_handle_stack[ahs_index]


(c)

aggregate_item = DDIF$_SGA_FRM_POSITION_C:
integer_value = DDIF$K_FRAME_GALLEY;
status = CdaStoreItem ( root_aggregate_handle
 aggregate_handle_stack[ahs_index],
 aggregate_item,
 integer_length,
 & integer_value,
 0, 0)

(d)

aggregate_type = DDIF$_SGA;
status = CdaCreateAggregate ( root_aggregate-handle
 aggregate_handle_stack[ahs_index],
aggregate_item = DDIF$_TYD_ATTRIBUTES;
status = CdaStoreItem ( root_aggregate_handle
 aggregate_handle_stack[ahs_index-1],

 aggregate_item,
 aggregate_handle_length,
 & aggregate_handle_stack[ahs_index],
 0, 0)


























































March, 1993
DESIGNING COMPLEX DATACENTRIC APPLICATIONS


HyperChem brings client/server architecture to computational chemistry




Paul Bonneau


Paul is the former director of software development for Hypercube Inc. He
writes the "Windows Q&A" column for the Windows/DOS Developer's Journal and
can be contacted through the DDJ offices.


HyperChem is a software tool for designing, visualizing and analyzing
molecular structures. This molecular modeling tool runs on 386 PCs (and
better) under Microsoft Windows and on Silicon Graphics workstations. With
HyperChem (which is marketed by Autodesk), you can create three-dimensional
atomic structures, visualize and manipulate their structural relationships,
and perform classical and semi-empirical quantum mechanical calculations.
HyperChem is implemented in about 500,000 lines of C code. This article
focuses on some of the key architectural features necessary in a product of
this complexity, and presents one of the algorithms for processing a data
structure representing atoms and molecules. Hopefully, the strategies
discussed here are transferable to the design and implementation of other
large GUI systems, such as CAD and IC design systems.


Top-level Architecture


HyperChem is implemented as a client/server system. The client, or front end,
presents the Windows- or Motif-based user interface. The server, or back end,
performs scientific computations such as molecular-dynamics simulations.
There are several benefits to this architectural strategy. By using a
well-defined communication protocol, HyperChem is not restricted to providing
the user with only one type of computation. Different back ends can be
substituted for particular computational requirements. Four back ends are
included with release 2.0.
The Windows version uses dynamic data exchange (DDE) for interprocess
communication; the UNIX-hosted SGI package uses sockets. Because the
communication protocol is independent of the communication medium, a future
version of the Windows product can utilize an SGI-hosted back end in the
presence of a TCP/IP network. Thus, a network of PCs will be able to utilize
the raw computational power of an expensive workstation.
Two of the back ends utilize parallel algorithms. On multiple-processor SGI
machines, a back end can be split into multiple processes, one per processor,
with a speedup nearly proportional to the number of processors. A typical
calculation will run nearly twice as fast on a dual-processor machine as on a
single-processor machine. Because Windows NT supports multiprocessor machines,
a future Windows NT version will implement a similar scheme. Thus, the
client/server model allows the back ends to be scalable.
Another benefit of this client/server design is it allows the unique
computational needs of the client and the server to be addressed. The back end
benefits from maximum utilization of the processor and memory. The front end
requires a tight coupling to the GUI windowing system (X/Motif for the SGI
product) to achieve snappy user response.
We used Watcom's 32-bit Windows compiler (C8.5 and now C9.0) in implementing
the back ends for the Windows version. As a result, the back ends run an
average of four times faster than when compiled with a 16-bit compiler. Using
32-bit code in 16-bit Windows does have its drawbacks. The big problem is
that, in general, Windows 3 does not know how to handle USE32 segments. Thus,
all data must either be copied to a USE16 segment, or to the bottom 64K of a
USE32 segment, when copying data to or from a 32-bit application. If the
frequency of calls to the Windows API is high, the benefits of 32-bit code are
lost due to the overhead of the 16/32-bit mapping layer. In fact, the Windows
front end is implemented as 16-bit code. A native 32-bit environment such as
Windows NT does not suffer these problems when running a 32-bit application.
In general, a 32-bit front end running on Windows NT runs slightly faster than
a 16-bit front end on Windows 3.1. This is significant, since at the time of
this writing the overhead of the Windows NT operating system is greater than
that of Windows 3.1.


External Interfaces


In addition to the user interface supplied by the front end, HyperChem also
exposes an ASCII-string scripting interface. The interface is divided into
menu equivalents, state variables, and commands. Menu equivalents use a naming
scheme to correspond to each HyperChem menu item. For example, the script-menu
equivalent for File/New is menu-file-new. State variables provide an interface
for examining and/or manipulating the state of HyperChem. The script interface
allows a user to set or query a state variable. State variables have read and
write attributes; current-file-name is such a read-only variable--you must
actually read a file to change the current filename. In addition to the set of
static hardcoded state variables, there are also dynamic ones. For instance,
atom-count refers to a particular molecule, because multiple molecules can be
present in the system. Therefore, atom-count is qualified with the index of a
molecule. Changing the value of a writable state variable will cause HyperChem
to perform the appropriate action to make sure the internal state remains
consistent. For example, setting the state variable show-stereo will cause
HyperChem to immediately switch to stereo viewing mode.
Finally, script commands cause HyperChem to take an action that is not
expressible by the change of a single state variable, but frequently has a UI
analogue. An example is the command add-amino-acid, which behaves as if the
user pressed an amino-acid push button on the Amino Residues dialog box. The
result is the creation of a new amino-acid residue, attached to a growing
chain if present. Several state variables are affected by the change,
including the number of atoms, the total weight of the system, and the state
of the newly created atoms.
One important command does not have a user-interface equivalent.
Notify-on-update takes as an argument the name of a state variable, and
instructs HyperChem to generate a message whenever the value of that variable
changes. So notify-on-update atom-count will cause a message to be generated
each time the number of atoms in the system is changed, either through user
interaction, an executing script, or another application talking to HyperChem
via DDE or sockets. Because DDE already has this concept embedded in the DDE
Request message, notify-on-update is implemented with WM_DDE_REQUEST for the
Windows product. HyperChem can support multiple simultaneous conversations
with client applications. By using the notify-on-update, arbitrarily complex
systems can be built using two or more applications, with HyperChem mediating
the flow of information among them.


Internal Architecture


Internally, HyperChem is a datacentric application. Because it is presenting a
model of chemical systems, at its heart lies a representation of that data in
the form of a tree of undirected graphs. The tree maintains a hierarchy of
objects, with each level containing unique properties and acting as a
container for the level below it. There are four levels to this hierarchy.
The atom class is at the lowest level. Atom properties include coordinates,
atomic number (for example, 6 for carbon), electrical charge, velocity (this
is used for molecular dynamics calculations where the user perceives the
system in motion), the number of other atoms to which it is chemically bonded
(neighbors), and a neighbor list.
The second level is the residue class. A residue consists of at least one
atom, and provides a way to group atoms into a repetitive unit. Currently,
HyperChem supports two predefined residue types, amino-acid and nucleic-acid
residues (the building blocks of proteins and DNA, respectively). However, the
user can specify user-defined residue groups via an ASCII template file.
Because residues are commonly chained together, an important residue property
specifies the angles to use to connect a residue to its residue neighbor.
The third level contains molecule objects. A molecule is composed of at least
one atom and residue. Every neighbor of an atom of a given molecule must be a
member of the same molecule. Another way of saying this is that no atom of a
molecule has a neighbor in a different molecule. A strict hierarchy has not
been implemented, since a molecule can have both atoms and residues as
descendants. A molecule is a very simple object that has a name, an atomic
weight (the combined mass of all atoms it contains), and a count of the number
of its descendants that are residues.
At the very top of the tree is the system class. A system is a collection of
molecules, and its main property is the overall temperature of all the
molecules. Usually there is only one system present at one time. However,
during operations such as loading a new system from a file, the new system is
created as a sibling of the current system. Only after the entire file has
been successfully loaded is the old system deleted and replaced with the new
one.
The tree provides a convenient and natural way to contain and access the
really important data, the atoms. Each atom contains a list of its neighbors;
a collection of atoms represents an undirected graph. So each molecule
contains a graph, and a system can be thought of as a collection of graphs.
Because molecules in nature frequently contain rings of atoms, these graphs
are not acyclic; they cannot generally be represented by trees.
This leads to interesting graph-manipulative algorithms inside HyperChem. As
an example, I recently had to write an import dynamic link library (DLL)
(similar to the use of DLLs by Word for Windows in importing foreign file
formats) to read a foreign molecule file format. This particular file format
consisted of a list of atoms and a list of bonds, where each bond is
represented as a pair of atoms. What is missing is a way of collecting those
atoms belonging to unique molecules, so one of the tasks of the DLL was to
group atoms into molecules. One way to approach this is to grow a spanning
tree. Such a tree will contain every atom of a molecule.
This method can be used to implement a molecule-grouping routine, as follows.
For each atom, test to see if it has been marked as belonging to a molecule.
If so, skip it and move on to the next atom. If not, grow a spanning tree with
this atom as the root of the tree, and mark each atom of the tree with a new
molecule identifier. The challenge is to write the code to generate a spanning
tree that is efficient in both execution speed and memory utilization. Listing
One (page 102) presents a sanitized version of the code that does this. It
implements a depth-first, nonrecursive search.
It is very important to avoid recursion in the Windows environment because the
theoretical limit of a stack is 64K. (In practice it is much less than this.)
Depending on the size of the stack frame and the size of the molecule being
searched, a recursive solution could exceed the size of the stack. In the
interest of clarity, the Walk() routine of Listing One accepts an array of
atoms to walk (instead of a system node). There are numerous other interesting
algorithms for dealing with the atom tree, but these lie outside the scope of
this article.

_DESIGNING COMPLEX DATACENTRIC APPLICATIONS_
by Paul Bonneau


[LISTING ONE]

/**************************************************************************

 MARK.C -- determines the molecules in an array of atoms
***************************************************************************/
#define catmMax 6 /* Maximum number of neighbors. */
#define NULL 0

typedef struct atm
{
 struct atm * patmParent; /* Parent atom during walk. */
 struct atm * patmRoot; /* Root atom. */
 int catm; /* Number of of neighbors. */
 struct atm * rgpatm[catmMax]; /* Neighbor list. */
} ATM; /* AToM. */
/* Prototypes */
void Mark(ATM *, int);
ATM * PatmNextChild(ATM * rgatm, ATM * patm);
void Walk(ATM *, ATM *);

/***************************************************************************/
/* void Mark(ATM * rgatm, int catm) */
/* Mark each atom in the array with its parent molecule. */
/* Upon return, each atom will point to the root atom of the molecule */
/* containing it via the patmRoot field. */
/* rgatm : The array of atoms to mark. */
/* catm : The number of atoms in the array. */
/***************************************************************************/

void Mark(ATM * rgatm, int catm)
{ ATM * patmLim = rgatm + catm; /* The limit of the array. */
 ATM * patm; /* An array iterator. */

 /* First, make sure the mark fields are zeroed out. */
 for (patm = rgatm; patm < patmLim; patm++)
 patm->patmParent = patm->patmRoot = NULL;

 /* Now mark the atoms. If an atom has already been marked, skip it. */
 /* If an atom has not been marked, then start a new molecule, and */
 /* grow a spanning tree and mark each atom in the tree with the new */
 /* molecule. We conveniently use the first atom as the root to mark */
 /* all atoms in the tree with. */
 for (patm = rgatm; patm < patmLim; patm++)
 if (patm->patmRoot)
 Walk(rgatm, patm);
}
/**********************************************************************/
/* void Walk(ATM * rgatm, ATM * patm) */
/* Mark all atoms in this molecule with the root atom. */
/* Depth first non-recursive marking. */
/* rgatm : Array of all atoms. */
/* patm : First atom in molecule. */
/**********************************************************************/
void Walk(ATM * rgatm, ATM * patm)
{ ATM * patmRoot;

 patmRoot = patm->patmRoot = patm; /* Remember root atom. */

 /* Mark the parent of the root atom with a special value, so we can */
 /* detect it is the root atom when encountered during the walk. */
 patm->patmParent = (ATM *)-1;


 for (;;) /* Walk the tree. */
 {
 ATM * patmNext;
 /* Given the current atom, get the next one to visit. */
 if (!(patmNext = PatmNextChild(rgatm, patm)))
 {
 /* No more kids, move back to parent. If parent is */
 /* the root, stop. */
 if ((patm = patm->patmParent) == (ATM *)-1)
 break; /* We're done. */
 }
 else
 {
 patm = patmNext; /* Move to "child". */
 patm->patmRoot = patmRoot; /* Mark child. */
 }
 }
}
/**********************************************************************/
/* ATM * PatmNextChild(ATM * rgatm, ATM * patm) */
/* Return the next unvisited neighbor. */
/* Return null if all neighbors have been visited. */
/* rgatm : Array of all atoms. */
/* patm : Atom to find an unvisited neighbor of. */
/**********************************************************************/
ATM * PatmNextChild(ATM * rgatm, ATM * patm)
{ ATM ** ppatmLim; /* Limit of atom's neighbor list. */
 ATM ** ppatm; /* Neighbor list iterator. */

 /* Loop over all neighbors of this atom, looking for an unmarked one. */
 ppatmLim = patm->rgpatm + patm->catm;
 for (ppatm = patm->rgpatm; ppatm < ppatmLim; ppatm++)
 if (!(*ppatm)->patmParent)
 {
 (*ppatm)->patmParent = patm;
 return *ppatm;
 }
 return NULL;
}























March, 1993
A DOS REDIRECTOR FOR SCSI CD-ROM


Putting the pieces together


 This article contains the following executables: CDROM.ARC


Jim Harper


Jim has a BSEE and an MBA from Union College in Schenectady, New York. He is a
member of the technical staff at Sun Microsystems' Rocky Mountain Technology
Center in Colorado Springs, Colorado. You can reach Jim via the internet at
james.harper@sun.com or through CompuServe at 72440,171.


The article, "Inside the ISO 9660 Filesystem Format," by Bill and Lynne Jolitz
(DDJ, December 1992) described how information is stored on a CD-ROM using the
ISO-9660 format. However, the ISO-9660 file format is just one aspect of the
entire process of getting the data from the CD-ROM disc over to where your
application can use it. On the PC/DOS platform, this is a somewhat complicated
affair, involving multiple software components that sometimes interact in
obscure ways.
This article describes how this process happens under DOS and presents the
code for an MSCDEX-like extension to DOS (a redirector) that allows access to
either High Sierra or ISO-9660 CD-ROMs. My redirector works in conjunction
with a TSR-based driver for SCSI devices which I also wrote and whose code is
presented here.
My SCSI TSR supports the Seagate ST-01 host adapter, a dumb but inexpensive
(about $35.00 retail) SCSI adapter card, and can be modified to support other
adapters. The SCSI TSR is independent of, and loads separately from, the
redirector. The two communicate via the INT 2F software interrupt. The
redirector is device independent; to work with another host adapter, you just
need to write a TSR that uses the same INT 2F protocol. In addition, I've also
included a small C program that exercises the SCSI TSR so you can read any
sector on the CD-ROM, displaying it in hex and ASCII format. These various
software components total about 2000 lines of code. The complete code is
available electronically (see "Availability," page 5), and excerpts are shown
in Listings One and Two (discussed in the following sections).
Although I wrote these programs some time ago--way before the current
multimedia explosion--to my knowledge no complete system has been published
anywhere. It's one thing to write code that follows the SCSI spec or the
ISO-9660 spec, and quite another to glue all the pieces together into a
working whole. The code here works under DOS 3.3, 4.0, and 5.0.


About Redirectors


MS-DOS has built-in support for networks, via its redirector protocol, and you
can use this feature to install a CD-ROM file system as a "foreign" file
system. The protocol is seamless enough such that if you run Microsoft
Windows, the redirected CD-ROM drive will appear as a network drive.
Several layers of software mediate between the data on CD-ROM and the point
where it gets used by your application. Looking at these layers from the top
down, the first layer is where your application code makes an open() or read()
call on your system's CD-ROM drive, say, drive K:. This library call turns
into an INT 21 request for DOS services. In DOS versions 3.x, 4, and 5, the
operating system checks if the network bit is set for that particular drive
letter. If so, the drive is not one of your conventional hard drives, but is
either a network device, CD-ROM, or other source of data. At this point, a
redirector can take over. The redirector does the necessary magic to make the
network or foreign file system appear as DOS-like as possible. The redirector
uses a lower-level component--in my case a TSR--to communicate with the
physical device, using the appropriate protocol--in my case, SCSI.
The best known redirector is Microsoft's MSCDEX, created as part of
Microsoft's big push for CD-ROM a few years ago. Even though data is stored in
High Sierra or ISO-9660 format on the CD-ROM drive, MSCDEX allows it to appear
as a DOS file system to your application program.
When a drive is declared as a network drive (by setting the network flag bit
in DOS corresponding to the drive we want to redirect), DOS handles high-level
I/O requests differently than for native drives. All the details for opening,
reading, writing, searching and parsing pathnames, and so on, are handed off
to the redirector via the "Multiplex Interrupt," a fancy name for INT 2F. Your
redirector must hook into this 2F chain, examining all interrupts, looking for
those in which AH equals 11H. In such cases, AL contains a value identifying a
specific redirector function. For example, a value of 15H in AL represents an
OPEN system call. In my code, the header file REDIR.H contains #defines for
most of the system calls for redirectors. A redirector signifies an error to
DOS by returning with the carry bit set in the flag register, and with an
error code in AX.
To successfully implement a redirector, you need to access some undocumented
structures within DOS. The important one is the List of Lists (LoL), which
points to other essential structures, such as the Current Directory Structure
(CDS) array, the System File Table (SFT), and the Last Drive variable. (For
more about the CDS array, see the letter entitled "Delving into Drive Paths"
in the February 1990 DDJ.) I also use the DOS Transient Area (DTA) and a
filename buffer (FN1) within DOS. In the early days of DOS, these undocumented
structures were mysterious, but in recent years they have been well-explored
by a number of authors. For further information, I recommend reading Chapter 4
of Undocumented DOS by Andrew Schulman et al. (Addison Wesley, 1990).
My redirector isn't a complete MSCDEX replacement, because MSCDEX supports
additional functions such as the playing of audio tracks (there are special
calls for music start/stop and so on). My redirector limits itself to file
redirection only. CD-ROM applications that test if MSCDEX is present via the
"test for presence" call will not work here. Actually, if your CD-ROM
application works with audio data in the form of digitized DOS files, as
opposed to the standard CD-ROM audio format, then your application should work
fine. There are a few applications that rely on standard CD-ROM audio tracks
but will still manage to work without MSCDEX. For example, the "Mammals"
CD-ROM from the National Geographic Society works in this manner: If MSCDEX is
absent, the program still functions, but does not allow you to select the
audio portions.


About ISO-9660


ISO-9660 is a hierarchical file system and therefore not totally different
from DOS. The file-system hierarchy is anchored by a "volume descriptor,"
which is usually at block 16 (10H) on the disc. Copies of this block (for
redundancy) may also be present. The volume descriptor contains, among other
things, the volume name, the date, and the location of the root directory.
Directories and files are almost always contiguous, for performance reasons. A
directory consists of a sequence of directory records, as many as will fit in
a sector. Each directory record consists of fields specifying record length,
file size, flag byte, starting block, date, and filename (referred to as a
File ID). Each multibyte integer such as file size is stored twice, once in
the MSB and again in LSB. Macintosh programmers will be pleased to note the
existence of the ASSOCIATED FILE bit in the flag byte of a directory entry;
this bit supports bifurcated files (files consisting of both a resource fork
and a data fork).
For more detail, see my header file CD-ROM.H in the electronic version of the
listings. This file contains C structures defining ISO-9660 and High Sierra
directory entries.


The Redirector Code


The file CD-ROM.C can be divided into roughly three parts. Part one is main(),
where the TSR installs itself into the multiplex interrupt chain (INT 2F),
finds the required structures within DOS using undocumented DOS calls, and
makes itself resident. Part two is New2F(), where a check is made to handle
the call or pass it on. The third part is everything else, where most of the
real work is done.
Listing One (page 103) shows main() and New2F(). The main() function first
collects the command argument specifying which drive to redirect, saving it in
a global for access by the interrupt handler New2F(). It then checks that the
SCSI TSR is loaded and reads block 10H, the volume descriptor. Then it looks
for the string "CDROM" or "CD001" within the descriptor, identifying a High
Sierra or ISO-9660 disk, respectively. Depending on the format found, it sets
certain important offset values. Next, the program gets the address of the LoL
and uses it to set pointers to other structures, such as the CDS array and the
SFT.
Then, main() sets a bit within the CDS which turns on the redirector, so
New2F() will get drive calls normally handled by DOS. Last, it initializes its
own structures, closes all open file handles, hooks into the 2F interrupt
vector, and calls DOS to terminate and stay resident.
After main() has completed its work, New2F() is part of the multiplex
interrupt chain. When called, New2F() checks if the interrupt is a redirector
call, signified by AX=11xx. If not, control is passed to the next interrupt
handler in the chain. If New2F() is supposed to handle the interrupt, it
switches to its own stack and calls particular subroutines, based upon the
value in AL. These different subroutines handle such file-system functions as
Open, Close, Read, Seek, Change Directory, Find First, and Find Next. In each
case, the data on the CD-ROM disc must be navigated, and a reasonable mapping
made so that this data is intelligible to DOS. These functions call
lower-level routines to read and write to the SCSI device. For example, the
DoRead() routine calls ScsiRead(), sets up a SCSI command descriptor block
(CDB) for the SCSI read command (CDBs are part of the SCSI protocol), and then
uses INT 2F to communicate with the TSR-based device driver (discussed in the
following section). When the DoRead() subroutine finishes, New2F() switches
stacks back and returns.
Because multiple redirectors may be installed (as in the case of a network),
each subroutine must determine in its own way (usually by checking the global
DriveNo) whether the call is for itself or for another redirector.


The SCSI Transport Layer


SCSI is an interface standard defined in 1986 by the American National
Standards Institute (ANSI). This standard evolved from an earlier effort by
Seagate to develop a "high-level" interface for transferring data between CPUs
and disk drives. Although in use on the Macintosh and Sun platforms for a long
time SCSI has only recently begun to take off on the PC platform, supplanting
ESDI in the marketplace as the high-end interface of choice. Eventually, even
IDE may be pushed out of the mainstream by the versatility of SCSI, which
allows scanners, printers, tape drives, and even graphics terminals to be
connected simply and at high bandwidths. Newer flavors of SCSI include SCSI-2
and SCSI-3.
I implemented the SCSI transport layer using a TSR called ST01.C. My code
manipulates the hardware directly, setting up the CDBs required by the SCSI
protocol, and then reading and writing memory-mapped registers specific to the
Seagate ST01 SCSI host adapter. If you know SCSI, you'll understand the
comments. It is beyond the scope of this article to delve into the details of
the SCSI protocol, defined in ANSI document X3.131-1986, "Small Computer
System Interface," available from ANSI. A good introduction is the slim volume
SCSI by NCR Corp. (Prentice Hall, 1990). Basically, hosts (CPUs) and
peripheral devices (such as CD ROMs or scanners) follow a bus-oriented
protocol broken up into a series of phases, beginning with the bus-free phase,
and moving to the arbitration phase, the selection phase, and finally the
command phase. During the command phase, a CDB is sent from host to
peripheral. A command can be rather powerful, such as "format the device" or
"read 1000 blocks." In my implementation, I use only the most basic commands.
The heart of the module is the Do-Scsi() routine, shown in Listing Two (page
104). Basically, it tracks the dialogue between host and peripheral through
the various phases, sending the bytes in the CDB during the COMMAND phase, and
transferring data bytes during the DATAIN or DATAOUT phases. You may notice my
optimization in the data-transfer portion, in which inline assembler is used
to maximize speed.



Conclusion


Each of the subject areas in this article--redirectors, CD-ROM file systems,
TSRs, and SCSI protocols--can consume many hours of study. The working code
I've presented here should give you a head start on your own projects.

_A DOS REDIRECTOR FOR SCSI CD-ROM_
by Jim Harper


[LISTING ONE]

/******************************************************************************
 * CDROM.C -- by Jim Harper (EXCERPTED LISTING)
 * A CD-ROM redirector for High Sierra and ISO 9660 disks.

*****************************************************************************/

/*...#include directives removed...*/
#define SetC(X) (X) = 0x01
#define ClrC(X) (X) &= ~0x01

extern unsigned _psp, /* Runtime gives us these variables */
 end;
char *IOBuf; /* I/O Buffer ptr */

/* Table of saved open SystemFileTable's (SFT's) for DoCloseAll() */
struct SFT _far *CloseTab[MAXCLOSEALL];
unsigned StkSeg, DataSeg,
 DriveNo, DriveFlags,
 TsrStkSeg, TsrStkPtr,
 AppStkSeg, AppStkPtr,
 CDType, FIDoff,
 Nameoff, Dateoff,
 Flagsoff, Blkoff,
 Sizeoff, BlkSize,
 ChainFlag,
 MyStack[STACKSIZE];
unsigned _AX,_BX,_CX,_DX,_DS,_ES,_DI,_FLAGS;
struct IntRegs {
 unsigned ES; unsigned DS;
 unsigned DI; unsigned SI;
 unsigned BP; unsigned SP;
 unsigned BX; unsigned DX;
 unsigned CX; unsigned AX;
 unsigned IP; unsigned CS;
 unsigned FLAGS;
};
int Active = 0;
struct isoVolDesc *isoVolDescp;
struct isoDirRec *isoDp;
struct hsVolDesc *hsVolDescp;
struct hsDirRec *hsDp;
struct Cmd Cmd;
struct DirEnt RootEnt,
 DirCache[CACHESIZE];
 /* Important pointers */
struct SDB _far *SDBp; /* Ptr to Dos Search Data Blk */
struct FDB _far *FDBp; /* Ptr to Dos Found Data Blk */
struct LOL _far *LOLp; /* Ptr to list of lists */

struct CDS _far *CDSp; /* Ptr to cur dir tab entry */
/* These pointers are set according to DOS 3.xx or 4.xx */
char _far *SWAPp, /* Ptr to Dos swap area */
 _far *FN1p, /* Ptr to Dos resolved name */
 _far * _far *DTApp, /* Ptr to Ptr to Current DTA */
 _far * _far *SFTpp, /* Ptr to Ptr to Current SFT */
 _far *DosDp, /* Ptr to dir ent for file */
 _far *Sattrp, /* Ptr to search attr */
 _far *OpenModep; /* Ptr to open mode */
unsigned _far *PSPp; /* Ptr to current PSP */
char *HiSierra = "HISIERRA ",
 *Iso9660 = "ISO9660 ";
/*...function prototypes removed...*/
/***********************************************************************/
main(int argc, char *argv[])
{ union REGS regs;
 struct SREGS sregs;
 unsigned _far *EnvBlkp;
 int i, Junk, CdsLen, ProgSize;
 DriveNo = 999;
 if (argc > 1) {
 if (!strcmp(argv[1],"-u") ! strcmp(argv[1],"-U")) {
 regs.h.ah = 0x11;
 regs.h.al = DEINSTALL;
 int86x(INT2F,&regs,&regs,&sregs);
 exit(0);
 }
 if (argv[1][0] >= 'A' && argv[1][0] <= 'Z' &&
 argv[1][1] == ':' && argv[1][2] == '\0') {
 DriveNo = argv[1][0] - 'A';
 }
 if (argv[1][0] >= 'a' && argv[1][0] <= 'z' &&
 argv[1][1] == ':' && argv[1][2] == '\0') {
 DriveNo = argv[1][0] - 'a';
 }
 }
 if (DriveNo > 26) {
 MsgOut("usage: cdrom [A-Z]:\r\n");
 exit(1);
 }
 segread(&sregs); /* Get our stack and data segments */
 StkSeg = sregs.ss;
 DataSeg = sregs.ds;
 regs.h.ah = FUNCID; /* Check if SCSI TSR is present */
 regs.h.al = INSTALLCHK;
 int86x(INT2F,&regs,&regs,&sregs);
 if (regs.h.al != 0xff) {
 MsgOut("Scsi tsr not found!\r\n");
 exit(1);
 }
 /* Check if there's a High Sierra or ISO9660 disk in the drive. */
 if (ScsiRead(0x10L)) {
 MsgOut("IO error.\r\n");
 exit(1);
 }
 hsVolDescp = (struct hsVolDesc *) IOBuf;
 isoVolDescp = (struct isoVolDesc *) IOBuf;
 CDType = UNKNOWN;
 Blkoff = 2;

 Sizeoff = 10;
 Dateoff = 18;
 FIDoff = 32;
 Nameoff = 33;
 strcpy(RootEnt.FName,"ROOT-CDROM ");
 if (strncmp(hsVolDescp->ID,"CDROM",5) == 0) { /* it's High Sierra */
 CDType = HIGHSIERRA;
 Flagsoff = 24;
 hsDp = (struct hsDirRec *)hsVolDescp->DirRec;
 RootEnt.Fattr = _A_SUBDIR;
 RootEnt.FTime = ToDosTime(hsDp->Date);
 RootEnt.FDate = ToDosDate(hsDp->Date);
 RootEnt.BlkNo = hsDp->ExtLocLSB;
 RootEnt.FSize = hsDp->DataLenLSB;
 RootEnt.ParentBlk = hsDp->ExtLocLSB;
 BlkSize = hsVolDescp->BlkSizeLSB;
 MsgOut("High Sierra disk...\r\n");
 }
 if (strncmp(isoVolDescp->ID,"CD001",5) == 0) { /* it's ISO 9660 */
 CDType = ISO9660;
 Flagsoff = 25;
 isoDp = (struct isoDirRec *)isoVolDescp->DirRec;
 RootEnt.Fattr = _A_SUBDIR;
 RootEnt.FTime = ToDosTime(isoDp->Date);
 RootEnt.FDate = ToDosDate(isoDp->Date);
 RootEnt.BlkNo = isoDp->ExtLocLSB;
 RootEnt.FSize = isoDp->DataLenLSB;
 RootEnt.ParentBlk = isoDp->ExtLocLSB;
 BlkSize = isoVolDescp->BlkSizeLSB;
 MsgOut("ISO 9660 disk...\r\n");
 }
 if (CDType == UNKNOWN) {
 MsgOut("Unknown disk type..\r\n");
 exit(1);
 }
 regs.h.ah = 0x52; /* Get Address of List of Lists */
 int86x(0x21,&regs,&regs,&sregs);
 FP_SEG(LOLp) = sregs.es;
 FP_OFF(LOLp) = regs.x.bx;
 regs.x.ax = 0x5d06; /* Get address of Dos Swap area */
 int86x(0x21,&regs,&regs,&sregs);
 FP_SEG(SWAPp) = sregs.ds;
 FP_OFF(SWAPp) = regs.x.si;
 if (DriveNo > LOLp->LastDrive) {
 MsgOut("Drive # to high.\r\n");
 exit(1);
 }
 MsgOut("DOS version "); ToHex(_osmajor); MsgOut("."); ToHex(_osminor);
 /* Now set the offsets within Dos according to 3.3x, 4.xx or 5.xx */
 if ( _osmajor == 3 && _osminor >= 30) {
 CdsLen = 0x51;
 PSPp = (unsigned _far *)(SWAPp + 0x10U);
 FN1p = SWAPp + 0x0092U;
 Sattrp = SWAPp + 0x023aU;
 DosDp = SWAPp + 0x01a7U;
 SDBp = (struct SDB _far *) (SWAPp + 0x0192U);
 DTApp = (char _far * _far *)(SWAPp + 0x000cU);
 SFTpp = (char _far * _far *)(SWAPp + 0x0268U);
 } else if (_osmajor == 4 _osmajor == 5) {

 CdsLen = 0x58;
 PSPp = (unsigned _far *)(SWAPp + 0x10U);
 FN1p = SWAPp + 0x009eU;
 Sattrp = SWAPp + 0x024dU;
 DosDp = SWAPp + 0x01b3U;
 SDBp = (struct SDB _far *) (SWAPp + 0x019eU);
 DTApp = (char _far * _far *)(SWAPp + 0x000cU);
 SFTpp = (char _far * _far *)(SWAPp + 0x027eU);
 } else {
 MsgOut("Not DOS 3.3x, 4.xx or 5.xx\r\n");
 exit(1);
 }
 /* Cast ptr to table entry pointer */
 CDSp = (struct CDS _far *) (LOLp->CDS + DriveNo * CdsLen);
 DriveFlags = CDSp->Flags; /* Turn on network & physical bits */
 CDSp->Flags = 0xC000;
 CDSp->RootOff = 2;

 CDSp->CurDir[0] = 'A' + DriveNo; /* Set to root */
 CDSp->CurDir[1] = ':';
 CDSp->CurDir[2] = '\\';
 CDSp->CurDir[3] = '\0';

 for (i = 0; i < STACKSIZE; i++) /* Initialize our stack */
 MyStack[i] = 0x4141;
 TsrStkSeg = DataSeg; /* Initialize stack and bottom of program. */
 TsrStkPtr = (unsigned)&MyStack[STACKSIZE];
 if (TsrStkPtr & 0x1U) /* Make sure stack is on a word boundry */
 TsrStkPtr--;
 /* Program size in paragraphs w/o a heap */
 ProgSize = (StkSeg + (((unsigned)&end) >> 4)) - _psp + 1;
 for (i = 1; i < CACHESIZE - 1; i++) { /* Initialize cache */
 DirCache[i].Forw = &DirCache[i+1];
 DirCache[i].Back = &DirCache[i-1];
 }
 DirCache[0].Forw = &DirCache[1];
 DirCache[0].Back = &RootEnt;
 DirCache[CACHESIZE-1].Forw = &RootEnt;
 DirCache[CACHESIZE-1].Back = &DirCache[CACHESIZE-2];

 /* Root dirent provides anchor into the cache */
 RootEnt.Forw = &DirCache[0];
 RootEnt.Back = &DirCache[CACHESIZE-1];
 /* Close files */
 _dos_close(0); /* stdin */
 _dos_close(1); /* stdout */
 _dos_close(2); /* stderr */
 Old2F = _dos_getvect(INT2F); /* Grab multiplex interrupt */
 _dos_setvect(INT2F,New2F);
 FP_SEG(EnvBlkp) = _psp; /* Free the environment */
 FP_OFF(EnvBlkp) = 0x2c;
 _dos_freemem(*EnvBlkp);
 _dos_setblock(ProgSize,_psp,&Junk); /* Shrink our program size */
 _dos_keep(0,ProgSize); /* TSR ourself */
}
/***** New2F(struct IntRegs IntRegs) -- our interrupt 2F handler. *****/
void _interrupt _far New2F(struct IntRegs IntRegs)
{
 /* See if we handle this function */

 if ((IntRegs.AX >> 8U) != 0x11 Active)
 _chain_intr(Old2F);
 if ((IntRegs.AX & 0xff) == INSTALLCHK) { /* Install check?? */
 IntRegs.AX = 0x00ff;
 return;
 }
 Active++; /* Set flag saying we're active */
 ChainFlag = 0; /* Don't chain out by default */
 /* Save needed regs from stack */
 _AX = IntRegs.AX; _BX = IntRegs.BX;
 _CX = IntRegs.CX; _DX = IntRegs.DX;
 _DS = IntRegs.DS; _ES = IntRegs.ES;
 _DI = IntRegs.DI; _FLAGS = IntRegs.FLAGS;
 _asm /* Switch to own stack */
 {
 cli ; Interrupts off
 mov WORD PTR AppStkPtr,sp ; Save app stack
 mov WORD PTR AppStkSeg,ss
 mov sp,WORD PTR TsrStkPtr ; Load new stack
 mov ss,WORD PTR TsrStkSeg
 sti ; Interrupts on
 }
 switch(_AX & 0xff) /* handle the command */
 {
 case DEINSTALL: DeInstall(); break;
 case CHDIR: DoChDir(); break;
 case CLOSE: DoClose(); break;
 case READ: DoRead(); break;
 case GETSPACE: DoGetSpace(); break;
 case GETATTR: DoGetAttr(); break;
 case OPEN: DoOpen(); break;
 case FINDFIRST: DoFindFirst(); break;
 case FINDNEXT: DoFindNext(); break;
 case SEEK: DoSeek(); break;
 case CLOSEALL: DoCloseAll(); break;
 case PATHNAME: Spoof(); /* hack */
 case 0x25:
 default: ChainFlag = 1; break;
 }
 _asm /* Switch back to app stack */
 {
 cli ; Interrupts off
 mov sp,WORD PTR AppStkPtr ; Load app stack
 mov ss,WORD PTR AppStkSeg
 sti ; Interrupts on
 }
 if (ChainFlag) { /* If anyone set the chain flag, chain out */
 Active = 0;
 _chain_intr(Old2F);
 }
 /* Restore (possibly modifed) registers */
 IntRegs.AX = _AX; IntRegs.BX = _BX;
 IntRegs.CX = _CX; IntRegs.DX = _DX;
 IntRegs.FLAGS = _FLAGS;
 Active = 0; /* Clear Active Flag */
}






[LISTING TWO]


/******************************************************************************
 * ST01.C -- by Jim Harper. (EXCERPTED LISTING) -- A simple SCSI transport TSR
 * that communicates with CDROM.C module via INT2F. The DoScsi() routine below
 * handles the actual work of of transferring data to/from the SCSI device.

*****************************************************************************/
void DoScsi(void)
{
 struct Cmd _far *Cmdp;
 unsigned Phase,
 NumBytes,
 Byte = 0,
 i;
 FP_SEG(Cmdp) = _DS;
 FP_OFF(Cmdp) = _DX;

 FP_SEG(Datap) = Cmdp->DSeg;
 FP_OFF(Datap) = Cmdp->DOff;

 NumBytes = 512;

 Endp = Datap;
 Cmdp->Count = 0L;

 /* Clear control reg */
 *RegPort = 0x00;

 /* Bus has gotta be free */
 if ((*RegPort & BUSYBIT) != 0x00) {
 Cmdp->Stat = 0x80;
 return;
 }
 /* Clear control reg */
 *RegPort = 0x00;
 /* Assert HBA's address */
 *DataPort = 0x80;
 /* Set the arbitration bit */
 *RegPort = (ARBITSTART PENABLE);
 /* Wait for arbitration to complete */
 for (Timer1 = HZ * 3; (*RegPort & ARBITDONE) == 0x00;)
 if (!Timer1) {
 Cmdp->Stat = 0x81;
 return;
 }
 /* OR the target & our ID bits into the data reg */
 *DataPort = 0x80 (0x01 << (unsigned)Cmdp->ID);

 /* Assert SELect, bus enable, deassert arbitration */
 *RegPort = (SEL PENABLE BUSENABLE);
 for (Timer1 = 2; Timer1;)
 ;
 /* Wait for BUSY */
 for (Timer1 = HZ * 3; (*RegPort & BUSYBIT) == 0x00;)
 if (!Timer1) {
 *RegPort = 0x00;

 Cmdp->Stat = 0x82;
 return;
 }
 /* Drop Select */
 *RegPort = (PENABLE BUSENABLE);

 /* Wait for command phase */
 for (Timer1 = HZ * 3; ((Phase = (*RegPort & PHASEMASK)) != COMMAND);)
 if (!Timer1) {
 *RegPort = 0x00;
 Cmdp->Stat = 0x83;
 return;
 }
 Cmdp->Stat = 0x00;
 for (Timer1 = HZ * 10;;) { /* Cmd must complete in 10s */
 if (!Timer1) {
 Cmdp->Stat = 0x84;
 return;
 }
 Phase = *RegPort & PHASEMASK;
 switch(Phase) {
 case COMMAND:
 while ((*RegPort & PHASEMASK) == COMMAND)
 *DataPort = Cmdp->CDB[Byte++];
 break;
 case DATAIN:
 _asm
 {
 push es
 push ds
 mov cx,NumBytes
 les di,Datap
 lds si,DataPort
 cld
 repeat1:
 movsb
 dec si
 loop repeat1
 pop ds
 pop es
 }
 Datap += NumBytes;
 break;
 case DATAOUT:
 _asm
 {
 push es
 push ds
 mov cx,NumBytes
 les di,DataPort
 lds si,Datap
 cld
 repeat2:
 movsb
 dec di
 loop repeat2
 pop ds
 pop es
 }

 break;
 case STATUS: Cmdp->Stat = *DataPort; break;
 case MSGIN: Cmdp->Sense = *DataPort; break;
 case BUSFREE: *RegPort = 0x00;
 Cmdp->Count += Datap - Endp;
 return;
 }
 /* Delay long enough */
 for (i = 0; (*RegPort & REQ) && i < 5; i++)
 ;
 }
}


















































March, 1993
TOOLS FOR EMBEDDED-SYSTEMS DEBUGGING


Emulators and logic analyzers can be a low-level programmer's best friends




Christopher Perez


Chris is engineering manager at Triage Corp. and can be contacted at 9900 SW
Wilshire, Suite 250, Portland, OR 97225.


Referring to application programming as "high level" leaves those of us who
twiddle bits for a living with the rather unfortunate title of "low-level"
programmers. Working in an alternative reality where software meets hardware,
we face a class of problems that would mystify application programmers. The
title not withstanding, our work is not lowly--just misunderstood.
Programming anything from a simple microcontroller to a new compute system
begins with the need to control hardware. Programmers who work in assembly
language are typically concerned with devices that push that hardware around.
Debugging software at this level often means looking into registers where each
bit has a dramatic impact on system performance. It may involve integrating
I/0 functions that require more speed than the system's designer allowed for.
Or it can mean that we test a device driver the sole purpose of which is to
chew up five critical days in a development schedule that was already
unreasonably short.
Most frequently, low-level debug means probing software interactions using
poorly understood application software while running on top of flaky
hardware--and proving, if at all possible, that the problem isn't yours.
Theoretically, low-level debug should happen early in the development process,
before everything else gets layered on the system. In a perfect world, every
project would implement a well-thought-out design created using a structured
methodology. The design team would sit down and design the hardware/software
interface completely. In reality, it's more common to end up desperately
pushing a probe while your cigar-smoking manager leans over your shoulder,
screaming, "Why the hell doesn't this thing work?" Low-level software
development is the foundation of all that is software, and if it's not right,
everything above and below may be in serious trouble.
Life is easier if the software and hardware teams understand each other's
jobs. Many companies working in the fast lane with the newest technologies
find that success requires that engineering teams work closely together on a
project from cradle to grave. This is essential, recognizing how tough the
low-level environment is: System clock rates can be absurdly high or
components can be A or B phase parts whose masks haven't even been fully
verified.
Life outside the fast lane isn't much easier. Embedded code for manufacturing
applications must often be written in environments that are, to say the least,
barbaric. More often than not, the engineering budget is only enough for one
or two software-development PCs that get used for everything from creating
small tasking matrices to overly optimistic schedules. To make matters worse,
some cross-development, debug, and analysis tools are too great an investment
for many departments, meaning the programmer has to do without.
In this less-than-perfect world, someone actually has to isolate the problem
and make the "simple fix," working only with the available equipment. Like
plumbers walking into a house with pipes backed up beyond Tidy-bowl's reach,
programmers require special tools like microprocessor emulators,
microprocessor on-chip debug circuitry, and logic analyzers can make this job
easier.
More Details.


In-circuit Emulators


In-circuit emulators are valuable for gaining insight into low-level software
behavior. They can be simple to install, with one insertion point at the
microprocessor. In-circuit emulation is great for loading a program, stepping
through the software, exercising a target system, or recreating a fault, and
can tell you what the microprocessor's registers are doing and whether the
parameters are correct. Because they often provide a clock and some useful
memory to work in, emulators also allow development of firmware long before
the actual hardware becomes available.
For instance, I once helped develop an intelligent Z80-based I/0 card. Not
surprisingly, the hardware specifications on the card were in flux when I
joined the team, but the specification for the software seemed firm.
Fortunately, I had access to a Z80 emulator capable of running on its own that
provided memory into which I could load my program. I was able to begin
implementation and debug my software long before the hardware guys had sorted
out their specifications with the customer.
Unfortunately, device-level debug with in-circuit emulation introduces
electrical loads that change the basic characteristics of the system itself.
I've seen major software-implementation cycles run using emulators, only to
find that once the emulators were unplugged and replaced with the stand-alone
devices, the systems crashed. While emulator manufacturers try to reduce the
amount they interfere with a system under test, electrical and timing
intrusion is inevitable in emulator operation, particularly when system-clock
rates begin to crowd the emulator's ability to keep pace.
A similar option is an evaluation board that runs off of a PC using the
vendor's development software. Purchasing such a system can give you a
complete development environment in which assumptions, software designs, and
implementations may be checked out.
More Details.


On-chip Emulation


Because microprocessor clock rates keep on increasing and chip designers keep
adding more gates to their designs, it's become harder to build a traditional
emulator. Cache information is unreachable, intermodule bus interaction is
unfathomable, and out-of-order opcode and data fetches and writes make
software debug increasingly difficult. Fortunately, alternatives are emerging.
One approach is on-chip emulation and the JTAG (IEEE 1149.1) specification, an
IEEE interface specification developed by an industry-wide committee.
Currently, the major JTAG entries are the Texas Instruments TMS 320C40 and
Motorola DSP96002 digital signal processors and the AMD 29200 microprocessor.
How do they work? With the right drivers to a development system and the right
probe, on-chip emulators let you look at registers, step through processor
cycles, or make the microprocessor do everything an emulator can do--in real
time and at native-processor clock rates. On-chip emulation is not
electrically intrusive--a major advantage when working at high clock rates. It
also helps you get around the problem of what's going on in the on-chip cache
or any of the on-chip peripherals such as I/0, shared memory, or floating
point. Coupled with an oscilloscope and timing analyzer, on-chip emulation is
every debugger's dream come true. For more details, see the accompanying text
box entitled, "Inside JTAG."
The current limitation of the technology is the availability of
software-development tools and JTAG-supported processors. Still,
JTAG-compatible chips may lead to a probing approach that can keep up with the
clock speeds and shorten debugger support-tool development cycles.


Logic Analyzers


Logic analyzers, the workhorse of assembler-level software developers, isolate
software problems by providing a more complete view of the hardware/software
interface and device-level behavior. A logic analyzer can combine software
disassembly with timing information and a way of viewing simultaneous events
on different parts of a system. It can also offer other types of data
acquisitions from the same platform, making it possible to probe a
microprocessor's bus, track other signals on the board, look at software
performance and at oscilloscope signals--all at the same time. Logic analyzers
can be configured to include digital oscilloscope functions,
software-performance analyzers, and pattern generators. Using these features
in concert with the traditional bus and timing acquisitions allows low-level
software developers to acquire and interpret system information and behavior
in new and unique ways. The text box, "Using Logic Analyzers" describes the
use of logic analyzers in more detail.
Perhaps the biggest disadvantage of a logic analyzer is the sheer volume of
information it produces. I've seen cases where logic-analyzer users made
incredibly large bus and high-speed timing acquisitions, only to spend four
days trying to understand what they captured. Smaller acquisitions usually do
the job more quickly by enabling you to understand each step as you debug.
Another disadvantage is that logic analyzers typically provide no active means
of controlling the target system. Plain-vanilla logic analyzers are passive.
Others come with some form of explicit microprocessor controller, using
ROM-emulation technology and requiring few system resources to provide
emulator-like capability. Once the logic analyzer is configured, the user can
hook directly into the system under test, download software and data,
single-step through code, read or write the register set, and read or write
any physical or virtual memory. These features are implemented by inserting a
small amount of monitor code residing in system memory or EPROM. It allows the
system microprocessor to execute in real time and maintains the use of the
other logic-analyzer tools.


Beating Bytes into Submission


Real-world examples best demonstrate the power and limitations of the toolsets
available to the assembler-level programmer.
For example, I recently confronted a D/A that was not putting out the desired
signal. The hardware guys swore that all was well in their domain, and the
applications-software team insisted that nothing was wrong in their lives. I
was volunteered to settle the dispute.
Using a logic analyzer configured with the ability to control the
microprocessor, I downloaded various versions of software and watched its
execution. From the logic analyzer's keyboard, I modified some software
versions to use different parameters or software functions and reexecuted the
system under test. I even saved a few of my changes for later use and review.
Using this approach, I was able to quickly prove (to the delight of the
software team) that the code was driving the D/A properly.

But the problem still wasn't solved. By using the timing analyzer, the logic
analyzer's built-in digital oscilloscope, and cross-correlated, time-stamped
displays, it was possible to generate a great amount of data. I watched the
various devices and their outputs, then compared this information against the
logic analyzer's scope. The time-stamp information allowed me to
cross-correlate events to a very narrow time-slice. Thus, I was able to
demonstrate that the real problem lay somewhere in the D/A
circuitry--completely independent of the microprocessor.
At the opposite end of the spectrum, I was once asked to force a solution very
rapidly. A system needed to be developed that demonstrated the capabilities of
a potential new product. The microprocessor was brand-new, hellatiously fast,
and had very little software-development support. All I had were a few code
fragments someone else had written, an assembler, a tenuous method of loading
the executable into the system, and very little time.
As I began using the existing software, I quickly realized there was no means
of seeing how functions were being executed. Several on-chip caches were being
used, and it seemed impossible to extract the information I needed. While
ranting and raving against the injustices of engineering life, I grabbed a
logic analyzer and began reviewing the microprocessor's user's manual to see
how exceptions were handled.
I discovered that as an exception was being taken, several useful pieces of
information were forced onto an exception stack, including the address where
the exception was taken and the exception type. I verified that I could
disable the caches and force data reads and writes to off-chip memory. I also
noted that even though there were no explicit software breakpoint
instructions, there were several meagerly documented illegal instructions.
Armed with this and the logic analyzer, I built an exception handler the sole
task of which was to turn off the caches and write stack and
microprocessor-register information to off-chip memory. In this way, I could
use the logic analyzer to trace and display the information my handler
provided. The exception address information would let me know which illegal
instruction had been executed, and the register dump would allow me to trace
function-parameter information. The most important thing about this approach
was that I suddenly had access to any part of that otherwise unfathomable
microprocessor. I then assembled the new exception table with the address of
my exception handler and inserted illegal instructions at different locations.
I was then able to rapidly map the existing software and its functions.
This approach is valuable for extracting information when the only available
tools are a logic analyzer and the microprocessor of the system under test.
Any microprocessor may be probed in this fashion.


Bringing All Weapons to Bear


The world of low-level software development is a messy place to work. Problems
range from application-software failure to hardware test and design
verification. Low-level programmers need to use all available weapons in order
to get out of desperate situations.
Emulators are wonderful, if you don't ask them to do things they're not
designed for. They can transfer executable code and step through instruction
sequences, but emulators have a hard time correlating different events on the
same board, and they typically can't handle a system being developed at the
highest possible clock rates. Logic analyzers provide a bigger and better
picture of what's happening at the software/hardware interface. Used in
combination with digital oscilloscopes, high-speed timing, and
software-performance analyzers, logic analyzers help cross-correlate data to
create sophisticated views of a system and its behavior. When combined with
software breakpoints and monitors, the logic analyzer becomes a powerful,
low-level software-analysis tool. Ultimately, there is hope that more
microprocessor manufacturers will provide access to on-chip information,
either through JTAG or some other yet-to-be-discovered approach for low-level
software debugging.
No matter how loudly we proclaim that "It's not the software driver's fault,"
low-level programmers are usually the ones stuck between the rock and the
cigar smoke. Finding the real culprit and saving our good name requires the
right tools and a little thought.


Inside JTAG


Tony Coomes, Andy Fritsch, and Reid Tatge
The authors are engineers at Texas Instruments and can be contacted at P.O.
Box 1443, Houston, TX 77251-1443.
As computer components and circuit boards become increasingly integrated, a
problem arises. How do you test, validate, or verify the functionality of a
specific output, function, or feature? Traditionally, this has meant
connecting the system under investigation to oscilloscopes, logic analyzers,
and microprocessor emulators. But as functions join other unctions on the same
piece of silicon, the ability to test each function individually becomes lost.
For example, devices such as DSPs with on-chip memory, peripherals, caches,
and internal buses for interconnect--as well as ASIC devices with core
logic--can have up to 300 pins with almost inaccessible internal connections.
Packaging advances such as surface-mount technology also contribute to
significant test problems at the board level. With surface-mount packages,
nodes must be accessed from the component side there may be components on both
board sides. This presents problems for bed-of-nails testers. Additionally,
the development of standard automatic test equipment (ATE) fixtures to test
surface-mount boards has been so difficult that something else is required.
As a result, physical-device probing is becoming impractical, and testing is
becoming more expensive. For this reason, considerable effort has been
invested in developing techniques that allow devices to test themselves.
To begin to address the concern of system and component testability, the Joint
Test Action Group (JTAG) was formed. Members include Alcatel, AT&T, Digital
Equipment, the Department of Defense, Hewlett-Pachard, IBM, Philips,
Siemens-Nixdorf, Texas Instruments, Unisys, and others. As a result of their
efforts, there's now an international organization (JTAG) and its
specifications are IEEE standards (IEEE JTAG 1149.1).
Building devices to meet the JTAG specifications requires that the device have
the ability to test itself (called "built-in self-test," or BIST). The
hardware built to perform this function allows access to circuits and nodes
that cannot be observed, controlled, or emulated with even the most
sophisticated ATE fixtures. This ability is based upon a test-access port and
a boundary-scan architecture. A test port allows control and access to a
boundary-scan capability and other test functions on the device.
For example, the JTAG serial-test access port (TAP) on Texas Instruments'
TMS320C40 (C40) DSP chip is the pathway to on-chip emulation and the means by
which TI's XDS510 emulator (a PC card) and the C40 communicate. A cable runs
from the PC card to an active buffer pod. A cable links the pod to a 14-pin
JTAG connector attached to the DSP under examination.
JTAG and the boundary-scan methodology allow an internal view into a DSP so
that it can be emulated directly through a TAP. Traditional emulation
features, such as displaying/changing registers and memory, single or multiple
stepping, and running/halting are handled by the XDS510 through this port.
These ports can be connected so that a system like the XDS510 can look at the
inner operations of multiple C40s simultaneously. Thus connected, application
software executing on these systems can be simultaneously debugged.
This is accomplished by designing access to internal nodes of the DSP through
the test-access port. These internal nodes are said to be "on the scan
boundary," and their state can be read out through the JTAG port.
Boundary-scan technology allows a chip's internal digital logic to be tested
without being probed. All that's needed is on-chip standardized test/scan
circuitry.
Certain mainframe manufacturers have been using the boundary-scan concept for
years to test their complex systems. The functions were not implemented at the
internal chip level, but at the computer-subsystem level. The boundary-scan
idea is not new; what is new is the acceptance of a standard (JTAG) for
boundary scan and the test/scan circuitry on the chip. Now test-equipment
vendors, semi-conductor manufacturers, and software/hardware toolmakers can
design common devices that work together.
To implement boundary scan, every chip to be tested must include test/scan
circuitry called "boundary-scan cells." The cell is implemented between each
chip pin and the circuitry to which it is connected. In addition to the
connection to the package pins and the chip's working logic, the boundary-scan
cells are connected to each other to form a shift-register path around the
periphery of the integrated circuit (hence the name, "boundary scan").
During normal chip operation, data flows between the chip pins and its
internal logic as if the scan cells were not there. In test mode, however,
these cells are directed by a test program to pass data along the
shift-register path. Additionally, data loaded into test cells can be used in
place of data flowing into, or out of, the chip points. This allows both the
chip's internal logic, and its external, chip-to-chip connections to be
tested. With JTAG boundary-scan technology, test/emulation sequences can be
put wherever needed.
Using these building blocks, a chip manufacturer allows the device user to
build a testable, debuggable system. This is driven by increasing integration
of device functions and features, which, in the near future, may well become a
requirement.



Using Logic Analyzers


When first confronted with a logic analyzer, the octopus of wires, leads,
probes, pods, and ribbon cables may be daunting. Connecting a logic analyzer
to the system to be probed can seem to require a small army of people to
oversee the task, while logic-analyzer user interfaces sometimes seem best
suited for communicating with visitors from other planets. It's easy to see
why the advantages and sophistication of logic analyzers get passed over for
simpler but much less powerful instruments. A valid question for programmers
new to embedded-systems debugging may be, "Is it worth the effort?"
The real answer depends on what manner of bugs you intend to do battle with
and how much of your system you choose to validate. For general
embedded-systems development/integration/test/validation, no other debug tool
is better suited.
Embedded-systems development most often requires:
Validating device timing and functions.
Validating exception handling and I/O functions.
Validating device events against embedded software commands.
Reviewing debug data from different points of view.
Logic analyzers have been designed to perform these exact functions (and
more). They do this by providing device timing and microprocessor state tools.
Each tool has one or more displays that the user can use to review the results
of data acquisition. The other two major debug tools used in embedded-system
development are the oscilloscope and the microprocessor emulator. In contrast
to the logic analyzer, the oscilloscope provides only a couple of channels of
debug information; the microprocessor emulator provides good software-debug
capabilities, but little or no hardware information. A good logic analyzer
provides the best of both, giving the user state and timing tools. Some new
logic analyzers even provide digital oscilloscope tools, software-performance
analysis, and embedded software load and manipulation tools. It's not uncommon
to see logic-analyzer tool systems running state, timing, oscilloscope, and
program-load functions simultaneously. By carefully choosing the logic
analyzer functions and features, the learning curve for a logic analyzer is no
greater than oscilloscope and microprocessor emulator learning efforts
combined. This is illustrated in the following description of how to set up a
logic analyzer:
Select a good general-purpose logic analyzer.
Select and connect the microprocessor probe made for your system's
microprocessor to the system to be tested.
Select and load to the logic analyzer the disassembler package that matches
the microprocessor's instruction set.
Connect device-timing probes to various points of interest on the system to be
tested.
Power on the system to be tested, select initial logic-analyzer trigger
conditions, and start an acquisition.
Several good general-purpose logic analyzers are available. To keep setup time
to a minimum, many manufacturers sell logic-analyzer packages that include the
timing probes and leads, the microprocessor probe and leads, and the
microprocessor disassembler package best suited for your specific system.
Logic analyzers configured in this way are nearly plug-and-play, thus making
it relatively easy to immediately begin debugging the system.
Take the first several logic-analyzer data acquisitions to learn about the
logic analyzer and to understand some of the debugging possibilities. Display
the timing information in waveform as well as in bit form (1s and 0s). Review
the software acquisition as disassembled data in its various forms (some logic
analyzers let you organize assembly opcodes and data in different ways) as
well as raw 1s and 0s (basic state-displayed information). Compare the
disassembler display against your assembly-code listing. Watch the program
flow as the logic analyzer has captured it. Then look at isolating hardware
events based upon software commands and vice versa (that is, look at what
happens when the system being tested handles a microprocessor exception).
Following this procedure will help you understand how data is gathered and
displayed. You can then move confidently onto more complex probing situations
and trigger setups as you continue to learn and begin to debug your system.
--C.P.

FOR MORE INFORMATION
EMULATORS Background Mode Emulators
Ice Master-PE 68332/68340, 68HC16
8051 Embedded Support Tools Corp.

MetaLink 10 Elmwood Street
P.O. Box 1329 Canton, MA 02021
Chandler, AZ 85244-1329 617-828-5588
602-926-1197
 8051 In-Circuit Emulator
EZ-ICE 8031/32/51/52
8031/32/51/52 68HC11 Vail Silicon Tools
AMS Inc. 692 S. Military Trail
160 SW Third Street Deerfield Beach, FL 33442
Pompano Beach, FL 33069 305-570-5580
305-784-0900

AN196-MC Emulator LOGIC ANALYZERS
Annapolis Micro Systems Inc. GPX Logic Analyzer
190 Admiral Cochrane Drive Tektronic Inc.
#130 P.O. Box 4600, M.S. 94-86
Annapolis MD 21401 Beavertown, OR 97076
410-841-2514 800-426-2200

NICE-51 ML4400 Logic Analyzer
8051 American Arium
Tribal Microsystems Inc. 14281 Chambers Road
44388 S. Grimmer Blv. Tustin, CA 92680
Fremont, CA 94538 714-731-1661
510-623-8859
 HP 1660 Logic Analyzer
EMUL-PC Hewlett-Packard
805I 68HC11/16, 68332 P.O. Box 58059, M.S. 51
Nohau Corp. Santa Clara, CA 95051
51 E. Campbell Ave. 800-452-4844
Campbell, CA 95008
408-866-1820 PM 3585 Logic Analyzers
 John Fluke Manufacturing
MICE-V P.O. Box 9090
80C186, 68000/10/30, 68302 Everett, WA 98206
Microtek Inc. 206-347-6100
3300 NW 211th Terrace
Hillboro, OR 97124
503-645-7333 JTAG
 JTAG IEEE 1149.1
UEM Emulator IEEE Standards Office
Z181 445 Hoes Lane
Softaid Inc. P.O. Box 1331
8300()() Guilford Rd. Piscataway, NJ 08855
Columbia, MD 21046 800-678-IEEE
41()- '90-7760
 TMS 320C40
Emulator/Analyzer Texas Instruments
8051, 68HC11, 80C196, Z80, P.O. Box 1443
6805,68000, 68302 Houston, TX 77251-1443
Orion Instruments
180 Independence Dr. DSP 96002
Menlo Park, CA 94025 Motorola
800-767-4666 3501 Ed White Blvd.
 Austin, TX 78721

Zaxpax AM 29000
6800,80C186, V50 Advanced Microdevices
Zaxtek P.O. Box 3453

42 Corporate Park Sunnyvale, CA 94088-3000
Irvine. CA 92714
714-474-1170



























































March, 1993
INSIDE BTRIEVE FILES


File recovery using undocumented features




Douglas Reilly


Doug owns Access Microsystems, a software-development house specializing in
C/C++ software development. He is also the author of the BTFILER and BTVIEWER
Btrieve file utilities. Doug can be contacted at Access Microsystems Inc., 404
Midstreams Road, Brick, NJ 08724, or on CompuServe at 74040,607.


It's the call all Btrieve developers dread: "I have a Btrieve file with 98,000
records, it's damaged, and I don't have a current back-up!" (or "...it's
damaged and my tape back-up didn't work!"). In either case, your client--and
possibly you as well--has a problem.
Let's begin with some definitions. Btrieve from Novell is a key-indexed
record-management system. A Btrieve file can have between 0 and 24 keys. An
individual key can be composed of more than one "segment," or contiguous
section of a file, as long as there are not more than a total of 24 segments.
Unlike some systems used to do similar jobs, Btrieve is a record manager only.
It has no user interface, but instead provides an API that allows Btrieve to
be used from C/C++, Pascal, Cobol, Basic, and assembler. Btrieve comes in a
client-based version (BTRIEVE.EXE) and a server-based version (BREQUEST.EXE).
For our purposes, both versions can be considered identical since the
resulting files are identical.


Turn the Page...


Btrieve files consist of "pages." When a Btrieve file is created, a page size
is specified. The page size must be a number between 512 and 4096 that is
divisible by 512, and fit at least one of the fixed-length portions of the
record being created. Btrieve allows variable-length records, but each
variable-length record has a fixed-length portion as well. There are three
general types of pages: a single-header page (possibly with an additional page
for an "Alternate Collating Sequence"), 0 or more index pages, and 0 or more
data pages. (In fact, most files have one or more index and data pages, but
Btrieve allows data-only and key-only files, too.) All pages within a Btrieve
file are the same size. Figure 3 - Figure 5 are hex dumps of pages from a
Btrieve file with a page size of 512 and a single index starting at position 1
for 11 bytes. The variable data page (Figure 5) is from a variable-length data
file.
Most of what we are about to discuss is undocumented by Novell. A warning: As
with any undocumented element of a system, some aspects of Btrieve files are
undocumented to hide details of implementation from the outside world in order
to ensure that the end user is shielded from changes to the details. Most of
the information presented here is valid for Btrieve files created by version
4.11 through version 5.1x. Btrieve version 6.0 (currently available only as
part of Netware SQL) will almost certainly change some of these details. Of
course, there are good reasons to dive into the undocumented aspects of a
system now and then; one reason is to recover data unavailable elsewhere.
The first page of any Btrieve file is the "header" page or File Control
Record; see Figure 1. The details of what is where on the header page are
undocumented. A second header-like page will be present if an alternate
collating sequence is defined to allow the file to be sorted in other than
strict alphabetical order; see Figure 2. For instance, a standard alternate
collating sequence is defined in a file, UPPER.ALT, included with the Btrieve
developer's kit. In addition to some overhead bytes, the page containing the
alternate collating sequence has the hex value AC as an identifier, an
eight-character name of the sequence associated with the file (in this case
"UPPER") and one byte for each of the 256 possible values for a byte. In
Figure 2, note that the bytes describing this alternate collating sequence
start at 0 and end at FF hex, with the exception of the fact that the
uppercase alphabetic characters (41 through 5A hex) replace the lowercase
alphabetic characters (61 through 7A hex). This allows upper- and lowercase
characters to be sorted identically.
The next type of page in a Btrieve file is the "index" page; see Figure 3. The
number of index pages (if any) depends upon the number of indexes in the file,
as well as the number of records. These are of little use independent of the
actual data, but one might find it interesting to track the pointers near each
"key" on the index pages back to the data they represent. In a data-only
record, it's possible no index page will exist.
The next type of page is the "data" page. There are at least two types of data
pages: the "normal" data page (see Figure 4) for fixed-length records (or
fixed-length portions of variable-length records) and the variable-length data
page (see Figure 5). In a key-only record, no data pages will exist. The
header can be rebuilt (as we will see) and indexes can be recreated (given the
correct data and enough time), but the data is the one thing that cannot be
derived from the other components.
The organization of records on data pages gives us some more information we
can use to recover Btrieve files, and also allows for the selection of the
page size that will make the best use of each byte of the page. Fixed-length
records, or fixed-length portions of variable, length records, are stored on
the data page one after another, with some other information. Depending upon
page size, more or less space will be left over at the end of the page.
Determining the "best" page size, from a space utilization standpoint,
requires calculating the number of records that will fit on each page, and
from that number, the number of bytes wasted.
Let's use as an example a file with a fixed record length of 128 with two keys
that allow duplicates. There is an overhead of six bytes for each fixed-length
data page. (This information will be useful when trying to recover data.) The
physical record length, that is, the actual number of bytes taken up by a
Btrieve record is: (logical_record_length+ (number_of_dup_keys*8)) or in our
example (128+(2*8)) or 144. The eight bytes for each key that allows
duplicates are used for "prev" and "next" pointers. Since each page starts out
with six bytes less than the stated size, we next search for numbers between
506 and 4090 that have the smallest remainders when divided by 144. The best
fit allows seven records on a page size of 1024, with ten bytes wasted. If 128
is not a "magic number" for our record, we can use more of the page by using a
record length of 129, thus wasting only three bytes. There are shareware and
public-domain utilities that do this calculation for you. The important thing
to remember is that on every data page, six bytes are used by the system, and
eight bytes are added on to the physical record length for every key that
allows duplicates.
The nature of variable-length records is that the data is placed on the data
page such that no "optimum page size" exists for the variable portion of a
Btrieve record. Btrieve also allows compressed records, with which the entire
record is placed on a variable data page, and the fixed data pages contain
only pointers to the variable pages. It is important to note that Btrieve's
data compression is not nearly as sophisticated as that in programs like
PKZIP. Btrieve compresses only those strings of data that contain repeated
characters. Other data compression techniques take advantage of the fact that,
for instance, English text has many "e" and "i" characters and relatively few
"z" and "x" characters. Therefore, they encode the common characters in less
than one byte, and the less-common characters in one byte (or, I suppose,
possibly more than one byte).
Btrieve compression only occurs when you have five or more characters repeated
one right after another! In one sample purchase-order file I looked at, the
only time data compression kicked in was when the PO number had at least five
consecutive identical digits (like "144444" or "777777"). Not much of a
savings there. Keep this in mind as we discuss possible data-recovery
scenarios and how data compression affects them.


When Bad Things Happen to Good Btrieve Files...


With that background, we can go back to the Btrieve-based system-developer's
dilemma described at the start of this article. The first step, of course, is
to make sure that the user has a backup of this damaged file. This is
critical. If the user is unable to copy the damaged file, then we have a DOS
problem, and Btrieve-specific cures must await its resolution.
The next step is to determine what sort of error is occurring. Btrieve has
many status codes; those related to damaged files are listed in Table 1.
Several of the status codes that could lead to a damaged Btrieve file refer to
a "pre-image" file.
Table 1: Common Btrieve status codes relating to damaged files.

 Code Meaning

 2 I/O Error
 14 Pre-image Open Error
 15 Pre-image I/O Error
 30 Not a Btrieve File Error
 42 Incomplete Accelerated Access
 56 Incomplete Index

One way to protect data in Btrieve files is through the use of a pre-image
file created when a file is first updated. The pre-image file contains images
of the pages in the file before the update occurred. If the operation is
interrupted (for instance, the program hangs or the power fails), the next
time the file is opened the pre-image file is read, and the file is restored
to the condition it was in just after the last complete operation. If these
pre-image files are disturbed before the next successful open, the file cannot
be automatically restored by Btrieve. Pre-image files share the filename with
the Btrieve data file, with a .PRE extension.
The most common (and maddening) Btrieve error code is status 2, an I/O error.
This could be caused by almost anything. Novell has a list of 33 distinct
causes of status 2, and it's possible they missed a couple. Look back at
Figure 1 through Figure 4 and notice that in byte 3 of each record, there's a
number that acts like a "page count," with the header page having a 0, the
alternate collating sequence having a 1, the index page having a 2, and so on.
Btrieve appears to use this number as a "sanity check." On the index page,
change that 3 to a 7 and a status 2 will appear. Your data is still just fine,
but changing that one byte (or any of the four bytes that are part of the page
count) will cause a Btrieve status 2.
Start with the easiest recovery option first. Btrieve allows reading of
records by logical positioning (that is, via an index) or by physical
positioning. To recover data from a damaged Btrieve file, physical positioning
must be used; see the pseudocode in Listing One (page 106). To recover the
damaged file, we start from the first physical record and read towards the end
of the file. If we encounter an error and the error is not an end-of-file
error, we save the physical-positioning information from the most recent
successfully read record and get the last physical record (using STEP_LAST,
Btrieve function 34). We read towards the beginning of the file, stopping when
a Btrieve error occurs or when we read the last record we read previously. Now
we close the files and exit.
The recovery procedure just described is what Novell's utility BUTIL does
during a -RECOVER operation, saving records to a sequential file in a special
format. In many cases that is all that needs to be done. But what happens in
the case of multiple nonadjacent errors in the file? In that case, we start at
the beginning of the file, step forward until we find an error, then go to the
end of the file and step back until we find an error or read the same record
as the last successfully read record when we were stepping forward. If the
error we find stepping back is not the error we find stepping forward, the
records between the two errors will not be read. Here is where our knowledge
of Btrieve files (and some undocumented aspects of Btrieve files) comes in
handy.
In Listing One, we talk about saving physical-positioning information. This
takes the form of a four-byte value that is, in fact, the offset in the file
where the record occurs. (We'll ignore for a moment the fact that Btrieve for
DOS allows files to extend over more than one disk drive.) Btrieve also allows
you to set the position in a file. Usually, this set position function
(GET_DIRECT, Btrieve function 23) uses a 4-byte block obtained from a previous
call to GET_POSITION (Btrieve function 22). But this time we have to figure it
out ourselves.
We know that every Btrieve file has a 6-byte overhead on every data page. We
can prove this by creating a new Btrieve file and then seeing where the record
is placed on the data page. From this we know that the only possible place the
first record on a page can be found is six bytes from the start of the data
page. But how do we know what pages are data pages and what pages are index
pages? We could use DOS functions to read through the file, checking the bytes
at the beginning of each page. Data pages do have some distinctive values in
two of the six overhead bytes that are not the "page count" discussed above,
but this seems a little too low level. Thankfully, there is a better way.
We can ignore the first page of the file, since we know it will always be a
header page. We can calculate the total number of pages in the file by
dividing the file length by the page size. Then for each page we do a
GET_DIRECT at six bytes from the beginning of the page. If there is no error,
or the error that does occur is not a status 43 (Invalid Positioning
Information), we have found a data page. We save the record through the
GET_DIRECT operation, and calculate where the next record will start. From our
earlier discussion, we can tell the physical record length of any record:
(logical_record_length+ (number_of_dup_keys*8)).
We can calculate the number of keys that allow duplicates by using a Btrieve
status function (STAT, Btrieve function 15). Then we can simply take the
position of the first record, add the physical record length, and call
GET_DIRECT again. We can repeat this until the position is past the page on
which we are working, and we can repeat the entire procedure (look for a
record at six bytes into the page) for every page in the file.



Losing Your Head


A second type of problem can occur in Btrieve files. This problem will show
itself by reporting a status 30 error (Not a Btrieve File) during an OPEN
call. You can try to recover a file using the procedures just described, but
this will not work since Btrieve can't open the file. When Btrieve reports a
status 30, it is really saying that the header of the file is not consistent
with what Btrieve expects in the header.
The solution uses a "clone" of the damaged file--a Btrieve file with the exact
same page size, record length, key structure, and so on. Copy the header from
the good clone file over the header in the damaged file, remembering that a
page is some number divisible by 512 less than or equal to 4096. Now we have
to do one more thing before we can use the data-recovery method described
earlier.
In bytes 26 hex through 29 hex, Btrieve stores the length of the file,
measured in pages. If we do a directory listing on the file used to generate
Figure 11, we find that it is 9216 bytes. Dividing this value by 1536, we get
6. So, if we were to repair this file, we'd start with the 4-byte hex value,
as in Table 2. In our example in Figure 1, bytes 26 through 29 do contain "00
00 06 00" (hex values). If the header page was damaged and we inserted the "00
00 06 00" into the file after replacing the header page with a header page
from a clone file, we would then be able to use the file-recovery procedure
described earlier. File recovery is still needed because some information
needed for accessing the file (for instance, actual record counts) is still
missing. C code to rebuild a damaged file header is provided in Listing Two
(page 106).
Table 2: Repairing damaged files.

 26 27 28 29

 Start with 6 00 00 00 06
 Flip byte 26 with byte 27 00 00 00 06
 Flip byte 28 with byte 29 00 00 06 00



Conclusion


Btrieve as a file manager is fairly robust. Many Btrieve developers go for
years without seeing some of these conditions. The Btrieve developer's kit
provides the BUTIL program that handles the most common of the file-recovery
procedures. In addition, several commercial, shareware, and public-domain
programs exist to recover damaged Btrieve files.
But what you don't know can hurt you. For instance, some of the data-recovery
methods described depend upon knowing where records start. With
Btrieve-compressed records such methods cannot be used, since the starting
position for records depends upon the data in previous records. If, as a
developer, you do not understand exactly what it means to use Btrieve data
compression, you can end up with a file that is no smaller than a similar file
without compression, and somewhat slower to boot. Look as you might, you will
not find an adequate description of the implementation details of Btrieve's
data compression. It is for times like these that getting out the HEX editor
and rooting around our Btrieve data files can help make you a hero when the
dreaded call for help comes.

_INSIDE BTRIEVE FILES_
by Douglas Reilly


[LISTING ONE]

OPEN (Btrieve function 0) damaged file
Open destination file (Could be Btrieve file, could be DOS Sequential file)
if STEP_FIRST (Btrieve function 33) does not return an error
 DO
 write record to destination file
 save the physical position from call to GET_POSITION (Btrieve function
 22) in last_saved_pos.
 WHILE STEP_NEXT (Btrieve function 24) does not return an error.

if last Btrieve error is not an end of file error
 if STEP_LAST (Btrieve function 34) does not return an error
 DO
 write record to destination file
save the physical position from call to GET_POSITION (Btrieve function 22) in
cur_pos.
WHILE STEP_PREV (Btrieve function 35) does not return an error AND cur_pos is
not equal to
last_saved_pos

CLOSE (Btrieve function 1) source
close destination file (either Btrieve or DOS sequential file).





[LISTING TWO]

#include "stdio.h"

#include "stdio.h"


/*
Given a FILE pointer, return the length of the file.
*/

long filelen(FILE *t)
{
 int i;
 long loc;
 long begloc;

 extern errno;
 errno=0;
 if ( (begloc=ftell(t))==-1L 
 (fseek(t,0L,SEEK_END))!=0 
 (loc=ftell(t))==-1L 
 (fseek(t,begloc,SEEK_SET))!=0 )
 {
 return(-1L);
 }
 return(loc);
}

/*
Given the name of a damaged file, a good "clone" of the damaged file,
 rebuild the header on the damaged file. Don't use this file directly,
 but next do a simple recover on the file so that the rest of the
 information in the header is updated.

 This should NOT be tried on any file except one that returns a Btrieve
 status 2 (I/O error) or 30 (Not a Btrieve file) on OPEN.
*/
int do_hdr_rebld(char *damagedFile,char *goodClone,int pageSize)
{

 int ret;
 char headerbuf[4096];
 long damaged_size=0L;
 long num_pages=0L;
 long b1,b2,b3,b4;
 FILE *in;
 FILE *out;

 /* Open the good clone file, if you can */
 if ( (in=fopen(goodClone,"rb"))==NULL )
 {
 printf("\nCan't open undamaged file to read header. Exiting");
 return(0);
 }
 /* read the header from the good file */
 fread(headerbuf,1,pageSize,in);
 fclose(in);
 if ( (out=fopen(damagedFile,"r+b"))==NULL )
 {
 printf("\nCan't open damaged file. Exiting.");
 return(0);
 }
 /* get the length of the file, in bytes. */
 if ( (damaged_size=filelen(out))==-1 )
 {

 printf("\nCan't read damaged file. Exiting.");
 return(0);
 }
 /* convert to pages. We could do a sanity check to ensure
 that the file length is evenly divisible by the page
 length.
 */
 num_pages=damaged_size/(long)pageSize;
 /* isolate each byte. Yep, there probably is a better way
 (shifting bits), but this seems clearer to me, and we won't
 use this routine very often.
 */
 b1=(num_pages&0xFF000000L);
 b1=b1/0x1000000L;
 b2=(num_pages&0x00FF0000L);
 b2=b2/0x10000L;
 b3=(num_pages&0x0000FF00L);
 b3=b3/0x100;
 b4=(num_pages&0x000000FFL);
 headerbuf[0x26]=(char)(b2);
 headerbuf[0x27]=(char)(b1);
 headerbuf[0x28]=(char)(b4);
 headerbuf[0x29]=(char)(b3);
 /* get to the start of the file... */
 fseek(out,0L,SEEK_SET);
 /* write the header */
 ret=fwrite(headerbuf,1,pageSize,out);
 fclose(out);
 return(ret);
}
































March, 1993
EXAMINING PC AUDIO


Welcome to the wild and wooly world of PC sound


 This article contains the following executables: AUDIO.ARC


John W. Ratcliff


John is the president of THE Audio Solution and an independent software
developer. He produces interactive multimedia education products for Milliken
Publishing Company, is the author of 688 Attack Sub for Electronic Arts, and
is establishing an industry standard for digitized sound on MS-DOS machines.
John can be contacted at 747 Napa Lane, St. Charles, MO 63303.


Over the past few years, PCs have made extensive use of graphical user
interfaces. With Windows 3.1, which provides an API for producing digitized
sound and music output at system level, there's now a standardized way to
incorporate sound and music into the user interface. But in their base
configuration, IBM-compatible PCs don't offer sound capabilities, other than
being able to produce a square wave on a dinky speaker. Consequently, a
plethora of third-party sound boards, like those described in Table 1, have
stepped into the gap. Overall there are more than two million third-party
sound solutions from over 20 different vendors--each with its own hardware
specification. And therein lies the problem.
Programmers who want to add broad-based sound support to their software really
have only two choices: use a third-party API to talk to the wide array of
hardware, or purchase every single third-party audio device they want to
support and write hardware-specific code for each one. The second option is
not only impractical, it isn't even possible under Windows. Talking directly
to a piece of sound hardware under Windows generates an exception error. Under
Windows, you pretty much have to use the provided system calls. Under DOS,
however, a variety of practical solutions are available.


Sounding Off With Your PC


Generally speaking, there are three ways to make sound or music with a
computer: synthesizing, digitizing, and using MIDI. Synthesized effects are
created by modulating a waveform like a sine wave to produce unique sounds.
The results range from simple little ditties to sophisticated music. Still,
producing sound in this fashion is extremely complex and very difficult, if
not impossible, for programmers who don't have musical backgrounds.
Digitized sound is the same as that found on a compact disc. It's simply
recorded audio from a microphone which is analog-to-digital (A/D) converted at
some fixed sampling rate. The quality of the sound is controlled by the bit
resolution of the samples (16 bit for compact disc) and the recording
frequency (44 KHz for compact disc). Most third-party audio boards capable of
dealing with digitized sound use 8-bit resolution, and audio is generally
recorded anywhere between 5 and 22 KHz. The advantage of digitized sound is
that you can simply record whatever you want. Among its disadvantages: Not all
third-party boards support digitized sound; digitized sound takes up huge
amounts of data storage (often impractical for anything other than CD-ROM);
and digitized-sound playback is CPU and memory intensive.
The third way to deal with audio is through MIDI, a pseudostandardized file
format for music. MIDI allows you to specify multiple channels of sound
simultaneously, each with its own instrument. Each channel contains events
such as "play this note at this time." For programmers, this is great. You
hire a musician to compose some music, and tell the black box to play it.
Still, there are problems. First of all, MIDI isn't really a standard, because
each MIDI device responds differently to channel assignments and patch
changes. (On one MIDI device a flute might come out sounding like a violin, or
an entire channel might not be played because that device doesn't support that
particular channel.) Roland Corp.'s Sound Canvas (a "general" MIDI device)
addresses this problem by providing a base patch set (description of available
instruments) that works across a wide variety of MIDI systems. With it, you
can orchestrate a single piece of MIDI music as general MIDI and expect it to
play properly under many different hardware configurations.
Is MIDI the final solution? Unfortunately not, since few people actually own
MIDI devices at this time. This seems to be changing, however, as a number of
third-party manufacturers are readying sound devices that provide general MIDI
support.
Another approach, MIDI emulation, is seen in Windows 3.1, where MIDI commands
are emulated in software to produce music on non-MIDI devices like the
SoundBlaster or Adlib card, utilizing their on-board FM synthesis chip. There
are a number of other emulation solutions, some better than others in overall
musical quality.
What API solutions are available to you if you want to add sound and music to
your application? Under Windows, just use the system calls provided under
Windows 3.1. Another option is Miles Design's WAIL (Windows Audio Interface
Library), which provides MIDI emulation for Adlib-compatible devices through a
Windows DLL.
Under DOS, there are a number of options. The first is to purchase the SDKs
sold by each sound-board manufacturer. If you want to program the audio
hardware directly, this is a must. However, the API libraries provided are
often impractical for commercial products, and, of course, only support their
respective hardware platforms.
Looking again at Table 1, note that some manufacturers provide MIDI
pass-through on their sound cards. This allows the user to connect an external
MIDI device through this MIDI port. I've only noted sound cards that provide
on-board MIDI support. In the MIDI API section of the table, most sound
devices do not have on-board MIDI capability, but do support FM synthesis
(through YM3812 or YM OPL3 chipsets). A number of third-party APIs allow
software emulation of MIDI commands for these systems:
Windows MIDI Driver. Allows MIDI music to be played back under Microsoft
Windows 3.1.
DOS MIDPAK Driver. A set of sound drivers and instrument files from THE Audio
Solution (my company) that provide MIDI emulation under DOS.
Voyetra VAPI Driver. A set of sound drivers and instrument files from Voyetra
Technologies that provide MIDI emulation under DOS.
AIL MIDI Driver. A set of MIDI drivers under DOS and Windows provided by Miles
Design.
The Windows Wave Driver. A board-manufacturer provided driver that allows
Microsoft Windows to produce digitized sound output and input.
DOS DIGPAK Drivers. A set of drivers provided by THE Audio Solution that act
as the equivalent of Windows Wave drivers, but for DOS products.
Linkway Driver. A set of drivers provided by board manufacturers to support
IBM's Ultimedia sound for OS/2.
Voyetra VAPI Driver. DOS Wave drivers provided by Voyetra Technologies.
AIL Driver. A DOS- or Windows-based driver provided by Miles Design.


DIGPLAY.ASM


Having said all this, I now turn to a sound driver that will produce
high-quality digitized sound on any PC without requiring any extra hardware.
This driver installs as a TSR and has a simple API to produce digitized sound
output on the internal PC speaker. You simply fill a data structure that
describes where in memory the desired sound effect is, the length of the sound
sample, and the frequency that you want played back. The AX register contains
a 688H, the DS:SI registers point to this data structure, and you perform an
INT 66H. Your application should first check to see if the sound driver has
been loaded in memory. You can do this via a call to CheckIn, which is located
in DIGPLAY.ASM (see Listing One, page 107). The sound passed is simply raw
8-bit PCM data. A number of audio file formats (WAVE and VOC, for example)
typically contain a header followed by the raw data itself. The utility
SNDCONV (provided electronically--see "Availability," page 5) strips
unnecessary header information off of sound files to be played back with the
IBM-SND driver.
Most IBM compatibles come with an extremely poor-quality speaker that's
positioned ineffectively and has no volume control. Some PC manufacturers
don't even provide a speaker, merely a peizeolectric wafer that produces a
barely audible sound. Originally, the only sound produced on an IBM computer
was "beep." Later, music composed of simple tones was incorporated into some
entertainment products. Recently, relatively good-quality digitized sound has
been produced out of the IBM speaker. However, none of these approaches can
overcome the poor quality of the speaker itself.
There's only one way to make sound on the IBM. Under software control, you can
apply 5 volts to the internal speaker, then turn the speaker off. Turning the
speaker on causes it to travel out until it hits its maximum position; turning
it off causes it to come back to rest. There's no way to directly control
volume; this would involve causing the speaker to rest in intermediate
positions by applying different voltage levels.
There are two ways to turn the IBM speaker on and off: either directly
toggling bit 1 at port address 61H, or tying the speaker to Timer channel 2 of
the 8253 timer chip. This is enabled using bit 0 of port address 61H. With
this method, the speaker is turned on and off at whatever frequency Timer 2
has been set, which causes a square wave tone to be generated. This is the
method by which most simple sounds are created on the IBM.
A digitized sound sample is generally recorded as an 8-bit value (0-255),
where each data sample represents a voltage level. Many samples are taken to
approximate a continuous waveform. The sampling rate for the human voice is
generally above 5 KHz. Good-quality recordings of music are often at 22 KHz,
and a compact disc records data at 16 bits of resolution with a 44-KHz
sampling rate. The simplest way to do digitized sound on the IBM speaker is as
1-bit sound. If your 8-bit data sample is less than 128, keep the speaker
position off; if it is greater than 127, turn the speaker on. This works fine,
except that 1-bit digitized sound sounds awful. To get higher-quality sound we
need to find a way to hold the speaker in intermediate positions.
There's a simple way to do this. The speaker is a physical device. Even though
you can apply voltage to the speaker at an effective, instantaneous rate (the
speed of light is still pretty fast), the speaker cannot travel from its
minimum to its maximum position instantaneously. How long does it take? The
answer isn't exact because it depends upon the speaker inside of the
particular machine, but it is roughly 60 millionths of a second. That's pretty
fast, but since the 8253 clocks at roughly 1 MHz, we can control the rate at
which we toggle the speaker with about six bits of resolution.
More Details.
By sending a digitized sound sample to Timer 2 of the 8253, we effectively
hold the speaker at intermediate positions that correlate with different
voltage levels. The drawback is that we're doing this at a particular carrier
frequency. That means that if you send out 9-KHz audio data, you also get a
9-KHz tone as the carrier frequency. This carrier will actually drown out the
digitized sound data you are sending, and cause you and your friends to run
screaming from the room in pain as your ears quickly begin to burn from a
sound that could only appeal to a bat. My solution is to send the audio data
out at a carrier frequency out of the range of human hearing. By playing back
a 9-KHz sound sample with a carrier that is double the playback rate, we
achieve an 18-KHz carrier that can't be heard. What's left is the digitized
sound data itself, which is of surprisingly good quality.
As you might guess, the intermediate positions that the speaker holds are
nonlinear. Therefore, when you transform your 8-bit source data down to six
bits, the data needs to be modified to form a distribution that fills the
bandwidth within which we can modulate the speaker.
There are some technical challenges in pushing data out of the speaker at
frequencies above 18 KHz. This is why the audio driver presented in this
article does not operate in the background. The foreground driver, IBMSND,
effectively shuts the machine down while playing the sound out of the internal
speaker, thus providing maximum clarity during playback. You can play
digitized sound in the background, but it requires setting up an 18-KHz
interrupt on the machine, which runs a very high risk of causing conflicts
under DOS. (For example, protected-mode programs that intercept hardware
interrupts cause a lot of problems for code with a very high interrupt rate.)
The IBMSND sound driver presented here simply takes the input 8-bit audio
sample and reduces it to 6-bit values that are sent to the Timer 2 channel of
the 8253. The timer interrupt is revved up to run at twice the sampling rate
of the audio sample to get the carrier frequency out of the range of human
hearing. All other interrupts in the machine are shut down while the digitized
sound is pushed out of the speaker. The Timer 2 channel is reprogrammed to
accept a 1-byte countdown timer value rather than the normal 2-byte value. At
each interrupt, we simply execute the routine in Example 1 with DS:SI set up
to point at the audio data (already in 6-bit format). In the foreground, the
program waits for the SI register to reach the end of the audio data. The
interrupt is used to achieve very precise timing on all hardware. This code
worked even on 4.7-MHz 8088 machines, at 18 KHz! Notice that the hardware
interrupt expects certain registers to be set up and does not save and restore
the AX register. This is an especially good reason to shut off all other
interrupts in the machine, because it is doubtful that it will run effectively
for very long with the hardware interrupt installed.
Example 1: Reprogramming the Timer 2 channel.


 TimerInterrupt:
 lodsb ; Get a data sample (DS:SI points to audio data)
 out 42h,al ; Send 1 byte timer countdown to timer 2.
 mov al,20h ; non-specific EOI
 out 20h,al ; Send it to interrupt controller.
 iret

The file DIGPLAY.DOC (available electronically) documents the API interface to
the IBMSND digitized sound driver. A callable link layer into the digitized
sound driver has been provided through DIGPLAY.ASM. This assembly language
file provides C function-call hooks into the digitized sound driver as well as
the ability to detect the presence of a resident sound driver through the call
CheckIn. Simply include the C header file DIGPLAY.H in your code (see Listing
Two, page 107) and link, regardless of memory model, to the object module
DIGPLAY.OBJ. The assembly source is written taking advantage of Borland's
Turbo Assembler IDEAL mode, which provides a powerful 8086 syntax. A sample C
program that reads sound effects into memory and plays them back is enclosed
in the file SPLAY.C (also available electronically).


Conclusion


Adding sound to your software can be very rewarding. It's an excellent method
of communicating information, mood, and emotion to the user. It is also
entertaining, and makes computers more fun to use. But developers should be
cautioned to use sound wisely. Just as we have learned to hire professional
graphic artists to create attractive graphics for GUI-based software, we need
to look to professional musicians to obtain quality sound effects and music
for our products.
For More Information


Lunch With the Fat Man, or Music in Computer Software


With multimedia raging about us, it begs the question, "Why, when, and how
should music and sound be used in computer programs?" For answers, I turned to
George Alistair Sanger, aka "The Fat Man." From his home in Austin, Texas, the
Fat Man works on audio production and composition for interactive
entertainment and multimedia products, including award-winning projects like
Wing Commander, Ultima Underworld, and The 7th Guest. --J.R.
Why should music and sound be used in computer programs? For the same reasons
we've all surrendered to using icons--because it's the next logical step in
computer-software development. Icons and graphics save viewers from having to
figure out what they're supposed to be seeing; music helps them know what
they're supposed to be feeling.
I don't intend to demean the intelligence of the average user, but there's a
little bit of analytical work (which side of the brain does that again? Brains
ought to be color coded to match the big companies; one side blue and the
other rainbow) that goes into reading. Reading is slow, somewhat detached from
our more natural experiences, and the user doesn't really like to do it.
That's what makes icons work. It's already been established that icons speed
productivity by giving us images from the "real world" (whatever that is) and
letting us relate our feelings about, say, a file cabinet or a wizard, to the
items represented on the screen. Music and sound take this to the next logical
step.
Users associate with graphics the feelings appropriate to the real-world
object represented. Music and sound are even more direct: They expose users
directly to those feelings. (They can now experience an angry wizard or an
efficient filing cabinet.) And feelings can be very useful tools.
Sound and music can increase throughput, enhance the pleasantness of computer
experience, and increase entertainment value of a program.
When should music and sound be used? Not just in games.
Although computers and film are very different media, the relative maturity of
the latter makes it useful for developers to look at film as a model for the
future of some aspects of computer software. Especially with the importance of
multimedia, we can look for examples not only in feature films, but all kinds
of video applications: educational and industrial films, training,
advertising, and news programs--anything that's been on a film or videotape.
When do they use music and sound? To enhance emotion where it already exists.
Some folks like happy faces or other graphics when their computers start up--a
happy tune can triple the happy effect.
A business logo can be enhanced by an audio swirl. Action games can be scored
like mysteries or documentaries. Music and sound also inject emotion where it
is desired, but doesn't already exist. If the attention of the user is really
required, consider how much more effective a klaxon is than the word
"Warning!" in a dialog box. Clicking keystrokes gives a sense of security to
some typists. Any place an interesting graphic is used to change boredom to
interest, an interesting piece of sound can enhance that change.
Sound also manipulates or changes an emotion that might already exist. A
child-safety multimedia presentation might show a picture of a cute baby near
a swimming pool, and something like the Jaws theme might keep the user from
focusing on the cuteness of the baby. A short, simple title tune might make a
database program seem less complex and frightening. In the case of the "bad
news" dialog box, consider how much more palatable a warning a pleasant "ping"
is than an explosive sound (or a picture of a bomb, for that matter). With
audio, the developer, like a film director, is able to control the degree of
emotion the user feels.
Moreover, it's a mistake to use music or sound when an emotional response or a
reaction is inappropriate. When music is just thrown in as filler, or is so
poorly composed that it doesn't adequately support the purpose of the program,
it annoys the user and cheapens the software. And it gives music itself a bad
name--"Musak."
How should music and sound be used in computer programs? Like graphics, they
have to be done right. Filling up space with spectacular graphics done by an
"artist friend" (everybody's got one) is simply not the way to create an
effective program, and the same applies to music. There's as much an art to
using music and sound in software as there is in film. And of course it's too
big a subject to address here.
By way of cheap conclusion, here's a good rule of thumb. To judge the quality
of your sound, regardless of the technical limitations of your platform, ask
yourself these three questions:
Does every bit of the audio support the emotional direction of the program?
Does the audio maximize musical interest?
Can you dance to it?
Say, this was great. Let's have lunch more often.
--Fat

More Details.


Media Vision's Programming Contest


If you've been making noise about adding audio support to your applications,
Media Vision's Sound Programming Contest is an opportunity for you to really
sound off.
The programming contest, sponsored by sound-board manufacturer Media Vision,
seeks to reward outstanding use of sound in software. Sound-supported
applications must: use 16-bit digital sound and run on a Pro AudioSpectrum 16
or compatible sound card; be shareware or freeware; and be reasonably bug
free.
Entries will be rated on a scale of one to ten, with four points being
allocated for use of sound, four more for the "fun factor," and two for ease
of use. A panel of judges will nominate 53 finalists, from which first,
second, and third places will be selected.
Prizes will include CompuServe Electronic Mall shopping sprees (sponsored by
Media Vision and Computer Express) worth $5000.00, $2000.00, and $1000.00 for
first, second, and third places, respectively, and items worth $50.00-$100.00
for each of the 50 finalists. As a special bonus, ten additional prizes
consisting of $100.00 shopping spress each will be awarded to the best "early
bird" entries received before midnight, March 31, 1993. All entries will
receive a Media Vision T-shirt and free CompuServe time.
To enter, either download the entry kit from CompuServe or Media Vision's BBS
510-770-0968, or dial 800-356-7886 or 408-655-6014 x211 to ask for the kit.
Send your entry form to Media Vision and post the program in the PAS16
Shareware contest section on CompuServe (GO PAS16CONTEST). Media Vision will
use your CompuServe ID to verify the author names on the application and the
entry form. You must provide periodic technical support for your application.
Anyone can try out the applications on CompuServe (GO PAS16CONTEST) or the
Media Vision BBS (shareware contest area).
The deadline for entries is midnight on July 15, 1993.
--Editors

For More Information

Activision Digispeech
P.O. Box 67001 2464 Embarcadero Way
Los Angeles, CA 90049 Palo Alto, CA 94303

310-207-4500 415-494-8086

Adlib Corp. Media Vision
20020 Grande Allee East, #850 47221 Fremont Blvd.
Quebec City, PQ Fremont, CA 94538
Canada GIR 2J1 510-770-8600
418-529-9676
 Miles Design Inc.
Advanced Gravis 10926 Jollyville, #308
#111, 7400 MacPherson Ave. Austin, Texas 78759
Burnaby, BC 512-345-2642
Canada V5J 5B6
604-434-7274 Roland Corp.
 7200 Dominion Circle
Advanced Strategies Corp. Los Angeles, CA 90040-3647
60 Cutter Mill Road, Suite 502 213-685-5141
Great Neck, NY 11021
516-482-0088 Sequoia Development Group
 12517 Cascade Canyon Drive
AMD Granada Hills, CA 91344
5204 E. Ben White Blvd. MS 56 818-368-7221
Austin, TX 78741
800-292-9263 or 512-462-5651 Sequioa Systems Inc.
 400 Nickarson Rd.
Artisoft Malboro, MA 01752
691 East River Rd. 800-562-0011
Tucson, AZ 85704
800-846-9726 Street Electronic Corp.
 6420 Via Real
ASC Computer Systems Carpentina, CA 93013
26401 Harper Ave. 805-684-4593
St. Clair Shores, MI 48080
313-882-1133 THE Audio Solution
 P.O. Box 11688
ATI Clayton, MO 63105
3761 Victoria Park Ave. 314-567-0267
Scarborough, ON
Canada MlW 3S2 Turtle Beach Systems
416-756-0718 Cybercenter Unit 33
 1600 Pennsylvania Ave.
Covox Inc. York, PA 17404
675 Conger St. 717-843-6916
Eugene, OR 97402
503-342-1271 Voyetra Technologies
 333 Fifth Ave.
Creative Labs Inc. Pelham, NY 10803
1901 McCarthy Blvd. 914-738-4500
Milpitas, CA 95035
408-428-6600 Walt Disney software
 P.O. Box 290
 Buffalo, NY 14207-0290



_EXAMINING PC AUDIO_
by John W. Ratcliff


[LISTING ONE]


;; DIGPLAY.ASM John W. Ratciff
;;
;; This piece of source provides C procedure call hooks down into
;; the resident TSR sound driver. Use the call CheckIn to find out
;; if the sound driver is in memory. See the C header file DIGPLAY.H
;; for prototype information.
;;
;; This file is in the format for Turbo Assembler's IDEAL mode. The
;; IDEAL mode syntax makes a lot more sense for 8086 than the old
;; MASM format. MASM has recently been updated to provide some of the
;; functions that Turbo Assembler has had for a number of years. I prefer
;; to consider Turbo Assembler the standard for 8086 assemblers.
;; IDEAL mode functionality includes true local labels, real data structures,
;; typecasting, automatic argument passing and local memory.
;; Converting any of this code into MASM format is an exercise left for
;; the student.


 LOCALS ;; Enable local labels

 IDEAL ;; Use Turbo Assembler's IDEAL mode
 JUMPS

 INCLUDE "PROLOGUE.MAC" ;; common prologue

SMALL_MODEL equ 0 ; True if wanting to assemble near procs.

SEGMENT _TEXT BYTE PUBLIC 'CODE' ;; Set up _TEXT segment
 ENDS

 ASSUME CS: _TEXT, DS: _TEXT, SS: NOTHING, ES: NOTHING

SEGMENT _TEXT

Macro CPROC name ; Macro to establish a C callable procedure.
 public _&name
IF SMALL_MODEL
Proc _&name near
ELSE
Proc _&name far
ENDIF
 endm

;; int DigPlay(SNDSTRUC far *sndplay); // 688h -> Play 8 bit digitized sound.
CPROC DigPlay
 ARG DATA:DWORD
 PENTER 0
 push ds
 push si

 call CheckIn ; Is sound driver in memory?
 or ax,ax ; no-> don't invoke interupt...
 jz @@EXT ;
 mov ax,0688h ; Function #1, DigPlay
 lds si,[DATA] ; Data structure.
 int 66h ; Do sound interupt.
 mov ax,1 ; Return sound played.
@@EXT:

 pop si
 pop ds
 PLEAVE
 ret
 endp

;; int SoundStatus(void); // 689h -> Report sound driver status.
CPROC SoundStatus
 mov ax,0689h ; Check sound status.
 int 66h ; Sound driver interrupt.
 ret
 endp

;; void MassageAudio(SNDSTRUC far *sndplay);// 68Ah -> Preformat 8 bit
digitized sound.
CPROC MassageAudio
 ARG DATA:DWORD
 PENTER 0
 push ds
 push si

 mov ax,068Ah ; Identity
 lds si,[DATA] ; Data structure.
 int 66h ; Do sound interupt.

 pop si
 pop ds
 PLEAVE
 ret
 endp

;; int DigPlay2(SNDSTRUC far *sndplay); // 68Bh -> Play preformatted data.
CPROC DigPlay2
 ARG DATA:DWORD
 PENTER 0
 push ds
 push si

 mov ax,068Bh ; Identity
 lds si,[DATA] ; Data structure.
 int 66h ; Do sound interupt.

 pop si
 pop ds
 PLEAVE
 ret
 endp

;; int AudioCapabilities(void); // 68Ch -> Report audio driver capabilities.
CPROC AudioCapabilities
 mov ax,068Ch ; Check sound status.
 int 66h
 ret
 endp

;; int ReportSample(void); // 68Dh -> Report current sample address.
CPROC ReportSample
 mov ax,068Dh ; Report audio sample.
 int 66h
 ret

 endp

;; void SetCallBackAddress(void far *proc); // 68Eh -> Set procedure callback
address.
CPROC SetCallBackAddress
 ARG COFF:WORD,CSEG:WORD
 PENTER 0

 mov bx,[COFF]
 mov dx,[CSEG]
 mov ax,68Eh
 int 66h

 PLEAVE
 ret
 endp

;; void StopSound(void); // 68Fh -> Stop current sound from playing.
CPROC StopSound
 mov ax,68Fh
 int 66h
 ret
 endp

CPROC ReportCallbackAddress
 mov ax,691h
 int 66h
 ret
 endp

CPROC WaitSound
@@WS: mov ax,689h
 int 66h
 or ax,ax
 jnz @@WS
 ret
 endp

;; int CheckIn(void); // Is sound driver available?
CPROC CheckIn
 call CheckIn
 ret
 endp

Proc CheckIn near
 push ds ; Save ds register.
 push si

 mov si,66h*4h ; get vector number
 xor ax,ax ; zero
 mov ds,ax ; point it there
 lds si,[ds:si] ; get address of interupt vector
 or si,si ; zero?
 jz @@CIOUT ; exit if zero
 sub si,6 ; point back to identifier

 cmp [word si],'IM' ; Midi driver?
 jne @@NEX
 cmp [word si+2],'ID' ; full midi driver identity string?
 jne @@NEX

;; Ok, a MIDI driver is loaded at this address.
 mov ax,701h ; Digitized Sound capabilities request.
 int 66h ; Request.
 or ax,ax ; digitized sound driver available?
 jnz @@OK ; yes, report that to the caller.
 jz @@CIOUT ; exit, sound driver not available.
@@NEX:
 cmp [word si],454Bh ; equal?
 jne @@CIOUT ; exit if not equal
 cmp [word si+2],4E52h ; equal?
 jne @@CIOUT
@@OK: mov ax,1
@@EXT:
 pop si
 pop ds
 ret
@@CIOUT: xor ax,ax ; Zero return code.
 jmp short @@EXT
 endp


 ends
 end





[LISTING TWO]

/* Bit flags to denote audio driver capabilities. */
/* returned by the AudioCapabilities call. */
#define PLAYBACK 1 // Bit zero true if can play audio in the background.
#define MASSAGE 2 // Bit one is true if data is massaged.
#define FIXEDFREQ 4 // Bit two is true if driver plays at fixed frequency.
#define USESTIMER 8 // Bit three is true, if driver uses timer.

typedef struct
{
 char far *sound; // address of audio data.
 unsigned int sndlen; // Length of audio sample.
 int far *IsPlaying; // Address of play status flag.
 int frequency; // Playback frequency.
} SNDSTRUC;

int far DigPlay(SNDSTRUC far *sndplay); // 688h -> Play 8 bit digitized sound.
int far SoundStatus(void); // 689h -> Report sound driver status.
void far MassageAudio(SNDSTRUC far *sndplay); // 68Ah -> Preformat 8 bit
digitized sound.
void far DigPlay2(SNDSTRUC far *sndplay); // 68Bh -> Play preformatted data.
int far AudioCapabilities(void); // 68Ch -> Report audio driver capabilities.
int far ReportSample(void); // 68Dh -> Report current sample address.
void far SetCallBackAddress(void far *proc); // 68Eh -> Set procedure callback
address.
void far StopSound(void); // 68Fh -> Stop current sound from playing.
void far *far ReportCallbackAddress(void); // 691h -> report current callback
address.

/* Support routines */
void far WaitSound(void); // Wait until sound playback completed.
int far CheckIn(void); // Is sound driver available? 0 no, 1 yes.
































































March, 1993
PROXY: A SCHEME-BASED PROTOTYPING LANGUAGE


High-level data structures for rapid software design and development


 This article contains the following executables: PROXY.ZIP


Burt Leavenworth


Burt is a consultant and former professor of computer science specializing in
software engineering and programming languages. He can be reached via
CompuServe at 70262, 1074.


Scheme has long been regarded as an ideal language for prototyping because it
is interactive, extensible, and contains simple, yet powerful features.
Although the Scheme syntax is generally straight-forward, many programmers are
nonetheless put off by it because of its heavy use of parentheses. For this
reason, I've developed Proxy, an interactive language with a C-like syntax.
Proxy provides all of the high-level data structures--sets, maps, sequences,
and objects--useful for software design and prototyping. The Proxy interpreter
is available electronically (see "Availability," page 5).
Proxy, which is written in Scheme, is implemented by translating Proxy
expressions and function definitions into Scheme-language statements. These
statements are then executed by a Scheme interpreter. The current
implementation allows Proxy functions to call Scheme functions, which can then
send results back to the Proxy program.
There are no pointers or arrays in Proxy (although a map can be thought of as
a generalized array). This keeps data structures at a high level suitable for
prototyping. A software system can be represented as a "state" and a
collection of functions that operate on components of the state. The user may
develop these operations incrementally or define them in files that are loaded
prior to execution. It's also possible to manipulate objects (defined by
classes) that encapsulate local states; this allows the user to define a
software model as a hierarchy of submodels. The use of Proxy data structures
requires some mathematical sophistication, but nothing beyond the ability of
an experienced software developer.
Since Proxy uses infix notation, it avoids excessive use of parentheses. It
further reduces extra parentheses by using unary operators in many cases where
functional notation would be required. A comparison of the use (and disuse) of
parentheses is shown in Figure 1, which provides both a Proxy and Scheme
rendering of a simple recursive function that adds up the numbers in a
sequence.
Figure 1: Recursive function written in both Proxy and Scheme to add up
numbers in a sequence. (a) Proxy (6 parens); (b) Scheme (16 parens).

 (a)

 reduce (x) {
 if (x==[]) return 0; else
 return hd x + reduce (tl x);};

 (b)

 (define (reduce x)
 (if (null? x) 0
 (+ (car x) (reduce (cdr x)))))



Sets and Maps


A mathematical set is, of course, an unordered collection of elements. From a
formal point of view, each element should have the same type. However, since
Proxy has latent types (types not declared by the programmer), set elements
are not restricted to having the same type.
The simplest set operation is to enumerate the set, delimiting its elements by
curly brackets, for example, {1,2,3,4,5}. This same set can be constructed by
using the range notation, {1..5} (not to be confused with the range of a map,
to be defined below).
Two predicates can be applied to sets: exists and all. When the existential
quantifier exists is applied to a set, the predicate returns True if at least
one element of the set satisfies the predicate, and False otherwise. For
example, the statement (exists x in {2,3,5,7}; even(x)); returns True. On the
other hand, all returns True only if all elements of the set satisfy the
predicate. For example (all x in {2,3,5,7}; even(x)); returns False.
A more general way of constructing sets is given by the form shown in Example
1. The syntax x<-expr is called a "generator" and is required. expr is
evaluated to yield a set. The values of the set are successively assigned to
x, and a new set is formed from elements obtained by applying the function f
to the successive values of x which satisfy pred(x). pred(x), however, is
optional. Example 2 should make this clearer.
Example 1: A general form for constructing sets.

 { f(x): x<- expr ; pred(x) }

Example 2: General method for constructing sets.

 { x: x<- {1,2,3,4,5}}; returns {1,2,3,4,5}
 { x: x <- {1,2,3,4,5};x>2}; returns {3,4,5}
 { x*x: x <- {1,2,3,4,5}}; returns {1,4,9,16,25}

It is possible to have two generators, in which case an example is:

 { x+y: x <- {1,2}, y <- {3,4}} returns {4,5,6}



Only three elements returned because two additions (1+4) and (2+3) yield
duplicate values.
Maps are also enumerated by delimiting their elements with curly brackets.
(Formally, maps are sets of ordered pairs.) Maps are created using map
construction similar to that of sets. The syntax of maps is shown in Example
3. Note that the first and second elements of each ordered pair in this
example are separated by the mapping symbol ->.
Example 3: Syntax for using maps.

 m = {1->2,3->4,5->6};

The domain of a map is obtained by forming a set composed of all the first
elements of the ordered pairs. Returning to Example 3, dom m returns the set
{1,3,5}. Likewise, the range of a map is obtained by forming a set composed of
all the second elements of the ordered pairs (rng m returns the set {2,4,6}).
A single-valued map must have all its first elements unique. Given a domain
element, we can obtain the corresponding range element much like a table
lookup. For example, m[1] returns the value 2 and m[5] returns 6. If a domain
element is given that does not exist in the map, a false value will be
returned. However, it is possible to supply your own value to be returned in
this situation. This is done by supplying an additional argument called the
"default error return." For example, m[7, "not found"] returns "not found"
instead of False.
Domain restriction (dr) and domain subtraction (ds) produce new maps from a
given map by either allowing only certain domain elements in the given map to
appear in the result map, or taking away certain domain elements from the
given map. The domain elements are given in a set that is the second argument
of these operations, as shown in Example 4(a).
Example 4: (a) Domain restriction and subtraction; (b) using the overwrite
operator.

 (a)

 {1->2,3->2,5->6} dr {3,5}; returns {3->2,5->6}
 {1->2,3->2,5->6} ds {3,5}; returns {1->2}

 (b)

 {1->2,3->4} overwr {3->5,4->6}; returns {1->2,3->5,4->6}

The overwrite operation m1 overwr m2 is defined as follows: Each mapping in m1
is included in the result, unless its domain element occurs in the domain of
m2. In that case, it is replaced by the mapping from m2. Every mapping in m2
whose domain element does not occur in the domain of m1 is included in the
result; see Example 4(b). The map update m[d] = r is an assignment and a
special case of overwrite where the second operand (of overwrite) contains a
single ordered pair. Assuming that m is equal to {1->2, 3->4}, m[3]=5 is
equivalent to m overwr {3->5} and assigns to m the map {1->2,3->5}.


Sequences and Strings


A sequence is a collection of ordered elements which may be selected by their
ordinal position (index) in the sequence. The types of the elements are not
necessarily the same. A sequence can be enumerated by delimiting its elements
by square brackets, [1,2,3,4,5]. This same sequence can be constructed by
using the range (not to be confused with the range of a map) notation, [1..5].
Another example of enumeration would be s1 = [1,2,2,3,4].
Concatenation of two sequences is performed using the concatenation operator
conc. For example, the statement [1,2,3,4] conc [3,4,5] returns
[1,2,3,4,3,4,5]. If s is equal to [3,4,5], then an element of the sequence can
be selected using its index in the sequence. For example, s[1] returns 3, s[3]
returns 5, s[4] returns False, and s[4,0] returns 0.
Other selection operators on sequences are hd, tl, last, and butlast. The
first element of the sequence is returned by hd; tl returns a sequence
consisting of every element but the first; last returns a sequence consisting
of only the last element; and butlast returns a sequence consisting of every
element but the last. Proxy also provides a general way of constructing
sequences analogous to set construction. The expression expr in Example 5(a)
is evaluated to yield a sequence. The values of the sequence are successively
assigned to x, and a new sequence is formed from elements obtained by applying
the function f to the successive values of x that satisfy pred(x); see
Examples 5(b).
Example 5: (a) General form for constructing a sequence; (b) constructing the
sequence [4,5].

 (a)

 [ f(x): x<- expr ; pred(x) ]

 (b)

 [len x: x<-["abc","defg","hijkl"];len x > 3]; returns [4,5]

Strings can be considered as sequences of characters, although a character is
not a primitive type in Proxy. Since the string operators are similar to the
sequence operators already discussed, we will not bother to give examples.


Structs


A struct declaration is essentially a class declaration that defines the names
of the components of that class. The declaration in Example 6(a), for
instance, defines item to be a struct with the field names partno, code, and
quantity. Various items are instantiated by providing the struct name and
values for the fields. The only restriction is that a struct component may not
be a function.
Example 6: (a) Syntax for the struct declaration; (b) assigning values to
files in a struct.

 (a)

 struct item {partno, code, quantity;};

 (b)

 i1.quantity=22; assigns 22 as the new value of the quantity field

Components of structs may be selected using a dot notation similar to records
in Pascal and structures in C. Component values may be updated using
assignment and dot notation, as shown in Example 6(b).



Class Definition and Instantiation


A class definition is similar to a function definition except that the class
name is preceded by the class keyword. Also, the body of the definition
consists of a collection of function definitions (called "methods"). Contrary
to C++, Proxy classes may only contain functions. Listing One (page 90) shows
a FIFO queue, where the state component rep is declared as a local variable in
the class header, and the initialization of this component is performed in a
function with the same name as the class name. This function is called a
"constructor" in C++ and is automatically invoked when the class is
instantiated. If, when the class is instantiated, the user neglects to define
the queue constructor, a diagnostic is triggered.
The function definitions may optionally be preceded by the public: keyword to
indicate that they are exported or made visible outside the class. There are
cases when one wants to define functions that are private. In this situation,
the keywords private: and public: are used (in that order). Listing Two (page
90) gives an example of a priority queue.


A Sample Session: Prototyping


The software-development cycle usually starts with an unambiguous, complete,
and consistent requirements statement and the development of a proper
interface specification and module decomposition. In this process, it's
necessary to execute some representation of the formalized requirements to
have some confidence that the requirements are correct. In addition,
experience with the module-decomposition process tells us that it's difficult
to get the decomposition right without elaborating some of the inner
structure. (Conversely, it's hard to go too far with the details of inner
structure without settling on the decomposition.) The conclusion is that an
iterative process is called for with prototyping at a high level and
implementation details suppressed.
To see how Proxy might be used in a prototyping session, consider the problem
(adapted from Hekmatpour and Ince) of developing a tool to record the
relationships between the modules of a software system. The requirements may
be stated informally as consisting of the following routines: Add a module to
the system, delete a module from the system, list what modules a given module
may use, list what modules may use a given module, and list all recursive
modules.
We follow the "me too" paradigm (see Alexander and Jones) for prototyping an
application in three steps. The model step identifies the entities and
associated operations of an application. In this case, we select an entity
called "cross usage" (xu). This entity can be thought of as a set of pairs,
where each pair consists of a module and the set of modules it may use. The
operations are add, delete, uses, used_by, and rec_mod (recursive modules).
The specification step implements the entities and operations in terms of
sets, maps, sequences, and so on. We will represent xu as a map from modules
to sets of modules. Each module is represented by a string. The validation
step is where the operations are executed to determine if the model and
implementation are appropriate. The three steps are iterated as necessary
until a satisfactory design is obtained.
Figure 2 shows a module-structure diagram for a given software system. Listing
Three shows the specification step of the paradigm in terms of the global
state xu. The add_mod operation simply adds a module and the set of modules it
uses by using map update. The effect of the del_mod operation is that all
occurrences of the deleted module will be removed from the range elements of
the map and the entry for the module itself will be removed from the map.
The uses and used_by operations are straightforward. Finally, rec_mod is
defined in terms of an auxiliary function, reaches, which returns True if a
module can reach another module through a sequence of one or more calls.
Listing Four shows the validation step of the paradigm. After initializing the
database with calls to add_mod, the resulting value of xu is shown. Calls are
then made to the routines uses, used_by, and rec_mod. Finally, after calling
del_mod, the new value of xu is shown.


Conclusion


The Proxy interpreter, together with documentation and examples, is available
electronically; see "Availability," page 5. I'm also developing an extension,
Concurrent Proxy, that will allow modeling of concurrent and distributed
software, and direct execution of data-flow modules. Contact me directly for
more information on this package.


References


Abelson, H. and G.J. Sussman. Structure and Interpretation of Computer
Programs. Cambridge, MA: MIT Press, 1985.
Alexander, H. and V. Jones. Software Design and Prototyping using me too.
Englewood Cliffs, NJ: Prentice-Hall, 1990.
Hekmatpour, S. and D. Ince. Software Prototyping, Formal Methods and VDM.
Reading, MA: Addison-Wesley, 1988.

_PROXY: A SCHEME-BASED PROTOTYPING LANGUAGE_
by Burt Leavenworth


[LISTING ONE]

class queue(;rep) {
 queue() {rep=[];}
 enqueue(x) {rep=rep conc [x];}
 dequeue(;x) {x=hd rep;
 rep=tl rep;
 return x;}
 empty() {return rep==[];}};







[LISTING TWO]

class pqueue(;rep) {
 pqueue() {rep=[];}
 private: insrt(x,y) {if(y == []) {rep = [x]; return rep;} else
 if(x < hd y) {rep = [x] conc y;return rep;} else
 {rep = [hd y] conc insrt(x,tl y);
 return rep;}}

 public: insert(x) {rep = insrt(x,rep); return rep;}
 remove(;x) { if(rep == []) return "queue empty";
 x = hd rep; rep = tl rep; return x;} };






[LISTING THREE]

add_mod(m,ms) {xu[m]=ms;};

del_mod(m) {xu= {x->xu[x] diff {m}:x <- (dom xu diff {m})};};

uses(m) {return xu[m];};

used_by(m) {return {ms:ms<-dom xu;m in xu[ms]};};

rec_mod() {return {m:m<-dom xu;reaches(m,m)};};

reaches(m1,m2) {return ((m2 in xu[m1])(exists m in xu[m1];reaches(m,m2)));};





[LISTING FOUR]

add_mod("mod2",{"mod4","mod5","mod2"});
add_mod("mod3",{"mod5"});
add_mod("mod4",{"mod1","mod2"});
add_mod("mod5",{});
add_mod("mod1",{"mod2","mod3"});

xu = {"mod2"->{"mod4","mod5","mod2"},"mod3"->{"mod5"},
 "mod4"->{"mod1","mod2"},"mod5"->{},"mod1"->{"mod2","mod3"}}

uses("mod1") returns {"mod2","mod3"}
used_by("mod2") returns {"mod2","mod4","mod1"}
rec_mod() returns {"mod2","mod4","mod1"}

del_mod("mod3");

xu = {"mod2"->{"mod4","mod5","mod2"},"mod4"->{"mod1","mod2"},
 "mod5"->{},"mod1"->{"mod2"}}



Example 1. A general form for constructing sets.

{ f(x): x<- expr ; pred(x) }


Example 2. General method for constructing sets.

{ x: x<- {1,2,3,4,5}}; returns {1,2,3,4,5}

{ x: x <- {1,2,3,4,5};x>2}; returns {3,4,5}


{ x*x: x <- {1,2,3,4,5}}; returns {1,4,9,16,25}

It is possible to have two generators in which case an example is:

{ x+y: x <- {1,2}, y <- {3,4}} returns {4,5,6}

Only three elements are returned because two additions (1+4) and (2+3)
yield duplicate values.


Example 3: Syntax for using maps.

m = {1->2,3->4,5->6};


Example 4: (a) Domain restriction and subtraction; (b) using the
overwrite operator

(a)

{1->2,3->2,5->6} dr {3,5}; returns {3->2,5->6}
{1->2,3->2,5->6} ds {3,5}; returns {1->2}


(b)

{1->2,3->4} overwr {3->5,4->6}; returns {1->2,3->5,4->6}


Example 5a. General form for constructing a sequence; b)
constructing the sequence [4,5].

a)

[ f(x): x<- expr ; pred(x) ]

b)

[len x: x<-["abc","defg","hijkl"];len x > 3]; returns [4,5]



Example 6: (a) Syntax for the struct declaration; (b) Assigning
values to fileds in a struct

(a)

struct item {partno, code, quantity;};

(b)

i1.quantity=22; assigns 22 as the new value of the quantity field.





































































March, 1993
PROGRAMMING PARADIGMS


Serius Pet Tricks




Michael Swaine


The late physicist and Nobel laureate Richard Feynman once trained his dog to
retrieve socks by indirection. The dog would spot a sock on the floor and zip
off, circling around the house, entering the back door, grabbing the sock, and
retracing its steps to emerge a minute later from the front, sock in mouth,
all the while exhibiting supreme confidence that it was behaving in a
perfectly rational manner. Feynman had convinced the dog that this was the
only reasonable way to retrieve socks.
That, according to his biographer James Gleick, was Richard Feynman. He would
teach old physicists the new trick of thinking of particles traveling backward
in time, and the very perverseness of the idea would appeal to him. His sense
of humor was legendary: While working under the highest security on the atomic
bomb at Los Alamos, he used to crack colleagues' safes for laughs.
I've tried, with less perverseness and less luck, to teach my dog Molly a few
tricks. Molly is not dumb. She's certainly smarter than, say, Michael Abrash's
dog. (She's bigger than Jeff Duntemann's dogs, too, being more or less a
Labrador, and since I found her abandoned in the company parking lot, I
sometimes call her the DDJ Lab.)
I've had some success in teaching Molly commands like "sit" and "stay," and
I've taught her the difference between "mailbox" and "walk." Anytime I step
out the door, she assumes we're going for a walk and runs on ahead. The
question for her is: Are we going up the trail into the woods (her favorite),
or down the driveway to the mailbox (no big deal)? She listens attentively for
what I'm going to say and then takes off running in one direction or the
other. Sometimes for a stretch of weeks she will seem to understand perfectly.
Then she'll seemingly forget and run up the hill every time, no matter what I
say. My theory is, she's trying to teach me which way to go.
After all, she taught herself what car keys are for. With no help from me, she
figured out that every time I step out the door and take my keys out of my
pocket, I am going to get in the car and drive off without her. Now, as soon
as she sees the keys, she sits back down and watches mournfully. I sometimes
wonder just how smart she is.


The Dog Star


"Serius" was the name of the superintelligent dog in Olaf Stapledon's classic
science fiction-novel of the same name. "Serius" is also the name of the
software product that the UPS driver tossed out of the truck onto my driveway
a while ago. (His delivery technique owes a lot to his attitude toward large
dogs, and I've had no luck at all in training him.) This is version 3.0 of
Serius, and the product has learned some new tricks since its last release.
Serius is an object-oriented development system that can be used profitably by
casual and experienced programmers, from people who don't want to write code
at all to those who debug C code in their sleep.
Earlier in its life, Serius attracted some pats on the head from programmers
and pundits. Charles Seiter in MacWorld called it one of the few new
programming schemes to take seriously. Of course, in that publication he was
talking to a Macintosh audience, and not an exclusively technical one. But
J.D. Hildebrand, writing about Serius in Computer Language, said that he was
extremely impressed by it and implied that it was reason enough to buy a
Macintosh.
Serius seems to be particularly popular with people who: 1. are developing
multimedia products and; 2. while not necessarily put off by programming, want
to spend the bulk of their efforts on content development. At a gathering of
multimedia developers last year, I kept hearing the name Serius, and a session
panelist confirmed that most of the serious multimedia developers he knew were
in fact Serius developers. That was when I began to get interested.
Serius is for both Macintosh and Windows. The Mac version plays by Mac rules
and supports Mac-model interapplication communication and publish and
subscribe.
It really consists of two products, which can be purchased separately: Serius
Workshop and Serius Developer Pro. Serius Workshop is a user-developer system,
while Serius Developer Pro requires real programming experience; it's a tool
for serious developers.
I wish the folks at Serius Corp. hadn't used such generic names. The
capitalization helps here, but I would hate to have to give a talk about these
products to Serius programmers or to serious programmers.
Whether you're a potential Serius programmer or a serious programmer, Serius
Workshop is where you start. (See what I mean?)


Serius Workshop


Serius Workshop is a visual programming system that produces stand-alone
applications.
All visual programming systems look at least a little like flowcharts: boxes
and lines. Labeled boxes and awkwardly labeled lines. Too often, flowcharts
hide everything interesting or difficult about a program in a box labeled
"process." Flowchart-like diagrams are good for representing flow, but they
need some help when it comes to representing functionality. And when a visual,
flowchart-like model is used to create programs rather than just to represent
their structure and flow, this can be an even bigger problem.
Flowcharts were developed to represent procedural programs. Like a lot of
visual programming systems, Serius claims to be an object-oriented system.
Does that make a difference? Object-oriented programming systems at least make
it clear where to put the functionality. The boxes are objects, and objects
have behaviors associated with them. Unfortunately, most visually oriented
programming systems fall short of being real object-oriented programming
systems. There may be a Catch-22 at work: The ideal paradigm for visually
oriented systems is OOPS; the ideal user of visually oriented systems is a
novice; the ideal paradigm for a novice is something other than OOPS. It's a
theory, anyway.
The authors of Serius did not accept that theory. Serius Workshop is a pretty
serious application-development system with an object-oriented flavor,
accessible to people who do not think of themselves as serious developers. Not
complete novices, perhaps, but people like multimedia content developers who
need to program, but don't think of programming as what they do.
Among the object-oriented features are plug-and-play reusability and object
persistence. What's chiefly missing is inheritance.
Serius Workshop is also missing the ability to create your own objects, but
Serius Developer Pro supplies that. Serius Corp. itself supplies a lot of
ready-made objects. Workshop comes with objects such as menu bars, menus, menu
items, windows, buttons, radio-button groups, other controls, lists, date and
time objects, text and number objects, pictures, and cursors. Program control
is handled via loop and subroutine objects, user interaction via notification
objects, and file operations via file objects. Two kinds of peripheral
devices, printers and scanners, have objects, and there is a painter object,
which is really the guts of a full paint program. Serius also sells packages
of objects for serial communications, database development, interapplication
communication, multimedia development, and Serius XCMD, which lets you
incorporate HyperCard external commands and functions (original or new XCMD
format) into Serius applications. Windows versions are forthcoming for all but
XCMDS.


Short Subjects


Serius Workshop uses a project-and-subproject model. A project is just an
application under development. Subprojects, which the Serius authors--in a
less than serious moment--decided they would shorten to "subjects," are
reusable. A typical reusable subject would be a standard menu-bar subject,
containing objects for a menu bar and the File and Edit menus that are
standard in most Macintosh applications.
To build a subject, you work in two windows. In one, you just deposit icons
for the objects you want to use, clicking on each icon to edit it. Each object
has its own custom editor for setting its characteristics. What you are
creating are instances of objects, so of course you can have several instances
of the menu object, each with its own characteristics. In the other window,
you deposit icons for functions specific to the objects you've chosen, and
string them together in a way that looks suspiciously like procedural
programming. Underneath, Serius is creating objects and messages, but what you
build in the window are short function chains. This is probably clearer to the
casual user than a more visible object-oriented approach.
You also wire the function icons to the object icons in the other window, for
example, drawing a line from the icon for the Quit menu item in the objects
window to the icon for the Quit function in the functions window. The screen
can get messy fast, with lines connecting chains of functions, lines
connecting functions to objects, and labels on the lines and above the
function icons. (A line from a menu-item object to the function triggered when
it is selected would have the label "selected." The labels on function icons
are for parameters.)
I know: The lines and boxes sound messy, the confounding of paradigms sounds
confusing, and you are wondering about performance. But I find that for small
subjects, the model is easy to grasp, feels pretty natural, and doesn't
generate a lot of visual spaghetti. And for small, simple projects, it
generates pretty efficient, compiled, stand-alone applications. I admit I
haven't written anything that would really test it, like a distributed
database system.


ObjecTalk


For those who absolutely have to work in a textual paradigm, the designers
have provided ObjecTalk, a scripting language. Anything you can do by wiring
up icons on the screen, you can also accomplish via ObjecTalk code. An
ObjecTalk utterance consists of a function name and probably some parameters;
for example: Open window My-Paint.
That's pretty simple, but there a couple of noteworthy points about this
language.
First, it is not an alternative to the entire visual programming paradigm of
Serius Workshop, but only for functions. You can define functions either via
icons or via ObjecTalk, and that's all. In practice, you choose whether you
want to work in ObjecTalk or ObjecSketch (the icon mode) for each subject. You
can switch back and forth at any time; the two map one-to-one. Each open
subject can have its own mode, so you may see textual programming in one
window and wired icons in another.

Second, there's the fuzz. If you hate and scorn scripting languages like
HyperTalk for their overhead dedicated to user-friendly Anglicization, then
you'll really be annoyed by ObjecTalk. ObjecTalk, its designers trumpet with
only slight exaggeration, has no syntax. Just get the idea down somehow or
other, and the fuzzy-logic parser will figure out what you mean.
Well.
It's easy to be sarcastic, but the idea has merit. It is appropriate for its
target user, it raises technical problems worth being raised, and the
ObjecTalk implementation is actually pretty good. It's not by any stretch of
the imagination syntaxless--I can't imagine what a syntaxless language would
be. It's also not finished, a fact that the Serius authors frankly
acknowledge. They expect to continue exploring the possibilities of fuzzy
parsing in future revisions. But the current example is interesting. For
example, all the following can be equivalent:
 Open MyPaint Open MyPaint window Show MyPaint Open the window named MyPaint
Display
MyPaint Display My_Paint Open MyPint Aperis fenestram Mypaint
Most of the variations above are built-in equivalences, like "show" =
"display" = "open" or optional words like "the" and "named" and "window"
(optional if leaving it out is not irremediably ambiguous), but there are some
other things afoot as well. You can create your own language-element names,
including translating ObjecTalk into a foreign language. (Serius is in the
process of developing foreign-language versions, though apparently Latin is
not a high priority.) And misspelling is tolerated--that's the real
fuzzy-logic aspect. The programmer has two settable parameters--the Match
Threshold and the Ambiguity Threshold--for controlling the degree of ambiguity
accepted and for resolving it.
My guess is that most serious programmers would enjoy the challenges of
developing such an interface or at least studying how Serius Corp. has
approached it, while most Serius programmers would actually enjoy using it.
But only serious programmers would get anything out of Serius Developer Pro.


Serius Developer Pro


Serius Developer Pro is a separate product that allows you to create your own
Serius objects and functions. Your objects and functions have exactly the same
status as objects and functions developed by Serius, and can be sold to Serius
users license-free.
Serius Workshop doesn't emit any code. It just spits out finished, compiled,
stand-alone applications, using supplied objects. Serius Developer Pro
produces objects and functions for Serius Workshop. It isn't a compiler; you
have to write the code for these objects and functions using a C or Pascal
compiler, then import the code into Serius Developer Pro.
When you work with Serius Workshop, you only have to think about objects and
functions. When you work with Serius Developer Pro, you deal with object
types, functions, methods, and events. You can use Serius Developer Pro to
construct any of these four kinds of things.
The Serius Developer Pro manual is as important as the program itself, since
it tells you how to write the code that Serius Developer Pro turns into an
object type, function, method, or event.
The program walks you through the remaining steps in creating these things.
For example, to create a new object type, you have to specify a type ID,
supply any needed resources used by objects of this type, define data
structures for any data stored by objects of this type, supply code used in
the creation and assignment of objects of this type, and provide a user dialog
for editing objects of this type--that is, for allowing the user to set object
properties. Serius Developer Pro guides you through the process of packaging
these things into an object type that Serius Workshop can use.
It doesn't sound like Serius Developer Pro does a whole lot, since it doesn't
produce code or finished applications, and since I say the manual is as
important as the program. But it does something very important: It imposes a
single, strict approach to object development. Your objects will look and act
like they were developed by Serius, and users will have no more trouble using
them than Mac users have using their second or third Mac application. By
exerting this level of control, Serius comes closer to the promise of reusable
components than object-oriented programming usually does.
This, I think, makes Serius worth a look even if you don't find the product
itself interesting. While you may not program on the Mac or Serius may not
meet your needs, it does provide an interesting example of an approach to
reusable components that actually works.
For More Information
For More Information

Serius Corp.
6400 Commerce Park
488 East 6400 South, Suite 100
Salt Lake City, UT 84107-7590
801-261-7900

Serius Workshop, $395.00
Serius Developer Pro, $1495.00
Serius Communications Object
Library, $145.00
Serius Multimedia Object
Library, $195.00
Serius Interapp Object Library,
$95.00

























March, 1993
C PROGRAMMING


Control Classes Continued


 This article contains the following executables: DFPP01.ARC


Al Stevens


I taught a C++ class this week. The students were C programmers who are
undertaking the port of a DOS system to Windows. They have to write a word
processor of sorts. That's right, the ones out there aren't quite good enough.
Only our government could land on logic like that. The wrinkle is that the
system has to use SGML, a standard generic text mark-up language to identify
fonts, headings, page breaks, and such. The Windows SGML editors available are
expensive--not as expensive as writing your own, but the users need a lot of
licenses. The software is for A&E firms to use in preparing engineering
specifications for proposals. They have to use the system to be eligible to
bid. Uncle Sam doesn't want to tell them that they have to buy a $1000 text
editor just to respond to a request for proposal. I wish they'd be that
considerate of me every April.
The first question out of the class was, "We have to design an object-oriented
program. What are the objects?" I remember asking that same question several
years ago. It took a while to figure out the answer. We kicked around some
possible objects, mostly taken from exercises in the book, and then they
wanted me to help them find the objects in their project's design. They were
intimidated by the notion that if you don't call everything an object, you
aren't doing it right. That's a flawed notion. Their compiler is rich with
classes in a container-class library and a Windows-API class library. There
are already plenty of objects for them to use. What they did not understand
was that there are generic data structures into which parts of their problem
will fit, and the container classes support those generic structures. If your
class library is complete enough, you may never need to identify an object.
You never really know a subject until you've taught it. Each of my four
students had a 486 with Borland C++ installed. They'd ask a question, and
before I could look up the answer in the ARM, they'd have it typed in and
running through the debugger to see what happened. We used my book, Teach
Yourself C++. (Mine, not Herb Schildt's. Get your own class, Herb.) By the end
of the week, they not only had a good foundation in C++, but they had found
some errors in the book as well.


Visual Basic: Add C and Stir


I've been working on a Windows program that provides access to a big
static-text database. Most of the text compression, indexing, and search
functions come from earlier work, much of which you've seen here in this
column. I blanched at the thought of mastering the SDK for one program, so I
decided to use Visual Basic for the user interface and build my existing C
code into a DLL. It has been a smooth project primarily because the C code was
already checked out in DOS programs. Borland's Turbo C++ for Windows is a good
tool for editing and building the DLL directly in Windows, although their
documentation is weak when it comes to DLLs. They tell you how to build one,
but they leave some stuff out. In desperation I switched to Microsoft C 7.0,
which gave me a much needed error message that Borland left out. I fixed the
error and returned to TCW, and everything is okay now. Visual Basic is a great
tool for constructing a quick and dirty Windows application. Jeff Duntemann
pleads for a Visual Pascal. Listen up, Microsoft. We really need a Visual
C/C++, too.


D-Flat++ Controls


Last month's column discussed the D-Flat++ Application class and introduced
the Control class. In a CUA user interface, a "control" is a generic term that
applies to any of the input mechanisms that a user has to enter data values.
Text entry, options, file selections, all come to the program from the user by
way of one or more controls, which include edit boxes, lists, buttons, and the
like. The DF++ class library implements controls with specific control classes
derived from the Control class. Last month we discussed the first of the
controls, the TextBox class. Most of the other controls will derive from the
text box, because they require the basic text-displaying functions that it
supports.
This month we look at five more control classes, all of which derive from the
TextBox class. One, the ListBox class, will be the base class for pop-down
menus. It will also provide listbox support for dialog boxes. The other
classes are radio and command buttons, the check box, and the base class for
buttons.


The ListBox


Listing One, page 146, is listbox.h, the source file that describes the
ListBox class. The class has few data members of its own. The addmode variable
will indicate that the list box is in extended-selection mode, a feature that
I have not implemented yet. The anchorpoint and selectcount variables are for
that feature as well. The only data member in use in the current
implementation is the selection variable, which subscripts the current user
selection on the list box. Menus, which are implemented in the first version
and which are derived from the list box, use that variable. The constructors
and destructor are all inline functions. The constructors call the OpenWindow
function to initialize the list box.
Listing Two, page 146, is listbox.cpp, which contains the member functions for
the ListBox class. The OpenWindow function initializes the data members and
calls the base TextBox class's OpenWindow function. A list box differs from a
text box in that the list box has a selection cursor bar. When the user
scrolls a list box up and down, the selection bar moves with the scrolling
action. The ListBox class must override the TextBox's keyboard and scrolling
messages to accommodate this feature. In addition, there are SetSelection and
ClearSelection public member functions to allow the using program to modify
the selected entry in the list box. The paint, scrolling, and paging messages
all use the TextBox class's messages and then augment them by repainting the
selected line if it is in view. The selected line must display in highlighted
colors so that the user can distinguish it from other entries in the list.
The ListBox class intercepts mouse messages to know when the user has changed
the selection or chosen one. The user "selects" a listbox entry by a single
click or by moving the selection bar to the entry with the keyboard arrow
keys. In this context, "selecting" means pointing. No action is taken until
the user "chooses" the selection by pressing the Enter key or double-clicking
the selection. These terms come from the Windows and CUA lexicon. Pop-down
menus behave differently, so even though they derive from the list box, they
will need to override some of the list box's behavior. I'll discuss them in a
later column.


Buttons


Users have different kinds of buttons with which to specify options and
selections. There is a tendency in user interfaces to make the screen look
like things more familiar to users. So, we have buttons, because our home
appliances have buttons, and people know how to press them. Lyndon Johnson
worried about the grave responsibility of a president who could "mash" a
button and destroy the world. The command button is a familiar device on
dialog boxes. Most of them have OK and Cancel command buttons, and users are
accustomed to "mashing" them to accept or reject the dialog's content. Check
boxes record the user's selection with respect to simple binary decisions.
Either the option is enabled or not. (Toggled menu commands perform a similar
function, and the designer is often unsure about which one to use.) Radio
buttons resemble the radio buttons on an old car. They come in groups and when
you press one, the others pop out. They represent mutually exclusive choices
within a group.


Command Buttons


Listing Three, page 146, and Listing Four, page 147, are pbutton.h and
pbutton.cpp, the source files that define the PushButton class, the class that
implements command buttons. A command button has an executable function
associated with it. The function carries out the command. When the user
presses the button, the function executes. The class, therefore, has three
data members, a Boolean variable that indicates that the button is pressed,
the window handle of the window that will receive the message, and the address
of the function that will execute. The function is a member function of the
DFWindow class. Usually it will be a member function of the application window
class that you derive from the Application class, discussed last month. The
SetButtonFunction member function of the Button class will initialize the
window and function addresses. Command execution occurs when the user clicks
the button or selects it (tabs to it) with the keyboard and presses the Enter
key. The execution occurs when the user releases the mouse button or Enter
key. While the mouse button or Enter key is down, the class paints the button
in a depressed state. When the user releases the key or button, the class
executes the function if the window and function pointers are initialized. The
member functions that manage the user's selection and choice and the Paint
member function are virtual functions so that you can derive new command
buttons with different formats.


Generic Buttons


Listing Five and Listing Six, page 148, are button.h and button.cpp, the
source files that implement a generic Button base class from which the
RadioButton and CheckBox classes are derived. The Button class is derived from
the TextBox class and adds only one data member, the setting variable, which
records if the button object is set on or off. The constructor and destructor
are protected, so the Button class is an abstract base class, which means that
you cannot instantiate an object of its type. You must instantiate one of its
derived typed instead. The Button class defines keyboard and mouse processes
for its derived classes and paints the button and its label.


Radio Buttons



Listing Seven and Listing Eight, page 148, are radio.h and radio.cpp, the
source files that implement radio buttons. Radio buttons come in groups, and
only one button of the group may be set on at any time. Therefore, if you set
one button on, the software must set the others off. The first matter to
resolve is how to assign radio buttons to a group. A dialog box might have
more than one group of radio buttons, so the grouping must be based on
something other than the common parent of the button objects. When the user
selects a radio button, the software needs to know which other buttons to turn
off. I developed a technique for D-Flat that works well. Radio buttons are
grouped if they share the same window X coordinate and if there are no blank
lines between them. So far, that has not been a serious restriction, so I
decided to keep it in DF++. The program begins by building a table of all the
radio buttons that have the same parent window and the same left screen
coordinate as the one being selected. Then it purges the ones that are not in
a contiguous group in the Y coordinate. Next it turns off all the radio
buttons. Finally, it turns on the one being selected.
The base Button class's constructor builds a string with open and close
parentheses characters followed by the button's label, which is the default
display for a radio button. A radio button will display with text like this:
 ( ) Option description
If you put a tilde (~) in the label, the letter it follows will display in a
highlighted font and the tilde will not display. This allows you to identify a
shortcut key for the radio button. When the button is selected, the program
modifies the text to include a bullet character (ASCII 7) between the paren
characters. The character comes from the setchar data member, which the radio
button's constructor provides.


Check Boxes


Listing Nine and Listing Ten, page 149, are checkbox.h and checkbox.cpp, the
source files that implement the CheckBox class. The class definition looks
just like that of the RadioButton class. The difference is in the behavior as
defined by the member functions. The constructor builds a check box's text
display with brackets rather than parentheses characters so that the check box
looks like this:
 [] Option description
A tilde in the label has the same effect as it does for a radio button. The
check box displays the X character inside its brackets when it is selected. I
have a baseball cap with a selected checkbox on the front. I thought it was a
Dan Quayle signature campaign cap, and I still haven't figured out why the
guys down at the Moose Lodge glare at me when I wear it.


How to Get the Source Code


D-Flat++ is still in its preliminary version. I am writing this column in
December and have just uploaded the first version. It does not include a full
CUA package, but there is enough of an implementation to give you an idea of
how it will work and how it will differ from D-Flat. You can download DF++
from the CompuServe DDJ forum or from M&T Online. You can also get it by
sending a stamped, self-addressed diskette mailer and a formatted diskette to
me at Dr. Dobb's Journal, 411 Borel Ave., San Mateo, CA 94402. The software is
free, but if you wish, include a dollar for my Careware charity, the Brevard
County Food Bank.


Bloated Commentary


John Dvorak, columnist, bon vivant, raconteur, and windbag at large, recently
commented about a new C++ compiler for supercomputers from Cray Research,
saying with typical bombast, "I'm glad to see that bloated code is now
universal." As a rule, PC Magazine, where Dvorak's column appears, has good
programming commentary, featuring the works of Duncan, Petzold, Prosise,
Schulman, and other esteemed technical superstars. But not all PC Mag
columnists are as qualified to discuss programming issues. Most of them stick
judiciously to their own turf, but Dvorak is not so disciplined, and his
remark evinces a serious lack of understanding about programmers' concerns. He
compounds the evidence of his ignorance when he says in a subsequent column
that anyone who uses the phrase, "paradigm shift" probably does not know what
they are talking about. My guess is that Dvorak himself does not know what
they are talking about and, therefore, dismisses as inconsequential and
invalid that which he does not understand.
All that notwithstanding, the guy writes an entertaining, irreverent, and
controversial column and has devoted followers who, unless otherwise informed,
might believe that he always speaks with well-founded authority. You should be
wary when nonprogramming writers make casual remarks about programming; they
would do better to stay with operating systems, applications, and the business
activities of the industry and leave programming issues to those who
understand them.


InterNetworking


Networks fascinate me. I couldn't get by without e-mail. For some time I've
gotten messages that come by way of Internet. I wasn't really sure how
Internet worked and how I could get into it except through a CompuServe
gateway, but it seemed as if it should have more to offer than just mail. I
looked for books about it and only just recently found one. It's called The
Internet Companion, by Tracy LaQuey and Jeanne Ryer (Addison-Wesley, 1992). It
does a nice job of explaining what Internet is, where it came from, how you
get onto it, and how you use it. But the really interesting part is that the
foreword is by Senator Al Gore, written in August 1992. Besides making me
wonder how an author can get a vice-presidential candidate to write a
foreword, this event is news. Read the foreword and you realize that a single
heartbeat away from the presidency is a man who understands some of what we
do. I leave it to you to decide whether that's good news or bad news.

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

// -------- listbox.h

#ifndef LISTBOX_H
#define LISTBOX_H

#include "textbox.h"

const int LISTSELECTOR = 4; // selected list box entry
class ListBox : public TextBox {
 Bool addmode; // adding extended selections mode
 int anchorpoint; // anchor point for extended selections
 int selectcount; // count of selected items
 virtual void SetColors();
protected:
 int selection; // current selection
 virtual void ClearSelection();
public:
 ListBox(char *ttl, int lf, int tp, int ht, int wd,
 DFWindow *par) : TextBox(ttl, lf, tp, ht, wd, par)

 { OpenWindow(); }
 ListBox(char *ttl, int ht, int wd, DFWindow *par)
 : TextBox(ttl, ht, wd, par)
 { OpenWindow(); }
 ListBox(int lf, int tp, int ht, int wd, DFWindow *par)
 : TextBox(lf, tp, ht, wd, par)
 { OpenWindow(); }
 ListBox(int ht, int wd, DFWindow *par) : TextBox(ht,wd,par)
 { OpenWindow(); }
 ListBox(char *ttl) : TextBox(ttl)
 { OpenWindow(); }
 virtual ~ListBox()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- listbox API messages
 virtual void OpenWindow();
 virtual void CloseWindow();
 virtual void Paint();
 virtual void Keyboard(int key);
 virtual void SetSelection(int sel);
 virtual void ButtonReleased(int mx, int my);
 virtual void LeftButton(int mx, int my);
 virtual void DoubleClick(int mx, int my);
 virtual void Choose();
 virtual void ScrollUp();
 virtual void ScrollDown();
 virtual void ScrollRight();
 virtual void ScrollLeft();
 virtual void PageUp();
 virtual void PageDown();
 virtual void PageRight();
 virtual void PageLeft();
};
#endif








[LISTING TWO]

// ------------ listbox.cpp
#include "listbox.h"
#include "keyboard.h"

// ----------- common constructor code
void ListBox::OpenWindow()
{
 windowtype = ListboxWindow;
 if (windowstate == CLOSED)
 TextBox::OpenWindow();
 selection = -1;
 addmode = False;
 anchorpoint = -1;
 selectcount = 0;
 SetColors();
}

// -------- set the fg/bg colors for the window
void ListBox::SetColors()
{
 colors.fg = YELLOW;
 colors.bg = BLUE;
 colors.sfg = LIGHTGRAY;
 colors.sbg = BLACK;
 colors.ffg = LIGHTGRAY;
 colors.fbg = BLUE;
 colors.hfg = BLACK;
 colors.hbg = LIGHTGRAY;
}
void ListBox::CloseWindow()
{
 TextBox::CloseWindow();
}
void ListBox::ClearSelection()
{
 if (selection != -1)
 WriteTextLine(selection, colors.fg, colors.bg);
}
void ListBox::SetSelection(int sel)
{
 ClearSelection();
 if (sel >= 0 && sel < wlines) {
 selection = sel;
 WriteTextLine(sel, colors.sfg, colors.sbg);
 }
}
void ListBox::Keyboard(int key)
{
 int sel = selection; // (ClearSelection changes selection)
 switch (key) {
 case UP:
 if (sel > 0) {
 ClearSelection();
 if (sel == wtop)
 ScrollDown();
 SetSelection(sel-1);
 }
 return;
 case DN:
 if (sel < wlines-1) {
 ClearSelection();
 if (sel == wtop+ClientHeight()-1)
 ScrollUp();
 SetSelection(sel+1);
 }
 return;
 case '\r':
 Choose();
 return;
 default:
 break;
 }
 TextBox::Keyboard(key);
}
// ---------- paint the listbox
void ListBox::Paint()

{
 TextBox::Paint();
 if (text != NULL)
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::ScrollUp()
{
 TextBox::ScrollUp();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::ScrollDown()
{
 TextBox::ScrollDown();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::ScrollRight()
{
 TextBox::ScrollRight();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::ScrollLeft()
{
 TextBox::ScrollLeft();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::PageUp()
{
 TextBox::PageUp();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::PageDown()
{
 TextBox::PageDown();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::PageRight()
{
 TextBox::PageRight();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
void ListBox::PageLeft()
{
 TextBox::PageLeft();
 WriteTextLine(selection, colors.sfg, colors.sbg);
}
// ---------- Left mouse button was clicked
void ListBox::LeftButton(int mx, int my)
{
 if (my != prevmouseline) {
 if (ClientRect().Inside(mx, my)) {
 int y = my - ClientTop();
 if (wlines && y < wlines-wtop)
 SetSelection(wtop+y);
 }
 }
 DFWindow::LeftButton(mx, my);
}
void ListBox::DoubleClick(int mx, int my)
{

 if (ClientRect().Inside(mx, my)) {
 my -= ClientTop();
 if (wlines && my < wlines-wtop)
 Choose();
 }
 DFWindow::DoubleClick(mx, my);
}
void ListBox::ButtonReleased(int, int)
{
 prevmouseline = -1;
}
void ListBox::Choose()
{
 // --- does nothing yet
}







[LISTING THREE]


// ----------- pbutton.h
#ifndef PBUTTON_H
#define PBUTTON_H

#include "textbox.h"
class PushButton : public TextBox {
 virtual void SetColors();
 Bool pressed;
 DFWindow *owner; // window that gets the command
 void (DFWindow::*cmdfunction)(); // selection function
public:
 PushButton(char *lbl, int lf, int tp, DFWindow *par);
 virtual ~PushButton()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- API messages
 virtual void OpenWindow();
 virtual void CloseWindow();
 virtual Bool SetFocus();
 virtual void ResetFocus();
 virtual void Paint();
 virtual void Shadow();
 virtual void Keyboard(int key);
 virtual void LeftButton(int mx, int my);
 virtual void ButtonReleased(int mx, int my);
 virtual void MouseMoved(int mx, int my);
 virtual void KeyReleased();
 virtual void PressButton();
 virtual void ReleaseButton();
 virtual void ButtonCommand();
 void SetButtonFunction(DFWindow *wnd,
 void (DFWindow::*cmdfunc)())
 { owner = wnd; cmdfunction = cmdfunc; }
};
#endif







[LISTING FOUR]

// ------------- pbutton.cpp
#include "pbutton.h"
#include "desktop.h"

PushButton::PushButton(char *lbl, int lf, int tp, DFWindow *par)
 : TextBox(lf, tp, 1, strlen(lbl)+2, par)
{
 OpenWindow();
 String lb(" ");
 lb += lbl;
 lb += " ";
 SetText(lb);
}
// ----------- common constructor code
void PushButton::OpenWindow()
{
 windowtype = PushButtonWindow;
 if (windowstate == CLOSED)
 TextBox::OpenWindow();
 SetColors();
 pressed = False;
 cmdfunction = NULL;
}
void PushButton::CloseWindow()
{
 TextBox::CloseWindow();
}
// -------- set the fg/bg colors for the window
void PushButton::SetColors()
{
 colors.fg = BLACK;
 colors.bg = CYAN;
 colors.sfg = WHITE;
 colors.sbg = CYAN;
 colors.ffg = BLACK;
 colors.fbg = CYAN;
 colors.hfg = DARKGRAY; // Inactive FG
 colors.hbg = CYAN; // Inactive BG
 shortcutfg = RED;
}
void PushButton::Paint()
{
 if (visible) {
 COLORS fg;
 COLORS bg;
 if (!pressed) {
 if (isEnabled()) {
 if (this == desktop.InFocus())
 fg = colors.sfg;
 else
 fg = colors.fg;

 WriteShortcutLine(0, fg, colors.bg);
 }
 else
 WriteTextLine(0, colors.hfg, colors.hbg);
 }
 else {
 // ---- display a pressed button
 fg = ClientBG();
 bg = Parent()->ClientBG();
 int wd = Width();
 WriteWindowChar(' ', 0, 0, fg, bg);
 for (int x = 0; x < wd; x++) {
 WriteWindowChar(220, x+1, 0, fg, bg);
 WriteWindowChar(223, x+1, 1, fg, bg);
 }
 }
 }
}
void PushButton::Shadow()
{
 if (visible && (attrib & SHADOW)) {
 COLORS bg = Parent()->ClientBG();
 int wd = Width();
 WriteWindowChar(220, wd, 0, BLACK, bg);
 for (int x = 1; x <= wd; x++)
 WriteWindowChar(223, x, 1, BLACK, bg);
 }
}
Bool PushButton::SetFocus()
{
 TextBox::SetFocus();
 Paint();
 return True;
}
void PushButton::ResetFocus()
{
 TextBox::ResetFocus();
 Paint();
}
void PushButton::Keyboard(int key)
{
 if (key == '\r')
 PressButton();
 else
 TextBox::Keyboard(key);
}
void PushButton::LeftButton(int mx, int my)
{
 if (ClientRect().Inside(mx,my)) {
 PressButton();
 CaptureFocus();
 }
}
void PushButton::MouseMoved(int mx, int my)
{
 if (desktop.FocusCapture() == this)
 if (ClientRect().Inside(mx,my))
 PressButton();
 else

 ReleaseButton();
}
void PushButton::ButtonReleased(int, int)
{
 ReleaseFocus();
 ButtonCommand();
}
void PushButton::KeyReleased()
{
 ButtonCommand();
}
void PushButton::ButtonCommand()
{
 if (pressed) {
 ReleaseButton();
 if (cmdfunction != NULL && owner != NULL)
 (owner->*cmdfunction)();
 }
}
void PushButton::PressButton()
{
 if (!pressed) {
 pressed = True;
 Paint();
 }
}
void PushButton::ReleaseButton()
{
 if (pressed) {
 pressed = False;
 Paint();
 Shadow();
 }
}







[LISTING FIVE]

// ----------- button.h
#ifndef BUTTON_H
#define BUTTON_H

#include "textbox.h"
class Button : public TextBox {
 virtual void SetColors();
 Bool setting;
protected:
 char setchar;
 Button(char *lbl, int lf, int tp, DFWindow *par);
 virtual ~Button()
 { if (windowstate != CLOSED) CloseWindow(); }
public:
 // -------- API messages
 virtual void OpenWindow();

 virtual void CloseWindow();
 virtual Bool SetFocus();
 virtual void ResetFocus();
 virtual void Paint();
 virtual void Keyboard(int key);
 virtual void LeftButton(int mx, int my);
 virtual void InvertButton();
 virtual void PushButton();
 virtual void ReleaseButton();
 Bool Setting() { return setting; }
};
#endif





[LISTING SIX]

// ------------- button.cpp
#include "button.h"
#include "desktop.h"
Button::Button(char *lbl, int lf, int tp, DFWindow *par)
 : TextBox(lf, tp, 1, strlen(lbl)+5, par)
{
 OpenWindow();
 String lb("( ) ");
 lb += lbl;
 lb += " ";
 SetText(lb);
}
// ----------- common constructor code
void Button::OpenWindow()
{
 if (windowstate == CLOSED)
 TextBox::OpenWindow();
 SetColors();
 setting = False;
 setchar = ' ';
}
void Button::CloseWindow()
{
 TextBox::CloseWindow();
}
// -------- set the fg/bg colors for the window
void Button::SetColors()
{
 colors.fg =
 colors.sfg =
 colors.ffg =
 colors.hfg = Parent()->ClientFG();
 colors.bg =
 colors.sbg =
 colors.fbg =
 colors.hbg = Parent()->ClientBG();
 shortcutfg = RED;
}
void Button::Paint()
{

 if (visible) {
 (*text)[1] = setting ? setchar : ' ';
 if (isEnabled())
 WriteShortcutLine(0, colors.fg, colors.bg);
 else
 WriteTextLine(0, colors.hfg, colors.hbg);
 }
}
Bool Button::SetFocus()
{
 TextBox::SetFocus();
 desktop.cursor().normalcursor();
 desktop.cursor().SetPosition(Left()+1, Top());
 desktop.cursor().Show();
 return True;
}
void Button::ResetFocus()
{
 TextBox::ResetFocus();
 desktop.cursor().Hide();
}
void Button::Keyboard(int key)
{
 if (key == ' ')
 InvertButton();
 else
 TextBox::Keyboard(key);
}
void Button::LeftButton(int mx, int my)
{
 if (ClientRect().Inside(mx,my))
 InvertButton();
}
void Button::InvertButton()
{
 if (setting)
 ReleaseButton();
 else
 PushButton();
}
void Button::PushButton()
{
 setting = True;
 Paint();
}
void Button::ReleaseButton()
{
 setting = False;
 Paint();
}





[LISTING SEVEN]

// ----------- radio.h
#ifndef RADIO_H

#define RADIO_H

#include "button.h"
class RadioButton : public Button {
public:
 RadioButton(char *lbl, int lf, int tp, DFWindow *par);
 virtual ~RadioButton()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- API messages
 virtual void InvertButton();
 virtual void PushButton();
 virtual void OpenWindow();
};
#endif






[LISTING EIGHT]

// ------------- radio.cpp
#include "radio.h"
#include "desktop.h"

RadioButton::RadioButton(char *lbl, int lf, int tp,
 DFWindow *par) : Button(lbl, lf, tp, par)
{
 OpenWindow();
 setchar = 7;
}
void RadioButton::OpenWindow()
{
 if (windowstate == CLOSED)
 Button::OpenWindow();
 windowtype = RadioButtonWindow;
}
void RadioButton::InvertButton()
{
 PushButton();
}
void RadioButton::PushButton()
{
 int ht = desktop.screen().Height();
 DFWindow **rd = new DFWindow *[ht];
 for (int i = 0; i < ht; i++)
 rd[i] = NULL;
 // - build a table of radio buttons at the same x coordiate
 DFWindow *sib = Parent()->First();
 while (sib != NULL) {
 if (sib->WindowType() == RadioButtonWindow) {
 if (sib->Left() == Left()) {
 int tp = sib->Top();
 if (tp < ht)
 rd[tp] = sib;
 }
 }
 sib = sib->Next();

 }
 // ----- find the start of the radiobutton group
 i = Top();
 while (i >= 0 && rd[i] != NULL)
 --i;
 // ---- ignore everthing before the group
 while (i >= 0)
 rd[i--] = NULL;
 // ----- find the end of the radiobutton group
 i = Top();
 while (i < ht && rd[i] != NULL)
 i++;
 // ---- ignore everthing past the group
 while (i < ht)
 rd[i++] = NULL;
 // ------ release all the radio buttons in the group
 for (i = 0; i < ht; i++)
 if (rd[i] != NULL)
 ((RadioButton *)rd[i])->ReleaseButton();
 delete [] rd;
 // ----- set the chosen radio button
 Button::PushButton();
}





[LISTING NINE]

// ----------- checkbox.h
#ifndef CHECKBOX_H
#define CHECKBOX_H

#include "button.h"
class CheckBox : public Button {
public:
 CheckBox(char *lbl, int lf, int tp, DFWindow *par);
 virtual ~CheckBox()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- API messages
 virtual void OpenWindow();
};
#endif





[LISTING TEN]

// ------------- checkbox.cpp
#include "checkbox.h"
#include "desktop.h"
CheckBox::CheckBox(char *lbl, int lf, int tp, DFWindow *par) :
 Button(lbl, lf, tp, par)
{
 OpenWindow();
 setchar = 'X';

 (*text)[0] = '[';
 (*text)[2] = ']';
}
// ----------- common constructor code
void CheckBox::OpenWindow()
{
 if (windowstate == CLOSED)
 Button::OpenWindow();
 windowtype = CheckBoxWindow;
}




















































March, 1993
STRUCTURED PROGRAMMING


Action at a Distance




Jeff Duntemann KG7JF


I have discovered a Great and Terrible Truth: New cars just aren't worth it
anymore. It doesn't matter if they're produced here, in Japan, or in Outer
Brungaria--what you get is in no way a fair return for the money you offer.
Having spent close to a year poking around the auto industry, kicking tires
and slamming doors, I've come away with a feeling of terrible disgust. A
$10,000 car (what few remain) is a plastic toy, built so shoddily that I can
only imagine them coming to pieces with virtually no resale value inside of
five years. To get what I consider a car, you now have to lay out $25,000 or
more--a sum I refuse to spend on a device that one drunk in an old Ford can
destroy utterly in 200 milliseconds flat.
I've hesitated to speak my suspicions until now; that is, until Shakespeare
rolled out of Gary Turney's body shop, gleaming like a brand new car, minus
dents and plus a new coat of blinding white paint, with rechromed bumpers so
shiny you can comb your hair in them. Another week at the reupholstery shop
gave him brand new seats, a new headliner, new door panels, new armrests, and
new electric-blue carpeting. With a start I realized that all that new vinyl
had given Shakespeare a reborn dose of new-car smell.
Total cost on the Shakespeare project so far: $5700. For that I've gotten a
car that looks new, smells new, roars like cars haven't roared since the
Sixties--and is made of enough metal so that I have a fighting chance in a
crackup against that drunk in the old Ford. I figure buying anything close to
that quality in a '93 would cost $30,000.
It's true that there was a certain amount of effort involved. I made a few
trips down to Dagley's Junkyard, searching out parts, and spent some time on
the phone getting quotes on various jobs I didn't feel like fooling with
myself. But I was amazed at the support network out there; there's even an
outfit in Phoenix called Arizona Classic Chevelle Parts. You can buy almost
anything for a Chevelle, either leftover GM stock or brand-new repro parts. I
did buy a lot of parts and did lay out some money for work--the car originally
cost me just $1800--but the end result is eminently worth the time and the
money.
If you can't stand the thought of paying more for a car than some people still
pay for a house, well, then support the other American auto industry, the one
Detroit doesn't want you to know about: The people who recycle old cars and
bring them back to life, often better than they were originally, and always
for a fraction of what a similar vehicle would cost today.


Between Building and Buying


What makes software expensive is the same thing that makes cars expensive:
time. The more people-hours spent on a product, the more that product will
have to cost to make money in a given market. Time is now more expensive than
it has ever been, drastically so. What this means for the auto industry is
that a lot more people are going to be buying used cars. What it means for the
software industry is that fewer and fewer products are going to be built "from
scratch." The days when you could afford to create every single line of a
major application in C or Pascal are mostly past, not only because time is
expensive, but also because users' expectations have gone through the roof in
recent years. A word processor now has to be a page-layout program, too--as
unnecessary as this often is. (I still use WordPerfect 4.2 on my laptop. I'd
use it in the office, too, if it weren't so hard to teach to my staff.)
Between buying a finished application in a can and writing it all yourself
there is a lot of room for hooking together commercial subsystems with "glue"
code of your own creation. This is now easier than it used to be, thanks to a
growing family of well-defined application programming interfaces (APIs) that
define how your code should communicate with a commercially purchased
subsystem.
One of the very best APIs is the Windows spec for dynamic link libraries
(DLLs). DLLs remove language dependency from library calls. The DLL API
dictates the details of how a language calls its code, rather than leaving
such details to the language itself. This allows a Windows DLL to be called
from any language at all (assuming the language bothers to pay attention to
the API), and gives a vendor of DLLs a much larger market than ever before.
The same binary DLL can be called from Borland C, Turbo Pascal for Windows,
Visual Basic, or Smalltalk/V Windows. Small language vendors can now count on
a ready-made, third-party aftermarket immediately upon releasing their
language to the public.
Other emerging APIs define calls to sound boards (MCI) and access to extended
memory (DPMI). The one that has fascinated me most, however, concerns database
management and the use of database engines in commercial applications.


Separating the Interface from the Database


Most people don't think of it in quite these terms, but the ascendance of
standard APIs has made possible a remarkable concept called client-server
database management. It has been a mainframer's concept most of its short
life, but there's nothing about it that demands a mainframe. I've been doing
it on my own desktop machine for a little while now, and it works well there,
even if much of the concept's power is wasted.
From a height, this is the client-server idea: You take a coal chisel and
split your database-oriented application right down the middle. The parts of
the application specific to it (the menus, dialogs, help screens, and so on)
stay in one chunk, and the general-purpose data-management code goes into the
other chunk. The application-specific chunk handles interaction with the user,
while the data-management code stores the data, performs searches and queries,
and ensures that all one-to-many relationships among data files remain intact
and valid.
You might say this reminds you of things like the Paradox Engine, B-Tree
Filer, and other database engines of the sort that you link into your
application to handle data management. Well, yes--except that in a
client-server system you don't link the two pieces together. The application
and the database engine are not part of a single .EXE file. They don't have to
be in the same subdirectory. They don't even have to be on the same machine.
Now you're talking client-server.
In the client-server lexicon, your application is the client. The client
application can be anywhere, as long as it is connected somehow to the machine
that houses the database server. The connection could be a T1 mainframe link,
an Ethernet-style local area network, or just a serial cable between two
machines. Furthermore, nothing says that the client and server must be on
separate machines. One of the most common links between clients and servers is
the Windows DLL interface. (This is, in fact, the one I have tested most
thoroughly myself.) What matters is not the nature of the connection, but the
fact that the client and the server both speak the same language; that is,
both understand the API that lies in common between them.


Enter SQL


There are a number of client-server APIs, most of which originated in
mainframeland, or at least in places where networking is king. The usefulness
of such APIs is directly proportional to how widely they are adopted, which in
nearly all cases is not especially widely. Mainframe vendors have always been
prone to the deadly NIH virus, for which there is no cure but extinction--and
most would rather die than adopt an API that somebody else concocted.
Only one client-server API has come into anything that I'd call "general use:"
IBM's Structured Query Language (SQL), which first appeared in the early '80s,
and is still evolving. IBM has been good enough not to make silly ownership
claims to the specification a la Ashton-Tate, and in consequence SQL is
spreading rapidly through the database community as the lingua franca of
client-server.
SQL is about as high-level as any API is ever going to get. It says nothing
about calling conventions, stacks, or big-endian vs. little-endian conflicts.
It is strictly a group of commands written as simple, English-like, 7-bit
ASCII text. The client passes those commands across the connection to the
server, which obeys them and passes back data to the client in response to the
commands.
One interesting wrinkle here is that the "client" doesn't have to be an
application. Because SQL is strictly text, and tolerably English-like, the
client can be a human being typing at a terminal. This is analogous to working
with DOS through Windows: Windows ordinarily handles the somewhat cryptic DOS
command language for the user, but when the user really needs to, he or she
can tickle the DOS-prompt icon, go out to the DOS prompt, and type commands
directly to DOS.
Most use of SQL, however, is in the form of embedded SQL, a series of SQL
commands embedded into the source code of some other host language like Basic,
C, Pascal, or (God knows) Cobol. The application mediates between the end user
and the SQL syntax and passes along the end user's wishes to the database
server in the form of SQL commands.
I've sketched this out in Figure 1. Your application is Here, and the actual
database-manager code is There. The gap between them is crossed by some
connection or another; as I said before, it doesn't matter what the nature of
that connection actually is. SQL commands are issued by your application,
cross the gap, and are interpreted by the database manager's SQL-command
interface. These commands typically set up a database query of some sort,
which the database manager executes. The results of the query, usually rows
from a table, are sent back over the gap to your application.
Keep in mind that SQL is an API. It's an interface language, not something
that compiles to independent native code files. The database manager simply
has a module that interprets SQL statements as they come in and responds
appropriately. The database manager often has its own proprietary command
interface, or it may support still other command interfaces like Xbase.


Little Action Here, Big Action There


That's how most people envision client-server database management operating,
and that's the way it has usually been done, up until now. The client
application has generally been some simple user-interface front end without a
great deal of power or intelligence; hardly more than a smart terminal.
You can take it further than that. This past November, Microsoft introduced
its first home-grown database manager at Comdex/Fall, with an interesting
feature: In addition to being a fully relational database manager with its own
internal command language (a dialect of Basic), it can also act as a SQL front
end. I've drawn this one out in Figure 2.
Access can send SQL commands to a remote database manager on a big system, and
bring home a subset of the big system's database as one or more query results.
The Access user can then work with the local tables (retrieved through SQL
commands) by using Access's own language and macros.
This all resolves to the opportunity to do all pertinent data management on
the desktop, while leaving the big system to do whatever data management must
be done on all company data. If a company has local offices in 14 states, each
state office can have its own 486 with a copy of Access to do its local data
management at home, close to the people who do the work and need the results.
The big machine back at CHQ still contains all the data from all local
offices, and can do company-wide queries and reports as needed. Someone (I
forget who) coined the term "rightsizing" to cover systems like this.
(I'm not implying, by the way, that Microsoft Access is the first or the only
database manager to be able to do this. It's only the first in my experience.
There are probably others, and there will be more in the near future.)



Learning SQL


I think it's a good idea to learn SQL even if you aren't in a shop that has to
bridge the gap between the big systems and the important systems. The gap
between client and server can be as small as the gap across a function call
and in such a situation, using a SQL database manager is pretty much the same
thing as linking in a database library like the Paradox Engine or B-Tree
Filer. That's pretty much what I've been doing, and the process has been
delightful.
I've been using a product called Ocelot2 (whose full name is Ocelot2--the
SQL!, including the exclamation point) from Ocelot Computer Services in
Edmonton, Alberta. The same box gives you a Windows DLL and a DOS linkable
library. It's not what I call cheap ($700.00), but SQL products are generally
expensive, and it's less expensive than many I've seen.
One good thing about SQL's being a strong standard is that numerous books have
been written on it, nearly all of which are better than Ocelot's somewhat lame
documentation. Any sizable bookstore will offer you a number of books on SQL,
and most are at least readable. (One to avoid at all costs, however, is SQL
Structured Query Language, by Dr. Carolyn Hirsch and Dr. Jack L. Hirsch,
published by Windcrest/McGraw-Hill. This has the dubious honor of being one of
the worst computer books I've read, and cements my conviction that one should
never buy computer books by people who insist on putting "Dr." in front of
their names--especially when neither doctorate has anything to do with
computer science.) I learned SQL by skimming a few books and then just hacking
around interactively with the Ocelot2 SQL back end, through a simple SQL
terminal" application provided with the Ocelot2 product. The terminal simply
allows you to type in a SQL command and then transfer it to the back-end
database manager. Any responses from the database manager are displayed for
examination.
It was a lot of fun. The Ocelot2 product is solid and fast, and I do recommend
it. The documentation should be rewritten and reprinted by the time you read
this.


The Structure of a SQL Command


SQL commands have a relatively simple underlying structure that is awesomely
cluttered with qualifiers. Its data-management power is in the qualifiers--but
its advantage for learning is that you can shovel away the qualifiers and see
the bones of the language in the form of simple (if not necessarily useful)
commands.
SQL commands typically begin with a SQL reserved word, followed by a string of
qualifiers, and terminated by a semicolon. Unlike Pascal, SQL's semi-colons
are terminators, not separators. Each statement must be terminated by a
semicolon, regardless of that statement's position in a sequence. Some
statements may have other statements embedded within them--hence the
"structured" in Structured Query Language. The language is not case sensitive,
but standard practice is to place all SQL identifiers and reserved words in
upper case.
Listing One (page 126) is a series of SQL CREATE commands that I used to
create a database. If you tuned in last month, you'll recall a three-table
database of contact names, locations, and phone numbers shown in last month's
Figure 3. Listing One is the SQL code it took to create that database through
Ocelot2.
The first statement creates the database, which in SQL is a named umbrella
covering all of the database's diverse components. This umbrella is called a
catalog, and it contains information summarizing the current state of the
tables and indexes comprising the database. The CREATE TABLESPACE and CREATE
INDEXSPACE commands direct SQL to store the database's various tables in a
single file called CONTABLE.TBL, and all its indexes in a single file called
CONINDEX.IND. This reduces file clutter somewhat, though it may also reduce
performance with larger files.
The CREATE TABLE statements define the individual tables and their component
fields. If a table has a primary key, the primary-key field is marked PRIMARY
KEY. If a table contains a foreign key (that is, another table's primary key)
to link it to its parent's table, that foreign key is marked by the REFERENCES
qualifier, followed by the name of the parent table. The REFERENCES qualifier
assumes that the name of the primary-key field in the parent table is the same
as the name of the foreign-key field in the child table. That is, if a field
ConID) references the table ConBase, the field it references in ConBase must
also be named ConID.
Most of the field definitions should be self explanatory. Mostly what they do
is name a field, give it a type, and then specify how large it is. NOT NULL
means that SQL must disallow a record update that leaves a NOT NULL field
empty.
I created an index for ConBase to speed queries, but indexes are optional and
the database will work well (if slowly) without them.


Asking Questions


Ocelot2 has a nice feature that is evidently nonstandard SQL: It can import a
properly structured ASCII comma-delimited text file into a SQL table. This
allowed me to suck in a file I had exported from Paradox 3.5 as ASCII, and
immediately begin work with a 500-record database. That sure beat typing in
beaucoup lines of demo data!
But once you've created a database and gotten data into it somehow, the
interesting stuff becomes possible. The SELECT statement is the most-used one
in all SQL, and through it you create subset tables from existing tables,
according to the qualifiers you place after the SELECT reserved word.
SELECT statements read easily until they get heavily nested. Here's a simple
one that selects all records from ConBase with the string "Editor" in the
Specialty field:
 SELECT*FROM ConBase WHERE Specialty='Editor';
The SELECT* clause means "select all fields." You could also have written
SELECT FName, LName and gotten only the first-name and last-name fields.
A host of logical qualifiers is available so that you can pin things down any
way you want. You could get very choosy, like this:
 SELECT* FROM ConBase WHERE (Specialty='Editor' OR Specialty='Writer') AND
(Tag='A');


Bringing Back the Bacon


Once you've successfully executed a query in the server, you've got to get the
results table home somehow. When the server executes a query, it retains the
result table internally. The server does not automatically squirt the whole
result table back over the link to the client. The client has to ask for it
and fetch it back over the link, one row at a time.
A SQL cursor is an invisible pointer to one row of a table. When a table is
created and a cursor is defined for it, the cursor initially points to the
first row in the table. You can use the FETCH command to position the cursor
to some row in the table and then retrieve that row, one field at a time.
FETCH can position the cursor to the next or the prior row, to the first or
last row in the table, or to a row specified by row number -- either the
absolute row number or some number relative to the current cursor position.
EXEC SQL FETCH NEXT MyCursor INTO :ln :fn :tg :cl :sp :tx;
Here, the named cursor MyCursor is moved to the next row in the results table,
and brings back the row into six host variables named ln, fn, tg, cl, sp, and
tx. The host variables are separated by their prefixed colons.
FETCH is only available as an embedded command; that is, a user cannot
interactively type a FETCH command from a terminal. This is the reason for the
EXEC SQL immediately before FETCH. EXEC SQL indicates that what follows is a
SQL statement and not just another host language (Pascal, Basic, Cobol, and
the like) statement. Combining SQL statements and host language statements is
still an ugly business and could be made a lot easier (as I'll emphasize a
little later) by the hostlanguage compiler vendors.


The Boundaries of Standardization


There's a whole lot more to SQL than I've shown on this quick tour. There are
additional reserved words for updating and deleting databases, for
restructuring them, and for handling issues like concurrent access and
security.
On the whole, the SQL standard is remarkably strong, stronger than Xbase
(which has lots of loose ends) with only a few, probably necessary exceptions.
Embedded SQL (at least through Ocelot2) requires that the source file be
precompiled by a host-language specific SQL precompiler, which takes standard
SQL statements and translates them into host-language statements that
implement the SQL statements in the host language. Other SQL vendors may
handle embedded SQL in a different fashion; the standard says little or
nothing about the process of creating an embedded SQL application.
And because SQL is platform independent, there are going to be some bumps when
you finally have to hook the logical to the physical somehow and store SQL
databases in DOS files. Ocelot2's mechanisms make sense to me, but they are
not identical to those used by other SQL vendors.


Statements Held in Common


If you're working on a data-management application in Pascal, you're wasting
serious time trying to implement the data manager proper from scratch. You
might as well smelt your own steel and hammer out a replacement fender in a
blacksmith's forge. Auto parts are widely available and don't cost that much,
and neither do the multitude of database engines and development tools of
various kinds.
The truly ugly question has been turning up more and more often: Do I need to
work in Pascal at all? The current generation of high-end database managers
are lightning-fast and contain everything you need to create a whole
application, and every time I've done it, I've accomplished in days what in
Pascal would have taken weeks. I know that a lot of people I've spoken with
have been forced by productivity squeezes to abandon traditional languages
entirely and work in "database languages" instead.
Over time this could hurt the Pascal industry. One way out has become pretty
obvious to me: Borland (and all Pascal language vendors) should think very
hard about agreeing on an embedded SQL standard, and incorporating that
standard into the language, just as x86 assembly language has now been
incorporated into Turbo Pascal through BASM. The data-manager executable
itself doesn't necessarily have to be included with the Pascal product, but
the full embedded SQL syntax should be understood by the Pascal compiler, and
the means by which SQL statements may be passed to the data manager should be
standardized as well. Then the programmer could choose from among many
third-party, back-end SQL data managers, all of which would respond to
identical SQL commands generated from within a Pascal application.

Like it or not, we're going to have to start shaving the time it takes to
produce useful things in traditional languages, and one way to do this is to
start using standardized parts. SQL is one such standardized part. It should
be a lot easier to make use of the SQL standard from Pascal. The SQL people
have done pretty much what they can. The ball is now in the Pascal vendors'
(and specifically Borland's) court.


Products Mentioned


Ocelot2--the SQL! Ocelot Computer Services Inc. Suite 1104, Royal Trust Tower
Edmonton Center Edmonton, AB T5J 2Z2 Canada 403-421-4187 $695.00























































March, 1993
UNDOCUMENTED CORNER


Inside Windows Regions




Joseph M. Newcomer and Bruce Horn


Dr. Joseph M. Newcomer received his PhD in the area of compiler optimization
from Carnegie Mellon University in 1975. Bruce Horn was one of the original
members of the Macintosh design team, having written the Finder (with Steve
Capps), and designed and implemented the notion of the Macintosh "resources."
He is currently finishing his PhD in Computer Science at Carnegie Mellon
University. The work described in this article was done for the Maya Design
Group (Pittsburgh, Pennsylvania).


The "Undocumented Corner" is a new DDJ column that will cover undocumented
aspects of the programming interfaces for DOS, Windows, OS/2, networks, and
any other operating systems for which DDJ readers are programming. The intent
is to not just explain obscure or nonintuitive interfaces, but to present
solid reference material that really isn't in the manuals -- but should be.
For example, an explanation of how to use the Windows DDE interface wouldn't
belong here because, while extremely difficult, that interface is documented.
If, however, you've found five new DDE messages that are genuinely useful or
important (this is a purely hypothetical example), but aren't mentioned in the
manuals that come with the Windows SDK, then this is the place to talk about
them. Likewise if you have, say, disassembled Windows and want to talk about
how the documented DDE interface is implemented. In other words, in this
column DDJ readers will be writing in to talk about undocumented interfaces
and operating-system internals.
One purpose of this column is to provide a forum for readers of Undocumented
DOS (Addison-Wesley, 1990) and Undocumented Windows (Addison-Wesley, 1992),
for which I was a coauthor. While generally recognized as standard works on
the subject, both books contain errors and omissions that deserve some public
attention before the next editions appear. (The second edition of Undocumented
DOS is due to appear in April 1993. It will be about 150 pages longer than the
first edition, and will cover topics such as DR-DOS, the relation between
Windows and undocumented DOS, and disassembly of the DOS kernel.)
Future "Undocumented Corners" might cover, for example, undocumented LAN
Manager named-pipe functions available under DOS, the SmartDrive 4 "BABE"
interface, the PIF file format, FDISK internals, walking the Windows VxD
chain, undocumented DPMI, undocumented DR-DOS, and undocumented NT. (If you
have NT and want a sneak preview, just run coff -dump -imports pview.exe.)
Of course, the usual cautions about relying on undocumented operating-system
features apply. These features can change in the next version of the operating
system, because the vendor is under no obligation to support them. On the
other hand, the most useful undocumented features tend to get documented (look
at Microsoft's MS-DOS Programmer's Reference for DOS 5, and the
Get/SetSelectorBase/Limit functions in Windows 3.1), so knowing about and
using these features may do nothing more dangerous than giving you a year or
two jump on your more cautious competitors.
If you have material or a question for an "Undocumented Corner" column, please
write to me on CompuServe at 76320,302, or send Internet mail to
andrew@pharlap.com. (I no longer work at Phar Lap Software, but the mail will
get forwarded to me.) Articles for the "Undocumented Corner" will appear under
your name, and you'll be paid for your contribution.
Our first column, on the RGNOBJ structure in Microsoft Windows, is a beautiful
example of what we're looking for. Windows provides functions such as
CreateEllipticRgn(), CreatePolygonRgn(), and CreateRectRgn() that return an
HRGN, a handle to a region; functions such as CombineRgn() and PtInRegion ()
operate on these HRGN handles. Given the black-box "you can use a handle
without knowing what it points to" philosophy of Windows, it's not surprising
that the structure of a RGNOBJ, to which an HRGN points, is not documented.
Regions are maintained by the Windows graphics-device interface (GDI). If you
turn to the entry for RGNOBJ in the GDI chapter in Undocumented Windows (pp.
589), this is all the enlightenment you'll receive:
This is the structure behind the documented HRGN handle. It is stored in GDI's
default heap segment, and is made up of two parts. The first is the
undocumented GDIOBJHDR structure described in the GDI object header entry
earlier in this chapter [pp. 566-568], and the second, which contains the
actual region data, is not currently understood.
In other words, a total cop out! Frankly, though, I was surprised by the large
number of Undocumented Windows readers who wanted this information. We just
didn't think it was that important, compared to the structures we did
thoroughly cover, like the WND, DC, and TDB. After all, even in Quick-Draw on
the Macintosh (which seems a lot better documented than Windows), Inside
Macintosh simply says that the "region definition data for nonrectangular
regions is stored in a compact way that allows for highly efficient access by
QuickDraw routines," and leaves the rest to your imagination.
But treating regions as a black box works on the Macintosh because Quick-Draw
provides a relatively complete set of operations that work on regions,
including UnionRgn( ), DiffRgn( ), XorRgn( ), and the powerful MapRgn( ). (For
a good description of Macintosh regions, see Howard Katz's "Region Maker,"
Byte, January 1987.)
Windows does not provide a complete set of region operations, and this is
where the otherwise-excellent black-box idea (that you can use an HRGN without
knowing what a RGNOBJ is) breaks down. Windows brings UnionRgn( ), DiffRgn (
), and like operations under the umbrella of CombineRgn( ) with options like
RGN_OR (union) and RGN_DIFF. But porting code from the Macintosh to Windows is
quite difficult because Windows doesn't have MapRgn( ) -- and without knowing
what a RGNOBJ looks like, your hopes of implementing MapRgn( ) for Windows are
pretty low.
It turns out that the structure of the region-data bytes in a RGNOBJ are far
from straightforward. Since CreateEllipticRgn( ) and CreateRectRgn( ) are
called with nothing more than the two (x,y) bounding coordinates for the
region, you might think there's nothing more here than a rectangle. However,
creating some regions and then exploring the GDI local heap with Microsoft's
HeapWalk utility, you can see that, whereas CreateRectRgn(10,20,100,200)
generates a 42-byte object, CreateEllipticRgn (10,200,100,200) generates one
that's 846 bytes! What the heck is going on? Several readers of Undocumented
Windows got out their debuggers, disassemblers, and hip-waders, and figured
out what a RGNOBJ actually looks like. Ed Andrews (eandrews@ingres.com) did so
by looking at how CombineRgn( ) works. Joseph Newcomer and Bruce Horn, whose
article appears here, spent a long time figuring out that an HRGN is an
LMEM_MOVABLE pointer into GDI's local heap. They then stared long and hard at
the data. Note their comment below, complete with a great Knuth citation,
about the need to test with unique values. Joe and Bruce's article is a fine
combination of low-level, nitty-gritty detail with a high-level appreciation
for important details.
--Andrew Schulman
Regions are one of the more mysterious graphics-device interface (GDI) objects
in Microsoft Windows. Fundamentally, a region is an arbitrarily bounded area
that can be used for a variety of purposes, such as filling, outlining, and
clipping. However, one operation that is missing is the ability to scale a
region, even though this ability is available on the Macintosh and in the X
Window system. This means that multiplatform software that depends on this
ability cannot be readily ported to Windows. We were confronted with exactly
this problem, and therefore had to look "under the floor" to find out how
Windows regions were implemented. This article tells what we discovered and
what we did with it.
A note of caution: This code is highly sensitive to the current Windows
3.0/3.1 implementation. There's no guarantee that it will work in any future
version of Windows, and in particular it will not work in Windows NT. The
usual cautions of using undocumented features apply: Use this at your own
risk! Perhaps in a future version of Windows, Microsoft will include a
region-scaling operation and this hack will no longer be necessary.


Locating the GDI Data


The first problem we confronted was discovering exactly what a region looked
like. Precisely where were the bits stored? Undocumented Windows (by Schulman,
Maxey, and Pietrek, Addison-Wesley, 1992), hereafter referred to as UndocWin,
describes in a less-than-useful fashion the key component of the RGNOBJ
structure as "not currently understood" (pp. 589). We were therefore on our
own in reverse-engineering the actual GDI data.
Several hours of tracing assembly code with Codeview began to reveal some of
the truth. The critical question was how, given the HRGN region handle, to get
an address for an actual region.
Regions are objects that change size. They are managed in a local storage heap
of the Windows GDI module. However, because they can change size, GDI wants to
be able to shift their position around, to allocate new storage and compact
existing storage. The particular function that does this is GdiSeeGdiDo( ),
described on page 570 of UndocWin. So an indirection table, similar to the
Macintosh's master pointer block, is used. The region handle is the offset in
a GDI data segment of the actual pointer to the region data. This pointer
remains fixed during the entire lifetime of the region. As the contents of the
region change (based on operations such as CombineRgn( )), the actual region
data may shift position. In other words, the region handle is an LMEM_MOVEABLE
pointer into the GDI heap. In Windows, LMEM_MOVEABLE is a pointer to a
pointer, just like handles on the Macintosh (see UndocWin, pp. 305).
We first had to convert the handle to a far pointer, then use this far pointer
to access another pointer, which was the pointer to the actual region data.
This second pointer can be thought of as a handle to a movable chunk of
storage that contains the actual region data. We thus have to convert this
second handle to a global segment:offset pointer, simulating the effect of
LocalLock.
As this work was begun in Windows 3.0, Microsoft's TOOLHELP.DLL was not yet
available. Without ToolHelp, obtaining the segment portion of the address is
one of those truly ugly and disgusting hacks that's always required when
you're overreaching the interface specifications. The code we used depended
critically upon the procedure entry sequence, documented in the Windows 3.1
SDK (Programmer's Reference: Volume 1, pp. 410). The prolog for an exported
Windows library module procedure starts with the instruction movax,DGROUP. The
code in Listing One (page 150) returns the segment of the GDI heap by
extracting the DGROUP value from the actual instruction. The set of casts is
singularly ugly C code.
As shown in Listing Two (page 150), it's much easier to get the segment of the
GDI heap using ToolHelp. This code will work under either Windows 3.1 or 3.0,
but in the case of 3.0 you would need to ship TOOLHELP.DLL along with your
software.
This is all illustrated in Figure 1, where the user has an HRGN variable which
has the value 0x003e. The GDI segment, hGDISegment (the name of the
SYSHEAPINFO field; see Listing Two), is 0x0617. At location 0617:003e is the
value 0x04ac. Thus, 0617:04ac is the far address of the RGNOBJ for this
region. If the region were recomputed, such as by specifying it as the
destination region of CombineRgn( ), the value at 0617:003e would most likely
change to point to a newly allocated chunk of GDI heap memory for the newly
computed region. The memory at location 0x4ac would be freed. This is all
handled by the previously mentioned internal function GdiSeeGdiDo( ).
Having obtained the GDI segment, we can convert the region handle to a
reference to the region data by extracting the near pointer at GDIseg:handle,
as shown in Listing Three (page 150). DOS.H file provides the macros FP_SEG (
) and FP_OFF( ).


Decoding the Region Data


Having carefully studied the way in which regions were created and combined,
we knew that one of the WORDs was a length of the region data, which
simplified some of the printout. We knew we only had to print out a fixed
number of words to get the complete region displayed. Our examples were a set
of examples of two rectangles, a hollow box formed from them, an L-shaped
region formed from two rectangles, and an ellipse.
Our next job was to decode the region. The adventures here, including the
false starts, guesses, and general staring at the code and data would make a
great series of boring war stories told over suitable beverages, but we'll
proceed directly to the final result, with only a brief diversion about
reverse engineering.
Some years ago, Donald Knuth wrote a paper on the early Babylonian
mathematicians' representation of algorithms ("Ancient Babylonian Algorithms,"
Communications of the ACM, July 1972, pp. 671-677; thanks to Jeremy Gibbons in
New Zealand for this citation). They had no notion of algebra, nor of
variables, but expressed the algorithms numerically. The art was to choose the
initial values so that all intermediate values were unique; thus an integer
represented the equivalent of a variable of that "name." Once the sequence of
operations in terms of the numbers was memorized, the person executing the
algorithm had only to associate the numbers representing the "variables" or
"parameters" at each step with the actual numbers of the particular
computation.
Whenever you are reverse-engineering something, it's important to similarly
choose values that guarantee that, insofar as possible, you have unique values
internally. For example, a rectangular region with coordinates {100, 100, 200,
200} would be an exceedingly poor choice, because not only would it not be
possible to tell the X[0] coordinate from the Y[0] coordinate, or X[1] from
Y[1], but if the internal representation happened to store width and height
instead of absolute coordinates, the internal representation might be {100,
100, 100, 100}. The values we used were chosen so that not only were the
values unique, but all differences in X and Y were also unique values.


The Representation of a Region



A region, we found, is stored as a sequence of horizontal stripes, kept in
vertical order, which we call "Y-spans." Each Y-span contains a sequence of
one or more rectangles, which we call "X-spans." [Editor's note: Ed Andrews
uses the terminology "stripes" and "bands."] In a simple example, the
difference between the two rectangular regions shown in Figure 2, an outer box
and an inner box, creates the hollow box shown in Figure 3. Code to create the
two boxes and the difference is shown in Listing Eight (page 150). Initially,
each of the rectangular regions consists of a single Y-span defining one
single rectangle. However, when they are combined, a new region is created.
On examining the region defining the hollow box, we find that the region
consists of three Y-spans. The first is the top segment of the box and has a
single X-span, labeled A. The second Y-span consists of two X-spans, one for
the left edge of the box (B) and one for the right edge of the box (C). The
third Y-span consists of the bottom segment of the box (D). This is shown in
Figure 4.
A simple ellipse took 78 horizontal stripes. An example of an elliptical
region is shown in Figure 6.
A region has a region header followed by the region data. The region header
consists of:
A GDI object header [words 0..4]. This follows the format given as GDIOBJ-HDR
in UndocWin, pp. 567. [Editor's note: This must be adjusted for the Debug
version of Windows!]
The length of the region object, in words (includes the length of the header)
[word 5].
The number of Y-spans [word 6].
What appears to be the length of the longest X-span data, in words [word 7].
The bounding box for the entire region, in the order left, top, right, bottom
[words 8..11].
Following the region header, the region data appears. This consists of a
sequence of Y-spans, as given by word 6 of the region header. Each Y-span has
the following format, where the offsets are the offsets in the Y-span:
The number of coordinates nx in the X-spans (which is always an even number
since each X-span has two coordinate values) [Y-span word 0].
The starting Y-coordinate of the Y-span [Y-span word 1].
The ending Y-coordinate of the Y-span [Y-span word 2].
The starting X-coordinate of X-span i (for i=O..nx/2-1) [Y-span word 3+2*i].
The ending X-coordinate of X-span i [Y-span word word 4+2*i].
A repeat of the number of X-spans in the region [Y-span word 3+nx].


Implementing the Region-mapping Function


We had a choice of how to implement region mapping. The X Window system
implements two primitives, Xoffset-Region, which performs a linear
transformation of the region, and XShrink-Region, which shrinks a region by
specified integral amounts in the X and Y direction. The Macintosh Toolbox
implements a much more general mechanism, MapRgn. We chose to implement the
latter.
MapRgn( ) is defined as mapping a region based on the relationship of two
rectangles, the source and destination rectangles. The vertical dimensions of
the region are scaled according to the ratio of the destination-rectangle
height to the source-rectangle height. The horizontal dimensions of the region
are scaled according to the ratio of their widths. The final (absolute)
position of the region is effectively the result placing the top-left corner
of its bounding box of the region relative to the destination rectangle in the
scaled relationship as the top-left corner of the unmapped region relative to
the source rectangle.
This can be done by mapping each of the X,Y coordinates of the region
according to the formula dst=(src-pos_src) * scale+pos_dst, where: dst is the
value for x or y in the mapped region; src is the value for x or y in the
source region; pos_src is the value for x or y of the top-left corner of the
source rectangle; pos_dst is the value for x or y of the top-left corner of
the destination rectangle; and scale represents the ratio of the
destination-rectangle width or height to the corresponding source-rectangle
dimension.
The dimensions of the rectangles are expressed as integer values. We have to
deal with nonintegral ratios, such as a mapping of 0.7 or 1.2. The solution is
obvious: We need to use floating-point computations. This means that to map
the points, we need to convert each point in the region to floating point,
perform a floating-point computation, and convert the result back to an
integer. This is not particularly attractive from the view-point of
performance, particularly if a math coprocessor is not available.
There is an alternative: fixed-point arithmetic. We convert each 16-bit
integer to a 32-bit fixed-point number, where the high-order 16 bits are the
integer part of the number and the low-order 16 bits are the fractional part
of the number. We need to implement the following operations: convert integer
to fixed point, convert fixed point to integer (rounded), multiply two
fixed-point numbers generating a fixed-point result, and divide two integers,
producing a fixed-point result.
A fixed-point number is described by the declaration typedef long tFixed;. An
integer is converted to a fixed-point number with the ToFixed procedure. This
simply shifts the integer left 16 bits. This is shown in Listing Four (page
150). Multiplication is performed by the computation: sign *(int(abs(num [1])
*frac (abs(num[2]))) + abs(num[1]) * int(abs (num[2]))) where: num[1] and
num[2] are the two operands; abs(x) gives the integer absolute value of x;
frac(x) gives the low-order 16 bits (the fraction part) of x; int(x) gives a
16-bit value which is the high-order 16 bits (the integer part) of x; and sign
is -1 if the signs of num[1] and num[2] differ and 1 if they are the same.
The code to implement this is shown in Listing Five (page 150). The comments
show that this will fail if the largest negative integer value, -32,768,
appears as the integer part of either input value, because the absolute value
is taken.
The ratio of two integers is computed by simply converting the numerator to a
fixed-point number and dividing that number by the integral denominator, as
shown in Listing Six (page 150).
The final conversion back to an integer (rounded) is shown in Listing Seven
(page 150). Positive numbers are rounded up and negative numbers are rounded
down; that is, rounding is done to the nearest integer value. One implication
of this is that some rectangles may shrink to a 0 dimension in width or
height. Such rectangles could be eliminated entirely from the region
description, but such a change involves modifications to the region size.
We're not certain of the implications of this with respect to the way regions
are allocated; for example, Windows might not reclaim storage beyond what the
region claims to own. We have therefore chosen to leave these collapsed
rectangles in place. This incurs a minor performance penalty but does not seem
to affect the rendering of regions. We could handle this with the GdiSeeGdiDo
internal call if we really cared.
The MapRgn code is shown in Listing Nine (page 150). This code walks across
the region structure and maps each point. In addition to mapping the points
for each of the internal rectangles, it maps the bounding box. The general
MapScalar procedure is used to map each of the points. The result of
performing the mapping is shown in Figure 5. In this example, the larger
rectangle on the left represents the source rectangle, and the smaller
rectangle on the right represents the destination rectangle. The rectangles
surrounding the ellipses represent the bounding box of the region. The
grayscale shown in the illustration is displayed as color banding on the
display. Note the proportional relationship of the source and destination
rectangles to the shape of the ellipses.


Limitations of the Method


The region coordinates are stored as integers. Consequently, applying scaling
several times to a region will tend to accumulate round-off errors. Scaling a
smoothly curved region, such as an ellipse, or sloped regions, such as polygon
regions, to a larger value will exhibit the usual effects of such scaling; the
quantization that was appropriate for the smaller image will be magnified in
the larger image, as shown in Figure 6. This technique will work best when
applied to regions composed of rectangles.
This quantization will also become apparent when a rectangle shrinks to a 0
horizontal or vertical dimension. Scaling will leave these dimensions as 0, so
the original shape may not be realizable; for example, scaling by 0.01 and
then by 100 may not produce the original shape because some subregions shrank
to 0 dimension.


Why This Representation?


You may be wondering: Why this particular implementation of regions? As we
analyzed this data, we came up with some conjectures of the rationale for the
particular representation chosen.
One representation that might be chosen is to keep the properties as abstract
as possible; for example, when an elliptical region is created, to store the
parameters of the elliptical region and mark the region type as elliptical;
similarly for a rectangular region. A significant problem with this
representation is the operations of intersection or difference on the regions;
for example, imagine what it would take to represent an elliptical region with
a rectangular hole in the middle of it. Four elliptical chords could do it,
but these are expensive to compute in the first place and even more expensive
to do subsequent operations with. The organization of rectangular banding
seems a natural way to reduce the computational complexity.
The problem with this is, as we observed earlier, rectangular approximations
to nonrectangular polygons (and the "infinite-sided polygon" elliptical
regions) are only as accurate as the quantization. Making smaller (down to one
pixel) bands gives the best approximation, but at the highest cost of storage;
making larger bands reduces the number of rectangles, at the cost of accuracy.
The Windows NT GDI represents regions as trapezoids, which allows for better
approximations with fewer region segments, at a slightly higher computational
cost.
Another representation would be a linked list of segments. One such
representation is described in the paper "Scan-Line Coherent Shape Algebra,"
by Jonathan E. Steinhart (Graphics Gems II, James Arvo, ed., Academic Press,
1991). A region is represented not by rectangular segments, as described here,
but as a linked list of horizontal spans. Each span contains its top
Y-coordinate, a pointer to the next span, and a pointer to the segment list.
The bottom Y coordinate is obtained from the top Y coordinate of the next span
(and a discontinuity in the spans is thus represented as a span with no
segments). The segment list is a linked list of segments ordered by X
coordinates. In the paper Steinhart describes how to implement the operations
of intersection, union, and translation.
As in the Steinhart paper, the Microsoft GDI maintains ordering in both Y and
X. This serves two purposes. For one thing, it makes it easy to clip regions;
you ignore all regions above your clipping region, and when you finally
encounter a region whose top is below your clipping region, you can stop.
Since the segments are maintained in X-coordinate order within each span, you
can apply the same technique for clipping left and right; only the subregions
partially intersecting the clipping region require computation. More
importantly (especially when polygonal regions are used), it allows for a more
efficient computation of the intersection, union, difference, XOR, and other
operations because you can ignore certain computations outside the bounding
box of the other region, and need only compute new rectangles in the
overlapping areas. For example, in computing an intersection of two regions A
and B, any rectangles in A that are outside the bounding box of B can be
ignored, as can any rectangles in B that are outside the bounding box of A.
Why a vector instead of a linked list? We conjecture that this is because the
GDI space is limited to a single 64K local heap; the vector representation is
much more compact than a linked-list representation. We also conjecture that
the repeated number of X-spans in the region (Y-span word 3+nx) is present to
allow the structure to be traversed in a backward direction, for example, to
locate the start of the previous Y-span.
One aspect of the representation we find dismaying is that it appears that
coordinates are kept in the screen space rather than in a user-space
coordinate system. This means that scaling a rectangle of a region so that it
has a 0 dimension and then scaling it by a multiple greater than 1 means that
it will not give the same effect as scaling the original to the same size; for
example, scaling a region by 0.1 and then scaling the resulting region by 20
is not necessarily the same as scaling the original region by 2.
We suppose this was done because it has an impact only when scaling, and the
absence of that operation made the more-general representation superfluous.
Because Windows was originally committed to run efficiently on an 8088, and
adding the necessary transformations to the internals of the region operations
would have made these representations unacceptably inefficient. However, since
this representation is an undocumented interface, there's no reason for
Microsoft to maintain it across releases. Perhaps a future release of Windows
could use a representation more amenable to scaling, and we could have a
supported scale-region operation.
In conclusion, with some considerable effort we were able to enhance the
Windows region operations with a new and very useful operation. It is the fond
hope of the authors that such an operation will become part of the official
Windows API in the future; but we were faced with the problem that we needed
this operation and had no other choice but to add it ourselves.


Acknowledgments


We would like to acknowledge Maya Design Group and Jim Morris for permission
to publish this work.


_UNDOCUMENTED CORNER_
edited by Andrew Schulman


[LISTING ONE]

WORD GetGDIsegment(void)
 {
 HRGN (FAR PASCAL * A)(int, int, int, int);
 unsigned short far * B;
 WORD GDIseg; /* GDI segment */
 A = CreateRectRgn;
 B = (unsigned short far *)A; /* function prolog: MOV AX,DGROUP */
 ((char far *) B)++;
 GDIseg = *B;
 return GDIseg;
 }






[LISTING TWO]


#include <toolhelp.h>
WORD GetGDISegment(void)
 {
 SYSHEAPINFO info;
 info.dwSize = sizeof(info);
 SystemHeapInfo(&info);
 return info.hGDISegment;
 }






[LISTING THREE]

#include <dos.h>
static unsigned short far * ConvertRegionToAddress(HRGN region)
 {
 WORD GDIseg;
 unsigned short far * * p; /* pointer to GDI segment */
 unsigned short far * pv; /* pointer to region data */
 GDIseg = GetGDIsegment();
 FP_SEG(p) = GDIseg;
 FP_OFF(p) = handle;
 pv = (unsigned short far *) *p;
 FP_SEG(pv) = GDIseg;
 return pv;
 }








[LISTING FOUR]

static tFixed ToFixed(int value)
 {
 return ((tFixed) value) << 16;
 }






[LISTING FIVE]

static tFixed tFixMul(tFixed num1, tFixed num2)
 {
 tFixed result;
 int negated = false;
 unsigned short int2;
 unsigned short frac2;

 /* 'negated' is the sign of the result. We in effect store
 the sign bits. This will fail if we give it -32768.0000 */
 num1 = ((long) num1 < 0) ? negated = !negated, -num1 : num1;
 num2 = ((long) num2 < 0) ? negated = !negated, -num2 : num2;

 /* Extract the highword (integer part) and lowword (fraction
 part) of the second operand */
 int2 = (unsigned long) num2 >> 16;
 frac2 = (unsigned long) num2 & 0xFFFF;

 result=(((unsigned long) num1*frac2) >> 16)+((unsigned long) num1*int2);
 return (negated ? -result : result);
 }






[LISTING SIX]

static tFixed tFixRatio(int num, int denom)
 {
 tFixed result;
 result = ((long) num << 16) / (long) denom;
 return result;
 }






[LISTING SEVEN]


static short tFixRound(tFixed num)
 {
 int negated = ((long) num < 0);
 short result = negated ? (-num) >> 16 : num >> 16;
 unsigned short low = negated ? (-num) & 0xFFFF : num & 0xFFFF;
 if (negated)
 return -result - (low > 0x7FFF ? 1 : 0);
 else
 return result + (low > 0x7FFF ? 1 : 0);
}







[LISTING EIGHT]

HRGN outer = CreateRectRgn(100,200,300,400);
HRGN inner = CreateRectRgn(110,210,290,390);
HRGN hollow = CreateRectRgn(0,0,0,0);
CombineRgn(hollow, outer, NULL, RGN_COPY);
CombineRgn(hollow, hollow, inner, RGN_DIFF);





[LISTING NINE]

static short MapScalar(short value, short srcOffset, short dstOffset,
 tFixed scale)
 {
 return tFixRound(tFixMul(scale, ToFixed(value-srcOffset))) + dstOffset;
 }
void MapRgn(HRGN rgn, RECT * srcRect, RECT * dstRect)
 {
 register short srcX = srcRect->left;
 register short srcY = srcRect->top;
 register short dstX = dstRect->left;
 register short dstY = dstRect->top;
 /* Compute the X-scale and Y-scale as the ratio of the
 length of the source and destination width and height */
 tFixed scaleX = tFixRatio( (dstRect->right - dstX),
 (srcRect->right - srcX));
 tFixed scaleY = tFixRatio( (dstRect->bottom - dstY),
 (srcRect->bottom - srcY));
 short ySpans; // number of Y-spans in the region
 short yspan; // current Y-span being processed
 short xSpans; // number of X-spans in this Y-span
 short xspan; // current X-span being processed
 short RgnLength; // length of region, in words
 short * regionbase; // base of region data
 short * scanC; // current region word

 regionbase = ConvertRegionToAddress(rgn); // Point to beginning of region
 scanC = regionbase; // start at base of region
 scanC += 5; // skip GDI header, point to region length

 if (GetSystemMetrics(SM_DEBUG) && (GetVersion() != 3))
 scanC += 2; // adjust for 3.1+ Debug version!!
 RgnLength = *scanC++; // word 5: region length
 ySpans = *scanC++; // word 6: number of Y-spans
 scanC += 1; // Skip to bounding box.

 // Map bounding box for the region
 *scanC++ = MapScalar(*scanC, srcX, dstX, scaleX); // left
 *scanC++ = MapScalar(*scanC, srcY, dstY, scaleY); // top
 *scanC++ = MapScalar(*scanC, srcX, dstX, scaleX); // right
 *scanC++ = MapScalar(*scanC, srcY, dstY, scaleY); // bottom
 for (yspan = 0; yspan < ySpans; yspan++)
 { /* foreach y-span */
 xSpans = *scanC++ / 2;

 // Now we are pointing to the y start and y end points. Scale them.
 *scanC = MapScalar(*scanC, srcY, dstY, scaleY); // startY
 scanC++;
 *scanC = MapScalar(*scanC, srcY, dstY, scaleY); // endY
 scanC++;
 // We are pointing to the x limit pairs for this y-span.
 // Iterate down these, scaling each of them
 for (xspan = 0; xspan < xSpans; xspan++)
 { /* foreach x-span */
 // now we are pointing to the x start and end points. Scale them
 *scanC = MapScalar(*scanC, srcX, dstX, scaleX); // startX
 scanC++;
 *scanC = MapScalar(*scanC, srcX, dstX, scaleX); // endX
 scanC++;
 } /* foreach x-span */
 // The next word is the second copy of the number of X spans * 2.
 // Ignore it.
 scanC++;
 } /* foreach y-span */

}


























March, 1993
PROGRAMMER'S BOOKSHELF


Adaptation, Emergent Computation, and Artificial Life




Ray Valdes


The computer industry is so fast moving that many products have a shelf life
not much longer than last night's guacamole. It's therefore a salutary
experience to encounter books that won't be outdated by this season's Comdex.
In the preface to the new 1992 edition of his classic work on Adaptation in
Natural and Artificial Systems, John Holland writes, "When this book was
originally published [in 1975, after seven years of writing], I was very
optimistic, envisioning extensive reviews and a kind of 'best seller' in the
realm of monographs. Alas! That did not happen." What did happen is that the
book sold a hundred or so copies a year, steadily, through the early 1980s,
until an explosion of interest occurred in the middle of that decade.
The area that John Holland pioneered with this book--that of genetic
algorithms--is now extensively covered by a whole raft of books and
publications, including five sets of conference proceedings. This area is of
more than academic interest, if recent articles in the business press are any
indication. For example, Business Week ran a special section on "The New
Rocket Science" (11/2/92) describing how financial trading on Wall Street is
being transformed by software technologies such as genetic algorithms, neural
nets, and chaos theory.
You don't need to read Holland's book to work with genetic algorithms. You can
simply find a good introduction, such as Richard Spillman's "Genetic
Algorithms" in the February 1993 issue of DDJ, get the sample code, and start
working (or playing) immediately. If you have access to the Internet, there
are public-domain tools such as Genitor and GATool that let you build systems
easily.
The value of Holland's book is the carefully written, lucid exposition of his
formal theory of adaptive systems. Genetic algorithms are but one aspect of
this mathematical theory. Holland begins by defining a general framework that
can rigorously describe a wide range of adaptive systems. He illustrates its
power by applying it to cases in genetics, economics, game theory, searching
and pattern recognition, control theory, and physiology. The examples are not
discussed in elaborate detail, but enough to make the connection:
Searches occur as the principal element in most problem-solving and
goal-attainment attempts, from maze-running through resource allocation to
very complicated planning situations in business, government, and research.
Games and searches have much in common and, from one viewpoint, a game is just
a search (perturbed by opponents) in which the object is to find a winning
position. The complementary viewpoint is that a search is just a game in which
the moves are the transformations (choices, inferences) permissible in
carrying out the search.
Likewise, evolutionary processes in nature can be seen as a sophisticated
search over a fitness landscape to ultimately arrive at life forms that are
optimally matched to their environment. If you've read the articles mentioned
earlier, this will come as no news. One critical point, however, is ignored by
much of the press coverage, including many technical articles.
This critical point is fully elaborated on in Chapters 5, 6, and 7, which
constitute the heart of the book. Holland explains, with full mathematical
rigor, why genetic algorithms work as well as they do. You need a good
background in probability, combinatorics, and system theory to follow his
reasoning. His proof relies on the central-limit theorem and the theorem of
large deviations--subjects sufficiently abstruse that I had to struggle to
keep up. Nevertheless, I understood enough to grasp the essential idea behind
genetic algorithms, which is that evolution is not a glorified random search,
but one that is highly efficient because of the genetic operators of crossover
and inversion. Sexless evolution--evolution by mutation alone--is slow and
uninteresting, not much better than an exhaustive enumeration of all
possibilities. But when all three genetic operators (crossover, inversion, and
mutation) are present, the result is a system in which the number of
better-than-average solutions increases exponentially to arrive at a solution
close to optimal.
The movement through the large search space follows a rapid trajectory because
of what Holland calls "implicit parallelism": When one individual is being
evaluated by the fitness function, many different comparisons are being made,
because the members of the current population form a compact representation of
a much larger candidate set, which includes much historical information.
Unlike mutation alone, the full genetic algorithm contains, at the beginning
of any cycle, the retained performance of past members of the population.
Holland's analysis does not stop there. Not only does he create an entire
field of research with one book, he also lays the groundwork for its most
penetrating criticism. He realizes that the main shortcoming of genetic
algorithms is the total dependence on how the attributes of a problem are
represented and the way in which those attributes are evaluated. If you
consider attributes to be data, then you can consider the way in which the
attributes are represented and evaluated to be analogous to code. Holland's
insight that carries us beyond the realm of genetic algorithms is to allow the
code to evolve along with the data. To this end, he develops a "language of
algorithms" that is amenable to transformation by genetic operators. This path
eventually leads to Holland's recent work on "classifier systems," explained
in Chapter 10. In the 1992 edition of the book, Chapter 10 is entirely new,
and provides a very good summary of Holland's recent work with the Santa Fe
Institute (SFI).
It's quite possible that history will regard SFI as one of the most important
scientific developments in the late 20th century. In Holland's words, it's
basically a "collection of Nobel Laureates, MacArthur Fellows, Old and Young
Turks, and bright young postdocs" dedicated to the study of complex adaptive
systems. The systems have
...a crucial role in a wide range of human activities... Economies, ecologies,
immune systems, developing embryos, and the brain are all examples of complex
adaptive systems. Despite surface dissimilarities, all complex adaptive
systems exhibit a common kernel of similarities and difficulties. [They all]
involve large numbers of parts undergoing a kaleidoscopic array of
simultaneous non-linear interactions. Because of the non-linear interactions,
the behavior of the whole is not, even to an approximation, a simple sum of
the behaviors of its parts.
Two other recent books provide a glimpse into the activities of this
intriguing and diverse group. The first is Emergent Computation, edited by
Stephanie Forrest. Although this book is not an official SFI publication, it
includes contributions by many of the key people at SFI, including John
Holland. The second is Artificial Life II (AL2) published in mid-1992 as part
of an SFI series for Addison-Wesley. AL2 consists of the proceedings of a 1990
conference at SFI; it follows an earlier volume on a similar conference in
1988. As discussed further on, artificial life and emergent computation are
two overlapping fields of research. Together, these two books contain over 60
different papers, bearing titles such as: "Computation at the Edge of Chaos,"
"Algorithmic Chemistry," "The Dynamics of Programmable Matter," and
"Interactions between Learning and Evolution."
Stephanie Forrest introduces the basic idea of emergent computation: "to
explore computational models in which the behavior of the entire system is in
some sense more than the sum of its parts." The traditional approach to
computing focuses on building systems that conduct specific tasks precisely
planned in advance. By contrast, the physical and biological sciences readily
accept the idea that interactions among simple deterministic systems can
produce not-quite-predictable, but interesting and complex, global behaviors.
The premise of emergent computation is that, "...interesting and useful
computation systems can be constructed by exploiting interactions among
primitive components, and further, that for some kinds of problems (such as
modeling intelligent behavior) it may be the only feasible method."
Compare emergent computation with conventional programming:
The standard approach to programming language design minimizes the potential
for emergent computation. The notation or syntax used to express computer
programs is for the most part context free. Roughly, this means that legal
programs are required to be written in such a way that the legality (whether
or not the program is syntactically correct) of any one part of the program
can be determined independently of the other parts. While this is a very
powerful property (among other things, it makes it possible to build efficient
compilers) emergent computations are almost certainly not context-free since
they arise from interactions among components.
Instead of trying to minimize side effects that lead to inadvertent
interactions, such as bashing a global variable, emergent computation is
"primarily computation by side effect." In other words, please set aside all
the careful efforts over the past few decades toward making the programming
process more controllable and predictable--the hallowed precepts of
information hiding, encapsulation, layered abstraction, and
modularization--and revel instead in the unbridled chaos of natural growth
processes. If this sounds a bit New Age and Californian, it's probably because
some of it is. Some of the key figures in emergent computation and artificial
life came from the hippie-ish Dynamical Systems Collective in Santa Cruz, so
vividly depicted in James Gleick's bestseller Chaos.
Nevertheless, there is serious, rigorous, world-class science here, along with
exuberant speculation and inspired creativity. Included with the paper by John
Holland are papers by his students and coworkers. Stephanie Forrest, one of
his PhD students, has a paper on the subject of classifier systems. John Koza,
another PhD student, has a paper entitled, "Genetic Evolution and Co-Evolution
of Computer Programs." After competing his studies in the '70s, Koza went on
to make a fortune doing traditional computing during the early '80s, then
became a venture capitalist, and has now returned to the unfettered study of
genetic algorithms.
I don't have anywhere near the space to describe the articles in these two
volumes, but if you ever get the feeling you haven't encountered anything
really new lately, thumb through either one of these books, and that notion
will be dispelled immediately. The subject of artificial life (AL) deserves a
whole review in itself. One definition of the field of AL is that it seeks to
model and understand natural living processes in the same way that artificial
intelligence (AI) seeks to model natural intelligence. The field of AL
overlaps with emergent computation in the same way that AI overlaps with
symbolic programming.
In the remainder of this review, I'll briefly highlight some of the articles
that struck my fancy.
Thomas Ray's "An Approach to the Synthesis of Life," in AL2, provides a
technical description of his work that has recently been touted in the popular
press, such as the New York Times. Ray takes the most literal approach
possible to implementing genetic algorithms, by implementing a virtual world
inside the computer where self-replicating programs compete to survive. The
precious resources are CPU time and memory space, and organisms evolve
strategies to exploit one another. CPU time is analogous to life-giving
energy, and memory is analogous to inhabited territory. The inhabitants of
this territory (or "primordial soup") are small, machine-language creations
that are a result of mutation from the one original program written by Ray.
In stark contrast with the work of John Holland, Ray provides little theory or
mathematical framework. Ray spent 16 years in the Central American jungle as a
biologist, and is content to let his empiricism stand (or fall) on its own.
The results of his experiments are enchanting: Over a time scale of thousands
of generations, "biodiversity" appears, parasites evolve, hosts evolve
immunity to parasites, and parasites circumvent the immunity. (Parasites are
creatures that do not contain the complete code for their self-replication;
they rely on other creatures' genomes to reproduce.) Ray's work might not have
the long-term impact of Holland's masterpiece, but his captivating concepts
have made his code a popular choice for ftp-ing on the Internet.
Another gem of a piece is by a different Ray--Alvy Ray Smith. Although Smith
is now well-known for his work in computer graphics and special effects, his
paper in AL2 summarizes Smith's PhD thesis of 20 years ago. The subject is not
computer graphics, but rather a proof of the existence of non-trivial,
self-reproducing machines. The term "nontrivial machine" refers to a Universal
Turing machine capable of processing any computable function. The theorem that
Smith tackles is one proved by John Von Neumann in 1953, but which originally
required a book-length proof. Smith's equivalent proof fits in two pages.
Along the way, he provides the clearest explanation I've seen of the
equivalence between a Turing Machine and a cellular automaton.
In another interesting piece presented in both AL2 and Emergent Computation,
Danny Hillis, founder of Thinking Machines (makers of the massively parallel
Connection Machine), describes his use of "co-evolving parasites" in
addressing the minimal sorting-network problem.
A sorting network is a sorting algorithm that takes a set of inputs and sorts
them using a sequence of comparisons and exchanges of data in a predetermined
order.
Hillis writes, "Finding good networks is a problem of considerable practical
importance, since it bears directly on the construction of optimal sorting
programs, switching circuits, and routing algorithms in interconnection
networks." Hillis uses a genetic algorithm, a pretty sophisticated one that
includes not just mutation and recombination, but also dominant and recessive
genes and competitive mating.
With this algorithm, the system produces a sorting network almost as good as
the one discovered by Donald Knuth in an early study of sorting networks. To
improve on this rather impressive result, Hillis introduces a whole new
population, one which interacts with the original population of networks in a
host-parasite relationship (equivalent to a predator-prey relationship). In
this interaction, hosts (sorting networks) are scored by how well they sort,
while parasites (test cases) are scored by how well they find flaws in sorting
networks. After successive waves of epidemic and immunity (feast and famine),
the result is a 61-exchange sorting network, very close to the best known, and
better than Knuth's.






















March, 1993
JULIAN AND GREGORIAN CALENDARS


Date-conversion functions for yesterday and today


 This article contains the following executables: DATECNV.ARC


Peter J.G. Meyer


Peter, who currently works in AI research, is the author of the Dolphin C
Toolkit and Dolphin Encrypt. He can be reached at 4815 W. Braker Lane, #502-
111, Austin, TX 78759 or via the Internet at meyer@mcc.com.


The Western calendrical system--known as the Julian calendar and consisting of
a year of 12 months and of 365 days with an extra day every fourth year--was
established by Julius Caesar (following the advice of the Alexandrian
astronomer Sosigenes) in 46 B.C. The extra day may not have been added
consistently until A.D. 8, during the reign of Augustus. Subsequently, this
calendar became widespread as a result of the expansion of the Roman Empire.
The system of numbering years Anno Domini was instituted in A.D. 525 by the
Roman abbot Dionysius Exiguus.
The Julian calendar assumes that the average length of a year is 365 days and
six hours (since one day is added every four years). The length of the year
assumed in the Julian calendar exceeds the current true value by about 11
minutes, resulting in an error of about three days every 400 years. Thus, as
the centuries passed the Julian calendar became increasingly inaccurate with
respect to the solar year as defined in terms of the solstices and the
equinoxes. This was especially troubling to the Church because it affected the
determination of the date of Easter, which by the sixteenth century was
slipping gradually into summer. To resolve these problems, the calendar was
reformed in 1582 on the authority of Pope Gregory XIII, and the modified
calendar is called the Gregorian calendar.
In this article, I'll present a C function which converts any date within an
11-million-year period in either the Gregorian calendar or the Julian calendar
into a unique long int, a number in the range of approximately -2,000,000,000
through 2,000,000,000. A function is also given for conversion of a long int
back into a date in one of the calendars. This permits conversion between
dates in the Julian and Gregorian calendars and provides a basis for other
date-manipulation functions. The date-conversion functions given in this
article are used in a general C-function library that I developed, the Dolphin
C Toolkit.


Universal Date Conversion


According to the Gregorian reform, ten days (or more exactly, dates) were
omitted from the c lendar. It was decreed that the day following October 4,
1582 (which was October 5, 1582 in the old calendar) would thenceforth be
known as October 15, 1582. In addition, the rule for leap years was changed.
In the Julian calendar, a year is a leap year if it is divisible by 4. In the
Gregorian calendar, a year is a leap year if it is divisible by 4, with the
added criterion that years divisible by 100 must also be divisible by 400.
Thus the years 1600 and 2000 are leap years, but 1700, 1800, 1900, and 2100
are not. Finally, it was decreed that new rules for the determination of the
date of Easter would be adopted.


Day Numbers


Astronomers use a system of numbering days called Julian-day numbers. The term
"Julian-day number" (unlike the term "Julian calendar") does not derive from
the name of Julius Caesar. This numbering system is said to have been named
after Julius, the father of its inventor. The astronomical system of
Julian-day numbers should not be confused with the simpler system of the same
name, which associates a date with the number of days elapsed since January
first of the same year (according to which December 31, 1993 is Day 365).
The astronomical Julian-day number of the day specified by a date in either
the Julian or the Gregorian calendar is the number of days elapsed from the
day designated as January 1, 4713 B.C. in the Julian proleptic calendar. (See
the textbox entitled, "The Proleptic Calendars" for more information.) Thus,
the Julian-day number of 1/1/-4712 (J) is 0. Note that 4713 B.C. is the year
-4712. The Julian-day number of 10/10/1992 is 2,448,906.
There is a simple relationship between Gregorian-day numbers, as used in this
method of date conversion, and Julian-day numbers. Given a Gregorian-day
number, the corresponding Julian-day number is obtained by adding 2,299,161,
the Julian-day number of October 15, 1582).
Listing One, page 158, contains the header file, DATECONV.H, used by the
date-conversion functions presented in this article. A structure Date contains
values for day, month, and year, plus a value gdn (for Gregorian-day number,
as explained shortly) and a flag indicating the validity of the date. A
day/month/year value is ambiguous until the particular calendar is specified.
A date is completely specified using an instance of the structure together
with an instance of a separate char variable that has the value G or J.
We first need to ascertain whether a given year is a leap year (in the Julian
or Gregorian calendars). Functions to do this are given in DATECONV.C (see
Listing Two, page 158). A universal date-integer conversion method, as
understood here, consists of two functions: The first takes a date (either
Julian or Gregorian) and returns a positive or negative integer (long int);
the second takes a long int and a calendrical specification (G or J) and
returns a date in that calendar. It does not matter which date corresponds to
day 0 as long as there is a quickly computable, one-to-one correspondence
between dates in a calendar and numbers.
The method presented here converts dates into the number of days before or
after October 15, 1582--the day that the Gregorian calendar came into effect.
Thus, October 15, 1582 (Gregorian) corresponds to day 0; October 16, 1582
(Gregorian) to day 1; and October 14, 1582 (Gregorian) to day -1. The number
corresponding to a date is thus called the "Gregorian-day number." Dates in
the Julian calendar, as well as those in the Gregorian calendar, are mapped
into Gregorian-day numbers. Thus, the day preceding October 15, 1582 in the
Gregorian calendar is both October 4 in the Julian calendar and October 14 in
the Gregorian calendar--both have Gregorian-day number -1.
The code for the function to convert a date in one of the calendars to a
Gregorian-day number, and for the function to convert a Gregorian-day number
to a date in a specified calendar, is given in Listing Two. The Date structure
uses long int variables (signed) for the year and the Gregorian-day number.
The largest integer representable as a signed long int is 2,147,483,647. Due
to the method of calculation, however, the largest Gregorian-day number that
can be used is 2,146,905,911. This corresponds to the dates July 11, 5,879,611
(Gregorian) and October 19, 5,879,490 (Julian). By that time, dates in the
Gregorian and Julian calendars will differ by about 121 years. (This will be
of no practical importance since by that time both calendars will likely have
been superseded.)
More Details.
To use the conversion functions, declare a structure of type Date (defined in
Listing One) and pass to the functions a pointer to the structure along with a
calendrical specification (G or J). If you are converting from Gregorian-day
number to date, define the gdn structure variable before calling
gdn_to_date(). Conversely, define the variables day, month, and year before
calling date_to_gdn(). On return from the functions, extract either gdn or the
date values from the structure. Before using the gdn value, it is advisable to
check the validity flag, which will be TRUE if the date passed was a valid
date in the specified calendar, and FALSE otherwise. For example, the attempt
to convert February 29, 1900 (Gregorian) to a Gregorian-day number will
produce a FALSE in the validity variable.
The lfloor() function in Listing Two is analogous to the Microsoft
floating-point library floor() function, and overcomes a small problem in
integer arithmetic. The date-conversion functions described earlier need a
long-integer division operation such that, for all long integers a and b, a/b
is the largest integer not greater than the real number a/b. MSC's division
operator produces this result if a and b are positive, but not if a is
negative and b is positive. The lfloor() function provides what is needed.
Finally, DATECONV.C also contains a function to convert a Gregorian-day number
into a day of the week. This is independent of the particular calendar used.


Testing


No function or program can be relied upon unless it is tested thoroughly. The
program DATETEST.C in Listing Three (page 159) tests the functions in Listing
Two. DATETEST takes two numbers, n and m, on the command line and performs
conversions for n, n+m, n-m, n+2*m, n-2*m, and so on, up to the largest
integer that can be handled. It first converts the number to a date using
gdn_to_date(), then converts the resulting date back to a number using
date_to_gdn(). The program does this for both the Gregorian and the Julian
calendars. If the number resulting from converting a number to a date and back
to a number were different from the initial number, then a bug would be
revealed.
If DATETEST is run with command-line parameters 0 and 1, then all dates
forward and backward from October 15, 1582 (Gregorian) are tested, although
the program displays the results only for every Nth conversion. (N is
currently defined as 3000 in Listing Three.)
Since there are over four million input numbers that could be tested,
exhaustive testing is not practical unless you can run the program on a very
fast computer. It has been shown, however, that when using DATETEST 0 1, no
bugs are revealed for all Gregorian-day numbers in the range -14,235,000
through 14,235,000. The corresponding dates include all dates in both
calendars for all years from -37,390 to 40,555.


Derivative Calendrical Functions


Given the basic date-number conversion functions, it is not difficult to
develop other calendrical functions. In fact, there are 38 date functions in
the Dolphin C Toolkit, including functions to determine which of two dates is
earlier, how many days separate two dates, and how many weekdays separate two
dates. There is also a function to format a date in hundreds of different
ways.


Conclusion



The Dolphin C Toolkit includes a demonstration program, CAL_FNS, which allows
conversion between dates and Julian-day numbers. This program also provides
the day of the week of any date and the number of days between two dates.
CAL_FNS allows us to determine, for example, that the storming of the Bastille
occurred on a Tuesday. We can also discover that on the evening of April 18,
1521, when Luther defended himself against charges of heresy before the Holy
Roman Emperor, Charles V, it was a Thursday. Finally, the coronation of
Charlemagne as Holy Roman Emperor in Rome on Christmas day, 800, occurred on a
Friday. We may reasonably suppose that festivities continued throughout the
weekend.


Adopting the Gregorian Calendar


There was no necessity for 10 days, rather than, say, 12 days to have been
omitted from the calendar. In fact, the calendar could have been reformed so
as to keep the year in step with the seasons without omitting any days at all,
since only the new rule for leap years is required to keep the calendar
synchronized with the equinoxes. In fact, ten days were omitted in order to
fix the date for the Spring equinox at March 21, which was the date of the
equinox at the time of the Council of Nicea in the fourth century.
Upon the promulgation of Pope Gregory's decree, the Gregorian calendar was
adopted immediately in Italy, Spain, Portugal, and Poland, and shortly
thereafter in France and Luxembourg. During the next two years, most Catholic
regions of Germany, Belgium, Switzerland, and the Netherlands came on board.
Hungary followed in 1587. The rest of the Netherlands, Denmark, Germany, and
Switzerland made the change in 1699-1701.
By the time the British were ready to acquiesce, the old calendar had drifted
off by one more day, requiring a correction of 11 days, rather than 10, to
locate the Spring equinox (usually) at March 21. The Gregorian calendar was
adopted in Britain (and in the British colonies) in 1752, with September 2,
1752 being followed immediately by September 14, 1752.
In many countries, the Julian calendar was used by the general population long
after the official introduction of the Gregorian calendar. Thus, events were
recorded in the sixteenth to eighteenth centuries with various dates,
depending on which calendar was used. Dates recorded in the Julian calendar
were marked "O.S." for "Old Style," and those in the Gregorian calendar were
marked "N.S." for "New Style."
To complicate matters further, the first day of the year was celebrated in
different countries and regions on January 1, March 1, March 25, or December
25. With the introduction of the Gregorian calendar in Britain and the
American colonies, people ceased to celebrate New Year's Day on March 25, as
had been their custom, and instead began to celebrate it on January 1.
Previously, March 24 of one year had been followed by March 25 of the next
year. Thus George Washington's birthday, which was 2/11/1731 O.S., became
2/22/1732 N.S.
Sweden adopted the Gregorian calendar in 1753, Japan in 1873, Egypt in 1875,
China and Albania in 1912, Bulgaria in 1915 or 1916, Romania in 1919, and
Turkey in 1927. Following the Bolshevik Revolution in Russia, it was decreed
that the day following January 31, 1918 O.S., would become February 14, 1918
N.S.
In 1923, the Eastern-Orthodox church adopted a modified form of the Gregorian
calendar. Whereas in the Gregorian calendar a century year is a leap year only
if division by 4 leaves a remainder of 1, 2, or 3, in Eastern system a century
year is a leap year only if division of 9 leaves a remainder of 2 or 6. This
renders the calendar slightly more accurate. October 1, 1923 in the Julian
calendar became October 14, 1923 in the Eastern-Orthodox calendar. The date of
Easter is determined by reference to modern lunar astronomy (in contrast to
the more approximate, rule-based, lunar model of the Gregorian system.)
--P.M.



The Proleptic Calendars


Every date recorded in history prior to October 15, 1582 (Gregorian), such as
the coronation of Charlemagne as Holy Roman Emperor on Christmas day in the
year 800, is a date in the Julian calendar, since on those dates the Gregorian
calendar had not yet been invented. We can, however, identify particular days
prior to October 15, 1582 (Gregorian) by means of dates in the Gregorian
calendar simply by projecting the Gregorian dating system back beyond the time
of its implementation. A calendar obtained by extension earlier in time than
its invention is called "proleptic."
For example, although the Gregorian calendar was implemented on October 15,
1582 (Gregorian), we can still say that the date one year before was October
15, 1581 (Gregorian), even though people alive on that day would have said
that the date was October 5, 1581 (the Julian date at that time). As another
example, the date of the coronation of Charlemagne was December 29, 800 in the
Gregorian proleptic calendar.
Similarly, dates after October 15, 1582 (Gregorian) have equivalent, but
different, dates in the Julian calendar. For example, this article was
completed on October 10, 1992 in the Gregorian calendar, but we could equally
well say that it was completed on September 28, 1992, in the Julian calendar.
As another example, the date of the winter solstice in the year 2012 is
December 21, 20012 (Gregorian), which is December 8, 2012 (Julian).
Thus, any day in the history of the Earth, either in the past or in the
future, can be specified as a date in either of these two calendrical systems.
The dates will generally be different. In fact, they will be the same only for
dates from March 1, 200 to February 28, 300. The dates in neither calendar
will coincide with the seasons in the distant past or distant future, but that
does not affect the validity of these calendars as systems for uniquely
identifying particular days.
Astronomers designate years by B.C. by means of negative numbers. In order to
avoid a hiatus between the year 1 and the year -1, there has to be a year 0.
Thus, astronomers adopt the convention: A.D. 1 is equal to Year 1; 1 B.C. is
equal to Year 0; 2 B.C. is equal to Year -1; and so on. More generally, a year
popularly designated n B.C. is designated by astronomers as the year -(n-1).
Finally, the rules for leap years in both calendars are valid for the year 0
and for negative years, as well as for positive years.
--P.M.


_JULIAN AND GREGORIAN CALENDARS_
by Peter J.G. Meyer


[LISTING ONE]

/* DATECONV.H -- Header file for date functions -- Last mod.: 1992-10-10 */
#define TRUE 1
#define FALSE 0
typedef struct
 {
 int day;
 int month;
 long year;
 long gdn; /* gdn and valid are for internal use */
 int valid; /* by functions in this library */
 } Date;
/* declarations of functions defined in DATECONV.C */
void date_to_gdn(Date *dt, char calendar);
void gdn_to_date(Date *dt, char calendar);
long lfloor(long a, long b);
int is_leap_year_c(long year, char calendar);
int is_leap_year(long year);
void set_feb_length(long yr, char calendar);
void reset_feb_length(void);
int day_of_week(long gdn);







[LISTING TWO]

/* DATECONV.C -- Universal date conversion functions --Last mod: 1992-10-10 */
#include <STDLIB.H> /* for labs() */
#include <CTYPE.H>
#undef toupper /* so as to use function, not macro */

#include "DATECONV.H"
int month_length[14]
 = {
 0, 31, 28, 31, 30, 31, 30,
 31, 31, 30, 31, 30, 31, 0
 };
/* months 0 and 13 have 0 days */
/* the calendar parameter is always 'G' or 'J'; G = Gregorian, J = Julian */
int is_leap_year_c(long year,char calendar)
{
calendar = (char)toupper((int)calendar);

if ( year%4 ) /* if year not divisible by 4 */
 return ( FALSE );
else
 {
 if ( calendar == 'J' )
 return ( TRUE );
 else /* calendar == 'G' */
 return ( ( year%100 != 0L year%400 == 0L ) ?
 TRUE : FALSE );
 }
}
int is_leap_year(long year)
{
return ( is_leap_year_c(year,'G') );
}
void set_feb_length(long yr,char calendar)
{
month_length[2] = 28 + is_leap_year_c(yr,calendar) ;
}
void reset_feb_length(void)
{
month_length[2] = 28;
}
/* function to convert date to Gregorian day number sets valid flag to FALSE
 * if date invalid, otherwise returns number of days before or after the day
 * the Gregorian calendar came into effect (15-OCT-1582) */
void date_to_gdn(Date *dt, /* dt is a pointer to a structure */
 char calendar) /* 'G' or 'J' */
{
int day = dt->day;
int month = dt->month;
long year = dt->year;
long gdn;

calendar = (char)toupper((int)calendar);
set_feb_length(year,calendar);
if ( month < 1 month > 12

 day < 1 day > month_length[month] )
 dt->valid = FALSE;
else
 {
 /* calculate number of days before/after October 15, 1582 (Gregorian) */
 gdn = (year-1)*365 + lfloor(year-1,4L);
 if ( calendar == 'G' )
 gdn += lfloor(year-1,400L) - lfloor(year-1,100L);
 while (--month)
 gdn += month_length[month];
 gdn += day - 577736L - 2*(calendar=='J');
 dt->gdn = gdn;
 dt->valid = TRUE;
 }
reset_feb_length();
}
/* function to convert gregorian day number to date */
void gdn_to_date(Date *dt,char calendar)
{
int month, i, exception;
long year, gdn, y4, y100, y400;
calendar = (char)toupper((int)calendar);
gdn = dt->gdn;
gdn += 577735L + 2*(calendar=='J');
y400 = 146100L - 3*(calendar=='G');
y100 = 36525L - (calendar=='G');
y4 = 1461L;
exception = FALSE;
year = 400*lfloor(gdn,y400); /* 400-year periods */
gdn -= y400*lfloor(gdn,y400);
if ( gdn > 0L )
 {
 year += 100*lfloor(gdn,y100); /* 100-year periods */
 gdn -= y100*lfloor(gdn,y100);
 exception = ( gdn == 0L && calendar == 'G' );
 if ( gdn > 0L )
 {
 year += 4*lfloor(gdn,y4); /* 4-year periods */
 gdn -= y4*lfloor(gdn,y4);
 if ( gdn > 0L )
 {
 i = 0;
 while ( gdn > 365 && ++i < 4 )
 {
 year++;
 gdn -= 365L;
 }
 }
 }
 }
if ( exception )
 gdn = 366L;
 /* occurs once every hundred years with Gregorian calendar */
else
 {
 year++;
 gdn++;
 }
set_feb_length(year,calendar);

month = 1;
while ( month < 13 && gdn > month_length[month] )
 gdn -= month_length[month++];
if ( month == 13 )
 {
 month = 1;
 year++;
 }
reset_feb_length();
dt->day = (int)gdn;
dt->month = month;
dt->year = year;
dt->valid = TRUE;
}
long lfloor(long a,long b) /* assumes b positive */
{
return ( a >= 0L ? a/b : ( a%b == 0L ) - 1 - labs(a)/b );
/* labs() returns the absolute value of its long int argument */
}
/* returns day of week for given Gregorian day number; 0=Sunday, 6=Saturday */
int day_of_week(long gdn)
{
return ((int)(((gdn%7)+12)%7));
}





[LISTING THREE]

/* DATETEST.C -- Tests date conversion routines -- Last mod.: 1992-10-10 */
/* Link with DATECONV.OBJ by using CL DATETEST.C DATECONV.C */

#include <STDIO.H>
#include <STDLIB.H> /* for exit() */
#include <CONIO.H>

#include "DATECONV.H"

void main(int argc, char **argv);
void display(long n);
void test(long n, char calendar);

Date date;

#define ESCAPE '\x1B'
#define N 3000
#define MAXIMUM_GDN 2146905911
/* largest Gregorian day number that can be handled */
void main(int argc, char **argv)
{
long n, m;
if ( argc < 3 )
 {
 printf("Syntax: DATETEST start increment\n");
 return;
 }
n = atol(argv[1]); /* number to start with */

m = atol(argv[2]); /* increment */
printf("Quit with Escape.");
while ( TRUE )
 {
 if ( n%(N*m) == 0 )
 {
 display(n);
 display(-n);
 }
 test(n,'G');
 test(n,'J');
 test(-n,'G');
 test(-n,'J');
 if ( n > MAXIMUM_GDN-m )
 {
 printf("\ngdn = %ld",n);
 return; /* since next n is too large */
 }
 n += m;
 if ( kbhit() )
 {
 if ( getch() == ESCAPE )
 break;
 }
 }
}
void display(long n)
{
date.gdn = n;
printf("\ngdn =%12ld",n);
gdn_to_date(&date,'G');
printf(" %02d/%02d/%ld (G)",date.month,date.day,date.year);
gdn_to_date(&date,'J');
printf(" %02d/%02d/%ld (J)",date.month,date.day,date.year);
}
void test(long n,char calendar)
{
date.gdn = n;
gdn_to_date(&date,calendar);
if ( !date.valid )
 {
 printf("\ngdn = %ld %d/%d/%ld (%c) Date invalid!",
 date.gdn,date.month,date.day,date.year,calendar);
 exit(1);
 }
date.gdn = 0L;
date_to_gdn(&date,calendar);
if ( date.gdn != n )
 {
 printf("\ngdn = %ld %d/%d/%ld (%c) n = %ld!",
 date.gdn,date.month,date.day,date.year,calendar,n);
 exit(1);
 }
}




































































March, 1993
OF INTEREST





Now available from Q/Media is Q/Media for Windows, a multimedia presentation
program. Both the regular and professional versions integrate video,
animation, audio, imaging, graphics, and text from Windows, DOS, and Macintosh
applications to create presentations on PCs running Windows 3.1.
Q/Media for Windows supports Microsoft's Video for Windows digital video
software; the professional version also supports Microsoft's Modular Windows.
The MCI device driver lets you embed Q/Media Pro movies into other
applications.
Using Q/Media for Windows, you can create presentations or add video,
animation, and audio to presentations already created in other programs. With
Q/Media Pro, developers can create interactive presentations and multimedia
titles.
The key to Q/Media for Windows' ability to transparently integrate any file
brought onto its screen, regardless of the program in which it was created,
lies in Q/Media's proprietary media player engine, which seamlessly integrates
all major file formats and even increases the imported file's playback
performance.
The package also includes a timeline to synchronize objects, a slide sorter
outliner to organize presentation flow, and over ten Mbytes of animation,
digital video, wave audio, and MIDI and graphic images. Q/Media for Windows
costs $99.00; Q/Media Pro sells for $495.00. Reader service no. 20.
Q/Media Software Corp. 312 E. 5th Avenue Vancouver, BC Canada V5T 1H4
604-879-1190
Lynx Real-Time Systems' LynxOS UNIX/POSIX-compatible operating system for
real-time applications now offers an integrated DOS operating environment:
VP/ix for LynxOS. Using this new option, you can operate in a real-time
environment and still run non-real-time DOS and UNIX apps.
VP/ix lets single or multiple users run DOS and UNIX applications in a paged
virtual-memory environment on a single system without additional hardware.
Each session executes in its own virtual-address space.
You can initiate DOS commands from within the LynxOS environment and run DOS
apps from the LynxOS prompt. The reverse is also possible: UNIX commands can
be run from within DOS. VP/ix integrates DOS files into the UNIX file system,
so there is no need for separate disk partitions; files are easily accessible.
DOS and UNIX apps can directly share files, and peripherals from either
operating environment can be used without additional device drivers.
The price for VP/ix for LynxOS is $695.00. Reader service no. 21.
Lynx Real-Time Systems 16780 Lark Avenue Los Gatos, CA 95030 408-354-7770
Version 2.0 of QuickStart for Windows NT, a DOS-hosted cross-development tool
for NT targets, has been released by Phar Lap. QuickStart runs NT tools under
DOS or in a Windows 3.1 DOS box, so Win32s applications can be created and run
under Windows without the need to reboot.
With the new version of QuickStart, programmers can run NT tools in
conjunction with DOS editors, utilities, source-control systems, and network
software. To use QuickStart with Win32s, you use NT tools under DOS or Windows
to write applications using a subset of the Win32 API calls; these apps then
use Win32s to run on top of 16-bit Windows 3.1. You can use the latest
versions of Microsoft's 32-bit Windows NT C/C++ compiler, linker, librarian,
and resource compiler under DOS or Windows 3.1.
QuickStart is available free of charge for a limited time. Reader service no.
22.
Phar Lap Software Inc. 60 Aberdeen Avenue Cambridge, MA 02138 617-661-1510
STV Software has introduced Menus & Macros, a common macro language that works
across different applications. Menus & Macros consists of: Macro-Workbench, an
integrated development environment for macros and applications; MacroManager,
a database containing the commands of available Windows applications; the
OnTop Scalable Kernel, a DLL containing all functions; ToolPaint, which allows
you to paint and program dialog boxes and button windows; the MacroScript
compiler, which compiles MacroScripts into optimized tokens executed by the
OnTop kernel; and OnTop2, the runtime application for macros.
Contact STV for pricing. Reader service no. 23.
STV Software GmbH Rathausstrasse 15 A-1010 Vienna Austria +43-1-404-4141
OMTool for the PC is an interactive data-modeling tool that facilitates rapid,
reliable development and is now available from the GE Advanced Concepts
Center.
GE's object-modeling technique allows programmers point-and-click methods for
creating, defining, and editing classes, attributes, relationships, and class
hierarchies within an application model. OMTool also lets you drag model
elements, maintain connectivity and logical structure, generate
object-oriented codes, and diagram output for PostScript, Interleaf, and
Framemaker. It allows you to create pop-up forms to guide users through
options and also provides a debugging interface for run-time objects.
OMTool sells for $995.00. Reader service no. 24.
GE Advanced Concepts Center 640 Freedom Business Center P.O. Box 1561 King of
Prussia, PA 19406 800-438-7246
Version 2.0 of Visual Basic has been released by Microsoft and is available in
both a standard and a professional edition. The standard edition offers
improved performance and greater capacity--huge arrays, unlimited string
space, and over twice the previous code capacity are built in. New debugging
tools include watch expressions, conditional break expressions, and a calls
window. An ASCII representation is now available for all files. With the new
properties window, you can set all the properties of a control or object from
a single window or set the properties of several controls simultaneously.
There is a new tool bar, color-coded syntax, better grid control, and MDI
support, as well as support for international currency, time, and date
formats.
In addition to all the aforementioned features, the professional edition also
provides the following: two new MAPI controls, open database connectivity
(ODBC) support, a new masked edit control, communications control, two new
Windows for Pens controls, Windows 3.1 support, and a new control development
kit.
Also available from Microsoft is Macro Assembler 6.1 Improvements to this
version include: instruction timings, which facilitate fine tuning of
execution times; a wider selection of high-level directives; Windows
programming documentation; improved compatibility with MASM 5.1 code; the
capability to target Windows, DOS, and Windows NT; and improvements to the
Programmer's Workbench and the Codeview debugger.
Visual Basic 2.0 retails for $199.00; the professional edition costs $495.00.
Upgrades to the professional edition are $199.00 and $99.00, respectively,
MASM 6.1 sells for $199.00; upgrades are $99.00. Reader service no. 25.
Microsoft Corp. One Microsoft Place Redmond, WA 98052-6399 206-882-8080
ESS has added ES488X and ES488Y to the AudioDrive family of speech- and
sound-integrated circuits. These ICs are all microprocessor-based, custom VLSI
chips for PC audio that provide direct interface to the PC bus, and direct
connection to line-in, microphone, and external speakers.
Like its predecessor, the ES488Y can record, compress, store, and play back
voice, sound, and music without employing external ICs. In addition, it
provides a direct interface with FM synthesis chips, allowing users to play
MIDI and synthesized music for multimedia audio implementation.
Also included in the ES488X is a direct interface with additional external
memory.
OEM-quantity prices are under $12.00 per unit for the ES488Y and under $16.00
for the ES488X. Reader service no. 26.
ESS Technology Inc. 2493 Industrial Parkway West Hayward, CA 94545
510-783-3100
SIGNAL++ is a signal-processing library from Sigsoft for simulating, testing,
and generating signal-processing algorithms using C++. It provides discrete
time-signal and noise simulations, digital signal processing, digital and
analog filter designs, time-domain correlations, filtering, and fast Fourier
transforms. SIGNAL++ also offers a library interface for user programs which
can be developed using most C++ compilers.
The library is composed of three main classes: DSP, simulation, and filter
design. The first group includes correlations, transforms, windowing, and
filtering; simulation includes signal simulation and noise simulation; and
filter design includes analog filter design and digital filter design.
Libraries are available for several memory models, including DOS extenders:
source code is available as well. Prices for SIGNAL++ start at $99.00. Reader
service no. 27.
Sigsoft 15865 Lofty Trail Drive San Diego, CA 92127 619-673-0745
Protoview is shipping ProtoGen Pascal in conjunction with Borland. ProtoGen is
an interface-design and code-generation tool for Borland Pascal with Objects
that lets you create Windows applications by visually creating the user
interface. ProtoGen writes the Pascal Windows code, which allows you to add
application-specific code to the framework.
Important features include: the interactive menu and application designers; a
live test mode for animating the application interface; a regeneration
facility that preserves all the code added to the framework, even if the
interface changes, international character support in code generation and
interface design; and generation of transfer buffers for moving data between
application and screen. ProtoGen takes advantage of many of Borland Pascal
7.0's new object-oriented features, as well as Borland's custom control DLLS.
ProtoGen Pascal sells for $49.95. Reader service no. 28.
ProtoView Development Co. 353 Georges Road Dayton, NJ 08810 908-329-8588
A multimedia whitepaper that details the requirements for integrating
multimedia is available from IBM. The paper, entitled "Multimedia Distributed
Computing: IBM's Directions for Multimedia Distributed Systems," is being
released to enable the delivery of all forms of information through natural
media throughout a distributed enterprise.
To receive a copy of the paper, call 800-426-9402. Reader service no. 29.
IBM Corp. 1133 Westchester Avenue White Plains, NY 10604 800-426-9402
Orion now supports 80C196 16-bit microcontrollers in its 8800-line of
real-time emulators. The 80C196KD and 80C196KR emulators can operate at speeds
of over 20 MHz, which the 8800 emulator/analyzer handles with full-speed,
zero-wait-state support.
The 8800 has several features that contribute to its usefulness in 80C196
applications. The first is Clip-On emulation, an option that lets you clip on
to soldered-in CPUs and debug and analyze SMD-based targets. It also offers
start/stop triggering of trace storage on specific events, a plus for
interrupt-driven applications. The system comes standard with symbolic
debugging and high-level language support and can access all 80C196 special
function registers, including A/Ds, timers, and ports.
The list price is $8800.00. Reader service no. 30.
Orion Instruments Inc. 180 Independence Drive Menlo Park, CA 94025
415-327-8800
Multi Soft is shipping two new communications libraries: CL/W, for
communication between Windows development tools and mainframe programs; and
CL/VB, a version of CL/W that has been integrated with Visual Basic.
Both libraries are based on a Windows DLL that lets you call PC/mainframe
communications functions that support two different modes of client/server
communications. The first mode, "front-ending," allows use of any Windows
development tool to create a graphical interface that runs "in front of" an
existing mainframe program; the second, "program-to-program communications,"
allows you to write a Windows program that interactively calls a partner
application on the mainframe using a message-passing mechanism for exchanging
data and application-specific codes.
CL/W also provides functions for embedding file verification and
software-distribution services in your application, ensuring that the proper
versions of the executable files, database files, and tables are loaded for
your application. CL/W can be called from any development tool that lets you
call a DLL.
CL/VB includes all the functions found in CL/W plus dialog-driven utilities to
support the screen-capture and conversion process.

The package including CL/W and CL/VB costs between $50.00 and $100.00 for a
runtime license, depending on quantity, and $1500.00 for a development
license. Reader service no. 31.
Multi Soft Inc. 123 Franklin Corner Road Lawrenceville, NJ 08648 609-896-4100




























































March, 1993
SWAINE'S FLAMES


The App Killer




Michael Swaine


The death of the application.
I heard the phrase again the other day. I think the first time I came across
the death-of-the-application phrase was in a talk by Ted Nelson a few years
back. Ted's contention was that applications are evil: They put the user in a
conceptual straightjacket, and the user needs to think in terms of the actual
task, not in terms of software-market categories. Applications are in the way
and must go, or at least fade into the background. What does Ted think will be
the death of application software? Thus far that answer has been wrapped up in
a vaporous project called Xanadu, a lifetime in the realization and still
unrealized. Xanadu, the app-less paradigm, is no present danger to application
software. But the app killer may be lurking out there right now.
The second time I heard the phrase was at this year's MacWorld Expo in San
Francisco. I ran into Steve Michel, who writes about scripting for MacWEEK.
Steve had to tell me something that Leonard Buck had said, a phrase that kept
running around in his head: "The death of the application."
I heard it again the next day, but the ultimate source for that third use was
the same person: Leonard Buck. Buck is the author of WindowScript, a nifty
product that I've mentioned here before. WindowScript is a HyperCard add-on
that gives anyone who can master HyperTalk scripting an impressive degree of
control over user-interface design. The first incarnation of WindowScript was
the award-winning tool Dialoger Pro. The second incarnation was enough of an
advance that it merited a new name. The latest version is probably as big a
step forward in capability, because it supports AppleScript.
There are so many thisScripts and thatScripts floating around these days that
unless you develop for the Mac, you are justified in not knowing offhand what
AppleScript is. Briefly, it's Apple's scripting language for controlling
applications and the Finder. Its native syntax is similar to HyperTalk, but
developers can substitute other dialects. It's built on top of Apple's
AppleEvents protocol. Some applications from beta-seeded vendors are now
supporting AppleScript, although an application can give various possible
levels of support to AppleScript. The most interesting--and difficult--is
scriptability, which can require rethinking and rewriting the entire
application in terms of conceptual objects (such as windows and buttons and
other visible components of the user interface, as opposed to OOPS objects)
and common commands that take such objects as parameters. In other words, full
support in an application for AppleScript is costly. What does the application
get from full support? That's an interesting question I don't have an answer
for, but it's clearer what other products (and users) get from applications
supporting AppleScript.
Other tools that are not exactly applications are taking advantage of
AppleScript. Scripter from Main Event is a script editor that attempts to
simplify object specification, probably the toughest part of AppleScript
scripting. And AppleScript's support for dialects will allow products like
HyperCard and UserLand Frontier to provide their own dialects. Tools like
these will let sophisticated users write or edit scripts independently of any
particular applications. But WindowScript--well, WindowScript is something
else.
It wasn't what anyone said that impressed me about Buck's WindowScript; it was
the demo I saw. I had spent the day looking at demos of AppleScript and
attending lectures on AppleScript, and the demo of WindowScript showed me more
than anything else I had seen or heard about the usefulness of AppleScript.
A straightforward description of the demo won't get the idea across. The
straightforward description is that a HyperCard add-on was used to drive
several applications, shunting data among them and performing one coherent
task. But the feeling of the demo was altogether different from, say, piping
UNIX commands or writing JCL files, or, for that matter, building QuicKeys
macros or writing HyperTalk or Frontier scripts. The WindowScript user can
design and create new windows that don't seem to belong to any application,
windows that implement tasks not apparently restricted to any one application.
In fact, these windows belong to HyperCard, with all its overhead, but that
limitation is in the process of disappearing, with some other
HyperCard-related products that have come out. Besides, in the demo one didn't
notice the HyperCard angle.
It's not so much what WindowScript can do that impressed me as the vision of
user development that it gave. Picture a layer of user-created windows, not
associated with any application, floating in front of all application windows
and driving the apps. If you get the picture, you get an extremely visual
impression of a new level of software, a level above the application.
The death of the application? Probably not. But perhaps the birth of something
genuinely new.







































April, 1993
April, 1993
EDITORIAL


Business as Unusual, or You Don't Need a Weatherman to Know Which Way the Wind
Blows




Jonathan Erickson


Whether you consider the winds of change blowing through the nation to be
fresh breezes or just more Foggy Bottom hot air, you have to agree there's
something happening out there. From technology to trade, change is in the air,
and "something different" is more than just the byword of the day.
What better example of this is there than corporate America, which is being
restructured from the inside out? Small is again beautiful, at least when it
comes to today's business climate. Let's face it, when a tiny, three-person
company like HighText Publishing in San Diego can gleefully boast in its press
releases that its "1992 profits were larger than the combined profits of IBM
and General Motors," you have to take notice. Meanwhile, big business, still
reeling from excesses of the '80s, continues to "downsize," the politically
correct way of not having to say, "We're sorry for eliminating your job."
As part of this era of corporate restructuring, there's a rapid move from
relatively expensive permanent positions to hourly paid part-timers, temps,
and contractors. The advantage for the employer is that there's no
company-funded medical insurance, no union hassles, and no severance pay. The
advantage for the employee is...well, give me a few minutes to think about
that one. According to the U.S. Bureau of Labor Statistics, there are about 70
percent more part-time workers in the work force today than 20 years ago --
one in five of today's civilian workers are part-timers.
But disillusionment is a two-way street. Whether of their own volition or not,
workers are leaving big companies in increasing numbers to start their own
businesses. Go to any city large companies call home and you'll find small
businesses springing up all over town. Although it readily admits good data is
hard to find, the most recent Labor Department estimates indicate about
270,000 people nationwide left large companies to set up shop on their own.
This number was up nearly 100,000 over the previous study in January 1990.
There are, of course, many factors contributing to these statistics, the least
of which are massive layoffs since the 1990 study. (It didn't ask why those
people left.) About the only real thing you can say is that, for one reason or
another, workers are finding running small businesses a viable alternative to
working at large corporations.
In addition to restructuring, businesses are also refocusing, especially in
the defense industry, where the end of the Cold War threw open the door to the
cold reality of much smaller markets for military-targeted products.
Consequently, businesses are scrambling to develop new, nonmilitary markets
and products -- they're looking for the Tang or Teflon of the '90s. For
example, Lawrence Livermore National Labs, home to SDI and similar research,
is now producing instructional software and videos for elementary-school
children on subjects such as computer ethics and security. Although better
examples doubtlessly exist, the LLNL's public-service efforts are nonetheless
symbolic of changes occurring on a much wider scale.
In the coming months, high tech may be more affected by change than other
industries. The University of California-Berkeley's Laura D'Andrea Tyson,
President Clinton's choice to head up the Council of Economic Advisers, will
likely see to that. Tyson's appointment has mainstream economists in a tizzy
because she considers silicon chips to be inherently different from and more
important than, say, potato chips. In Tyson's mind, high tech deserves special
favors and support along the lines of subsidies, duties, and market-sharing.
In short, Tyson has no intention of letting what she considers essential new
technologies be overrun by unfair overseas competition. This has free-trade
economists and foreign trading partners alike twisted in all sorts of knots.
While these are a few examples of the changes we're witnessing daily -- and
I'd like to know what you think are the most-significant others -- here's the
sort of change I'd like to see more of: In a recent court case, the Hell's
Angels Motorcycle Corporation sued Marvel Comics for violating the Hell's
Angels registered trademarks, both the friendly "death head" logo and the name
"Hell's Angels." The comic-book publisher had, it seems, used the club's name
and logo without permission in a comic-book series. Instead of a protracted
court battle, however, the parties settled out of court. Under the terms of
the agreement, Marvel is donating $35,000 to the Ronald McDonald House, a
charity for children who have cancer.
It's a pitiful state of affairs when a bunch of scruffy, tattooed, renegade
motorcycle outlaws actually set a good example for the button-down collar,
pin-striped suit-set in the computer industry. What if Microsoft, Borland,
Apple, Stac, Unix Systems Labs, and other industry litigants called a truce,
laid off lawyers instead of engineers, and then divvied up the savings in
legal fees amongst their favorite charities, where the money could do some
real good.
Now that's what I'd call a breath of fresh air.







































April, 1993
LETTERS







Splay Trees


Dear DDJ,
I was pleased to see the article about splay trees in your December 1992
issue. Because of their simplicity and ability to adapt to any pattern of
usage, splay trees have been adopted in a number of important systems,
including the Mach 3.0 kernel and the Microsoft Windows NT kernel.
Dean mentions the existence of a top-down variant of splaying, but does not
give any details. Although slightly less general than bottom-up splaying, top
down uses less memory and the code (see Example 1) is significantly simpler.
The program is available via anonymous FTP from internet host 128.2.209.226 in
/usr/sleator/public.
Daniel D. Sleator
Pittsburgh, Pennsylvania
Example 1

 /*
 * An implementation of top-down splaying.
 * D. Sleator
 *
 * This is adapted from simple top-down splay, at the bottom of page 669
 * of "Self-adjusting Binary Search Trees" by Sleator and Tarjan, JACM 32,
 * Volume No. 3, July 1985.
 *
 * The chief modification here is that the splay operation works even if
 * the item being splayed is not in the tree, and even if the tree root
 * of the tree is NULL. So the line:

 * t + splay (i, t );
 *
 * causes it to search for item with key i in the tree rooted at t. If
 * it's there, it is splayed to the root. If t=it isn't there, then the
 * node put at the root is the last one before NULL that would have been
 * reached in a normal binary tree search for I. (It's a symmetric order
 * neighbor of i inthe tree.) This allows insertion, deletion, split and
 * join to be easily implemented.
 */

 typedef struct tree_node Tree;
 struct tree_node {
 Tree * left, * right;
 int item;
 };

 Tree * splay (int i, Tree * t) {
 Tree N, *1, *r, *y;
 if (t == NULL) return t;
 N. left = N. right = NULL;
 l = r = &N;

 for (;;) {
 if (i < t->item) {
 if (t->left != NULL && i <t->left->item) {
 y = t->left; t->left =- y->right; y->right = t; t = y;
 }

 if (t->left == NULL) break;
 r->left = t; r = t; t = t->left;
 } else if (i > t->item) {
 if (i < t->item) {
 if (t->right != NULL && i < t->right->item) {
 y = t->right; t->right =- y->left; y->left = t; t = y;
 }
 if (t->right == NULL) break;
 l->right = t; l = t; t = t->right;
 } else break;
 }
 l->right=t->left; r->left=t->right; t->left=N. right;
 t->right=N.left; return t;
 }



Biting the Silver Bullet


Dear DDJ,
In his October 1992 article, "Superdistribution and Electronic Objects," Brad
Cox contends that superdistribution is a "silver bullet." I quite disagree.
First of all, you have the problem of when to call the charging routine: At
program startup (an object's constructor)? When any method is called? When an
important method is called? At regular intervals (time billing)? There is no
solution that fits all programs, so a developer should think carefully about
what portions of code will call the charging routine. (Can you imagine the
ads? "Our program will cost you virtually nothing, while our competitor's
program will cost you $1,000,000!")
Then there is the problem that I don't want to pay each time I use a program.
One of the aspects of spreadsheets that made them popular was the fact that
users could easily run what-if scenarios. You won't be as keen to experiment
if each experiment (recalculation) costs you a dollar.
Developers will have an additional problem when using third-party libraries.
These libraries will add cost to their program and they have no control over
it. This will certainly influence programmers to call functions (create
objects) in a nonoptimal way.
A fourth problem is the tamper-sensitivity of superdistribution. There will be
a huge demand for hardware that charges a fraction of the real cost.
Furthermore, there is no easy way to check if a program costs what its
developer says it costs (and a new category of expensive software that does
almost nothing: Phoneyware). And of course there is the threat of a virus
costing you millions of dollars instead of just one hard disk full of
information (and make no mistake: There will always be a virus threat,
regardless of technological advances).
What advantages has superdistribution to offer? Mr. Cox mentioned software
protection, information marketing, and free distribution. Software protection
can be achieved with additional hardware (keys) and is as easy (or as
difficult) as superdistribution hardware. When software protection is in
place, information marketing is easy: Create information that can only be read
by special programs. Free distribution must be limited to demonstration
versions or versions that run for a limited amount of time. This will not be
as easy as just distributing copies, but it certainly will not cost somebody
money when they test a program that doesn't meet their expectations.
When you pay for usage you expect to reduce the availability of a resource (a
very broad definition). When using a program, you are not actually costing a
developer anything, so why should he get extra money? When I buy an alarm
clock, I expect it to wake me up whenever I choose to use it. When I buy a
compiler, I expect it to compile my programs whenever I ask it to. Have you
ever seen an alarm clock that asks for a quarter each time it has to wake you?
Harald van Woerkom
Eindhoven, The Netherlands


An Alternative Hot-spot Detector


Dear DDJ,
Joseph Newcomer's article, "Profiling for Performance," in the January 1993
issue was accurate, well written, and insightful. The simple distinction
between knowing what a program is doing and why is very important.
I must point out, however, that profiling, subroutine timing, call counting,
etc. are essentially heuristic methods of performance diagnosis. There is
still a large element of educated guesswork involved, and this limits their
effectiveness, especially in very large software. For example, in large
programs the hot spot, if there is one, is often deep inside inaccessible
subroutine libraries.
There is a nonheuristic but little-known method requiring no tool other than a
decent debugger. Two major reasons programs are slow are that they: 1. do
unnecessary calculation; and 2. call subroutines unnecessarily. Either way,
they can be "caught in the act" by just examining the call stack at a few
random points in time. To get a random sample of the call stack, just display
it after hitting the manual interrupt key. You don't need many samples -- if
something is taking 50 percent of your execution time, it will be glaringly
obvious right away.
The call stack tells you both what the program is doing and why. You can see
what subroutine is currently executing, and the line of source code that
called it tells you why. And if you want to know why that was being done, just
look at its caller, and so on. If the reason for a subroutine call isn't very
good, maybe it can be eliminated. This will save time equal to the percentage
of time that it was on the stack. This process can be repeated until the code
is really tight.
This method is by far the most effective I have heard of. However, in
asynchronous or event-driven software, it only does half of the job. It can
locate hot spots and wasteful subroutine calls, but it can't identify needless
event messages, unless they can somehow be backtraced.
Michael R. Dunlavey
Needham, Massachusetts


Fortran Fine Points


Dear DDJ,
Fortran execution times evidently do not compare as unfavorably to the
execution times of C++ with Rogue Wave's Math.h++ library as Thomas Keffer
claims in his article, "Why C++ Will Replace Fortran," in the December 1992
Special Supplement to Dr. Dobb's Journal. The graph in Figure 1 on page 44 of
that issue shows that Microsoft Fortran, as tested, requires more than 35
percent more time for a long vector multiplication than Borland C++ 387 with
Math.h++. Using a 16-MHz 80386 with an 80387 coprocessor, which the figure
specifies, I compiled and ran the Fortran program cited there. However, I used
Watcom Fortran 77/386, Version 8.5 instead of Microsoft Fortran. For a vector
length of 4000, the multiplication required 40.5 milliseconds, only 13 percent
more time than the approximately 35.8 milliseconds that Borland C++ 387 with
Math.h++ required. In a Rogue Wave advertisement for Math.h++ in the November
1992 SIAM News, a nearly identical graph appears, but it shows that Borland
C++ 387 with Math.h++ required approximately 38 milliseconds. Watcom Fortran
needs less than 7 percent more time than that.
For this simple problem, a well-designed Fortran compiler with no special
library produces a program that is almost as fast as that produced by a C++
compiler with such a library. Microsoft Fortran apparently does not provide a
valid comparison. As for C++ alone being faster than Fortran, which seems to
be the thrust of the article, no quantitative comparison is presented.
William J. Clover, Jr.
Chicago, Illinois
Tom responds: Alas, I have failed as a writer if Mr. Clover got the impression
that my article was about how C++ is faster than Fortran. The point of the
article was that C++ is far easier to manage, during both code development and
maintenance, but just as fast. Indeed, real speed improvements come not with
fiddling with compilers and languages, but with the ability to adopt whole new
algorithms, perhaps too complicated to be managed safely and conveniently with
Fortran. Because of its unique combination of manageability and efficiency,
this is where C++ excels.


Clear-cut Corrections to Fuzzy Logic



Dear DDJ,
Thanks for Greg Viot's article, "Fuzzy Logic in C," which appeared in the
February 1993 issue. It was very informative and clear. However, a couple of
things in the listings appear to need some adjustment.
First, the routine rule_evaluation() should include zeroing the values *(tp->
value) before it begins. If this is not done, the max() function ensures that
the values keep increasing every cycle. Perhaps this is done somewhere else in
the program, but I did not find it.
Second, the integer variables sum_of_ products and sum_of_areas, as well as
compute_area_of_trapezoid(), can easily overflow. This can be overcome by
changing the ints to floats.
James B. Calvert
Denver, Colorado
Greg responds: James uncovered two important oversights about the listings for
the fuzzy-logic program. First, it is not clear from the listing that the
initialization (zeroing) of the rule-evaluation outputs with each inference
pass is necessary and where it takes place. It is assumed that this occurs in
the get_system_inputs routine. Secondly, overflow of some of the integer data
types could occur on computers with int size less than 32 bits. Both these
issues should have been addressed somewhere in the listings. James's comments
are greatly appreciated.


Neural Nets in the Real World


Dear DDJ,
In his fine article, "Cognitive Computing" (DDJ, February 1993), Colin Johnson
discussed a number of real-world applications in which neural networks are
used. I'd like to add to that list by mentioning Genesis (my company's
PC-based neural-net development environment). In the past it has been used in
applications from servo-control of a robot manipulator in real time to flight
control for high-performance aircraft.
Presently, Genesis is being used to develop neural networks for such purposes
as stand growth and yield prediction for the forestry industry. The network
predictions provide a more efficient technique for forestry management,
increasing forest productivity and protecting our forests. Genesis is also
being used to develop nets for electric utilities by forecasting electrical
capacity for long-term planning. This assists hydroelectric companies to
successfully predict user demand, safeguarding against the threat of demand
surpassing capacity.
Gary Josin
Neural Systems Inc.
Vancouver, British Columbia












































April, 1993
ALGORITHMS FOR STEREOSCOPIC IMAGING


True 3-D is easier than it looks


 This article contains the following executables: STEREO.ARC


Victor J. Duvanenko and W.E. Robbins


Victor is a member of the technical staff at Truevision in Indianapolis,
Indiana. He can be contacted at victor@truevision.com.


Humans, as well as many other animals, have two eyes that enable us to detect
a certain range of frequencies reflected off or transmitted from surrounding
objects. In other words, we have two sensors, offset horizontally by about 2.5
inches, that allow us to see and judge distances (depth) fairly precisely.
The ability to precisely judge depth comes in handy when manipulating objects
with our hands. It's also beneficial to have a bit of redundancy in the visual
system. Thus, our brain sees the world through our two eyes from two slightly
offset perspectives. We learn to merge the two disparate views into a single
image with added depth detail via triangulation.
The mechanism for extraction of depth information from two offset inputs is
illustrated in Figure 1. It's easy to see that the further away an object is,
the less disparate the two views (one from each eye) become. This is easy to
verify by placing a finger close to your eyes, and then quickly opening and
closing left and right eyes in alternating fashion. The finger appears to move
a lot. However, when you fully extend your arm, moving the finger and repeat
the procedure, your finger doesn't seem to move as much. Our brain learns to
interpret the amount of disparity between the two views as distance (or
depth). Thus, only objects that are very far away will be perceived to have
little or no disparity. This ability is commonly referred to as "stereoscopic
vision" (or, more formally, "binocular disparity").
The ability to present distance or depth information in a natural fashion is
absent from most current-generation computer-graphics environments, which
usually present only a single view of the world, as if the viewer had only one
eye. Thus, the visual sense is not fully exploited, and stereoscopic skills
are not utilized. The amount of information that can be effectively
communicated is reduced, although many applications would greatly benefit from
an increased bandwidth.
Depth information is especially useful when dealing with three-dimensional
objects. By presenting depth information, a more natural human-computer
interaction occurs, with higher communication bandwidth and improved
understanding. In fact, the need for object rotation and movement becomes less
critical, since it's no longer necessary for understanding of depth
relationships, but is only needed to reveal obscured sections. Depth is an
integral part of the three-dimensional space we live in. Only when depth is
presented to the viewer will the visual experience offered by computers be
indistinguishable from that of the real world.


True 3-D vs. Conventional 3-D


Stereoscopic, or true 3-D image-generation methods differ from conventional
3-D techniques. The 3-D techniques most computer-graphics books describe are
really less than 3-D, because much depth information is lost when the 3-D
image is projected onto a 2-D screen. The only depth cues that remain are the
relative order of objects from the viewer (that is, one object obscures
another that is behind it), limited depth information in the perspective
sizing (for example, objects and their features get smaller the further they
are from the viewer) and shadows. These cues provide limited depth
information, which in many cases must be augmented by motion to furnish more
detail about the objects. The viewer then gathers depth information from
movement of the objects and constructs a mental picture of the presented
scene.
For example, if you're presented with several nonoverlapping objects of
various sizes, you can't tell whether the smaller ones are further away or are
simply smaller. It's also impossible to tell whether the objects are 20 inches
or 20 feet away from you. This effect is especially pronounced with spherical
objects (planets of the solar system), since there's little, if any, perceived
perspective sizing. The only way to understand such a scene is to view it in
motion from an angle. For complex scenes, motion is difficult to achieve and,
unless real-time motion is used, the results aren't very useful or
interesting.
True 3-D, on the other hand, provides more-detailed depth information,
augmenting perspective sizing and object occlusion. When several objects are
presented, you immediately, and almost instinctively, surmise the depth
relationships. This additional insight is particularly useful when viewing
objects not common in everyday life, or when objects exhibit uncommon, and
possibly confusing, relationships. In other words, if what is being viewed is
common or easily understood, monoscopic techniques may be sufficient. More
complex objects call for more complex viewing techniques to minimize confusion
and maximize comprehension. In any case, monoscopic 3-D techniques are less
than true 3-D, and are referred to as "2 1/2-D" by stereoscopic enthusiasts.
In this article, we'll explore stereoscopic presentation, providing an
overview of both the hardware and software requirements. We'll then present
algorithms--and implement them in C--for generating the left-eye and right-eye
views fundamental to stereoscopic viewing. This article is based on our work
with the scanning-tunneling microscope (STM) data from North Carolina State
University's Precision Engineering Center. The STM is capable of measuring
properly prepared samples of materials at atomic levels and can sense
differences in vertical height of 0.01 angstrom. A sample of material is
scanned by the STM, providing the vertical-height information at every
regularly spaced sample point. The result is an array of 3-tuples (x,y,z) that
represents the surface of a sample at the atomic level.
The application that reads and displays an STM file was written in mid-1990 on
a uVAX under VMS, displayed images on a LEX/90 24-bit/pixel stereo-capable
display/accelerator, and used a menu-driven, text user interface. The complete
source code that implements the system is available electronically; see
"Availability," page 5.


Stereoscopic Hardware


Stereoscopic hardware is necessary for the proper delivery of the two
disparate (left- and right-eye) images to the respective eyes. One approach
time multiplexes the left- and right-eye views (that is, alternates between
them on every frame) on a single display. The views are demultiplexed
(separated) through a pair of shutter glasses worn by the user. While the
left-eye image is shown on the display, the left-eye lens is transparent, and
the right-eye lens is opaque. When the right-eye image is shown, the opposite
occurs. This technique requires the glasses to be synchronized with the
computer display. StereoGraphics (San Rafael, California) manufactures
infrared-controlled LCD shutter glasses that allow several people to view the
display simultaneously.
Another technique is to place a polarizing LCD panel in front of the screen.
This polarizes the right-eye frame in one direction and the left-eye frame 90
degrees off. The user wears passive glasses that have the left-eye lens
polarized 90 degrees from the right-eye lens. Tektronix (Beaverton, Oregon)
manufactures such LCD panels.
With both techniques, the user's left eye sees only the left-eye image,
whereas the right eye sees only the right-eye image. If the images are
displayed fast enough, the human brain fuses the two views, and the depth
information becomes naturally evident.
Both techniques use a single monitor capable of displaying frames at rates of
90 to 120 frames per second. This ensures that each eye is presented with a
refresh rate high enough to avoid flicker, since each eye sees its
corresponding image at half the total rate. However, by running at higher
vertical rates, you lose resolution, because most display hardware has an
upper limit on pixel-output bandwidth. Thus, when the vertical refresh rate
doubles for stereoscopic viewing (from 60 to 120 Hz), the resolution has to be
halved to keep the pixel-output bandwidth constant. Also, the apparent
brightness of the image is reduced due both to passing through the glasses and
to each eye seeing an image for only half of the time and seeing nothing for
the other half.
Another approach uses a head-mounted display (HMD) consisting of active LCD
display glasses that display the left-eye view on the left lens of the glasses
and right-eye view on the right lens.
One low-cost hardware configuration for stereoscopic viewing consists of a
display capable of high vertical-refresh rates (preferably 120 frames/second),
a pair of active LCD shutter glasses with synchronization hardware, and a
stereo-ready graphics card capable of displaying 120 frames per second,
switching between two images during every vertical screen retrace, and sending
a synchronization signal to the controller of the LCD glasses. Stereo-ready
graphics cards are available from Truevision (Indianapolis, Indiana) and other
vendors.


Stereoscopic Software


Many rendering software packages are currently available. As long as the
software is capable of generating a single-perspective view of a scene, it is
sufficient for generating stereoscopic views. However, a particular package
may not provide enough information or have enough flexibility to generate
proper stereoscopic views. Then the process becomes that of trial and error,
which may lead to improper results. There's actually no need to create or use
a full rendering package--wire-frame images produce very dramatic
results--since the addition of depth information enhances the outcome. (If you
succeed at displaying wire-frame images, try closing one eye to see what you
have been missing!)
Once the two views have been rendered, both must be loaded into display
memory, but only one of them should be visible at a time. The software must
then be capable of toggling from one image to the other during the vertical
blanking period. Most graphics cards allow for two images to be loaded at once
and for fast toggling between them. The software must be capable of detecting
the vertical retrace interval. This can be accomplished by either servicing a
vertical retrace interrupt or by polling a line counter in the display
controller. During the vertical blanking period, the software must flip to the
other-eye image and synchronize the glasses. The software must know which is
the left-eye image and which is the right-eye image, and allow you to see only
the left-eye image when the left-eye lens is transparent. If the two images
are displayed backwards, you'll most likely be incapable of fusing them into
one and be very uncomfortable. (The effect is equivalent to swapping the left
and right eye on your head.)


Stereoscopic Algorithms


At this point, we'll discuss the steps, algorithms, and mathematics necessary
for generating stereoscopic images. Most of the methods are based on those
described by Larry Hodges in his PhD thesis, "Technologies, Applications,
Hardcopy and Perspective Transformations for True Three-Dimensional CRT-Based
Display" (North Carolina State University, 1988). Keep in mind that the data
set we're working with is a uniform 2-D 200 x 200 grid with a Z value (height)
given at each grid point.
To generate perspective stereoscopic views of the scene, we'll perform these
steps:
1. Make the viewing transformation matrix.
2. Compute shading values and normals of each triangle.

3. Transform interocular distance from the physical coordinate system to the
viewing coordinate system.
4. Make the left- and right-eye transformation matrices.
5. Concatenate all matrices needed for the left-eye transformation into a
single matrix.
6. Transform, homogenize, clip, and render the left-eye view.
7. Repeat steps #5 and #6 for the right-eye view.
8. Display alternating views during successive frames.


Viewing Transformation


To simplify perspective projection, clipping, depth sorting, and
stereoscopic-image generation, you need a viewing transformation that places
the viewer reference point (VRP) at the origin, viewing down the negative
Z-axis. In other words, the first step is to compute the transformation matrix
necessary to move the viewer down to the origin, looking down the negative
Z-axis. This transformation is explained as a four-step process by Foley and
van Dam in Fundamentals of Interactive Computer Graphics (Addison-Wesley,
1984, pp. 258-261, 274-276). The technique is not efficient, but it is clear
and performed only once per image. (See make_composite_viewing_matrix() in
Listing Two (page 76) for implementation details.) This computation need be
performed only when the viewer moves.
Homogeneous matrix form (4x4) was used for simplicity and efficiency. This
method is beautiful, as it allows for accumulation of a series of
transformations into a single matrix. For example, it's possible to construct
a single matrix that not only performs the viewing transformation, but also
the scaling to the device/display coordinates. Thus, each sample point in the
original image needs to be multiplied by a 4x4 matrix only once to transform
it all the way to the device/display coordinates.
Some computational optimizations were performed here. For instance, instead of
handling each triangle separately, the entire image was transformed. This
reduced the amount of computations by six, since every sample point generates
two triangles, and all of the vertices are shared between neighboring
triangles. The multiplication of a vector (the sample point, 1x4) by a matrix
(4x4) is computationally intensive: 16 multiplications and 12 additions per
sample point (a total of 640,000 multiplications and 480,000 additions). Also,
three divisions, necessary to bring a point back to 3-D from the homogeneous
space, have been replaced by one reciprocal and three multiplications. This
should execute faster on most floating-point platforms.
More Details.


Shading


Every triangle in the uniform STM grid was flat shaded (but other shading
algorithms can easily be used). For every sample/grid point (the grid is
200X200), two triangles are created. The intensity/color of each triangle is
based on the angle between its normalized normal and the normalized negative
light-source direction vector (a normalized vector from the origin to the
light-source position). Since both vectors have been normalized (and thus have
a magnitude of unity), the angle between them is obtained by computing the dot
product. The result will be between -1 and 1. For simplicity, all negative
values were set to 0, implying that all triangles facing away from the light
source were rendered as dark as possible. Thus, the resulting triangle, shade
values were between 0 and 1. The range of possible display intensities was
known to be between 0 and 255. Therefore, the final triangle-shading value was
obtained by multiplying the shade value with 255.
For a 200x200 grid of sample points (or a total of 40,000 samples ), 80,000
triangle normals must be computed. Each triangle normal is obtained by
computing the cross product of two adjacent edge vectors. (This is how a
normal to a plane is found, and triangles are guaranteed to be planar, since
it takes a minimum of three points to define a plane.) For example, for a grid
point P[i,i] the normal to the first triangle is obtained by computing (P[i,i]
P[i+1,i])X(P[i,i]P[i+1,i+1]); and for the second triangle by computing
(P[i,i]P[i+1,i+1])X(P[i,i]P[i,i+1]). A general cross-product formula would
require performing three subtractions per edge to compute the edge vectors and
then six multiplications and three subtractions, for a total of nine
subtractions and six multiplications. Thus, for 80,000 triangles 720,000
subtractions and 480,000 multiplications are required, which would take a
noticeable amount of time even on the best of current workstations.
Several computational optimizations are possible due to the restrictive
structure of the STM database. The grid of samples is uniform in both X and Y,
and is X- and Y-axis aligned. Thus, the X and Y components of the vectors are
known in advance, and need not be computed for each triangle. Only the Z
component has to be computed. Also, for vectors between successive-row grid
points (for instance, P[i,i]P[i+1,i]), the X component is 0; for vectors
between successive-column grid points (for example, P[i,i]P[i,i+1]), the Y
component is 0. This leads to several simplifications in the cross-product
computations, as the 0 components do not contribute to the result and can be
eliminated. The result is that only five subtractions and four multiplications
are performed for every pair of triangles (or a total of 200,000 subtractions
and 160,000 multiplications--a 3.6-times reduction in subtractions and
3.0-times reduction in multiplications). See procedure compute_normals() in
Listing Three, page 76. However, these optimizations produce results that are
not general (that is, specific for the STM database structure).
Several floating-point operations per triangle are also necessary to normalize
the normal vectors: three multiplications, two additions, three divisions, and
one square root. To compute the dot product between the normalized
light-source direction vector and the normalized triangle, normal requires
three multiplications and two additions. It would be an interesting exercise
in vector mathematics to try and reduce the number of operations required.


Transforming Interocular Distance


Once a scene can be generated from a single viewpoint, the move to
stereoscopic scene generation is not difficult--two scenes must be generated
from two slightly different perspectives. The trick is to decide how far apart
to place the two viewpoints--the two eyes.
On average, the distance between human eyes (interocular distance) is 2.5
inches. This implies that when you look at any object, the views that each eye
sees will differ by no more than 2.5 inches. This interocular distance stays
nearly constant for each of us. We get used to it and derive depth information
based on it. The further away an object is, the more similar the two views
are; the closer an object is, the more disparate the two views.
Most computer-graphics books describe the rendering pipeline as a sequence of
steps that ends with drawing a pixel on the display device. Pixel aspect
ratio, gamma, and the color/shade capabilities of the display device may be
the only other parameters considered. Several additional parameters must be
considered when generating stereoscopic images. These parameters move beyond
the pixels of the display and into your physical world.
Two images (a "stereopair") must be generated, one for each eye. However, your
interocular distance (eye-to-eye spacing) is known only in the
physical-coordinate system (about 2.5 inches). This distance needs to be
transformed into (brought into) the viewing-coordinate system where the object
being viewed lies. The result will be the interocular distance in the
viewing-coordinate system, which will correspond to 2.5 inches in the
physical-coordinate system.
Foley and van Dam refer to the transformation from the viewing-coordinate
system to the device/display-coordinate system as "window-to-viewport
scaling." The transformation from the device-coordinate system to the
physical-coordinate system depends on the pixel size of the physical display.
Since the interocular distance is measured along the X-axis, only the pixel
width is significant (and this is obviously monitor dependent).
To map the interocular distance from the physical-coordinate system to the
viewing-coordinate system, it must first be mapped to the device-coordinate
system (pixels or dots). In other words, inches are mapped to pixels (and dots
per inch, or dot pitch, varies between monitors). For instance, the dot pitch
of the LEX/90 monitor is 0.021 inches/dot. Thus, 2.5 inches are equivalent to
119 dots (pixels). The 119 pixels is the interocular distance in the
device-coordinate system. The 119 pixels must now be mapped to the
viewing-coordinate system. The horizontal scale factor is the window width
divided by the viewport width. Multiplying the 119 pixels by this scale factor
will transform the 2.5 inches in the physical-coordinate system to the
viewing-coordinate system. This computation is shown in
set_per_projection_plane() in Listing Six, page 78.
Of course, your physical distance from the display needs to be transformed
into the viewing-coordinate system as well. This is done in
set_per_projection_plane(). These two transformations bring you from the
physical world into the viewing-coordinate system (once again, where the
object is). The set_per_projection_plane() procedure suggests the distance of
the projection plane in the viewing-coordinate system (which corresponds to 18
inches in the physical world).
Now that we have the interocular distance and your distance from the display
transformed into the viewing-coordinate system, the left- and right-eye view
transformation matrices can be formed. Larry Hodges presents a simple
technique whereby you first look down the Z-axis, and your eyes are centered
around the origin (along the X-axis). Thus, the left eye is at coordinate
(e/2, 0, 0) and the right eye is at (-e/2, 0, 0), where e is the interocular
distance in the viewing-coordinate system. Secondly, the two views are
generated by using the TPT{-1} model (translate, project, translate back). By
using homogeneous (4x4) matrices, a single (4x4) transformation matrix
results. The left-eye view transformation consists of a concatenation of three
operations: translation to the right by e/2 distance, projection, and
translation to the left by the same amount (T[R]PT[L]); see Figure 2. The
display_3D_data() procedure (see Listing One, page 76) follows this model by
creating first a left-eye (4x4) transformation matrix, then a right-eye
matrix.


Concatenating All Matrices into One


The display_3D_data() procedure then combines the left-eye matrix with the
viewing-transformation (4x4) matrix to produce a single (4x4) matrix that
contains the complete transformation that each vertex of the 2OOx2OO grid must
undergo. Once a single (4x4) transformation matrix has been obtained,
transform_and_homogenize_image() (Listing Four, page 78) is called,
transforming the 2OOx2OO grid (or 40,000 vertices). The result is homogenized
(X, Y, and Z are divided by W) to bring the points back into 3-D from the
homogenous space.
At this point, render_image() (Listing Five, page 78) is called, which takes
transformed grid with shading information for each of the 80,000 triangles,
clips the triangles to the viewport, and renders it (utilizing the LEX/90
polygon command). Clipping is done very simplistically--the triangle is
rendered only if it is entirely inside the viewport.


Displaying Alternating Views


The LEX/90 has two display buffers (double-buffered). It is capable of
displaying either of the two buffers and of alternating between the buffers on
each frame. The left-eye image was rendered into display-buffer one, and the
right-eye image was rendered into display-buffer two. Then LEX/90 is set into
the alternating buffer mode, and--stereo! The LEX/90 displays the left-eye
image in one frame, then displays the right-eye image, then the left-eye
image, and so on, 60 times per second.
To view this image, we used a crusty pair of piezoelectric shutter glasses
synchronized with the vertical sync pulse of the LEX/90 display. Thus, the
left eye saw only the left-eye image and the right eye saw only the right-eye
image. The display flickered a little, since each eye saw a picture at 30 Hz
(frames/second), but the result was sufficient to get quite a few "Wow!"s.


Rolling Your Own Stereopair Images


Barr E. Bauer
Barr is a research scientist at Arris Pharmaceutical and can be contacted at
385 Oyster Point Blvd., South San Francisco, CA 94080.
In my work as a pharmaceutical researcher, I find stereo imaging to be not
only useful, but essential because true 3-D presentation is often the only way
to clearly present results. In other words, when viewed in conventional 3-D or
2-D, DNA structures (such as that on the cover of this magazine) can lose much
of their structural content.

Workstations like Silicon Graphics' (Mountain View, California) Indigo provide
real-time 3-D effects directly on the monitor. Vector objects can be displayed
depth-cued, whereby uniform fading along the screen Z-axis gives a good sense
of front to back. The pseudo-3-D effect is magnified when the object is
rotated or when the fading is dramatic. Interior objects can be highlighted by
making them less bright than objects in front, maintaining a clear
what's-in-front distinction. The depth-cued display sometimes inverts in the
mind's eye, making the bright part of the display appear to be in the rear.
This curious side effect occurs at random and can complicate presentations.
The Indigo can display solid-shaded objects, too, which also gives a good
pseudo-3-D effect.
CrystalEyes from StereoGraphics (San Rafael, California) uses a special
monitor that alternates the display of left- and right-eye views to achieve
real-time stereo. Liquid-crystal goggles synched to the monitor through an
infrared emitter separate the displays so that each eye sees only the image
appropriate for it. The stereo effect is startling: Objects appear to float in
front of the screen and can be rotated in real time. On the downside, the
hardware is costly, significant programming must be done to support the
graphics, the picture quality is degraded due to fewer display lines, and the
manipulation of 2-D menus presents a programming challenge. Despite all this,
CrystalEyes can show stereo to up to four people in front of a monitor and up
to 20 people with projection.
Stereo slides enable stereo presentation to larger crowds or when maximum
picture quality is desired. Separate left- and right-eye views are
photographed, then projected through polarized filters and viewed with
polarized glasses, one eye at 90 degrees with respect to the other. The images
are rotated six degrees with respect to each other along the screen's Y-axis.
Internal coordinate frames must be normalized to the screen frame before the
two images are rotated and photographed, otherwise the resulting projected
stereo images can be very distressing to look at. The result is a static image
that looks good but requires careful image alignment, and it often takes a
moment for the audience to "see" the stereo effect. The polarizing filters
steal light intensity the farther you sit from the central axis of the screen,
making the image appear dim to those sitting in the front corners of the room.
The screen can fool camera exposure control often enough that multiple
exposures and careful intensity matching give the best stereo image. If one of
the pair is noticeably different in exposure, the stereo effect becomes more
difficult to see. Regardless, this method is inexpensive and portable. I
combine depth-cuing with stereo slides for a double effect that accentuates
the stereo.
Finally, stereo images can be put in print as stereo pairs, even with line
art. The simplest picture that minimizes unnecessary overlap provides the
clearest 3-D image. The pairs can be viewed with glasses, or by defocusing
your eyes and merging the two images into one. Figure 3, for instance, is a
stereo triplet from AI software used to recognize 3-D molecular shape
developed at Arris Pharmaceutical Corp. To see in stereo, follow these steps:
Hold the page flat under uniform light at arm's length.
Look at the center and right images, cross your eyes, and attempt to merge the
new images formed into a single 3-D image. When the structures merge, the
image will snap into 3-D.
If you can't see the stereo, relax, have a glass of wine, and try again.
Most people cross their eyes; if you are "wall-eyed" (eyes turned outward),
look instead at the left and center images. Beware, however, that this can
give you a severe headache if you look long enough.


_ALGORITHMS FOR STEREOSCOPIC IMAGES_
by Victor Duvanenko and W.E. Robbins


[LISTING ONE]

display_3D_data( proj )
int proj;
{
 double Tw[4][4], S[4][4], Td[4][4], Ry[4][4], Per[4][4], tmp[4][4];
 double Left[4][4], Right[4][4];
 double e_v, /* interocular distance mapped back to the view coord. */
 e_w; /* interocular distance mapped back to the world coord. */
 printf( "Computing normals for shading. " );
 compute_normals(); /* and the dot products for each triangle */
 printf( "Done.\n" );
 /* Perspective projection must use three steps:
 1) Compute normals and project,
 2) Divide by W (homogenize)
 3) Transform into the device coordinates. */
 if ( proj == PERSPECTIVE )
 {
 /* map physical interocular distance into the view port coordinates. */
 e_v = INTEROCULAR_DISTANCE / PIXEL_WIDTH; /* e_v == pixels */
 /* map from the viewport coordinate system to the world. */
 e_w = e_v / (( x_right_d - x_left_d ) / ( x_right - x_left ));
 set_to_identity( Left, 4 );
 set_to_identity( Right, 4 );

 /* Use the Translate, Project, Translate back model. */

 /* Create the Left eye transformation matrix. */
 set_to_identity( Tw, 4 ); /* translate the left eye to the origin */
 Tw[3][0] = -e_w / 2.0;
 matrix_mult( Left, 4, 4, Tw, 4, 4, tmp );
 /* Create the perspective projection matrix. */
 set_to_identity( Per, 4 );
 Per[2][3] = 1.0 / proj_plane; /* 1/d */
 Per[3][3] = 0.0;
 matrix_mult( tmp, 4, 4, Per, 4, 4, Left );

 Tw[3][0] = e_w / 2.0; /* translate back */
 matrix_mult( Left, 4, 4, Tw, 4, 4, tmp );
 copy_matrix( tmp, Left, 4, 4 );

 /* Create the Right eye transformation matrix. */
 set_to_identity( Tw, 4 ); /* translate the right eye to the origin */
 Tw[3][0] = e_w / 2.0;
 matrix_mult( Right, 4, 4, Tw, 4, 4, tmp );

 /* Create the perspective projection matrix. */
 set_to_identity( Per, 4 );
 Per[2][3] = 1.0 / proj_plane; /* 1/d */
 Per[3][3] = 0.0;
 matrix_mult( tmp, 4, 4, Per, 4, 4, Right );

 Tw[3][0] = -e_w / 2.0; /* translate back */
 matrix_mult( Right, 4, 4, Tw, 4, 4, tmp );
 copy_matrix( tmp, Right, 4, 4 );
#if 0
 printf( "Transforming, projecting and homogenizing the image. " );
 transform_and_homogenize_image( image, tr_image, Tm );
 printf( "Done.\n" );
#endif
 }
 /* Create the world to device transformation matrix. */
 /* Create the translation matrix. Translate only in X and Y. */
 set_to_identity( Tw, 4 );
 Tw[3][0] = -x_left; Tw[3][1] = -y_bottom; Tw[3][2] = 0.0;
 /* Create a uniform scale matrix. */
 set_to_identity( S, 4 );
 S[0][0] = ( x_right_d - x_left_d ) / ( x_right - x_left );
 S[1][1] = ( y_top_d - y_bottom_d ) / ( y_top - y_bottom );
 S[2][2] = ( z_back_d - z_front_d ) / ( z_back - z_front );

 matrix_mult( Tw, 4, 4, S, 4, 4, tmp );
 copy_matrix( tmp, Tw, 4, 4 );

 /* Create the translation matrix. Translate only in X and Y. */
 set_to_identity( Td, 4 );
 Td[3][0] = x_left_d; Td[3][1] = y_bottom_d; Td[3][2] = 0.0;
 matrix_mult( Tw, 4, 4, Td, 4, 4, tmp );
 copy_matrix( tmp, Tw, 4, 4 );

 /* Since device/screen origin on LEX/90 is in upper left, we need to reflect
 Y and translate by screen height to place device origin in bottom left.*/
 set_to_identity( Ry, 4 );
 Ry[1][1] = -1.0;
 matrix_mult( Tw, 4, 4, Ry, 4, 4, tmp );
 copy_matrix( tmp, Tw, 4, 4 );

 set_to_identity( Td, 4 );
 Td[3][1] = SCREEN_HEIGHT;
 matrix_mult( Tw, 4, 4, Td, 4, 4, tmp );
 copy_matrix( tmp, Tw, 4, 4 );
 /* Now, Tw has the world to device/screen transformation matrix. */
 if ( proj == PARALLEL )
 {
 /* Beautiful!!! Perform a single transformation of the image (for
 parallel projection). */
 printf( "Transforming, projecting and mapping the image onto screen." );
 matrix_mult( Tm, 4, 4, Tw, 4, 4, tmp );
 copy_matrix( tmp, Tm, 4, 4 );
 show_matrix( Tm, 4, 4 );
 transform_and_homogenize_image( image, tr_image, Tm );
 printf( "Done.\n" );
 printf( "Rendering the image. " );
 render_image( LEFT_EYE ); printf( "Done.\n" );
 }

 if ( proj == PERSPECTIVE )
 {
 printf( "Transforming, projecting, homogenizing and mapping the " );
 printf( "Left image onto screen. " );
 matrix_mult( Tm, 4, 4, Left, 4, 4, tmp );
 matrix_mult( tmp, 4, 4, Tw, 4, 4, Left );
 printf( "Left eye transformation matrix.\n" );
 show_matrix( Left, 4, 4 );
 transform_and_homogenize_image( image, tr_image, Left );
 printf( "Done.\n" );
 printf( "Rendering the Left eye image. " );
 render_image( LEFT_EYE ); printf( "Done.\n" );

 printf( "Transforming, projecting, homogenizing and mapping the " );
 printf( "Right image onto screen. " );
 matrix_mult( Tm, 4, 4, Right, 4, 4, tmp );
 matrix_mult( tmp, 4, 4, Tw, 4, 4, Right );
 /* Move the right eye view into the lower half of the buffer. */
 set_to_identity( Tw, 4 );
 Tw[3][1] = SCREEN_HEIGHT; /* move in device coord */
 matrix_mult( Right, 4, 4, Tw, 4, 4, tmp );
 copy_matrix( tmp, Right, 4, 4 );
 printf( "Right eye transformation matrix.\n" );
 show_matrix( Right, 4, 4 );
 transform_and_homogenize_image( image, tr_image, Right );
 printf( "Done.\n" );
 dszom( &0, &512, &1 ); /* look at the lower half */
 printf( "Rendering the Right eye image. " );
 render_image( RIGHT_EYE ); printf( "Done.\n" );
 }
}





[LISTING TWO]

make_composite_viewing_matrix( proj )
int proj; /* PARALLEL or PERSPECTIVE */
{
 double p1[4], p2[4], p3[4], p_tmp[4];
 double T[4][4], Rx[4][4], Ry[4][4], Rz[4][4], C_tmp1[4][4], C_tmp2[4][4];
 double Per[4][4], d1, d2, d12, cos_ang, sin_ang;
 /* Initialize the three points */
 p1[0] = eye_pt.x; p1[1] = eye_pt.y; p1[2] = eye_pt.z; p1[3] = 1.0;
 p2[0] = p2[1] = p2[2] = 0.0; p2[3] = 1.0;
 p3[0] = p1[0] + vup.x; p3[1] = p1[1] + vup.y; p3[2] = p1[2] + vup.z;
 p3[3] = 1.0;
 /* Magnitude of vector p1->p2 */
 d12 = sqrt( p1[0] * p1[0] + p1[1] * p1[1] + p1[2] * p1[2] );
 /* Create the translation matrix. */
 set_to_identity( T, 4 );
 T[3][0] = -p1[0]; T[3][1] = -p1[1]; T[3][2] = -p1[2];
 /* Translate the three points p1, p2, and p3 to the origin. */
 matrix_mult( p1, 1, 4, T, 4, 4, p_tmp );
 copy_matrix( p_tmp, p1, 1, 4 );
 matrix_mult( p2, 1, 4, T, 4, 4, p_tmp );
 copy_matrix( p_tmp, p2, 1, 4 );

 matrix_mult( p3, 1, 4, T, 4, 4, p_tmp );
 copy_matrix( p_tmp, p3, 1, 4 );

 d1 = sqrt( p2[0] * p2[0] + p2[2] * p2[2] ); /* length of projection */
 cos_ang = -p2[2] / d1;
 sin_ang = p2[0] / d1;
 /* Create the rotation about Y-axis matrix. */
 set_to_identity( Ry, 4 );
 Ry[0][0] = cos_ang; Ry[0][2] = -sin_ang;
 Ry[2][0] = sin_ang; Ry[2][2] = cos_ang;
 /* Rotate the three points p2, and p3 about the Y-axis. */
 /* p1 is at the origin after translation => no need to rotate. */
 matrix_mult( p2, 1, 4, Ry, 4, 4, p_tmp );
 copy_matrix( p_tmp, p2, 1, 4 );
 matrix_mult( p3, 1, 4, Ry, 4, 4, p_tmp );
 copy_matrix( p_tmp, p3, 1, 4 );

 cos_ang = -p2[2] / d12;
 sin_ang = -p2[1] / d12;

 /* Create the rotation about X-axis matrix. */
 set_to_identity( Rx, 4 );
 Rx[1][1] = cos_ang; Rx[1][2] = sin_ang;
 Rx[2][1] = -sin_ang; Rx[2][2] = cos_ang;
 /* Rotate the three points p2, and p3 about the X-axis. */
 matrix_mult( p2, 1, 4, Rx, 4, 4, p_tmp );
 copy_matrix( p_tmp, p2, 1, 4 );
 matrix_mult( p3, 1, 4, Rx, 4, 4, p_tmp );
 copy_matrix( p_tmp, p3, 1, 4 );
 /* Sanity check. */
 printf( "The view vector should be [ %lf %lf %lf %lf ]\n", 0.0, 0.0, -d12,
 1.0 );
 printf( "The view vector is [ %lf %lf %lf %lf ]\n", p2[0], p2[1],
 p2[2], p2[3] );
 d2 = sqrt( p3[0] * p3[0] + p3[1] * p3[1] );
 cos_ang = p3[1] / d2;
 sin_ang = p3[0] / d2;
 /* Create the rotation about Z-axis matrix. */
 set_to_identity( Rz, 4 );
 Rz[0][0] = cos_ang; Rz[0][1] = sin_ang;
 Rz[1][0] = -sin_ang; Rz[1][1] = cos_ang;
 /* At this point the translation, and all rotation matrices are known
 and need to be combined into a single transformaation matrix. */
 matrix_mult( T, 4, 4, Ry, 4, 4, C_tmp1 );
 matrix_mult( C_tmp1, 4, 4, Rx, 4, 4, C_tmp2 );
 matrix_mult( C_tmp2, 4, 4, Rz, 4, 4, C_tmp1 );
 copy_matrix( C_tmp1, Tm, 4, 4 );
}






[LISTING THREE]

void compute_normals()
{
 register i, j;

 point_3D_t p11_p21, p11_p22, p11_p12, cross_a, cross_b;
 double d, dx; /* stepping distance (dx = dy) */
 double dx_sqrd; /* dx^2 */
 int num_lines;
 point_3D_t na, nb; /* normals to triangle A and B */
 dx = scan_sz / num_samples;

 num_lines = MIN( num_samples, MAX_IMAGE_SIZE );
 num_lines--;
 dx_sqrd = dx * dx;
 for( i = 0; i < num_lines; i++ )
 for( j = 0; j < num_lines; j++ )
 {
#if DEBUG
 printf( "P11 %f %f %f\n", image[i][j].x, image[i][j].y,
 image[i][j].z );
 printf( "P21 %f %f %f\n", image[i+1][j].x, image[i+1][j].y,
 image[i+1][j].z );
 printf( "P12 %f %f %f\n", image[i][j+1].x, image[i][j+1].y,
 image[i][j+1].z );
 printf( "P22 %f %f %f\n", image[i+1][j+1].x, image[i+1][j+1].y,
 image[i+1][j+1].z );
#endif
 p11_p21.z = image[ i + 1 ][ j ].z - image[ i ][ j ].z;
 p11_p22.z = image[ i + 1 ][ j + 1 ].z - image[ i ][ j ].z;
 p11_p12.z = image[ i ][ j + 1 ].z - image[ i ][ j ].z;
#if DEBUG
 printf( "dz11_21 = %f, dz11_22 = %f, dz11_12 = %f\n",
 p11_p21.z, p11_p22.z, p11_p12.z );
#endif
 /* It's possible to eliminate one more multiplication in the
 computations below. */
 na.x = dx * ( p11_p21.z - p11_p22.z );
 na.y = dx * p11_p21.z;
 na.z = dx_sqrd;
#if DEBUG
 printf( "Na %f %f %f\n", na.x, na.y, na.z );
#endif
 nb.x = (-dx) * p11_p12.z;
 nb.y = dx * ( p11_p22.z - p11_p12.z );
 nb.z = dx_sqrd;
#if DEBUG
 printf( "Nb %f %f %f\n", nb.x, nb.y, nb.z );
#endif
 /* Normalize the normal vectors, since the intensity will be
 proportional to the angle between light source and the normal. */
 d = sqrt((double)( na.x * na.x + na.y * na.y + na.z * na.z ));
 na.x = na.x / d;
 na.y = na.y / d;
 na.z = na.z / d;
#if DEBUG
 printf( "Na %f %f %f\n", na.x, na.y, na.z );
#endif
 d = sqrt((double)( nb.x * nb.x + nb.y * nb.y + nb.z * nb.z ));
 nb.x = nb.x / d;
 nb.y = nb.y / d;
 nb.z = nb.z / d;
#if DEBUG
 printf( "Nb %f %f %f\n", nb.x, nb.y, nb.z );

#endif
 /* Compute the dot product between the light source vector and
 the normals (== to the angle between two unit vectors ).
 -1 <= cos( theta ) <= 1, which will be very useful. */
 image[ i ][ j ].sha = light_source.x * na.x + light_source.y * na.y +
 light_source.z * na.z;
 image[ i ][ j ].shb = light_source.x * nb.x + light_source.y * nb.y +
 light_source.z * nb.z;
 }
}






[LISTING FOUR]

transform_and_homogenize_image( s, d, tm )
point_3D_ex_t s[][ MAX_IMAGE_SIZE ], d[][ MAX_IMAGE_SIZE ];
double *tm; /* transformation matrix */
{
 register i, j;
 int num_lines;
 double p[4]; /* the point to be transformed */
 double t[4], inv_W;

 num_lines = MIN( num_samples, MAX_IMAGE_SIZE );
 for( i = 0; i < num_lines; i++ )
 for( j = 0; j < num_lines; j++ )
 {
 p[0] = s[i][j].x; p[1] = s[i][j].y;
 p[2] = s[i][j].z; p[3] = 1.0;
 matrix_mult( p, 1, 4, tm, 4, 4, t );
 if ( t[3] != 1.0 ) /* divide by W (homogenize) */
 {
 inv_W = 1.0 / t[3];
 t[0] *= inv_W; t[1] *= inv_W; t[2] *= inv_W;
 }
 d[i][j].x = t[0]; d[i][j].y = t[1]; d[i][j].z = t[2];
 d[i][j].sha = s[i][j].sha; d[i][j].shb = s[i][j].shb;
 }
}






[LISTING FIVE]

render_image( y )
int y;
{
 register i, j;
 int num_lines;
 short v[6], intensity;

 num_lines = MIN( num_samples, MAX_IMAGE_SIZE );

 num_lines--;
 for( i = 0; i < num_lines; i++ )
 for( j = 0; j < num_lines; j++ )
 {
 v[0] = ROUND( tr_image[ i ][ j ].x );
 v[1] = ROUND( tr_image[ i ][ j ].y );
 v[2] = ROUND( tr_image[ i + 1 ][ j ].x );
 v[3] = ROUND( tr_image[ i + 1 ][ j ].y );
 v[4] = ROUND( tr_image[ i + 1 ][ j + 1 ].x );
 v[5] = ROUND( tr_image[ i + 1 ][ j + 1 ].y );
 /* Render triangle A */
 intensity = ROUND( tr_image[ i ][ j ].sha * (double)(NUM_SHADES - 1));
 if ( intensity > ( NUM_SHADES - 1 ))
 intensity = NUM_SHADES - 1; /* saturate */
 if ( intensity < 0 )
 {
#if 0
 printf( "Triangle A, intensity = %d\n", intensity );
 printf( "v11.x = %f, v11.y = %f, v11.z = %f\n",
 image[i][j].x, image[i][j].y, image[i][j].z );
 printf( "v21.x = %f, v21.y = %f, v21.z = %f\n",
 image[i+1][j].x, image[i+1][j].y, image[i+1][j].z );
 printf( "v22.x = %f, v22.y = %f, v22.z = %f\n",
 image[i+1][j+1].x, image[i+1][j+1].y, image[i+1][j+1].z );
#endif
 intensity = 0;
 }
 if ( clip_to_viewport( v, 6, y ) == ACCEPT )
 dspoly( &intensity, v, &6 );

 v[2] = ROUND( tr_image[ i ][ j + 1 ].x );
 v[3] = ROUND( tr_image[ i ][ j + 1 ].y );
 /* Render triangle B */
 intensity = ROUND( tr_image[ i ][ j ].shb * (double)( NUM_SHADES-1));
 if ( intensity > ( NUM_SHADES - 1 ))
 intensity = NUM_SHADES - 1; /* saturate */
 if ( intensity < 0 ) intensity = 0;
 if ( clip_to_viewport( v, 6, y ) == ACCEPT )
 dspoly( &intensity, v, &6 );
 }
}





[LISTING SIX]

display_3D_data( proj )
int proj;
{
 double Tw[4][4], S[4][4], Td[4][4], Ry[4][4], Per[4][4], tmp[4][4];
 double Left[4][4], Right[4][4];
 double e_v, /* interocular distance mapped back to the view coord. */
 e_w; /* interocular distance mapped back to the world coord. */

 printf( "Computing normals for shading. " );
 compute_normals(); /* and the dot products for each triangle */
 printf( "Done.\n" );































































April, 1993
GENETIC ALGORITHMS AND DATABASE INDEXING


Finding the best set of indexes




Joe Celko


Joe is a columnist for DBMS magazine and a member of the ANSI X3H2 Database
Standards Committee. He can be reached through the DDJ offices.


Imagine you've just declared all the tables for a SQL database, and now have
to declare primary and secondary indexes. Most SQL implementations
automatically create primary indexes (which assure uniqueness of primary keys)
from the table declarations. However, secondary indexes which optimize update,
delete, insert, and query search time in the database must be explicitly
created and can also be unique, although they're most often not.
The easy way out is to not declare secondary indexes because small test
databases run fine without them. But when the test database grows into a large
production system, you're hit with a big performance issue. At this point, the
solution might seem to swing to the other extreme and index everything.
But while query performance will almost certainly improve since the query
optimizer can find an index whenever it needs it, performance of inserts,
updates, and deletes will suffer. This is because every time the database
changes, so do the indexes. At times, hashing algorithms will even have to
take care of collisions unless the algorithm produces a relatively even
distribution. This means that a change to the database will cost one table
access plus one or more accesses per index, which can slow performance in
databases with millions of rows.
For instance, given a table of (n) columns, there are actually more than (n)
possible indexes on it. An index can be built on more than one column, and the
order of the columns within the index is important. The formula for the number
of possible indexes is the sum of all permutations of (n) columns; see Example
1.
Now consider a table with three columns, T(a,b,c). The 16 possible indexes on
table T are (a,b,c), (a,b), (a,c,b), (a,c), (a), (b,a,c), (b,a), (b,c,a),
(b,c), (b), (c,a,b), (c,a), (c,b,a), (c,b), (c), or none at all. The optimal
solution is to declare only the indexes you absolutely need to balance query
time against change times. This means that, as always, you also need to know
something about the transactions that will run against the database. If the
job is all queries with no changes on only one column in a table, that column
needs to be indexed. Any other indexing would not be used by the queries or
updates and only take up space. This is commonly called a "lookup table."
However, since the indexed column is probably the primary key of the lookup
table, it's indexed automatically. (Note that many databases have a column
that is used extensively for access and that isn't the primary key.) Likewise,
if no queries or changes use a particular column, then it doesn't need to be
indexed. The secondary index is often used to establish a one-to-many
relationship or many-to-many relationship with another table.
Real database problems lie somewhere between these two extremes.


Finding the Best Set of Indexes


To make life easier, let's consider only queries. Given a database schema and
a set of queries, we want to find the best possible set of indexes no bigger
than some limit, say (i).
Finding the optimal indexing arrangement is known to be NP-complete. This
means that the work to find a solution grows much faster as the number of
items in the problem increases. As a rule of thumb, if you have to inspect all
possible combinations of something, the problem will likely be difficult to
crack.
This doesn't mean you can't optimize indexing for a particular database schema
and set of input queries, but it does mean you can't write a program which
will do it for all possible relational databases and query sets.
The query information is usually given as a statistical model of the expected
inputs. For example, you might be told that 80 percent of the queries will use
the primary key and 20 percent will use another column picked at random. This
is about what you'd know in real-world situations, since most of the accesses
will be done by production programs with embedded SQL, with only a small
percentage of ad hoc queries.
However, NP-complete problems do have efficient, near-optimal algorithms.
Farshad Fotouhi and Carlos E. Galarce of Wayne State University Computer
Science Department, for instance, have proposed using genetic algorithms to
search for near-optimal indexing.
The idea behind genetic algorithms is to mirror natural selection to evolve a
better algorithm. You create a set of "chromosomes" that can be modified by a
set of operators based on feedback from their environment. A group of "genes"
form a chromosome, which is handled as a unit. Each generation, the survival
quotient of a chromosome is measured by some adaptive plan. The starting point
is usually a random mixture of chromosomes, and the experiment is run for a
fixed number of generations.
Fotouhi and Galarce's approach uses a single table of library-book information
(classification, ISBN, title, subject, author). The gene is a binary vector
with a position for each of the five attributes. A 1 means the column is
indexed; 0 means it's not. For example, the gene (01011) would mean that there
are indexes on ISBN, subject, and author only. Notice that the ISBN is the
primary key, but no attempt is made to accept only genes with the ISBN
indexed. The idea is to let the genetic process find a solution without any
help.
This same chromosome pattern can also be used to represent a type of query. A
1 in a "query chromosome" means that the corresponding column is to be
returned; 0 means it's not. For example, the query, "find I, THE JURY by
Mickey Spillane" has the genes (00101) because the title and author are given.
This correspondence makes it simple to simulate query runs. The payoff formula
is based on hitting or missing an index in a query. The optimal score is to
ask for only indexed columns, which makes sense because there's a chance that
a nonindexed column would require a sequential search of the table.
Fotouhi and Galarce ran a series of random queries with a known statistical
distribution against the test database of one million rows. The genes with the
highest scores were saved from ("survived") that test run and used to build
the next test run ("generation"). The performance of the system was measured
in terms of average query-response times. The system leveled out in about ten
generations with a 5-bit chromosome, but took longer with a 10-bit chromosome.
The Fotouhi-Galarce experiment was based on a single table, a rare occurrence
in the real world. Tables are built based on a set of functional dependencies
(FDs). An FD between two data items means that if I know one value, I can
determine the second. This is written "A->B" and read "A determines B." If I
know the part's stock number, I can look up its weight in the inventory file.
The way I combine columns to make tables uses normal forms built from FDs. A
first-normal-form (1NF) table is simply a table; you cannot avoid it in SQL. A
second-normal-form (2NF) table is a table with at least one key. A
third-normal-form (3NF) has only one key and no transitive dependencies. A
transitive dependency would be a table with columns A, B, and C such that
(A->B) and (B->C), which implies (A->C).
There are higher normal forms, but most databases have to get to at least 3NF
to work without anomalies. A quick example shows why this is important.
Consider a table T(department, advisor, student) for faculty advisors at a
college. If Dr. Celko in computer science is deleted, so are all his students
(deletion anomaly); we cannot start a department until we have an advisor for
it (insertion anomaly); if Dr. Celko switches to the math department, so do
all his students (update anomaly). What we needed was two tables,
T1(department, advisor) and T2(student, department), to get rid of the
transitive dependency.
The bad news is that it's possible to develop more than one 3NF schema from
the same set of FDs.


An Example


Let's say an imaginary airline has a database for scheduling flights and
pilots. Most of the relationships are obvious: Flights have only one departure
time and one destination, and they can get a different pilot and be assigned
to a different gate each day of the week.
The functional dependencies for the database are given in Example 2 . Example
3, provides five possible answers, although there can be many more. The query
chromosome structure would have six genes (day, destination, flight, gate,
hour, and pilot). To show each possible 3NF schema, we first build tables
where functional dependencies for the database are the key. Once the tables
are defined, we apply queries against the whole database schema, not just one
table, as we did before.
Example 2: The functional dependencies for the database.

 flight -> destination
 flight -> hour
 (day, flight) -> gate
 (day, flight) -> pilot
 (day, hour, gate)-> destination
 (day, hour, gate)-> flight
 (day, hour, gate) -> pilot
 (day, hour, pilot)-> destination
 (day, hour, pilot)-> flight

 (day, hour, pilot)-> gate

Example 3: Pseudo-SQL notation for creating tables within a database schema.
Type declarations and constraints are not shown, just the table names, column
names, and primary keys.

 CREATE SCHEMA Normal_1
 CREATE TABLE Departures
 (flight, destination, hour, PRIMARY KEY (flight)):
 CREATE TABLE WeeklyRoster
 (day, hour, gate, flight, pilot,
 PRIMARY KEY (day, hour, gate));

 CREATE SCHEMA Normal_2;
 CREATE TABLE Departures
 (flight, destination, hour, PRIMARY KEY (flight));
 CREATE TABLE Weekly Roster
 (day, hour, pilot, flight, gate,
 PRIMARY KEY (day, hour, pilot));

 CREATE SCHEMA Normal_3;
 CREATE TABLE Departures
 (flight, destination, hour, PRIMARY KEY (flight))
 CREATE TABLE GatePilotSchedule
 (day, flight, gate, pilot, PRIMARY KEY (day, flight));
 CREATE TABLE GateFlightSchedule
 (day, hour, gate, flight, PRIMARY KEY (day, hour, gate));
 CREATE TABLE PilotFlightSchedule
 (day, hour, pilot, flight, PRIMARY KEY (day, hour, pilot));

 CREATE SCHEMA Normal_4;
 CREATE TABLE Departures
 (destination, hour, flight, PRIMARY KEY (flight));
 CREATE TABLE GateFlightSchedule
 (day, flight, gate, PRIMARY KEY (day, flight));
 CREATE TABLE GatePilotSchedule
 (day, hour, gate, pilot, PRIMARY KEY (day, hour gate));
 CREATE TABLE PilotFlightSchedule
 (day, hour, pilot, flight, PRIMARY KEY (day, hour, pilot));

 CREATE SCHEMA Normal_5;
 CREATE TABLE Departures
 (destination, hour, flight, PRIMARY KEY (flight));
 CREATE TABLE DutyRoster
 (day, flight, pilot, PRIMARY KEY (day, flight ));
 CREATE TABLE GateFlightSchedule
 (day, hour, gate, flight, PRIMARY KEY (day, hour, gate));
 CREATE TABLE GatePilotSchedule
 (day, hour, pilot, gate, PRIMARY KEY (day, hour, pilot));

A table chromosome is made up of a subset of the ten original functional
dependencies. Two rules have to be obeyed by the tables. First, no combination
that violates the 3NF condition is allowed; it's a "lethal mutation" and is
rejected at once. Second, all columns must be present in some table in the
schema; this assumes that you won't put information into a database if you
aren't going to look at it.
The database schemas will be made up of more than one table, and we'll be
mutating one or more tables at a time. The payoff function will have to
consider joins between tables (an expensive operation), the number of tables
accessed, and so forth.
A join is how SQL relates data in one table to that in another. A join query
builds a result table, which is made up of columns from other tables. Given
tables X1(x,b,c) and X2(x,d,e), the syntax in SQL for a simple join would look
like this:
 SELECT X1.x, b, c, d, e FROM X1, X2 WHERE (X1.x = X2.x)
This builds a result table with five columns by taking each row in X1 and
concatenating it to each row in X2 (a Cartesian product), then keeping only
those new rows for which the join condition (X1.x = X2.x) is true. The
<table>.<column> notation is used when columns in different tables have the
same name.
Notice that indexes are not mentioned in the SQL SELECT statement, unlike
Xbase and other nonrelational database languages. The SQL engine decides which
indexes (if any) to use and builds a query plan behind the scenes. This means
that indexes can be dropped and created from the schema without changing the
queries; but you do have to recompile the query plans. Indexes can both help
and hurt the performance of joins. Hopefully, the query optimizer will pick
the fastest search, but it can be fooled. It might pick an index when a
sequential read of the table would have been faster.
For example, if I want to use the flight to find both the day and the
destination, I have to join TABLE Departures to TABLE GateFlightSchedule in
the Normal_4 schema.
The operations on the schema will now be more complex than those we used for
indexes on a single table. We have to allow tables to combine or split. The
goal is to have the smallest number of tables used in the queries to avoid the
cost of joins. Once the tables are determined for the set of queries, we can
apply the index genetic algorithm to the tables.


Conclusion



To implement a system like this using existing system facilities, start with
the query optimizer in most mainframe SQL packages that has the name of the
indexes used in its query plans. The system tracks the CPU usage of each job
on several different parameters--time, resources, working storage used, and so
on. Most shops have repeated workloads which are close to identical in size
and mix on weekly or monthly cycles. The statistics are fairly simple; the
real trick is deciding on a payoff function.
Every large-scale database package has a utility program to trap
statistics--the number of accesses to a table, resources used by a job, and so
on. Next, have the indexes automatically appear or disappear as needed, a
process that's easy in SQL using the commands DROP <index name>; and CREATE
INDEX <index name> ON <table>(<column list>);, usually with vendor extensions
on the CREATE INDEX statement. Remember to recompile the execution plans.
You can't reorganize the database schema because the queries would not work.
Of course, as the input changes the old schema might not be the best possible,
so you would have to run the genetic system on a regular basis or whenever the
performance of the system declined dramatically.


References


Comer, D. "The Difficulty of Optimum Index Selection." ACM Transactions on
Database Systems (vol. 3, 1978).
Fotouhi, Farshad and Carlos E. Galarce. "Genetic Algorithms and the Search for
Optimal Database Index Selection," in Lecture Notes on Computer Science #507.
Berlin: Springer-Verlag, 1991.
Goldberg, D.E. Genetic Algorithms. Reading, MA: Addison-Wesley, 1989.
Paitetsky-Shapiro, G. "The Optimal Selection of Secondary Indexes in
NP-Complete." SIGMOD Record (vol. 13-N2, 1983).
















































April, 1993
TEXT EDITORS: ALGORITHMS AND ARCHITECTURES


Not much theory, but a lot of practice


 This article contains the following executables: TEXTED.ARC


Ray Valdes


Ray is DDJ's senior technical editor. He can be contacted at
rayval@well.sf.ca.usa or 411 Borel Ave., San Mateo, CA 94402.


This article discusses algorithms for implementing text editors. You may
perhaps think that such a subject is passe: "Why do I need to know about this?
My nifty new programming language (or class library or application framework)
lets me write a text-editor application in just 12 lines of code." That may be
true--and if your application's text-editing requirements are fully satisfied
by the built-in edit widget that came with your programming environment--then
party on, dude.
But, in the Windows environment at least, most text-editing chores ultimately
fall on the shoulders of the built-in Edit control, which is poorly equipped
to handle tasks like tabbing, multiple fonts and point sizes, and large files.
Those 12 lines of code using your whizzy class library are often just the
overhead for invoking a class that ultimately invokes the built-in Edit
control.
Ironically, the genesis for this article was a conversation with the designer
of an award-winning app framework, in which I was asked to suggest articles on
text-editor algorithms that could be of help in creating a more powerful
text-edit class for his tool. It turns out there isn't much written on the
subject, despite the dozens of public-domain editors available in source-code
form. Instead, there is an oral history of text-editor algorithms, passed down
from master to novice as part of a rite of passage.


The Simplest Case


Let's consider the simplest example of text editing. Listing One (page 80)
consists of one routine, EditLine(), which allows the user to edit a single
line of text on the screen. Chances are you've written a routine resembling
this. The sample spreadsheet program in Borland's Turbo C++ package, for
example, has a similar routine (and, in fact, this example is an adaptation of
that routine).
EditLine() accomplishes, in scaled-down form, many of the basic tasks that
required of a full-fledged text editor:
Display the initial contents of the text buffer on the screen.
Get a keystroke from the user.
If the keystroke is a command (like delete or backspace), dispatch or carry
out that command.
If the keystroke is a character, insert it into the text buffer (or replace
the current character if not in insert mode).
Display the updated contents of the text buffer on the screen.
This sequence repeats endlessly until the Return key is pressed, providing us
with the most primitive example of an event-driven program.
In looking at the code, you'll notice that this routine does a lot of
unnecessary work (which, in the case of small amounts of text, does not impact
the user). For example, when only the cursor is moved and no text has changed,
the entire line of text is cleared and redrawn. Further, the same happens in
cases where only one character has changed. Likewise, when a character is
deleted, it's only necessary to display the text on the right-hand side of the
cursor. Not printed here, but included with the electronic version of the
listings (see "Availability," page 5), is an improved EditLine2() routine,
which contains these and other obvious optimizations.


Scaling Up to Spaghetti


Although a full-fledged word processor or commercial desktop-publishing
program is in some sense a beefed-up version of EditLine(), it isn't a trivial
matter to scale up this routine to a real program. Time and again, novice
programmers have followed the garden path of extrapolating from this simple
function into a full-screen text editor via incremental improvements.
Sometimes the result is eminently usable, but just as often the program
collapses under the accumulated weight of haphazard changes. Those who persist
end up rediscovering many of the architectural principles and algorithms
described here.
Because text editors aren't traditional subjects for computer-science
textbooks, written materials on implementing them is scarce. By themselves,
the algorithms used in text editors aren't complex or challenging in a formal
mathematical sense. Rather, the challenge is more pragmatic in nature--to make
a disparate collection of simple algorithms work together in a coherent
manner. Still, I did run across one excellent resource, Craig Finseth's The
Craft of Text Editing: Emacs for the Modern World (Springer-Verlag, 1991).
Finseth is coauthor of several Emacs-type editors, including Mince, FinalWord,
and Freyja. I also found a similar, but less complete, discussion in the
electronic archives of the Internet news group comp.editors. The files
editech.[1-4].Z contain short but illuminating comments by Joseph Allen and
Stephen Trier on how to implement a text editor.
However, none of these authors place text-editor algorithms in the larger
context of interactive graphics programs. Implementing a text editor is a
subclass of a more general problem--the same problem faced by implementors of
"draw" programs and other graphical editing tools. Whether manipulating
textual or graphical objects, the fundamental goal is the same: to maintain a
consistent mapping between a data model that exists in an application-specific
data space, and a visual representation of that model that exists in a
graphics coordinate space. The well-known Smalltalk Model/View/Controller
paradigm is an equivalent way of expressing this relationship.
Implementation decisions fall into the following categories:
How text is stored and represented.
How text is formatted for presentation.
How the screen gets updated.
How the program is made customizable, extensible, and modular.
For each category, there are various choices, each reflecting a trade-off
between performance, resource consumption, and ease of implementation.
Over the years, I've written a number of editors for DOS, Sun, Windows, and
Macintosh platforms and have worked with the code for a number of high-end
electronic publishing systems. I've also gone over the code for a number of
publicly available editors, including Richard Stallman's GnuEmacs, Craig
Finseth's Freyja, Dan Lawrence's Micro-Emacs, Jonathan Payne's Jove, and on
the DOS platform, Marc Adler's ME, Al Stevens's MemoPad, and Fook Eng's Chi.
These implementations span a range of functionality and implementation
strategies, from the monumental 1.5 million lines of code in Interleaf 5.0 and
the 150,000 lines of Lisp and C code that constitute GnuEmacs, to the
medium-weight 25,000 lines of MicroEmacs and 14,000 lines of MemoPad, to the
modest 4000 lines of Chi. Each implementation is a unique mix of choices and
trade-offs.
The topic of text editors is among the most deeply personal of subjects for
many programmers, giving rise to many a flame war on Usenet, CompuServe, and
BBS systems. It's no surprise that one of the Internet newsgroups is called
alt.religion.emacs. The following survey of techniques tries to be as
nondenominational as possible.


Conceptual View of Data


Perhaps the most basic design decision is what conceptual view of text data to
present to the user. Most programming editors regard a text file as a
one-dimensional array of ASCII characters that gets mapped to a
two-dimensional grid of screen positions. An exception is Chi, which treats
text as a one-dimensional array of lines, each of which is a one-dimensional
array of characters. This means that lines do not wrap at the right edge of
the screen, making for a simple and fast implementation--but one that's
cumbersome to use when typing narrative text (as opposed to source code).
To maintain the view of text as an uninterrupted stream of characters, you can
use a number of data structures, such as linked-list structures, the
buffer-gap structure, and virtual-memory blocks.
Plain ASCII text is fine for editing program source code, but other uses
require additional attributes to be associated with the text stream. These
attributes control the way text is formatted or presented on the screen:
typeface, point size, justification, and so on.
One implementation choice is to embed these presentation attributes into the
text stream, thereby mixing formatting commands with text content. This
approach is used in many older-generation typesetting machines, back before
WYSIWYG. In general, this is also the approach that the future version of
GnuEmacs will use. Version 19 of GnuEmacs, to be released later this year,
will view a file as a stream of first-class Lisp objects that can represent
text characters, formatting commands, events such as mouse-clicks, or any
arbitrary Lisp function.

A different way to deal with text attributes is to maintain parallel
streams--one of content, the other of presentation attributes related to
content. This is used in Microsoft Word. Likewise, on the Macintosh, the
environment provides an equivalent to the Windows Edit control called TextEdit
that allows for "runs" or sequences of text styles, a "text style" being
defined as a particular combination of font and point-size attributes. Text
styles are implemented as an array of style records that point to locations in
the text stream where a particular "run" begins. There's an elaborate figure
on page 15-38 of Inside Macintosh Volume VI (Addison-Wesley, 1992) that shows
how style records are maintained in relation to the text. In System 7.1, the
WorldScript facility generalizes the notion of runs even further.
As you step up from ASCII text editors and simple word-processors to more
sophisticated document-processing and desktop-publishing programs, text can no
longer be regarded as a one-dimensional array of characters (even when
associated with a parallel stream of text attributes). In document processors
such as FrameMaker, Interleaf, or Ventura Publisher, the text content has its
own elaborate structure -- words, sentences, paragraphs, subsections,
chapters, appendices, and volumes. This is known as a "semantic" or "logical"
structure, in contrast to the geometrical or visual structure of the
presentation. Document-processing programs have the most complex task of all
graphical editors, to maintain a consistent mapping between two tree-like
structures: the semantic hierarchy of text content and the geometrical
hierarchy of pages, columns, frames, and lines. Dealing with this complexity
in an interactive, optimized manner is what leads to million-line-plus
programs.


The Machine Representation of Text


In the EditLine() example, text is represented in the machine as a single
array of ASCII-encoded bytes. Inserting or deleting a character merely
requires calling movmem() to shift every byte by one memory location.
Depending on the CPU, this brute-force method can work for even
medium-to-large amounts of text. At some point, of course, this profligate
expense of machine cycles becomes unworkable. Then the "buffer-gap" approach
comes into play.
The buffer-gap approach divides the single array of characters into two parts,
separated by a movable gap. The gap is an internal construct, not visible to
the user. From the user's point of view, text remains in an unbroken stream.
As the user navigates over the text, moving the cursor from one character to
the next, the system updates the corresponding pointer in the text data
structure, skipping over the buffer gap as necessary. When the user enters in
a bunch of text, the system shifts the gap over to the point of insertion,
then shrinks the gap by one character for each keystroke. This method avoids
most of the shuffling and reshuffling of text required by the earlier
approach.
Implementing a buffer-gap manager is not difficult, but requires attention to
detail to avoid fencepost errors. As Finseth points out, three coordinate
systems are in play at the same time. In the user coordinate system, location
0 corresponds to the position before the first character of the text. Note
that coordinates label the positions between the characters, rather than the
characters themselves. This is similar to a 2-D graphical coordinate system,
such as that used by QuickDraw or Windows GDI, which labels the positions
between pixels rather than the pixels themselves.
Second, there's the buffer-gap coordinate system, which is the same as the
user coordinate system, except that the continuum is broken up by the
variable-length gap. The third system is the storage coordinate system, which
labels the memory locations where characters are stored (rather than the
positions between them) and is the one used by pointers to memory. If you
don't scrupulously maintain the distinction between these three coordinate
systems, you'll be plagued by an ongoing cascade of fencepost errors.
Fortunately, the code available electronically contains bufgap.c, a module
that implements all the basic functions for managing a buffer gap--inserting
and deleting text, moving the gap, expanding the gap, searching the buffer for
a particular string, and so on. Ecerpts from buf_gap.c are shown in Listing
Two (page 80). This code is heavily based on an example posted by Joseph Allen
to the Internet on September 10, 1989. The module is not a stand-alone
program, but assumes other modules for input, command dispatching, redisplay,
memory allocation, and screen output.
In contrast to the buffer-gap approach, some text editors use a doubly linked
list of lines to represent a text file. As with the buffer-gap, the lower
levels of the program shield the higher levels from dealing with the
particulars of the machine representation. As regions of text are deleted or
inserted, the lines in the linked list are split or joined as necessary. This
is used by many text editors written in Lisp or Lisp-like languages. Although
written in C, Jove also uses this approach, as does Bill Joy's vi. A variant
was used by the Xerox Bravo editor, which kept the text file in unbroken form
at the memory location where it was initially loaded; as changes occurred
during an editing session, pointers were inserted into the original text to
link in inserted text or to skip over deleted text. Upon finishing the
session, the complete file was written out by traversing the resulting tangle
of pointers.
A limitation of all the above methods is that they don't work well with files
too large to fit into physical memory. Even on systems with hardware-supported
virtual memory, performance can still be poor. To deal with the size
limitation and the lack of hardware virtual-memory support, many PC-based text
editors, such as Mince and Epsilon, implement their own virtual-memory scheme
in software. In this approach, a file is broken up into fixed-size pages,
whose size is a multiple of the physical disk-block size. A working set of
pages--as many as will fit--is loaded into RAM, with the remainder kept in a
swap file on disk. Pages on disk are swapped in as needed; pages in RAM are
swapped out on a least-recently used basis. This software-only, virtual-memory
mechanism exists at a lower level than the rest of the editor routines. As
with the buffer-gap and linked-list approaches, the low-level coordinate
system is rendered invisible to the higher-level routines that do inserts,
deletes, and searches. In this case, the low-level coordinate system requires
a tuple (a page number and an offset within the page) to identify a location
in the virtual text stream.
All these schemes need to be modified in situations where text isn't
represented by 8-bit ASCII encoding. For example, Windows/NT uses 16-bit
Unicode internally for all text. Such an adaptation would be straightforward.
More complex is where the document consists of a stream of variable-length
objects or has a tree-like semantic structure. Programs such as Aldus
PageMaker use their own object-oriented database manager to store and retrieve
these structures.


Incremental Redisplay Algorithms


Given a bunch of stored text, how is the visual representation of it derived?
Again, there's a spread from simple to complex. In the simplest case, where
there's no word-wrap and no typographic formatting, the algorithm for
generating a screen display is trivial: Clear the screen, find the text
position that corresponds to the top of the screen, then step through lines of
text in the buffer one at a time, outputting each one until you reach the
bottom. This algorithm is the 2-D analogue of the one in EditLine().
Unfortunately, it is basically useless for an interactive program, due to
intolerable flashing that results from the constant regeneration of the entire
screen display. However, a version of the algorithm can be used for generating
the initial screen display at program startup.
The next step taken by many authors is to transform this trivial
display-generation routine into one for incremental redisplay. The changes are
analogous to going from EditLine() to EditLine2(). Instead of doing the entire
task each and every time, only the necessary work is done. A number of data
structures come into play to avoid constant recalculations and unnecessary
output to the screen.
Before deciding which data structures can be used in the time vs. space
trade-off, there is a basic decision to make: How extensible should the editor
be? Because the choice of editor is so deeply felt, many implementors try to
make their editor as customizable as possible. The decision has profound
architectural implications, because intermixing command processing with
redisplay routines (as shown in Editline2) often renders the editor
customizable only by the original author.
To allow for greater extensibility, many authors choose a bipartite
architecture in which the low-level routines, including the redisplay, are
implemented in a language such as C or assembler, and the high-level commands
are implemented in a simple interpreted extension language that can only
access the "inner editor" via an API. The extension language of the "outer
editor" is often Lisp-like (in the case of GnuEmacs, Brief, Sine, ZMacs, and
others), but can also resemble C (in the case of Epsilon, CBrief, ME), Forth
(in the case of Final Word), Basic (Microsoft Word), or even Awk (Sage). The
user-visible commands (such as "delete a word" or "move forward one sentence")
are implemented in the extension language, and cannot directly access the
redisplay structures except where allowed by the API.
In this case, the redisplay algorithm needs to be more intelligent--able both
to generate a new image from first principles as well as optimally update the
existing image using hints left by low-level buffer-management routines.
In mapping text data to screen display, a common technique is to keep a small
array of records that maintain the correspondence between text-stream
locations and screen coordinates. This text-to-screen map is updated as
changes are made to the text buffer. The buffer-management routines can
invalidate part or all of the mapping structure, depending on the extent of
changes to the text content. The map gets rebuilt as necessary during the
editing session and is discarded upon termination. While this approach works
well for plain ASCII text editors, it becomes inadequate for more
sophisticated word-processing and electronic-publishing programs.
In addition to the text-to-screen map, another useful data structure is one
that optimizes output to the physical screen. In the early days of text
editors, character-mode terminals were connected via low-bandwith lines to a
host computer (as was the case for both time-shared systems and CP/M
microcomputers) and much attention was paid to minimizing data transfer to the
screen. For timeshared systems, the situation was complicated by the wide
variety of terminals that could be connected to the host computer. Some
programs dealt with this by keeping track of two screen images: a virtual
image that represented the desired display and a second image that represented
the currently visible terminal screen. One program went to heroic lengths to
minimize the bytes transferred over the communication line, by using a
sophisticated dynamic-programming algorithm that calculated the optimal
sequence of commands to update the screen, tuned to the particular brand of
terminal device. However, users found it somewhat disconcerting, because
portions of the screen would jump up and down seemingly haphazardly as the
system moved snippets of text around to piece together a complete display,
exploiting the terminal's built-in commands for scrolling, insertion, and
deletion. The source code to this module was prefaced with the following
comment: "This routine is rather complicated. If you read this code and think
you understand it, you are very wrong."
Since then, both PCs and workstations have memory-mapped displays, so
communications bandwith is no longer the problem it was. At the same time, the
formatting process has become much more complicated. Instead of monospaced
fonts and simple word wrap, formatting has become more like typographic
composition. A simple linebreak routine has now become a research project. For
example, Donald Knuth, in TeX: The Program (Addison-Wesley, 1986) considers
his line-breaking algorithm the "most interesting algorithm of TeX," and
devotes 40 pages of his book, as well as a journal article, to describing it.
Knuth writes: "The line-breaking problem can be regarded as a special case of
the problem of computing the shortest path in an acyclic network."
TeX's typographic capabilities are indeed impressive, and Knuth's
boxes-and-glue metaphor for page-level formatting is elegant. Nevertheless,
the task of formatting in TeX is eased by the fact that it's a noninteractive
batch program. This explains why an interactive electronic-publishing program
with similar capabilities is an order of magnitude larger than the 60,000
lines of TeX.
Delving into such complexity is beyond the scope of this article, but some
general observations can be made. As the formatting requirements become more
complex, deriving the text-to-screen map is no longer cheap. The mapping
structures are therefore no longer blithely discarded once they have been
computed, but are instead kept around on a permanent basis, stored on disk
along with the text content (vastly increasing resource requirements). To
reduce the lag between the time a keystroke is entered and the time the screen
gets updated, some systems take advantage of multithreading, if it is
available. For example, in the OS/2 version of PageMaker, the formatting
process is a separate thread from the input process. This dynamic thread
architecture is the logical extension of the static module decomposition used
by singly threaded implementations.


Conclusion


There are so many degrees of freedom in implementing text editors, it's no
wonder that there are so many instances, each one unique. As a concrete
illustration of this discussion, the electronic version of the listings
includes a multifont, mouse-aware text editor I wrote some time ago for
Windows. I hope the context presented here will make the 4000 lines of C code
in my implementation more understandable.

_TEXT EDITORS: ALGORITHMS AND ARCHITECTURES_
by Ray Valdes


[LISTING ONE]

/* EditLine() -- The simplest text editing routine */
void EditLine(char* buffer, int max_length, int curr_row)
{
 int c, str_length = strlen(buffer), curr_column = str_length,
 insert_mode = TRUE;
 ChangeCursorShape(insert_mode);
 do
 { vt_ClearLineAt(curr_row,0);
 vt_OutputStringAt(buffer, curr_row,0);
 vt_SetCursorPositionAt(curr_row,curr_column);
 switch(c = vt_GetKeystroke()) /* dispatch on user's keystroke */
 {
 /*-------- keystrokes that terminate the editing session-----*/
 case ESCAPE_KEY:
 case ENTER_KEY: break;
 /*--------- keystrokes that merely change the cursor position---*/


 case HOME_KEY: curr_column = 0; break;
 case END_KEY: curr_column = str_length; break;
 case LEFT_KEY: if (curr_column > 0) curr_column--; break;
 case RIGHT_KEY: if (curr_column < str_length) curr_column++;
 break;
 case INSERT_KEY: insert_mode = !insert_mode;
 ChangeCursorShape(insert_mode);
 break;
 /*------ keystrokes that alter the contents of the buffer----*/
 case BACKSPACE_KEY: if (curr_column > 0)
 {
 movmem( &buffer[curr_column], /*source*/
 &buffer[curr_column-1], /*dest*/
 str_length - curr_column + 1);
 curr_column--;
 str_length--;
 }
 break;
 case DELETE_KEY: if (curr_column < str_length)
 {
 movmem( &buffer[curr_column+1], /*source*/
 &buffer[curr_column], /*dest*/
 str_length - curr_column);
 str_length--;
 }
 break;
 default: if (((c >= ' ') && (c <= '~')) &&
 (str_length < max_length))
 {
 if (insert_mode)
 {
 movmem(
 &buffer[curr_column],
 &buffer[curr_column + 1],
 str_length - curr_column + 1);
 str_length++;
 }
 else if (curr_column >= str_length)
 str_length++;
 buffer[curr_column] = c;
 curr_column++;
 }
 break;
 }
 buffer[str_length] = '\0';
 }

 while ((c != ENTER_KEY) && (c != ESCAPE_KEY));
}





[LISTING TWO]
/****************************************************************************
Excerpts from BUF_GAP.C--buffer gap manager module. Derived by Ray Valdes from
code by Joseph H. Allen, who wrote in his post to the comp.editors newsgroup
on 9/10/89: "Do whatever you like with this, just leave my name on it."

*****************************************************************************/

private unsigned sizeofBuffer; /* The size of theBuffer */
private char* thePoint; /* The point */
private char* theBuffer; /* The buffer */
private char* theEndOfBuffer; /* First character not in buffer */
private char* theStartOfGap; /* Beginning of theStartOfGap */
private char* theEndOfGap; /* First character not in theStartOfGap */
private bool isBufferChanged; /* Set when file has been changed */

#define SIZEOF_GAP_INCREMENT 16384 /* Amount that the buffgap grows by */

/****************************************************************/
public bool
bg_InitializeModule(void)
{ sizeofBuffer = SIZEOF_GAP_INCREMENT;
 theBuffer = (char* ) mem_AllocMem(sizeofBuffer);
 if(!theBuffer) return FALSE;
 thePoint = theBuffer;
 theStartOfGap = theBuffer;
 theEndOfGap = theBuffer + SIZEOF_GAP_INCREMENT;
 theEndOfBuffer= theEndOfGap;
 return TRUE;
}
/****************************************************************/
public void
bg_ExpandBuffer(unsigned amount)
{
 if( (theEndOfBuffer + amount - theBuffer) > sizeofBuffer)
 { char* old = theBuffer;
 sizeofBuffer = theEndOfBuffer + amount
 + SIZEOF_GAP_INCREMENT - theBuffer;
 theBuffer = (char* ) mem_ReallocMem(theBuffer, sizeofBuffer);
 if(!theBuffer) ProgramError("ReallocMem failed!");
 thePoint += theBuffer - old;
 theEndOfBuffer += theBuffer - old;
 theStartOfGap += theBuffer - old;
 theEndOfGap += theBuffer - old;
 }
}
/****************************************************************/
public void
bg_MoveGapToPoint(void)
{
 if(thePoint==theStartOfGap) return;
 if(thePoint==theEndOfGap) { thePoint = theStartOfGap; return; }
 /*else*/
 if(thePoint < theStartOfGap)
 { bg_MoveBytes( theEndOfGap - (theStartOfGap-thePoint),
 thePoint, theStartOfGap - thePoint);
 theEndOfGap = theEndOfGap-(theStartOfGap-thePoint);
 theStartOfGap = thePoint;
 }
 else
 { bg_MoveBytes(theStartOfGap,theEndOfGap,thePoint-theEndOfGap);
 theStartOfGap += thePoint-theEndOfGap;
 theEndOfGap = thePoint;
 thePoint = theStartOfGap;
 }

}
/****************************************************************/
public void
bg_ExpandGap(unsigned size)
{ if(size > bg_SizeofGap())
 {
 size += SIZEOF_GAP_INCREMENT;
 bg_ExpandBuffer(size);
 bg_MoveBytes(theEndOfGap+size,
 theEndOfGap, theEndOfBuffer - theEndOfGap);
 theEndOfGap += size;
 theEndOfBuffer += size;
 }
}
/****************************************************************/
public bool
bg_FindNextNewline(void)
{ while(((thePoint==theStartOfGap) ? (thePoint=theEndOfGap) : (thePoint))
 != theEndOfBuffer)
 {
 if(*thePoint==NEWLINE_CH) return TRUE;
 else thePoint++;
 }
 return FALSE;
}
/****************************************************************/
public void
bg_InsertStringAtPoint(char* string, unsigned size)
{ bg_MoveGapToPoint();
 if(size > bg_SizeofGap())
 bg_ExpandGap(size);
 bg_MoveBytes(theStartOfGap,string,size);
 theStartOfGap += size;
 isBufferChanged = TRUE;
}
/****************************************************************/
public bool
bg_CompareString(char* string, unsigned size)
{ char* x;
 if(thePoint==theStartOfGap) thePoint=theEndOfGap;
 if( (theStartOfGap > thePoint )
 && (theStartOfGap < thePoint + size )
 && (theStartOfGap != theEndOfGap) )
 {
 if(bg_CompareString(string,theStartOfGap-thePoint)) return TRUE;
 else
 {
 x = thePoint;
 thePoint = theEndOfGap;
 if(bg_CompareString( string + (theStartOfGap-x)
 , size - (theStartOfGap-x)))
 { thePoint=x; return TRUE; }
 else { thePoint=x; return FALSE; }
 }
 }
 else
 {
 x = thePoint;
 do { if(*(x++) != *(string++)) return TRUE; } while(--size);

 return FALSE;
 }
}
/****************************************************************/
public bool /*this routine assumes file is already open*/
bg_InsertFile(FILE* file)
{ unsigned amount;
 long file_size = filelength(fileno(file));

 if(file_size==0L) return TRUE;
 if(file_size > 32767L) return FALSE;

 isBufferChanged = TRUE;

 bg_MoveGapToPoint();
 bg_ExpandGap((int)file_size);

 amount = fread(theStartOfGap, 1, file_size, file);
 if(amount != file_size)
 {
 ProgramError("I/O Error on reading file.");
 return FALSE;
 }
 theStartOfGap += amount;
 return TRUE;
}




































April, 1993
A WAVELET ANALYZER


An alternative to the FFT-based spectrum analyzer


 This article contains the following executables: WAVELET.ARC


Mac A. Cody


Mac is an engineering specialist for a Dallas, Texas defense contractor. He
can be contacted at 214-205-6452.


The ubiquitous (and indispensable) oscilloscope and its cousin, the spectrum
analyzer, are the basic tools of the trade for electronic engineers. The
oscilloscope is used for viewing signals; its display shows the signal as a
function of voltage vs. time. The spectrum analyzer allows you to view the
signal as a function of voltage vs. frequency. Some versions of the spectrum
analyzer use a frequency-adjustable, narrow-band filter to extract the
spectral content of the signal. As the filter's center frequency is swept from
lower to higher frequencies, the output-signal energy is traced upon an
oscilloscope screen. The resulting plot is the time-averaged spectrum of the
input signal. More modern spectrum analyzers use the fast Fourier transform
(FFT) to directly calculate the spectrum of the sampled signal.
An alternative to the Fourier transform is the wavelet transform (discussed in
my article "The Fast Wavelet Transform," Dr Dobb's Journal, April 1992), which
is capable of simultaneous analysis of both time and frequency in signals. In
addition, an infinite variety of wavelets enables you to tailor that analysis
according to the wavelet chosen. The fast wavelet transform provides an
efficient implementation of the wavelet transform for use on digital
computers.
This article presents an implementation of a wavelet analyzer (a variation of
the spectrum analyzer), which uses the fast wavelet transform rather than the
FFT to analyze the signal. The fast wavelet transform is implemented as a
recursive routine on a digital signal processing (DSP) board based on an AT&T
DSP32 or DSP32C DSP IC. The input is a voice-grade audio signal sampled and
digitized by a "codec" (a type of analog-to-digital converter) on the DSP
board. The wavelet analyzer uses the features and capabilities of a standard
VGA controller to display the input signal and wavelet transform coefficients
as they are generated in real time.


No Recourse But to Recurse


My previous article provided a complete description of the fast wavelet
transform algorithm; here, I'll focus on one variation. The tree or pyramid
structure of the fast wavelet-transform algorithm naturally leads to a
recursive implementation; see Figure 1. The input of each level of the tree or
pyramid algorithm depends upon the output of the previous approximation level.
This is true even for the first level, since the input samples are themselves
an approximation of the continuous function which the samples represent. Note,
also, that the filter pairs for all levels are identical. Each level of
"filtering" is followed by down-sampling (or decimation) by a factor of 2
prior to sending a sample to the next-lower level.
Therefore, the recursion for each level acts upon the same filter pair with
the same set of wavelet-filter coefficients. Each level of recursion occurs
half as often as each higher level. Recursion terminates under two conditions.
The first (and most common) condition is that, at any level, two samples have
not been generated, so decimation cannot occur. The decimation operation
occurs as a result of sliding the samples through the filters two at a time,
generating one output. Before proceeding to the next-lower level, recursion
must occur twice, generating two approximation coefficient samples. The other
condition for termination of recursion is execution of the lowest (and last)
level of approximation and detail filters. When this happens, the
approximation and detail-coefficient samples are both output. Depending upon
the level of recursion performed, there can be from three to nine signal,
detail-coefficient, and approximation-coefficient samples output from each
recursion for a six-level fast wavelet transform; see Table 1.
Table 1: The recursive fast wavelet transform generates a quantity of detail
samples for every two signal samples it is supplied.

 Signal Detail level sample Approximation Samples
 sample 5 4 3 2 1 0 sample to plot

 0, 1 0 3
 2, 3 1 0 4
 4, 5 2 3
 6, 7 3 1 0 5
 8, 9 4 3
 10, 11 5 2 4
 12, 13 6 3
 14, 15 7 3 1 0 6
 16, 17 8 3
 18, 19 9 4 4
 20, 21 10 3
 22, 23 11 5 2 5
 24, 25 12 3
 26, 27 13 6 4
 28, 29 14 3
 30, 31 15 7 3 1 0 7
 32, 33 16 3
 34, 35 17 8 4
 36, 37 18 3
 38, 39 19 9 4 5
 40, 41 20 3
 42, 43 21 10 4
 44, 45 22 3
 46, 47 23 11 5 2 6
 48, 49 24 3
 50, 51 25 12 4
 52, 53 26 3

 54, 55 27 13 6 5
 56, 57 28 3
 58, 59 29 14 4
 60, 61 30 3
 62, 63 31 15 7 3 1 0 0 9

In my previous article, the program listings for the transform contain control
statements that handle the ends of data arrays. This is necessary because the
data arrays have finite length in both directions. In the case of real
signals, however, once sampling of data begins, it can go on endlessly (or as
long as the hardware and power supply hold out). As a result, the algorithm
must now deal with a nonterminating data sequence, which actually makes the
algorithm and code implementation simpler, since new data will always be
available.


DSP Implementation


The recursive fast wavelet transform is implemented on the AT&T DSP32 and
DSP32C digital signal processors. I chose the DSP32 and DSP32C because of
their support for floating-point math at high speeds (8 and 24 MFLOPs,
respectively), pointer addressing, and C-like assembly language. These DSPs
are described more fully in Jim Bittman's article, "Adding the Power of DSP to
Your Application" (Dr. Dobb's Journal, May 1991).
The fast wavelet-transform algorithm borrows heavily from the field of DSP, so
it can easily be taken to a full DSP implementation. Four source-code modules,
written in AT&T DSP32 and DSP32C assembly languages, form the DSP component of
the wavelet analyzer. Listing One (page 82) is the recursive transform;
Listing Two (page 82) is the pixel plotting routine; Listing Three (page 87)
the main loop; Listing Four (page 90) storage allocator for variable and
constants; and Listing Five (page 90) the make and header files.


Recursive Fast Wavelet Transform


The routine DECOMP (see Listing One) implements the recursive fast wavelet
transform, or signal decomposition. The routine first retrieves pointers to
the input-and output-approximation storage arrays. The input-approximation
coefficients are supplied by the next-higher level of the transform (or
recursion). In the case of the first level of recursion, the input consists of
the signal samples from the codec. The output-approximation coefficients are
those generated by the current level of the transform.
The routine then performs two data convolutions between the
input-approximation coefficients and the high-pass and low-pass wavelet
filters (the wavelet coefficients). This operation is coded inline to avoid
the execution-time penalty of control loops. A jump vector supplied by the
calling function causes DECOMP to execute the necessary number of convolution
instructions for the length of the wavelet filters.
The first convolution produces one wavelet-detail coefficient sample. This
sample is stored in an array for eventual display on the host system's VGA.
The second convolution produces one wavelet-approximation coefficient sample.
At the same time, the input-approximation coefficients are shifted by two
storage locations, thereby producing the decimation factor of 2 and opening up
two free storage locations for new approximation samples from the next-higher
transform level. What happens to the newly calculated
approximation-coefficient sample depends on the current level of recursion of
DECOMP and the number of samples previously generated at this level of
recursion.
As DECOMP executes each level of recursion, a recursion counter is
decremented. When the count reaches 0, DECOMP has reached the desired lowest
level of the wavelet transform, level 0. At this point the new
approximation-coefficient sample is placed into the array with the other
detail-coefficient samples for display. This coefficient is the level-0
approximation coefficient for the wavelet transform.
If the recursion count has not reached 0, the approximation-coefficient sample
is destined for input into the next lower level of the wavelet transform. If
the sample is the first of a pair of samples for the next lower level, then it
is stored in the appropriate input-approximation coefficient array, and DECOMP
then returns to the next higher level of recursion. If this sample is the
second sample of the pair, then it is stored and DECOMP calls itself to
proceed down to the next level of the transform (and the recursion). At the
same time, the output pointer is realigned to allow two more approximation
samples to be placed into the output array during future recursions.


Pixel-plotting Routine


The DRAWIMAG routine plots each sample generated by the codec or DECOMP into
one of two sets of image bitmaps in the DSP memory; see Listing Two. Each set
consists of eight bitmaps corresponding to the displays of the input signal,
the six detail levels, and the level-0 approximation. Each map is designed to
hold eight unit intervals of sample data or wavelet coefficients. A unit
interval is the number of samples required to generate a wavelet transform to
a desired number of levels. In this application, a six-level wavelet transform
has a unit interval of 21, or 64, samples at the input.
Each map consists of 50 rows of bytes, which correspond to the 50 rows
assigned for each signal trace on the VGA display. The number of bytes per row
depends upon the number of samples in eight-unit intervals for the input
signal or wavelet-coefficient level displayed. For example, the level-2 detail
bitmap in this application has four samples per unit interval. With eight-unit
intervals, this bitmaps need four bytes per row. Exceptions to this general
rule are the level-0 detail and approximation-display bitmaps, which require
two, rather than one byte per row. This is necessary because the DSP32 and
DSP32C are restricted to 16-and 32-bit DMA transfers through the parallel data
register. The additional byte in each row is unused and set to 0.
Plotting a sample value into the bitmap involves translating the sample value
into a count (either up or down) from the baseline in the bitmap; see Figure
1. The count determines the length of a line of pixels drawn from the
base-line. The byte-column pointer and bit pointer determine in which pixel
column the line will be drawn. There is a bit pointer, byte-column pointer,
and baseline offset value for each of the eight bitmaps of the image.
Depending upon the value of the bit pointer, the drawing operation is either a
direct write of the pixel values to memory or a logical OR with the current
contents of the bitmap prior to the write. When the bit pointer selects the
most significant bit in the byte, all 50 bytes in the currently selected byte
column are written to so that the previously displayed image is overwritten
while the new image is written. The write operation proceeds from the
upper-blank portion of the image to the trace pixels, and then to the
lower-blank portion of the image. If the sample value is full scale in the
positive direction, the upper-blank fill operation is skipped; if it is full
scale in the negative direction, the lower-blank fill operation is skipped.
Trace pixels are not drawn if the signal level is below the scale of the
least-significant pixel.
When the bit pointer points to a bit other than the most significant, the
pixel-drawing routine uses logical ORing of the bit pointer and the bytes in
the column to turn on the trace pixels. This process takes longer than the
direct write operation described previously, but it is performed only on the
bytes where the trace image will reside. The baseline offset determines the
starting byte for the line of pixels drawn. While calculating the pixel draw
count, the direction of the draw operation from the baseline is established. A
draw count of 0 results in no pixels being drawn on that bit column.
Once the column of pixels is drawn, the bit pointer is shifted down one bit
and saved for the next call of DRAWIMAG for that bitmap. If the bit pointer is
0 after the shift, then it is set to point to the most-significant bit, and
the byte column pointer is incremented and saved. The pointers move through
all the bit columns of their respective bitmaps until all samples of the
eight-unit intervals are plotted.


Main DSP Control Loop


ANALYZER is the main routine for the DSP portion of the wavelet-analyzer
program; see Listing Three. The routine initializes the DSP registers, memory,
and codec prior to entering the main processor loop. Within the main processor
loop, signal samples are taken from the codec and passed to DECOMP for
processing to generate the next group of wavelet coefficients. After returning
from DECOMP, DRAWIMAG is called several times to plot the group of signal
samples and wavelet-coefficient samples into the various image bitmaps for
eventual display on the VGA screen. The plotting operation can exceed the time
of arrival of the next sample from the codec. To prevent loss of a sample
during plotting, the serial-input buffer flag is checked after each plot until
the sample arrives. Once the sample arrives, it is acquired and the remainder
of the plotting operations are completed.
The ANALYZER main loop processes eight-unit intervals (512 data samples) to
fill the bitmaps with new information to display. Then the signal processor
flags the host processor and informs the host program which set of bitmaps
(IMAGE_0 or IMAGE_1) holds the latest set of display data. The signal
processor then reinitializes the pointers for DRAWIMAG to plot to the other
set of bitmaps. Before proceeding with processing the next pair of signal
samples, the DSP checks that the host processor has acknowledged the flag and
waits if it hasn't.


Watchin' the Waves Roll In...


Once the DSP is running, the host PC essentially acts as an I/O server for the
DSP board. Whenever a new block of image data is plotted and ready for
display, the DSP board flags the host PC to retrieve the image and display it
on the VGA screen. This process is extremely time critical, since new image
data is ready to be displayed every 64 milliseconds. The PC must shift the
pixels of previously displayed images to make room for the new image and then
transfer the image from the DSP memory to the display memory. Thus, the
display device must be able to receive, manipulate, and display the image data
as quickly as it becomes available. A standard VGA controller and display are
used for this task. The VGA provides the graphics resolution, color, and
control features necessary to display and manipulate images at fairly rapid
rates. It is also a well-established display standard and is now quite
affordable.
Smooth update of the display requires page flipping. Michael Abrash ("Graphics
Programming," DDJ, December 1991) suggested that this is possible on a 640x480
VGA display. However, Michael's technique allows only the upper two-thirds of
the display area to be flipped; more area is needed for the wavelet-analyzer
display. Since only 512 columns are needed to display data, a compromise
between horizontal display resolution and page size can be made.
A display with 592 columns and 480 rows is created, with the top 400 rows
capable of being page flipped; see Figure 3. The horizontal display dimension
is altered by setting the Horizontal Display Enable End and the Offset
registers in the VGA's CRTC to display 74 bytes (or 592 pixels) on each
horizontal scan. The size of the flip page is determined by setting the Line
Compare Count register in the VGA's CRTC to reset the line count after 400
lines have been displayed. Page flipping is controlled by setting the CRTC
Start Address High and Start Address Low registers to point to the specific
page. The routine Set592x480, see Listing Six (page 90), performs the setup of
the VGA display by initially setting the display to mode 16H (640x480 pixels,
16 colors) and modifying the registers described earlier.
The image data currently displayed on the VGA must be shifted to make room for
new image data coming from the DSP memory. The backfields of the latest image
alternate between dark and light gray to delineate between the eight unit
intervals. These backfields need to be changed to a blue background to
indicate past wavelet coefficients when they are shifted. Therefore, a trick
is played with the VGA color palette to produce multiple mappings of signal
and backfield colors; see Table 2. Write Mode 1 and the Map Mask register are
used to enable pixel-color changes to occur simultaneously with the movement
of image data; see Figure 4.
Table 2: The colors of the VGA palette registers can be selected to remap the
colors of the signal traces during a data-shift operation so that the
background colors are changed while the signal itself appears unaffected.

 Palette DAC palette Pixel shift-and-mask operation
 register register number map pixel to register

 0 0 (black) 0
 1 63 (white) 1
 2 60 (light red) 2

 3 3 (cyan) 3
 4 4 (red) 0
 5 5 (magenta) 1
 6 6 (olive) 2
 7 55 (pale yellow) 3
 8 1 (blue) 8
 9 62 (yellow) 9
 10 1 (blue) 10
 11 62 (yellow) 11
 12 56 (dark gray) 8
 13 62 (yellow) 9
 14 7 (light gray) 10
 15 62 (yellow) 11

When the CPU reads the VGA memory, all four bytes in the pixel planes are
placed in latch registers. When the CPU writes to the VGA memory with Write
Mode 1 enabled, the contents of the latch registers are written back into the
pixel planes. Writing latch contents to memory can be blocked by setting the
corresponding bits of the Map Mask register to 0. Setting bit 2 of the Map
Mask register to 0 and all others to 1 causes the pixels' color-register
values to be remapped to other values at the time of writing. The remapping
causes the light gray and dark backfields to be changed to blue while the
traces remain yellow.
The routine ShiftWaveTraces in Listing Six performs the operations of image
shifting and color modification. The routine moves eight rectangles of the
screen image representing the input signal and wavelet-transform coefficient
traces. Each rectangle consists of 50 rows of pixel bytes. Although the
routine is written in C, an assembly language loop (written using Borland
Turbo C 2.0 __emit__ statements) is used to move each block of pixel bytes a
row at a time. The REP MOVSB instruction is used to move the bytes of each row
quickly. The data in the array shiftblocks describes the number of bytes to
move in each rectangle and the screen-wrap index to keep the source and
destination pointers aligned with the block boundaries. The source and
destination addresses for the shift are chosen so that the image blocks are
moved from the page being displayed to the other page. In this fashion, the
screen can be updated with the new image from the DSP without disturbing the
screen image.
With the pixel images shifted in the VGA's memory, the routine GetDSPimage can
now transfer the pixel image from the DSP's memory to that of the VGA; see
Listing Five. The design of the image bitmaps in the DSP memory allow for
rapid transfer of the images. The bytes in the DSP image bitmaps correspond
directly to the bytes in pixel plane 0 memory of the VGA.
To transfer the pixel data, the Write Mode register is set to mode 0 to allow
the external data to be written into the VGA memory. The Bit Mask register is
set so that all bits in a byte of VGA memory can be written to. The Map Mask
register is set so that only pixel-plane 0 (and the least-significant bit of
the color-register value) is affected by the write. Transfers can now proceed
by performing word transfers from the DSP's memory directly into the VGA
memory using the DMA capabilities of the DSP32 and DSP32C; see Figure 5. As in
ShiftWaveTraces, GetDSPimage uses an assembly language loop to optimize the
data transfers. The REP INSW instruction is used to move individual rows of
pixel bytes.


Making Waves, Again...


The code presented here is only a portion of a complete wavelet-analyzer
package called ANALYZER. The user interface of ANALYZER allows the operator to
select the coefficients to be used by the analyzer and view their values and
the recursive scaling and wavelet functions defined by those coefficients. The
user can also run and halt the wave-transform operation on the incoming signal
and pause the screen display at any time. The vertical scale of the input
signal and wavelet-transform coefficient traces can be independently adjusted
to allow detailed viewing of the data.
The code for the host PC is written in Borland's Turbo C 2.0 with the small
memory model. Although the assembly language components of the VGA graphics
code are written using __emit__ statements, it should not be too difficult to
convert the code over to full assembly language. Assembling the DSP code
requires the AT&T WE DSP32-SL Support Software Library. The batch files
supplied with the analyzer source code are sufficient to control the library's
make utility. Because of its size, the program is available electronically;
see "Availability" on page 5. The electronic version includes the ANALYZER
executable and source, the DSP source and executables, and documentation.
A PC equipped with an 80286 (or higher) processor running at 12 MHz or higher
and a VGA card is required. To run the DSP software, the user must supply
either a Burr-Brown ZPB32 (DSP32) or ZPB34 (DSP32C) DSP board and a ZPB100
codec board. Other DSP boards can be supported by modifying the DSP I/0
control software.


Wave Goodbye (Again)!


The ANALYZER program turns the PC and a DSP board into a test instrument that
uses the recursive fast wavelet transform (much as the spectrum analyzer uses
the Fourier transform) to analyze signals in real time. With this tool, you
can study interactively how different wavelet and scaling functions can
interpret a signal. Someday, the wavelet-transform analyzer may even reside
alongside the oscilloscope and spectrum analyzer as an essential tool in every
engineer's toolbox!


References


Abrash, Michael. "Catching Up." Dr. Dobb's Journal (December, 1991).
Bittman, Jim. "Adding the Power of DSP to Your Application." Dr. Dobb's
Journal (May, 1991).
Cody, Mac A. "The Fast Wavelet Transform." Dr. Dobb's Journal (April, 1992).
iAPX 286 Programmer's Reference Manual, Intel Corporation, 1983.
Kliewer, Bradley Dyck. EGA/VGA: A Programmer's Reference Guide, second
edition. Berkeley, CA: McGraw-Hill, 1990.

_A WAVELET ANALYZER_
by Mac A. Cody


[LISTING ONE]

#include "dsp_type.h"
#if DSP32
#include "dspregs.h"
#endif

.global DECOMP
 /* DECOMP
 perform recursive decomposition on non-terminating data sequence
 registers used: r1 r2 r3 r4 r5 r6 r11 r12 r13 r14 r15 r16
 accumulators used: a0 a1

 input: r1 - pointer to wavelet high-pass filter coefficients
 r2 - pointer to wavelet low-pass filter coefficients
 r6 - pointer to wavelet output data list
 r11 - jump address for proper filter length
 r12 - pointer to data pointer array
 r13 - pointer to stack
 r14 - return stack register, i.e. "TOP OF STACK"
 r15 - recursion counter
 r16 - filter coefficient pointer wrap back index
 */
DECOMP: r3e = *r12++; /* load pointer to approx. input array source */
 r5e = *r12; /* load pointer to approx. output array destination */
 goto r11; /* jump to appropriate filter processing for length */
 r4e = r3 + 8; /* set pointer to approx. input array destination */
DT6: a0 = a0 + *r3++ * *r1++; /* pass data through high-pass wavelet */
 a0 = a0 + *r3++ * *r1++; /* filter (a0 = 0.0 initially) */
DT4: a0 = a0 + *r3++ * *r1++; /* the destination of the jump depends */
 a0 = a0 + *r3++ * *r1++; /* upon the length of the wavelet */
DT2: a0 = a0 + *r3++ * *r1++; /* filter */
 goto r11 + 28; /* jump to appropriate filter processing for length */
 *r6++ = a0 = a0 + *r3++r16 * *r1++r16; /* output is detail point */
AT6: a1 = a1 + (*r4++ = *r3++) * *r2++; /* pass data through low-pass */
 a1 = a1 + (*r4++ = *r3++) * *r2++; /* filter (a1 = 0.0 initially) */
AT4: a1 = a1 + (*r4++ = *r3++) * *r2++;
 a1 = a1 + (*r4++ = *r3++) * *r2++;
AT2: a1 = a1 + *r3++ * *r2++;
 a1 = a1 + *r3++ * *r2++r16;
 r15 = r15 - 1; /* check the recursion count */
 if (eq) goto NO_RECUR; /* if true, recursion at bottom of tree */
 r5e & 0x0004; /* check for even/odd status */
 *r5-- = a1 = a1; /* save approx. data for next level */
 if (ne) goto CLEAN_UP; /* if true, await another data point */
 r4e = r5 + 8; /* wrap output pointer back */
 *r12++ = r4e; /* save wrapped pointer to approx. O/P */
 *r13-- = r14e; /* save return address to stack */
 a0 = a0 - a0; /* clear the accumulators */
 call DECOMP (r14); /* recurse down the tree */
 a1 = a0;
#if DSP32C
 r13e = r13 + 4; /* align stack pointer to return address */
#else
 r13 = r13 + 2; /* align stack pointer to return address */
#endif
 r14e = *r13; /* pop the return address */
 nop;
 return (r14);
 nop;

NO_RECUR: *r6 = a1 = a1; /* save approx. coeff. as next value */
CLEAN_UP: return (r14);
 *r12++ = r5e; /* save unwrapped pointer to approx. output */
 /* END OF DECOMP */






[LISTING TWO]


#include "dsp_type.h"
#if DSP32
#include "dspregs.h"
#endif

.global DRAWIMAG
 /* DRAWIMAG -- draw image of supplied point and connect to previous point
 registers used: r1 r2 r3 r4 r5 r6 r7 r12 r13 r14
 accumulators used: a0 a1
 input: r12 - pointer to data pointer array
 r13 - pointer to stack
 r14 - return stack register, i.e. "TOP OF STACK"
 a0 - data point to draw
 */
DRAWIMAG: a0 = *r12++ - a0 * *r12++; /* multiply by scalling coefficient */
 a1 = -a0 + *r12; /* determine if value above upper threshold */
 a0 = ifalt(*r12++); /* if true, limit data to threshold */
 a1 = a0 - *r12; /* determine if value below lower threshold */
 a0 = ifalt(*r12++); /* if true, limit data to threshold */
 *r12++ = a0 = int(a0); /* convert data to integer format */
 nop;
 r6e = r12 - 2; /* point to temporary storage */
 r1 = *r12++; /* load the row increment value */
 r4 = *r12++; /* load the row baseline offset value */
 r2 = *r12++; /* load the bit pointer */
 r3e = *r12; /* load the byte column pointer */
 r6 = *r6; /* read the new byte column pointer offset */
 r2 - 0x80; /* check if first bit of new byte column */
 if (ne) goto ADD_BAR; /* if true, add new bar to byte column */
 nop;
#if DSP32C
 r6 - 23; /* check if datum point is above baseline */
 if (le) goto ABOVE_BL; /* if true, it is above the baseline */
 nop;
 r4 = 25; /* top counter for clearing of pixel over bar */
 r5 = r6 - 24; /* middle counter for draw of bar pixels */
 goto DRAW_COL; /* go draw the column of pixels */
 r6 = 48 - r6; /* bottom counter for clearing of pixel under bar */

ABOVE_BL: r4 = r6; /* top counter for clearing of pixel over bar */
 r5 = r6;
 r5 = 24 - r5; /* middle counter for draw of bar pixels */
 r6 = 25; /* bottom counter for clearing of pixel under bar */
DRAW_COL: r4 = r4 - 1; /* check if no top pixels are to be cleared */
 if (lt) goto NO_TOP; /* if true, skip clearing bytes above bar */
 r7 = r7 - r7; /* force the register to zero */
 do 0, r4; /* repeat next instruntion r4+1 times */
 *r3++r1 = r7l; /* zero the bytes above the bar */
NO_TOP: r5 = r5 - 1; /* check if no bar pixels are to be set */
 if (lt) goto NO_MID; /* if true, skip setting bytes of bar */
 nop;
 do 0, r5; /* repeat next instruntion r5+1 times */
 *r3++r1 = r2l; /* MSB of byte is bar, others cleared */
NO_MID: r6 = r6 - 1; /* check if no bottom pixels are to be cleared */
 if (lt) goto NO_BOT; /* if true, skip clearing bytes below bar */
 nop;
 do 0, r6; /* repeat next instruntion r6+1 times */
 *r3++r1 = r7l; /* zero the bytes below the bar */

NO_BOT: goto SHIFTBIT; /* go shift the bit pointer and clear up */
 r3e = *r12; /* reload the byte pointer */

ADD_BAR: r6 - 24; /* check if datum below baseline */
 if (gt) goto BELOW_BL; /* if true, datum is below baseline */
 r6 = r6 - 24; /* calculate length of bar */
 r6 = -r6; /* datum above baseline length was negative */
 r1e = -r1; /* datum above baseline, increment is decrement */
BELOW_BL: r4e = r4 + r3; /* add baseline offset to byte column pointer */
 r5e = r4 + r1; /* move pointer away from the baseline */
 r7 = 0; /* zero initial storage value */
 do 2, r6; /* repeat the next 3 instructions r6+1 times */
 r6l = *r5++r1; /* load the byte */
 *r4++r1 = r7l; /* store the byte and move the pointer */
 r7 = r6 r2; /* OR byte with bit pointer to set the bit */
#else
 r6 = r6 * 2; /* multiply offset by four to account for ... */
 r6 = r6 * 2; /* four bytes per instruction in MEMSET */
 r6 - 92; /* check if datum point is above baseline */
 if (le) goto ABOVE_BL; /* if true, it is above the baseline */
 r5 = -r6; /* middle counter for draw of bar pixels */
 r4 = 0; /* top counter for clearing of pixel over bar */
 r5 = r5 + 196; /* middle counter for draw of bar pixels */
 goto DRAW_COL; /* go draw the column of pixels */
 r6 = r6 - 92; /* bottom counter for clearing of pixel under bar */

ABOVE_BL: r4 = -r6; /* top counter for clearing of pixel over bar */
 r4 = r4 + 100; /* top counter for clearing of pixel over bar */
 r5 = r6 + 4; /* middle counter for draw of bar pixels */
 r6 = 0; /* bottom counter for clearing of pixel under bar */
DRAW_COL: *r13-- = r14; /* save return address to the stack */
 call r4+MEMSET (r14); /* zero the bytes above the bar */
 r7 = r7 - r7; /* force the register to zero */
 call r5+MEMSET (r14); /* set the bytes of the bar */
 r7 = r2; /* MSB of byte is bar, others cleared */
 call r6+MEMSET (r14); /* zero the bytes below the bar */
 r7 = r7 - r7; /* force the register to zero */
 r13 = r13 + 2; /* point to return address on the stack */
 r14 = *r13; /* load the return address from the stack */
NO_BOT: goto SHIFTBIT; /* go shift the bit pointer and clear up */
 r3 = *r12; /* reload the byte pointer */

ADD_BAR: r6 = r6 * 2; /* multiply offset by twelve to account for ... */
 r6 = r6 * 2; /* four bytes per instruction and ... */
 r7 = r6 * 2; /* three instructions per byte in MEM_OR */
 r6 = r6 + r7;
 r6 - 288; /* check if datum below baseline */
 if (gt) goto BELOW_BL; /* if true, datum is below baseline */
 r6 = r6 - 288; /* calculate length of bar */
 r6 = -r6; /* datum above baseline length was negative */
 r1 = -r1; /* datum above baseline, increment is decrement */
BELOW_BL: r6 = 288 - r6 /* align the counter for proper call */
 r4 = r4 + r3; /* add baseline offset to byte column pointer */
 r5 = r4;
 r5 = r5 + r1; /* move pointer away from the baseline */
 *r13-- = r14; /* save return address to the stack */
 r7 = 0; /* zero initial storage value */
 call r6+MEM_OR (r14);
 r6 = r7;

 r13 = r13 + 2; /* point to return address on the stack */
 r14 = *r13; /* load the return address from the stack */
#endif

SHIFTBIT: r2 = r2 >> 1; /* shift the pixel pointed to right */
 if (ne) goto NO_WRAP; /* if true, don't wrap the pixel pointer */
 r12e = r12 - 2; /* point to bit pointer storage */
 r2 = 0x0080; /* new bit mask if pixel wraps */
 r3e = r3 + 1; /* increment base to next byte upon wrap */
NO_WRAP: *r12++ = r2; /* save the next bit pointer */
 return (r14);
 *r12++ = r3e; /* save the next byte pointer start value */
 /* END OF DRAWIMAG */

#if DSP32
 /* MEMSET -- set column of image bytes to a given value
 registers used: r1 r3 r7 r14
 input: r1 - postincrement value
 r3 - byte column pointer
 r7 - storage value
 r14 - return stack register, i.e. "TOP OF STACK"
 */
MEMSET: *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l;
 *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l;
 *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l;
 *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l;
 *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l;
 *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l; *r3++r1 = r7l;
 *r3++r1 = r7l;

 return (r14);
 nop;
 /* END OF MEMSET */

 /* MEM_OR -- logical OR column of image bytes with a given value
 registers used: r1 r4 r5 r6 r7 r14
 input: r1 - postincrement value
 r2 - 'OR' value
 r4 - lagging byte column pointer
 r5 - leading byte column pointer
 r14 - return stack register, i.e. "TOP OF STACK"
 */
MEM_OR: r6l = *r5++r1; /* load column byte */
 *r4++r1 = r7l; /* save initial 'dummy' value */
 r6 = r6 r2; /* 'OR' column byte with byte value */
 r7l = *r5++r1; /* load the next column byte */
 *r4++r1 = r6l; /* save the new column byte value */
 r7 = r7 r2; /* 'OR' the next column byte with the byte value */
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;

 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l; r6 = r6 r2;
 r7l = *r5++r1; *r4++r1 = r6l; r7 = r7 r2;
 r6l = *r5++r1; *r4++r1 = r7l;
 return (r14);
 nop;
 /* END OF MEMSET */
#endif





[LISTING THREE]

#include "dsp_type.h"
#if DSP32
#include "dspregs.h"
#endif

.extern DECOMP, DRAWIMAG
.extern WAVEADRS, WAVELVLS, IMAGSHOW, LVLADDRS
.extern SIG_DRAW, DRAW_CNT, H_FILTER, L_FILTER
.extern RST_DATA, IM0_PTRS, IM1_PTRS, IMAGE_0
.extern STACKEND, SIGNALIN, DATA_OUT

 /* ANALYZER
 main control program for the wavelet analyzer
 registers used: r1 r2 r3 r4 r5 r6 r8 r9
 r11 r12 r13 r14 r15 r16 r17
 accumulators used: a0 a1 a2 a3
 */
.rsect ".bank0"
 dauc = 0x0000; /* initialize DAU formats */
 r1e = WAVELVLS; /* point to jump address for wavelet filter size */
 r2e = SIGNALIN; /* point to approximation data storage */
 r3 = *r1++; /* load number of levels in wavelet transform */
 r4 = 8; /* load number of unit intervals per image */
 r1e = r1 + 2; /* point to the unit interval count storage */
 *r1++ = r4; /* store the unit interval count value */
 r4 = r4 - r4; /* zero the register */
 *r1++ = r4; /* set active image flag storage to 0 (IMAGE 0) */
 a3 = a3 - a3; /* intial data value is zero */
#if DSP32C
 r3 = r3 - 1; /* number of levels in wavelet transform - 1 */
 do 8, r3; /* repeat next nine instructions r5+1 times */
#else
 r3 = r3 - 2; /* number of levels in wavelet transform - 2 */
#endif
INITAPPX: *r1++ = r2e; /* store base address of each approximation level */
 *r2++ = a3 = a3; /* zero storage for approximation level */

 *r2++ = a3 = a3; /* r2 ends up pointing to first location .... */
 *r2++ = a3 = a3; /* in the next approximation level */
 *r2++ = a3 = a3;
 *r2++ = a3 = a3;
 *r2++ = a3 = a3;
#if DSP32C
 nop;
 *r1++ = r2e; /* store the next location */
 ioc = 0x40987; /* initialize the serial I/O to the codec */
 r1e = RST_DATA; /* point to reset pointer initialization storage */
 r2e = SIG_DRAW; /* point to signal drawing data sets */
 r3 = 26; /* integer pointer index, 4 bytes per pointer */
 r2e = r2 + 22; /* point to bit pointer storage */
 do 3, 7; /* perform next 4 instructions eight times */
 r4e = *r1++; /* load initial bit pointer */
 r5e = *r1++; /* load initial byte column pointer */
 *r2++ = r4; /* store initial bit pointer */
 *r2++r3 = r5e; /* store initial byte column pointer */
#else
 if (r3-- >=0) goto INITAPPX; /* repeat for all arrays */
 *r1++ = r2e; /* store the next location */
 ioc = 0x0987; /* initialize the serial I/O to the codec */
 r1 = RST_DATA; /* point to reset pointer initialization storage */
 r2 = SIG_DRAW; /* point to signal drawing data storage */
 r3 = 26; /* integer pointer index, 2 bytes per pointer */
 r2 = r2 + 22; /* point to bit pointer storage */
 r4 = 6; /* initialize loop counter for eight loops */
INIT_PTR: r5 = *r1++; /* load initial bit pointer */
 r6 = *r1++; /* load initial byte column pointer */
 *r2++ = r5; /* store initial bit pointer */
 if (r4-- >=0) goto INIT_PTR; /* repeat for all pointers */
 *r2++r3 = r6; /* store initial byte column pointer */
#endif
 r12e = WAVEADRS; /* load pointer to filter jump address */
 r13e = STACKEND; /* load pointer to top of stack memory */
 r11e = *r12++; /* load jump address for filter size */
 r16e = *r12++; /* load coefficient pointer wrap back index */
 r17 = *r12++; /* load number of levels */
 r15 = *r12; /* load image data space size / 4 */
 r3e = IMAGE_0; /* point to the detail level float output */
#if DSP32C
 r15 = r15 - 1; /* number of data points minus one */
 do 0, r15; /* repeat next instruction r15+1 times */
 *r3++ = a3 = a3; /* clear four bytes in IMAGE_0 array */
#else
 r15 = r15 - 2; /* number of data points minus two */
CLR_IMAG: if (r15-- >=0) goto CLR_IMAG; /* repeat until all image cleared */
 *r3++ = a3 = a3; /* clear four bytes in IMAGE_0 array */
#endif
 goto ENTRY_PT; /* jump to the entry point */
 nop;

MAINLOOP: r8 = DATA_OUT; /* point to output data array */
 r12e = SIG_DRAW; /* point to signal drawing data sets */
 r9e = DRAW_CNT; /* point to draw count array */
 a0 = *r8++; /* load first signal data point */
 r9e = r9 + r15; /* index into draw count array */
 call DRAWIMAG (r14); /* draw the first input sample point */
 r9l = *r9; /* load the draw loop counter */

 r12e = SIG_DRAW; /* point to signal drawing data sets */
DRWLOOP1: if (ibf) goto SAMPL_IN; /* if true, next sample available */
 nop;
 a0 = *r8++; /* load data point */
 call DRAWIMAG (r14); /* draw data point */
 nop;
 if (r9-- >= 0) goto DRWLOOP1; /* repeat for all levels of xfrm */
#if DSP32C
 nop;
#else
 r12 = r12 + 2; /* point to data set for next level */
#endif
WAITIBF1: if (ibe) goto WAITIBF1; /* wait until next data sample arrives */
 nop;
 goto TEST_LVL;
SAMPL_IN: a3 = float(ibuf); /* load the new signal sample */
DRWLOOP2: a0 = *r8++; /* load data point */
 call DRAWIMAG (r14); /* draw data point */
 nop;
 if (r9-- >= 0) goto DRWLOOP2;
#if DSP32C
 nop;
#else
 r12 = r12 + 2; /* point to data set for next level */
#endif
TEST_LVL: r15 - 0; /* test for process of all levels */
 if (ne) goto ENTRY_PT; /* if true, all levels have not been done */
 r12e = IMAGSHOW; /* point to the unit interval count down */
 nop;
 r3 = *r12++; /* load the unit interval count down */
 r4 = *r12--; /* load the image draw flag */
 r3 = r3 - 1; /* decrement the unit interval count */
 if (ne) goto ENTRY_PT; /* if true, not done with new data set */
 *r12++ = r3; /* save the new count */
 r3 = 8; /* reset count to eight unit intervals */
 r12 = r12 - 2; /* point to unit interval count storage */
 *r12++ = r3; /* save the reset count */
 pir = r4; /* interrupt host processor for new image */
 r4 = r4; /* tickle CAU flags for image set */
 if (ne) goto IM0_NEXT; /* if true, set up for drawing on IMAGE 0 */
 nop;
 r4 = 1; /* next image displayed is IMAGE 1 */
 goto PNTRINIT; /* go to image pointer initialization */
 r2e = IM1_PTRS; /* set up for drawing on IMAGE 1 */

IM0_NEXT: r4 = 0; /* next image displayed is IMAGE 0 */
 r2e = IM0_PTRS; /* set up for drawing on IMAGE 0 */
PNTRINIT: r1 = SIG_DRAW; /* point to signal drawing data sets */
 *r12++ = r4; /* save the image draw flag */
 r1e = r1 + 24; /* point to byte column pointer storage */
#if DSP32C
 r15e = 28; /* set up post increment value */
 do 0, 7; /* perform next instruction eight times */
 a0 = (*r1++r15 = *r2++) + a0; /* moves four bytes at once! */
#else
 r3 = 28; /* set up post increment value */
 r4 = 6; /* set up loop counter for eight iterations */
REINIPTR: r5 = *r2++; /* load image array pointer */
 if (r4-- >= 0) goto REINIPTR; /* repeat for all pointers */

 *r1++r3 = r5; /* store image array pointer */
#endif
WAIT_PIE: if (pif) goto WAIT_PIE; /* wait for pif flag to be cleared */
 nop;
ENTRY_PT: r3e = SIGNALIN; /* point to input signal storage array */
WAITIBF2: if (ibe) goto WAITIBF2; /* wait until next data sample arrives */
 r15 = r17; /* inititialize the recursion counter */
 *r3++ = a2 = float(ibuf); /* output second data sample first */
 r6e = DATA_OUT; /* point to output data array */
 *r3 = a3 = a3; /* output first data sample last */
 *r6++ = a3 = a3; /* place first sample in data output array */
 *r6++ = a2 = a2; /* place second sample in data output array */
 r1e = H_FILTER; /* point to detail filter coefficients */
 r2e = L_FILTER; /* point to approx. filter coefficients */
 r12e = LVLADDRS; /* point to data level address pointers */
 a0 = a0 - a0; /* zero the accumulators */
 call DECOMP (r14); /* start the recursive decomposition */
 a1 = a0;
 goto MAINLOOP;
 nop;






[LISTING FOUR]

/* WAVEDATA.S */
#include "dsp_type.h"
#if DSP32
#include "dspregs.h"
#endif

.global WAVEADRS, WAVELVLS, IMAGSHOW, LVLADDRS
.global SIG_DRAW, DRAW_CNT, H_FILTER, L_FILTER
.global RST_DATA, IM0_PTRS, IM1_PTRS, IMAGE_0
.global STACKEND, SIGNalIN, DATA_OUT

.align 4
WAVEADRS: int24 0; /* jump address for wavelet filter length */
WAVEINDX: int24 0; /* wrap back index for wavelet filter length */
WAVELVLS: int 6, 1625; /* number of levels, clear loop counter */
IMAGSHOW: int 8, 0; /* unit interval count, active image pointer */
LVLADDRS: int24 SIGNalIN, APPROX_5; /* data pointer storage for level 5 */

 int24 APPROX_5, APPROX_4; /* data pointer storage for level 4 */
 int24 APPROX_4, APPROX_3; /* data pointer storage for level 3 */
 int24 APPROX_3, APPROX_2; /* data pointer storage for level 2 */
 int24 APPROX_2, APPROX_1; /* data pointer storage for level 1 */
 int24 APPROX_1, 0; /* data pointer storage for level 0 */
.align 4
SIG_DRAW: float 0.0, 24.0, 48.0, 0.0; /* scaling and offset coefficients */
 int 0, 64; /* temp storage, row increment */
 int 1536, 0; /* baseline value, bit pointer */
 int24 0; /* byte column pointer */
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 32;

 int 768, 0;
 int24 0;
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 16;
 int 384, 0;
 int24 0;
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 8;
 int 192, 0;
 int24 0;
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 4;
 int 96, 0;
 int24 0;
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 2;
 int 48, 0;
 int24 0;
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 2;
 int 48, 0;
 int24 0;
.align 4
 float 0.0, 24.0, 48.0, 0.0;
 int 0, 2;
 int 48, 0;
 int24 0;

DRAW_CNT: byte 6, 4, 3, 2, 1, 0;

.align 4
H_FILTER: 6*float 0.0; /* highpass wavelet filter storage allocation */
L_FILTER: 6*float 0.0; /* lowpass wavelet filter storage allocation */

/* image pointer reset initialization data */
RST_DATA: int24 0x02; /* position of first pixel in unit interval at reset */
 int24 IM0INITS; /* 00 00 00 00 00 00 00 01 */

 int24 0x01, IM0INIT5; /* 00 00 00 01 */

 int24 0x01, IM0INIT4; /* 00 01 */

 int24 0x01, IM0_LVL3; /* 01 */

 int24 0x10, IM0_LVL2; /* 10 */

 int24 0x40, IM0_LVL1; /* 40 */

 int24 0x80, IM0_LVL0; /* 80 */

 int24 0x80, IM0_LVLA; /* 80 */

/* image pointer switch initialization data */
IM0_PTRS: int24 IM0SIGNL, IM0_LVL5, IM0_LVL4, IM0_LVL3;

 int24 IM0_LVL2, IM0_LVL1, IM0_LVL0, IM0_LVLA;
IM1_PTRS: int24 IM1SIGNL, IM1_LVL5, IM1_LVL4, IM1_LVL3;
 int24 IM1_LVL2, IM1_LVL1, IM1_LVL0, IM1_LVLA;

/* IMAGE 0 storage allocation */
IMAGE_0:
IM0SIGNL: 7*byte 0;
IM0INITS: 3193*byte 0;

IM0_LVL5: 3*byte 0;
IM0INIT5: 1597*byte 0;

IM0_LVL4: byte 0;
IM0INIT4: 799*byte 0;

IM0_LVL3: 400*byte 0;

IM0_LVL2: 200*byte 0;

IM0_LVL1: 100*byte 0;

IM0_LVL0: 100*byte 0;

IM0_LVLA: 100*byte 0;

/* IMAGE 1 storage allocation */
IM1SIGNL: 3200*byte 0;
IM1_LVL5: 1600*byte 0;
IM1_LVL4: 800*byte 0;
IM1_LVL3: 400*byte 0;
IM1_LVL2: 200*byte 0;
IM1_LVL1: 100*byte 0;
IM1_LVL0: 100*byte 0;
IM1_LVLA: 100*byte 0;

.align 2
STACKBSE: 31*int24 0; /* subroutine stack storage allocation */
STACKEND: int24 0;

.rsect ".hi_ram"
SIGNalIN: 6*float 0.0; /* approximation data storage allocation */
APPROX_5: 6*float 0.0;
APPROX_4: 6*float 0.0;
APPROX_3: 6*float 0.0;
APPROX_2: 6*float 0.0;
APPROX_1: 6*float 0.0;
DATA_OUT: 9*float 0.0; /* output data storage allocation */






[LISTING FIVE]

DSP_REGS.H

/* register file redefinition */
#define r1e r1

#define r2e r2
#define r3e r3
#define r4e r4
#define r5e r5
#define r6e r6
#define r7e r7
#define r8e r8
#define r9e r9
#define r10e r10
#define r11e r11
#define r12e r12
#define r13e r13
#define r14e r14
#define r15e r15
#define r16e r16
#define r17e r17
#define r18e r18
#define r19e r19
#define r20e r20
#define r21e r21

/* integer and float redefinition */
#define int24 int
#define float24 float

DSP_TYPE.32
#define DSP32 1

DSP_TYPE.32C
#define DSP32C 1

MAKE32.BAT
copy dsp_type.32 dsp_type.h
d3make -M2 -N -l analyzer.s decomp.s drawimag.s wavedata.s -o anlyzr32.dsp

MAKE32C.BAT
copy dsp_type.32c dsp_type.h
d3make -M6 -Q -O -l analyzer.s decomp.s drawimag.s wavedata.s -o anlyz32c.dsp





[LISTING SIX]

V592x480.h
void Set592x480(void);
void ShiftWaveTraces(unsigned int src, unsigned int dest);
void GetDSPimage(unsigned int dest, unsigned int io_addr);

V592x480.c
/* Mode set routine for VGA 592x480 16-color mode. Tested with Borland C 2.0
*/
#include <stdlib.h>
#include <dos.h>

unsigned char palette_set[17]={0,64,60,3,4,5,6,55,1,62,1,62,56,62,7,62,0};
void Set592x480(void)
{
 union REGS regset;

 struct SREGS sregset;

 /* First, set to standard 640x480 mode (mode 12h) */
 regset.x.ax = 0x0012;
 int86(0x10, &regset, &regset);

 /* Next, set up the new color palette */
 regset.x.ax = 0x1002;
 regset.x.dx = (unsigned int) palette_set;
 sregset.es = _DS;
 int86x(0x10, &regset, &regset, &sregset);

 /* Now, tweak the registers needed to convert the horizontal character
 count from 80 to 74 characters (640 to 592 pixels) per line */
 outportb(0x3D4, 0x11); /* allow access to CRTC registers 0 - 7 */
 outportb(0x3D5, inportb(0x3D5) & 0x7f);
 outport(0x3D4, 0x4901); /* adjust the Horizontal Display Enable End
 register for 74 byte display area width */
 outport(0x3D4, 0x2513); /* adjust the Offset register for 74 byte
 (37 word) display area width */
 /* adjust Line Compare register to start display of non-flipping area at
 line 400 (row scan number 399 (0x18f) */
 outportb(0x3D4, 9); /* clear tenth bit of Line Compare count */
 outportb(0x3D5, inportb(0x3D5) & 0xbf);
 outportb(0x3D4, 7); /* set ninth bit of Line Compare count */
 outportb(0x3D5, inportb(0x3D5) 0x10);
 outport(0x3D4, 0x8f18); /* remaining eight bits of Line Compare */

 /* adjust the Start Address High and Start Address Low registers
 to start screen display on page 0 */
 outport(0x3D4, 0x170c);
 outport(0x3D4, 0x200d);

 outportb(0x3D4, 0x11); /* block access to CRTC registers 0 - 7 */
 outportb(0x3D5, inportb(0x3D5) 0x80);
}
#define CLD 0xfc
#define PUSH_DS 0x1e
#define PUSH_CX 0x51
#define POP_DS 0x1f
#define POP_CX 0x59
#define REP 0xf3
#define MOVSB 0xa4
#define STOSB 0xaa
#define INSW 0x6d
#define DEC_AX 0X48
#define DEC_CX 0X49
#define DEC_DX 0X4a
#define INC_DX 0X42
#define DEC_SI 0X4e
#define JNE 0X75
#define OUT_DX_AL 0xee
#define USE_ES 0x26
#define MOV_AL_DI 0x058a
#define MOV_CX_AX 0xc88b
#define MOV_CX_DX 0xca8b
#define ADD_DI_BX 0xfb03
#define ADD_SI_BX 0xf303
#define ADD_SI_CX 0xf103

#define SUB_SI_CX 0xf12b

unsigned char colorarray[74];
unsigned char leftedges[8]={0xff, 0x7f, 0x3f, 0x1f, 0x0f, 0x07, 0x03, 0x01};
unsigned char rightedges[8]={0x80, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc, 0xfe, 0xff };
static unsigned int shift_blocks[7][2] = { {42, 32}, {26, 48}, {18, 56},
 {14, 60}, {12, 62}, {11, 63}, {11, 63}};
void ShiftWaveTraces(unsigned int src, unsigned int dest)
{
 int i;
 /* set the Mode Register Write Mode to 1 */
 outportb(0x3ce, 5);
 outportb(0x3cf, (inportb(0x3cf) & 0xfc) 0x01);
 /* set the Map Mask Register to enable writes to pixel planes
 0, 1, and 3 and disable writes to pixel plane 2 */
 outport(0x3c4, 0x0b02);
 _SI = src;
 _DI = dest + 10;
 _ES = 0xa000;
 __emit__(CLD); /* assure that MOVSB increments SI and DI */
 for (i = 0; i < 7; i++)
 {
 _DX = shift_blocks[i][1]; /* load the length of each line for block */
 _BX = shift_blocks[i][0]; /* init middle loop for wrap value for block */

 __emit__(PUSH_DS);

 _DS = 0xa000;
 _AX = 50; /* init middle loop for number of lines in block */

 __emit__(ADD_SI_BX, MOV_CX_DX, REP, MOVSB);
 __emit__(ADD_DI_BX, DEC_AX, JNE, 0xf5, POP_DS);
 }
 /* set the Mode Register Write Mode to 0 */
 outportb(0x3ce, 5);
 outportb(0x3cf, inportb(0x3cf) & 0xfc);
 /* set the Map Mask Register to enable writes to all pixel planes */
 outport(0x3c4, 0x0f02);
}
static unsigned int imag_blocks[6][2]={{10,32},{42,16},{58,8},
 {66,4},{70,2},{72,1}};
void GetDSPimage(unsigned int dest, unsigned io_addr)
{
 int i;
 /* Map Mask register - set pixel planes 1, 2, and 3 to "0",
 pixel plane 0 to "1" */
 outport(0x3c4, 0x0102);
 _DX = io_addr;
 _DI = dest;
 _ES = 0xa000;
 __emit__(CLD); /* assure that INSW increments DI */
 for (i = 0; i < 6; i++)
 {
 /* point to fill element offset */
 _AX = imag_blocks[i][1]; /* load the length of each line for block */
 _BX = imag_blocks[i][0]; /* init middle loop for wrap value for block */
 _SI = 50; /* init middle loop for number of lines in image block */

 __emit__(ADD_DI_BX); /* wrap destination pointer to next row of display */

 __emit__(MOV_CX_AX, REP, INSW, DEC_SI, JNE, 0xf7);
 }
 _DI += 1; /* offset pointer by one */
 _CX = imag_blocks[5][0]; /* init middle loop for wrap value for block */
 _SI = 100; /* init middle loop for number of lines in image block */
 __emit__(ADD_DI_BX, INSW, DEC_SI, JNE, 0xfa);
 /* Map Mask register - set pixel planes 0 - 3 to "1" */
 outport(0x3c4, 0x0f02);
}





















































April, 1993
ROUTING ALGORITHMS FOR INTERNETWORKING


Getting from here to there over networks, bridges, and routers




William Stallings


William is an independent consultant and president of Comp-Comm Consulting of
Brewster, Massachusetts. He is a frequent lecturer on networking topics and
the author of over a dozen books on data communications and computer
networking. This article is based on material in Bill's latest book,
Networking Standards: A Guide to OSI, ISDN, LAN, and MAN Standards
(Addison-Wesley, 1993). Bill can be reached at 72500.3562@ compuserve.com.


There is a trend within many organizations toward the use of an increasing
number of LANs and other networks joined together in an internet to support
distributed processing needs. For relatively simple configurations of LANs,
interconnection is usually provided by bridges. For more complex internets,
and for ones that include metropolitan-and wide-area networks, the more
sophisticated router is used for interconnection.
A major task in an internet is the routing of packets from source to
destination through a sequence of networks and bridges or routers. For this
purpose, an algorithm is needed to determine a route for each
source-destination pair, and a protocol is needed by which the bridges or
routers can exchange configuration and traffic information in order to
calculate the routes.
For internets based on bridges, routing is usually accomplished with source
routing or spanning-tree routing, both of which have been standardized by the
IEEE 802 LAN standards group. However, neither of these techniques scale up
well to a complex, highly interconnected internet supported by routers.
Three different routing protocols have emerged in recent years to answer the
need for dynamic routing that provides load balancing and response to
failures.
The Open Shortest Path First (OSPF) protocol (RFC 1247, July 1991) is designed
for routers in a TCP/IP environment.
The international standard is the Intermediate System (IS) to IS Intra-Domain
Routing Information Exchange Protocol (ISO 10589, January 1992), which is
designed for an OSI environment.
Finally, the Integrated IS-IS protocol modifies the IS-IS protocol to enable
its use to support both TCP/IP and OSI (RFC 1195, December 1990).
All three protocols use the same underlying routing algorithm, known as
Dijkstra's algorithm, and the three protocols are quite similar. This article
provides an overview of the routing algorithm and describes how its use is
supported by the routing protocols.


Statement of the Problem


For internets that use the internet protocol (IP) or the ISO connectionless
network protocol, the key function of a router is to make a routing decision
to forward network-level protocol-data units, called "datagrams," on their
next hop. The information required by a router to perform the routing function
is of three types: topology, end-system (ES) reachability, and hop cost. For
topology information, a router needs to know about the existence of the other
routers and how they are interconnected. We can abstract the topology into a
graph consisting of nodes connected by edges. Each node is a router, and each
edge is either a point-to-point link or a subnetwork.
The second type of information needed by the router is ES reachability. For
each end system, the router needs to know the identity of the subnetwork that
contains that ES.
Finally, each hop must be assigned a cost in each direction. The cost
associated with each hop, in each direction, is generally referred to as a
"routing metric." The routing metrics used in IS-IS are arranged in four
levels, with a lower value indicating a more optimum (lower-cost) choice. In
order, the routing metrics are:
Default, which is assigned by routing administrators to satisfy any
administrative policies. The default metric is understood by every router.
Each hop has a positive integer value assigned to it. The values may be
assigned arbitrarily, but the intent is that the metric should be a function
of throughput or capacity; higher values indicate lower capacity.
Delay, which is the measure of the transit time or delay through a particular
hop. This is made up of propagation delay plus queueing delay at the router,
and is measured dynamically by each router for each hop element to which it is
connected.
Expense, which is related to the monetary cost of moving internet traffic
through a particular subnetwork.
Error, which is a measure of the probability of error over this hop.
Routing metrics are applied in a cumulative fashion, so that the cost of a
particular hop is equal to the sum of all applicable metrics. Figure 1 is a
logical diagram of an internet that indicates these costs. Each circle
represents a router, and the two arrowed lines between each pair of circles
represents a subnetwork or link connecting the two routers; the numbers on the
lines indicate the current hop cost in each direction.


The Least-cost Algorithm


The hop costs are used as input to the path-calculation routine. Each router
maintains an information base containing the topology and hop costs of each
hop in the internet. This information is used to perform what is referred to
as a "least-cost" routing algorithm, which can be simply stated as:
Given a network of nodes connected by bidirectional links, where each link has
a cost associated with it in each direction, define the cost of a path between
two nodes as the sum of the costs of the links traversed. For each pair of
nodes, find the path with the least cost.
The algorithm used in OSPF, IS-IS, and Integrated IS-IS was originally
proposed by Dijkstra. It enables each router to find the least-cost route to
every other router. Dijkstra's algorithm can be stated thus: Find the shortest
paths from a given source node to all other nodes by developing the paths in
order of increasing path length. The algorithm proceeds in stages. By the kth
stage, the shortest paths to the k nodes closest to (least cost away from) the
source node have been determined; these nodes are in a set M. At the (k + 1)
stage, the node not in M that has the shortest path from the source is added
to M. As nodes are added to M, their path from the source is defined. The
algorithm is formally described in Example 1. Table 1 and Figure 2 show the
results of applying this algorithm to Figure 1, using s = 1. Note that at each
step, the path to each node plus the total cost of that path is generated.
After the final iteration, the least-cost path to each node and the cost of
that path has been developed. The same procedure can be used with node 2 as
source node, and so on.
Example 1: Dijkstra's algorithm.

 Define:
 N =set of node in the network
 s =source node
 M =set of nodes so far incorporated by the algorithm
 d[ij] =hop cost from node i to node j; d[ii] =O, and d[ij]= infinity
 if the nodes are not directly connected
 D[n] =cost of the least-cost path from node s to node n that is
 currently known to the algorithm

 The algorithm has three steps; steps 2 and 3 are repeated until M=N:

 1. Initialize:

 M = {s} (set of nodes incorporated is only the source node)
 D[n] =d[sn] for n not equal s (initial path costs to neighboring nodes
 are simply the hop costs)

 2. Find the neighboring node not in M that has the least-cost path from
 node s and incorporate that node into M:
 min
 Find w epsi s M such that D[w]= D[j]
 j epsi s M
 Add w to M

 3. Update least-cost paths:
 D[n]=min[D[n], D[w]+d[w,n]]for all nepsiM
 If the latter term is the minimum, the path from s to n is now the
 path from s to w, concatenated with the hop from w to n.

Table 1: Least-cut routing calculation.

 Iteration M D[2] Path D[3] Path D[4] Path D[5] Path D[6] Path

------------------------------------------------------------------------------------

 1 {1} 2 1-2 5 1-3 1 1-4 infinity -- infinity --
 2 {1,4} 2 1-2 4 1-4-3 1 1-4 2 1-4-5 infinity --
 3 {1,2,4} 2 1-2 4 1-4-3 1 1-4 2 1-4-5 infinity --
 4 {1,2,4,5} 2 1-2 3 1-4-5-3 1 1-4 2 1-4-5 4 1-4-5-6
 5 {1,2,3,4,5} 2 1-2 3 1-4-5-3 1 1-4 2 1-4-5 4 1-4-5-6
 6 {1,2,3,4,5,6 2 1-2 3 1-4-5-3 1 1-4 2 1-4-5 4 1-4-5-6

Dijkstra's algorithm is known to converge under static conditions of topology
and hop costs. If the hop costs change over time, the algorithm will attempt
to catch up with these changes. However, if the hop cost depends on traffic,
which in turn depends on the routes chosen, then a feedback condition exists,
and instabilities may result.


The Routing Protocol


The validity of the algorithm will, of course, depend on the validity of the
information used as input. The routing information may change over time. A
router or subnetwork failure can alter the topology, and some of the costs,
especially delay, are variable. Thus, some sort of information-exchange
discipline is needed to govern the frequency with which routers exchange
routing information. This is the purpose of all three routing protocols, and
all three use essentially the same strategy for distributing information. The
general strategy has the following elements:
1. Each router generates and maintains information about its local
environment, consisting of:
The identity of each of its router neighbors and the cost to reach each
neighbor, using some or all of the routing metrics. Thus, there must be some
type of discovery function by which each router learns of its neighbors. This
is part of the protocol, consisting of discovery-type PDUs.
The identity of each ES directly reachable from this router.
2. The information generated in step #1 is referred to as "link-status"
information. Each router distributes this information to all peer routers,
using the routing protocol. The information to be distributed is contained in
link-status PDUs (LSPs).
3. Based on steps #1 and #2, each router can maintain the following
information:
The topology, consisting of a set of routers and the links connecting them.
The cost of each link in each direction.
A list of ESs that are neighbors of each router.
This information is sufficient to make a routing decision for each datagram.
Since the point of distributing LSPs is to enable each router to build up a
picture of the topology of its area or domain, how can a router know to whom
and how to deliver the LSPs? There appears to be a circular dilemma: A router
needs to know the topology in order to directly address its LSPs to each other
router, yet the router learns the topology from LSPs. The way out of this
dilemma is flooding: The LSP is sent by the source router to every one of its
neighbor routers. At each router, an incoming LSP is retransmitted on all
outgoing links except for the link that it arrived from. Eventually, all
routers will receive a copy of the LSP.
Flooding solves the delivery problem, but raises some new problems:
1. How is the flooding stopped? That is, how do we prevent the indefinite
retransmission of an LSP by every router that receives it?
2. How do we prevent obsolescence? That is, if a router fails, how will other
routers know that the last LSP that they received from the failed router is
obsolete?
3. How do we ensure currency? Since LSPs are being transmitted in a
connectionless fashion, an older LSP may arrive after a younger LSP from the
same source.
Let us examine each of these problems in turn. First, flooding is easily
controlled since each router stores a copy of each LSP that it receives. Each
LSP includes a sequence number so that two different LSPs from the same source
can be distinguished. When a router receives an LSP from a given source, it
checks to see whether it already has a copy of that particular LSP (same
source, same sequence number). If so, it does not retransmit the incoming LSP.
The result is that each LSP traverses each link in the topology only once.
Next, consider the problem of obsolescence. Each LSP includes a parameter
called "remaining lifetime." This parameter is set to some default maximum
value by the source router. The parameter is decremented each time it is
retransmitted by a router. Once a router stores an LSP in its database, it
periodically decrements the lifetime parameter until it reaches 0. If the
sequence number of an LSP reaches 0 (no new LSP from the same source arrives
in time), the LSP is purged from the database.
Finally, a router can determine which of two LSPs from the same source is the
more recent on the basis of the sequence number. When a router is initialized,
it issues its first LSP with a sequence number of 1 and increments the
sequence number for each subsequent LSP it creates. With a 32-bit sequence
number, it is unlikely that all of the sequence numbers will be used up.
However, if a router reaches the maximum sequence number (2{32}-1), the
routing function at the router is disabled for a period sufficient to ensure
that all of the prior LSPs issued by this router have expired. The router then
begins again with a sequence number of 1.


Hierarchical Routing


The routing discipline can actually be more complex than that just described.
All of the routing protocols we have been describing are designed to work in a
multilevel, hierarchical routing environment. We will describe the environment
designed for OSI configurations; the one for TCP/IP configurations is similar.
In an OSI configuration, an internet can be divided into a number of routing
domains. A domain is a large-scale portion of an internet, generally organized
along geographical or organizational lines. For example, all of the local area
networks at a site, such as a military base or campus, could be linked by ISs
to form a routing domain. This complex might be linked through a wide area
network to other routing domains. Domains can be further subdivided into
areas. This hierarchical approach has a number of advantages:

It minimizes the amount of information exchanged by ISs, thus simplifying the
operation of ISs at all levels.
It allows different routing optimizations within each level of the hierarchy.
It protects the entire routing environment from inaccurate information
generated by any intermediate system.
It constructs "firewalls" between different portions (areas, domains) to
provide access control and other mechanisms to protect and secure the
environment.
It simplifies routing protocol evolution, since ISs at one level need not know
the protocol or topology at other levels.
Four levels of routing can be defined:
Level-0 routing: routing of traffic between ESs and ISs on the same
sub-network.
Level-1 routing: routing of traffic between ISs within the same area.
Level-2 routing: routing of traffic between different areas within the same
routing domain.
Level-3 routing: routing of traffic between different domains.
Level-0 routing is covered by routing protocols that operate between ESs and
ISs; these tend to be quite simple. Levels 1 and 2 are covered by the IS-IS
routing protocols discussed in this article. Currently, there is no standard
for level-3 routing. At this level, the gross topology will generally be
rather simple, and static routing based on manual configuration will usually
suffice.
The information required by an IS to perform the routing function depends on
its role. For topology information, a level-1 IS only needs to know the
existence of the other level-1 ISs in its area and at least one level-2 IS in
its area, plus the way in which these ISs are interconnected. Similarly, a
level-2 IS needs only know the identity of the other level-2 ISs in its
routing domain and how they are interconnected. In either case, we can
abstract the topology into a graph consisting of nodes connected by edges.
Each node is an IS, and each edge is either a point-to-point link or a
subnetwork.
The second type of information needed by the IS is ES reachability. In the
case of a level-1 IS, it needs to know, for each ES in its area, the identity
of the subnetwork that contains that ES. In the case of a level-2 IS, it needs
to know, for each ES, the area that contains that ES and a level-2 IS in that
area.
Finally, for either level-1 or level-2 routing, each hop must be assigned a
cost in each direction. In the case of level-1 routing, a hop is a subnetwork
or point-to-point link connecting ISs. For level-2 routing, a hop is a
point-to-point link between level-2 ISs.
















































April, 1993
MEASURING FRAGMENTATION


Are malloc and free fragmenting your heap?




James Harrington


Jim is a consultant on memory-management issues and author of a C
memory-management library. He may be reached at Library Technologies, P.O. Box
56031, Madison, WI 53705-9331, or on CompuServe at 71333,30.


As a program dynamically allocates memory, available memory is subdivided into
blocks by an allocation function. In C, that function is malloc(). Usually,
all allocated blocks wind up at contiguous addresses until some memory is
freed. When some (but not all) blocks are freed, the result is usually
fragmented memory--free memory not in a single contiguous block. In such
cases, free memory is broken into less-optimally sized, smaller blocks,
separated from each other by allocated blocks. Depending on the implementation
of the allocation and free functions, adjacent freed blocks may not
necessarily be combined into single larger free blocks; they may remain as
separate free blocks--another manifestation of fragmentation.
When free memory becomes fragmented and an allocation request is made, the
allocation function may return NULL ("not enough memory") even if enough total
free memory is available, because no single contiguous free block is large
enough to satisfy the memory request. Furthermore, if memory is allocated at
the base of the higher of two adjacent free blocks, there's free memory below
and above the allocated block. This free memory would have existed as a single
larger free block if adjacent free blocks had been previously combined.
Fortunately, once you've determined that fragmentation is a problem, you can
often avoid it. I'll present a method for quantifying the degree of
fragmentation, and give code that implements this method for Borland C++ 3.1
and Microsoft C/C++ 7.0. The tests presented here were performed under DOS 5.0
on a 20-MHz 80386 with 8 Mbytes of RAM (although only the first 640K of RAM is
relevant).


A Fragmentation Index


The heap is the expanse of memory that dynamic memory-allocation functions
utilize as the pool of allocatable memory. All addresses returned by malloc()
or other standard allocation functions point into the heap. The current
configuration of your program's heap consists of the sizes and locations of
allocated and free blocks at the particular moment in question. You can assign
a value called the "fragmentation index" to any heap configuration according
to the formula in Figure 1, which gives larger values as fragmentation
increases. The size of the largest allocatable block is the largest number of
bytes you can request from the allocation function without having it return
NULL. Be aware that the fragmentation index applies to the memory
configuration only at a particular moment in your program's execution. You can
get the index a number of times as your program runs and see the progression
of heap fragmentation as it occurs. You can use values of the index to
determine which portions of code are causing the greatest fragmentation
problems, and/or whether fragmentation is a problem at all. Using this formula
requires you to access the entire memory configuration. (On some PCs or
compilers, this isn't possible.)
Figure 1: Formula to calculate the fragmentation index. The index produces
larger values as fragmentation increases.

 breakup factor
fragmentation index = -----------------
 large free factor

where

 number of free blocks
breakup factor = -------------------------
 number of used blocks + 1

and

 size of largest allocatable block
large free factor = -----------------------------------
 total amount of free memory



Logic Behind the Formula


Several issues relate to the fragmentation index, the first of which is the
number of free blocks. The ideal is to have only one free block, containing
all free memory. The more individual free blocks there are, the worse memory
is fragmented.
The number of used blocks is relevant because the more used blocks there are,
the more free blocks could potentially be separating them. For example, if
your heap contains 20 used blocks and 15 free blocks, there is a potential
maximum of only 21 separate free blocks, assuming no adjacent free blocks are
allowed. You have 15 free blocks, so free memory is broken into 72 percent as
many pieces as are possible. Conversely, if your heap contains 2000 used
blocks but still has only 15 free blocks, there could be as many as 2001 free
blocks. Therefore, free memory is fragmented into only 0.7 percent as many
pieces as it could be. The fragmentation index included here reflects this
subtlety, and incorporates both the absolute amount of fragmentation present
and a measure of how well the allocation/free implementation keeps
fragmentation to a minimum.
The total amount of free memory is also relevant. If the largest allocatable
block contains nearly all of free memory, the heap is nearly optimally
configured, regardless of the number of small free blocks. On the other hand,
if the largest allocatable block is only a small portion of total free memory,
the heap is badly fragmented.
The size of the largest allocatable block is the focal point of any interest
in fragmentation. The larger this block is, the less effect fragmentation has
on your program. One element that is not factored into the index, however, is
the size of blocks you need to allocate. Ideally, that would be a part of the
formula, expressed relative to the size of the largest allocatable block.
Unfortunately, that value is of less general use.


Interpreting Index Values


In considering the interpretation of index values, recognize that the index
equation involves a compound fraction and that no fraction can have 0 as the
denominator. Thus, if you have no free blocks, the index will give an
undefined value. In such a case, the code presented here will return a
fragmentation index of -1. It will also return -1 if the heap has been
corrupted such that the index can't be properly calculated.
In normal situations, the index equation returns a value from 0 to some very
large number. If there's exactly one free block in the heap (the ideal
situation), the numerator is 0, and the fragmentation index is at its minimum
value.

There is a maximum value for the index, but the exact value of that maximum
depends on the implementation of the allocation and free functions and on the
amount of memory available to the heap. For example, consider the Borland
compiler's far-heap allocation function with an available heap of, say, 80
Kbytes. A maximally fragmented heap will always consist of alternating used
and free blocks, which are of the absolute minimum size allowable by the
allocator. With Borland C++, each block would have the minimum size of 16
bytes, and the size of the largest allocatable block would be 12 bytes. (Each
block contains four bytes of overhead.) The corresponding value of the
fragmentation index according to Figure 1 is the maximum of approximately
3331.
What sorts of fragmentation index values can you expect in a real program?
Given an 80-Kbyte heap and Borland C++, assume there are 500 allocated blocks
in the heap with 49 free blocks interspersed amongst the used blocks. Assume
one free block above the highest used block, for a total of 50 free blocks.
The total free memory is 20 Kbytes and the largest allocatable block is 5
Kbytes. The average size of the used blocks is 120 bytes, which is reasonable
for some programs with many small blocks and a few large ones. The average
size of the free blocks is 400 bytes. The value of the fragmentation index is
0.4. Whether the 0.4 value is high enough to represent a problem depends on
the size of the block(s) you need to allocate relative to the size of the
largest allocatable block (which is only one of the factors in the equation).


Using the Index


If your allocation function returns NULL, the fragmentation index has an
excessively high value, and the block being allocated is not particularly
large, then fragmentation is probably responsible for the NULL return.
If you determine that fragmentation rather than lack of memory is the problem,
you have several options: implement or purchase a better malloc() and free();
allocate and free memory in a different order; subdivide large blocks and dole
out smaller chunks of memory; or use static buffers for some blocks whose
allocation leads to increased fragmentation. Use the fragmentation index to
help locate the sections of code where fragmentation increases, and focus your
memory-management efforts on blocks allocated or freed there.


Comparing Index Values


You might think that if fragmentation-index value A is twice that of value B,
fragmentation in configuration A is twice that of configuration B. However,
you can't directly compare the index values like this. For instance, imagine a
heap configuration that contains exactly three free blocks of equal size. Now
imagine that allocating and freeing several blocks results in each free block
being bisected--there are now six free blocks, each half as large as before.
Intuitively, you could say that the level of fragmentation has exactly
doubled. However, two factors in the equation have been altered: increasing
the number of free blocks and decreasing the size of the largest allocatable
block. As a result, the original fragmentation index is multiplied by a factor
of four rather than two.
To intuitively compare two indexes, each should be transformed into a relative
fragmentation index using the equation in Figure 2. First, identify the
smallest value from the set of fragmentation-index values to be compared. That
value, which I'll call Z, is assigned a relative fragmentation index of 1.0.
Determine the relative fragmentation indexes corresponding to other values
according to the formula in Figure 2. Of course, you can choose an arbitrarily
small, positive, nonzero raw index as a base "Z value," and compute all other
relative fragmentation indexes against it.


Compiler-specific Computation


C source code for a function fragindex() that computes the raw fragmentation
index I've described is provided electronically; see "Availability," page 5.
This code is not specific to any memory manager. However, the functions
external to the module must be written to be compatible with the particular
allocator being used. Additional code provides for the external functions,
which can be used with the Borland C++ and Microsoft C/C++ memory manager.
This code generates results for the far heap, which is the default heap for
far data-memory models. (malloc() always allocates from the default heap.)
Code for the near heap would be very similar.
The total free memory in a large-model, Borland-compiled program consists of
the return value of farcoreleft() plus the total sizes of all free blocks as
reported by farheapwalk(), plus four bytes for every free block. The size of
the largest allocatable block is the size of the largest free block reported
through heapwalk(), or the return value of farcoreleft() minus 4, whichever is
larger. The number of free blocks is the number of free blocks reported by
heapwalk() plus 1, for the "coreleft" memory.
Collecting the required information for Microsoft's heap is more complicated,
since malloc() allocates chunks of memory from MS-DOS and then subdivides
these large allocated blocks as required. The heapwalk() function reports free
memory within the DOS blocks allocated by malloc(), but does not report on any
existing free DOS memory blocks (even though those memory blocks are part of
the heap). Determining the total amount of free memory requires, first,
determining and adding the sizes of all free DOS blocks, adding 16 bytes per
DOS block for the arena headers, and adding to the sum the sizes of the free
blocks reported by heapwalk() plus two bytes per free block. The size of the
largest allocatable block is the size of the largest free block reported by
heapwalk() or the size of the largest free DOS block (not including the bytes
in the arena header), whichever is larger.
Another complication is that the Microsoft allocator does not combine adjacent
free blocks, which heapwalk() reflects. Therefore, the fragmentation index is
affected. However, Microsoft's malloc() will combine adjacent free blocks if
it is required to be able to satisfy an allocation request. Thus, to determine
the size of the largest allocatable block, add the sizes of adjacent free
blocks, plus 2n (where n is the number of combined blocks minus 1) to account
for per-block overhead. If you do this, adjacent free blocks should not be
considered separate, and the number of free blocks may be correspondingly
reduced. Additionally, free DOS blocks must be counted as separate free
blocks.


Comparing Memory Managers


Given code that is able to compute fragmentation indexes for different memory
managers, it's tempting to compare the memory managers to see how well they do
at avoiding fragmentation. Do this with caution. For example, Borland's memory
manager is able to use only the free memory immediately above the program
block after loading. Microsoft's memory manager, on the other hand, can use
any allocatable DOS block (even those that might be below the program block or
in UMBs if they are linked in). Thus, the actual size of the heap may be
different for the two allocators, which affects the total amount of free
memory and presumably the size of the largest allocatable block. The size of
the heap should be the same in each test run in order to fairly and directly
compare how well the allocators avoid fragmentation.
Also, any test program must place the allocators under stress without forcing
early failure of either allocator. One extreme would be a program that simply
allocates 500 blocks of memory without freeing any. Clearly, any
halfway-decent memory manager would yield a fragmentation index of 0 after
this allocation/free pattern. At the other extreme would be a pattern of
allocations and frees that quickly fragments the heap. When comparing
different allocators, the fragmentation index must be measured with each
allocator having gone through exactly the same pattern of allocations and
frees, and excessive fragmentation can prevent that. Finally, the pattern of
memory allocations and frees should be analogous to that of a real-world
program. Thus, more small blocks should be allocated than large ones.
Listing One, page 92, presents a test program that may be used to test any
memory allocator that doesn't fail before the call to fragindex(). (Some
third-party allocators require initialization/cleanup calls that are not
included, and may require a redefinition of the allocation function's name.) I
used Listing One to compare the Borland and Microsoft allocators. The early
call to setdos() is used to equalize the playing field by allocating all free
DOS blocks that the Microsoft memory manager could use except the one large
block immediately above the program block at startup. This avoids one
uncontrolled program difference. In addition, the Borland modules rand.obj and
n_lxmul.obj were linked into programs with both allocators so that the pattern
of allocations and frees would be identical in both compiled versions. When
the programs were run, no TSRs or other memory managers (such as QEMM or
386-MAX) were loaded. One difference between the two test programs was that
the base of the heap (the top of DGROUP), and thus its size, was not identical
for the two programs, since the compiled code size was different. However, the
difference was small relative to the total amount of memory available in the
heap, and should not have a great effect on the results. Table 1 shows the
results of 7200 random allocations and frees. max size is the size of the
largest allocatable block; nfree is the number of free blocks; rawidx is the
raw fragmentation index; and relidx is the relative fragmentation index.
Table 1: Results from Listing One, which test the Borland and Microsoft memory
allocators.

 max size nfree rawidx relidx
------------------------------------------
Borland 156732 515 .6366 1.00
Microsoft 57105 745 2.180 1.89

Though the raw and relative fragmentation indexes for the allocators will be
higher or lower, depending on the test program and the amount of memory
available, the Borland allocator will always yield better results with respect
to fragmentation, due to several factors. First, the Microsoft allocator does
not combine adjacent free blocks as they are created, immediately leading to
increased fragmentation and probably more with subsequent allocations. Also,
the Microsoft allocator often doesn't reuse previously allocated and freed
memory unless it has to. Additionally, the Microsoft allocator breaks DOS
memory into chunks as part of the basic allocation algorithm, inherently
creating some fragmentation. The Borland allocator behaves differently in all
three areas. Periodic calls to Microsoft's _heap-min() function will reduce
the Microsoft allocator's fragmentation index, but it will still be higher
than Borland's.
The implementation of the allocation and free functions often greatly affects
fragmentation. In an evaluation of seven commercially available C memory
managers using the source code associated with this article, the raw
fragmentation indexes ranged at termination from 0.1556 to 34.79,
corresponding to relative fragmentation indexes from 1.0 to 3.9. These numbers
demonstrate a tremendous range in effectiveness at avoiding fragmentation. The
two best memory managers in that particular test used a best-fit allocation
algorithm instead of the first-fit algorithm used by the compilers' malloc()
functions. Using a best-fit algorithm alone does not guarantee good memory
management, however, as another manager that purportedly used a best-fit
algorithm gave the worst results of any of the seven. (Before condemning any
memory manager out of hand, recall that a number of other factors should be
considered when evaluating a memory manager.)


Summary


It's important to be aware of the negative effects fragmentation can have on
your program. If it fails due to lack of memory, check whether reducing
fragmentation will allow your program to run successfully without adding
memory resources. The fragmentation index presented here should help you make
that determination, and serve as a tool to help you manage your program's
memory usage more effectively. Using an appropriate memory manager is always
an important step in achieving good memory management in your program, but it
is especially critical when memory resources are limited and fragmentation
becomes a problem.

_MEASURING FRAGMENTATION_
by James Harrington


[LISTING ONE]

/* This module contains a generic function that calculates a fragmen-
 tation index. This index depends on the numbers of used and free
 block, the largest number of bytes that can currently be allocated
 in a contiguous block, and the total amount of free memory.


 Several functions declared as "extern" will be found in other
 modules, where each module calculates values as appropriate for
 a particular memory manager.
*/

#include <stdio.h>

double fragindex( void );
double getHeapSizeF( void );
double getBreakupF( void );
double getLargeFreeF( void );
double getWorstCase( void );

extern double getWorstCase( void ); /* In manager-specific module */
extern long getnfreeblks( void ); /* In manager-specific module */
extern long getnusedblks( void ); /* In manager-specific module */
extern long getlargest( void ); /* In manager-specific module */
extern long gettotalfree( void );

/*********************************************************************
*
* double fragindex( void )
*
* This procedure returns a relative value that reflects the degree of
* heap fragmentation. It is calculated as follows:
*
* frag_index = BreakupFactor / LargeFreeFactor
*
* BreakupFactor = number of free blocks-1 / number of used
* blocks+1, where the free memory above utilized heap is
* counted as a free block.
* LargeFreeFactor = size of largest block that can be success-
* fully allocated / total amount of unused memory,
* including overhead.
*
* maximal fragmentation occurs when every other block is free, all
* blocks are of the minimal size, and all of memory has been filled
* with used:free block pairs.
*
* If adjacent free blocks are allowed to exist, maximal fragmentation
* occurs when every block is free (except perhaps for one, and not
* counting any blocks allocated by startup code), all blocks are of
* the minimal size, and all of memory has been filled with free
* blocks.
*
* Returns -1 in case of error condition - corrupt heap or setdos()
* not called in an MSC program. setdos() is required only if compar-
* ing with other memory managers.
*
*********************************************************************/

double fragindex( void )
{
 double breakupF, largeFreeF;

 breakupF = getBreakupF();
 largeFreeF = getLargeFreeF();


 if( breakupF<0 largeFreeF<0 )
 return( -1 );

 if( !largeFreeF )
 return( 0 );
 else
 return ( breakupF/largeFreeF );
}

/*********************************************************************
*
* getBreakupF
*
* Returns the BreakupFactor for use in the fragindex function.
*
*********************************************************************/

double getBreakupF( void )
{
 long nfree, nused;

 nfree = getnfreeblks();
 nused = getnusedblks();

 if( nfree<=0 )
 return(nfree);

 return (double)(--nfree) / (double)(++nused);
}

/*********************************************************************
*
* getLargeFreeF
*
* Returns the LargeFreeFactor for use in the fragindex function.
*
* Returns -1 if error in heapwalk, 0 if is no free memory
*
*********************************************************************/

double getLargeFreeF( void )
{
 double sizelargest;

 sizelargest = (double)getlargest();

 if( sizelargest < 0 )
 return( -1 );

 if( !sizelargest )
 return( 0 );
 else
 return (sizelargest / (double)gettotalfree());
}







[LISTING TWO]

/* Provides Borland-specific versions of functions called by fragindex() */

/* Assumes no UMBs, etc., are available to malloc() */

#include <alloc.h>
#include <stdlib.h>
#include <stdio.h>

long gettotalfree( void );
long getnusedblks( void );
long getnfreeblks( void );
long getlargest( void );
void setdos( void );

/*********************************************************************
*
* getlargest( void );
*
* Returns a number of bytes, the size of the largest block of
* contiguous free memory that can be successfully allocated from the
* heap with farmalloc(), malloc(), or _halloc().
*
*********************************************************************/

long getlargest( void )
{
 struct farheapinfo hinfo;
 long size = 0;

 hinfo.ptr = NULL;

 while( heapwalk( &hinfo ) == _HEAPOK ){

 if( !hinfo.in_use )
 size = max( size, hinfo.size );
 }

 size = max( size, farcoreleft()-4 );

 return( size );
}

/*********************************************************************
*
* getnfreeblks( void );
*
* Returns the number of free blocks in the heap. Doesn't count memory
* above utilized heap.
*
* Returns -1 if heap is corrupt
*
*********************************************************************/

long getnfreeblks( void )
{
 struct farheapinfo hinfo;

 register int count = 0;

 if( heapcheck() < 0 )
 return( -1 );

 hinfo.ptr = NULL;

 while( heapwalk( &hinfo ) == _HEAPOK ){

 if( !hinfo.in_use )
 count++;
 }
 /* ret count+1 to count coreleft() memory */
 return (long) ++count;
}

/*********************************************************************
*
* getnusedblks( void );
*
* Returns the number of allocated blocks in the heap.
*
*********************************************************************/

long getnusedblks( void )
{
 struct farheapinfo hinfo;
 register int count = 0;

 hinfo.ptr = NULL;

 while( heapwalk( &hinfo ) == _HEAPOK ){

 if( hinfo.in_use )
 count++;
 }

 return (long) count;
}

/*********************************************************************
*
* gettotalfree( void );
*
* Returns the total amount of free memory.
*
*********************************************************************/

long gettotalfree( void )
{
 struct farheapinfo hinfo;
 long size=0;

 hinfo.ptr = NULL;

 while( heapwalk( &hinfo ) != _HEAPEND ){

 if( !hinfo.in_use )
 size+=(hinfo.size+4);

 }
 size += farcoreleft();

 return (long)size;
}


/*********************************************************************
*
* setdos( void );
*
* For sake of compatibility with fragmic.c
*
*********************************************************************/

void setdos( void )
{
}





[LISTING THREE]

/* Provides Microsoft C-specific versions of functions used in
 calculating a fragmentation index. */

/* Assumes no UMBs, etc., are available to malloc() */

#include <malloc.h>
#include <stdlib.h>
#include <stdio.h>
#include <dos.h>

#define LOWSEGS 100
 /* signed int, must be > 0 */
 /* represents the max number of
 free DOS blocks expected at
 time of program startup */
#define NDOS 200
 /* signed int, must be > 0 */
 /* affected by value of
 _amblksize variable - smaller
 val means larger NDOS req'd */
 /* represents max number of
 free DOS blocks
 expected */

int nsegs, _callflag;
unsigned _far dosseg_array[ LOWSEGS ]; /* saves segs for findfrag() */
unsigned _far seg_array[ NDOS ]; /* saves segs for setdos() */

void limitheap( void );
void freesegs( void );
long countdos( void );
long bigdos( void );
long dossize( void );
long gettotalfree( void );

long getnusedblks( void );
long getnfreeblks( void );
long getlargest( void );
void setdos( void );
void freedos( int count );

/* Note: _dos_allocmem() is a compiler runtime library function that
 allocates a block of memory directly from MS-DOS through int 21h
 fxn. 48h. You request a number of paragraphs (blocks of 16 bytes).
 Requesting 0xffff paragraphs will result in an error return, with
 the size of the largest allocable block, in paragraphs, being
 returned in the variable whose address is passed as the second
 parameter to _dos_allocmem(). _dos_freemem() frees the blocks
 allocated through _dos_allocmem(), via int 21h fxn 49h.
*/

/*********************************************************************
*
* getlargest( void );
*
* Returns a number of bytes, the size of the largest block of contigu-
* ous free memory that can be successfully allocated from the heap
* with fmalloc(), malloc() (in far data memory models), or _halloc().
*
* Returns -1 if error
*
*********************************************************************/

long getlargest( void )
{
 struct _heapinfo hinfo;
 int herr;
 long maxsize = 0;
 long tsize = 0;
 char first = 0;
 unsigned seg = 0;

 maxsize = bigdos();

 /* Get largest free */

 hinfo._pentry = NULL;

 while( (herr=_heapwalk( &hinfo )) != _HEAPEND && herr==_HEAPOK){

 /* if is a used block... */
 if( hinfo._useflag ){

 first = 0;

 seg = (unsigned)(((unsigned long)hinfo._pentry)>>16);
 }

 /* but if is a free block... */
 else{

 /* if last block was a used block */
 if( !first ){


 tsize = hinfo._size;

 /* If switch DOS blocks */
 if( (unsigned)(((unsigned long)hinfo._pentry)>>16)
 != seg )
 seg =
 (unsigned)(((unsigned long)hinfo._pentry)>>16);

 maxsize = max( maxsize, tsize );

 first=1;
 }

 /* if last block was a free block too */
 else{

 /* If switch DOS blocks */
 if( (unsigned)(((unsigned long)hinfo._pentry)>>16)
 != seg ) {
 seg =
 (unsigned)(((unsigned long)hinfo._pentry)>>16);
 tsize = hinfo._size;
 }
 else
 tsize += (hinfo._size+2);

 maxsize = max( maxsize, tsize );
 }
 }
 }

 return( herr==_HEAPEND?maxsize:-1 );
}

/*********************************************************************
*
* long bigdos( void )
*
* Returns the size of the largest block of available DOS memory.
*
*********************************************************************/

long bigdos( void )
{
 unsigned size;

 _dos_allocmem( 0xffff, &size );

 return( (long)(size-1) * 16L ); /* Size of largest allocable
 block */
}


/*********************************************************************
*
* getnfreeblks( void );
*
* Returns the number of free blocks in the heap, and assumes adjacent
* free blocks in the same DOS block can be combined.

*
* Returns -1 in case of error
*
*********************************************************************/

long getnfreeblks( void )
{
 struct _heapinfo hinfo;
 int herr;
 unsigned seg;
 long count = 0;
 char first = 0;

 /* error if didn't call setdos() */
 if( !_callflag )
 return( -1 );

 hinfo._pentry = NULL;

 while( (herr=_heapwalk( &hinfo )) != _HEAPEND && herr==_HEAPOK){

 /* if is a used block... */
 if( hinfo._useflag ){

 first = 0;

 seg = (unsigned)(((unsigned long)hinfo._pentry)>>16);
 }

 /* but if is a free block... */
 else{

 /* if last block was a used block */
 if( !first ){

 count++;

 /* If switch DOS blocks */
 if( (unsigned)(((unsigned long)hinfo._pentry)>>16)
 != seg )
 seg =
 (unsigned)(((unsigned long)hinfo._pentry)>>16);
 }

 /* if last block was a free block too */
 else{

 /* If switch DOS blocks */
 if( (unsigned)(((unsigned long)hinfo._pentry)>>16)
 != seg ){
 seg =
 (unsigned)(((unsigned long)hinfo._pentry)>>16);
 count++;
 }
 }
 }
 }

 return( herr==_HEAPEND?(count+countdos()):-1 );

}

/*********************************************************************
*
* int countdos( void )
*
* Counts the number of free DOS blocks, up to NDOS blocks
*
*********************************************************************/

long countdos( void )
{
 int count;

 for( count=0; count<NDOS; count++ ){

 _dos_allocmem( 0xffff, &seg_array[count] );

 if( seg_array[count] )
 _dos_allocmem( seg_array[count], &seg_array[count] );
 else
 break;
 }

 /* count is the number of allocated blocks */

 freedos(count);

 if( count == NDOS ){
 printf("\nToo many DOS blocks; increase value of NDOS.");
 exit(0);
 }

 return count;
}

/*********************************************************************
*
* getnusedblks( void );
*
* Returns the number of allocated blocks in the heap.
*
*********************************************************************/

long getnusedblks( void )
{
 struct _heapinfo hinfo;
 long count = 0;

 hinfo._pentry = NULL;

 while( _heapwalk( &hinfo ) != _HEAPEND ){

 if( hinfo._useflag )
 count++;
 }
 return count;
}


/*********************************************************************
*
* gettotalfree( void );
*
* Returns the total amount of free memory.
*
*********************************************************************/

long gettotalfree( void )
{
 struct _heapinfo hinfo;
 long size=0;

 hinfo._pentry = NULL;

 while( _heapwalk( &hinfo ) != _HEAPEND ){

 if( !hinfo._useflag )
 size+=(hinfo._size+2);
 }
 return size + dossize();
}

/*********************************************************************
*
* dossize( void )
*
* Counts the number of bytes in free DOS blocks, up to NDOS DOS
* blocks. Includes 16 bytes overhead for each block.
*
*********************************************************************/
long dossize( void )
{
 unsigned seg_array[NDOS];
 int count;
 long size = 0;

 for( count=0; count<NDOS; count++ ){

 _dos_allocmem( 0xffff, &seg_array[count] );

 if( seg_array[count] ){
 size += ((long)seg_array[count]+1)*16L;
 _dos_allocmem( seg_array[count], &seg_array[count] );
 }
 else
 break;
 }

 /* count is the number of allocated blocks */

 freedos(count);

 if( count == NDOS ){
 printf("\nToo many DOS blocks; increase value of NDOS.");
 exit(0);
 }

 return( size );

}

/*********************************************************************
*
* void freedos( int count )
*
* Frees segments stored in seg_array[].
*
* Parm "count" is number of stored segments.
*
*********************************************************************/

void freedos( int count )
{
 int i;

 if( !count )
 return;

 for( i=0; i<count; i++ )
 _dos_freemem( seg_array[i] );
}

/*********************************************************************
*
* setdos( void );
*
*********************************************************************/

void setdos( void )
{
 atexit( freesegs );
 limitheap();
 _callflag++;
}

/*********************************************************************
*
* limitheap()
*
* This function allocates all the DOS blocks besides the one
* immediately above the program block. Assumes that the one above the
* program block is the biggest free DOS block.
*
*********************************************************************/
void limitheap(void)
{
 /* nsegs = 0 at start */

 while( nsegs<LOWSEGS ){

 _dos_allocmem( 0xffff, &dosseg_array[nsegs] );

 if( dosseg_array[nsegs] ){
 _dos_allocmem( dosseg_array[nsegs],&dosseg_array[nsegs] );
 nsegs++;
 }
 else
 break;

 }

 if( nsegs == LOWSEGS ){

 _dos_freemem( dosseg_array[0] );

 printf("\nToo many free DOS blocks at start; ");
 printf("\nincrease value of of LOWSEGS.");
 exit(0);
 }

 if( nsegs )
 _dos_freemem( dosseg_array[0] );
}

void freesegs(void)
{
 int i;

 if( !_callflag )
 printf("\nerror: didn't call setdos()");
 else{
 if( nsegs>1 ){

 for( i=1; i<nsegs; i++ )
 _dos_freemem( dosseg_array[i] );
 }
 }
}

































April, 1993
 PROGRAMMING FOR THE OS/2 2.0 WORKPLACE SHELL


Turn data into objects using OS/2's system object model


 This article contains the following executables: OS2WPS.ARC


Joel Barnum


Joel is a programming instructor at Descriptor Systems, P.O. Box 461, Marion,
IA 52302. He can be reached on CompuServe at 70047,442.


If you're familiar with OS/2, Version 1.x but new to 2.0, you're in for an
immediate surprise. That's because 2.0 sports a new user interface called the
Workplace Shell (WPS) that replaces Presentation Manager. This new shell is
data oriented, rather than application oriented. In other words, users
manipulate data directly, rather than first launching an executable and then
opening a data file. This datacentric approach is more natural for novice
computer users--and for experienced users, once they get used to it.
For example, to modify a spreadsheet under Workplace Shell, simply open the
folder containing the spreadsheet data and double-click on the data-file
object. WPS will launch the associated .EXE file, passing the data-file name
as an argument. Of course, earlier versions of OS/2 (and Windows) also let the
user associate data files with executable files by file extension, but WPS
takes this concept much further. For example, it is possible to print data
files via drag and drop without first starting the .EXE file. For this to
work, however, the application developer must do some extra work. In this
article, I'll detail what developers must do to convert OS/2 1.x programs to
fit in the WPS. I'll approach this in a stepwise fashion, starting with the
minimum changes required, and working up to a full-blown WPS-compliant
program.


Launching by File Extension


Associating an executable with data by the data file's extension is a simple
process--open the Settings notebook by clicking the right mouse button on the
appropriate .EXE file and set up an association between that .EXE file, and a
list of file extensions. WPS will write entries in the OS2.INI profile that
specify the associations. When the user double-clicks on a data file, WPS
launches the appropriate .EXE and passes the data-file name as argv[1].
While this technique works, it's not very robust. First of all, the file
extension is not a great way to specify a file type. After all, you've only
three characters to work with, and it's quite possible that several
application programs might use the same extension for their data files.
Second, since users can give their files any name, there's no guarantee that a
.TXT really contains text, for example. Finally, under the high-performance
file system (HPFS), filenames aren't required to have extensions at all. In
that case, the system completely breaks down. Fortunately, OS/2 provides a
better way to associate file types.


.TYPE Extended Attributes


Beginning with OS/2 1.2, both the operating system and applications can assign
information, called "extended attributes" (EAs), to a file. OS/2's two file
systems--file allocation table (FAT) and high-performance file system
(HPFS)--store extended attributes differently. In neither case, however, are
the EAs stored in the file itself. HPFS stores EAs in sectors near the file
data, while the FAT system stores them in a hidden system file in the volume
root directory called EA DATA. SF. Therefore, an application can easily set
and query EAs under either file system.
OS/2 supports a text-string extended attribute called .TYPE. Examples of .TYPE
are Plain Text (a system-defined type) and Brand X Spreadsheet, which
spreadsheet vendor X defines. Once the application programmer associates a
program's .TYPE with its data files, WPS will use this .TYPE instead of (or in
addition to) the file extension to launch an associated .EXE file. To attach
the .TYPE-extended attribute to the data file, the application must call
DosSetFileInfo and pass a variable-length data structure containing the
extended attribute. The settype function in Listing One (page 94) shows how to
assign a .TYPE EA to a file.
An application must also tell WPS what .TYPEs it recognizes by defining an
ASSOCTABLE resource in its resource file; see Listing Two, page 95. You can
list several .TYPEs and extension filters if you wish. The icon file isn't
used in OS/2 2.0, but is still required syntax. The first time the user
displays the .EXE file in a Drives folder (or in file manager under OS/2 1.3),
the system writes an entry into OS2.INI. This entry indicates that files of
the given type "belong" to the executable. When the user double-clicks on a
data file, WPS (or file manager) searches OS2.INI and then launches the .EXE
file, passing the data file's name to the application as the first argument.
WPS takes one more very important action when the user displays the .EXE--it
adds a template for the data file to the templates folder. Templates are an
integral part of the WPS. They let users create new data files without first
running the application file. Instead, users "tear off" an entry from the
template and drop it in the folder of their choice. WPS then creates a new
file in the target folder, complete with the .TYPE-extended attribute. A user
can then launch the .EXE with the new data file by double-clicking on it.
However, templates can cause problems for existing programs that write headers
to their data files that indicate the file contents (such as a file signature,
copyright, and so on). Many such programs will refuse to open any files that
don't have the required header. That's a problem with templates because WPS
creates the new files without any data. Therefore, you may need to modify
existing applications so that they successfully open zero-length files.


The System Object Model


In a completely datacentric environment, the user should be able to print a
file by dragging and dropping it onto a printer object. But with what we've
covered so far, drag-and-drop printing won't work. After all, only your
program knows how to print its own data: You can't expect WPS to print a
spreadsheet or a chart, for example. The catch is that when the user drops a
data file on a printer object, WPS does not launch the .EXE. Instead, it's up
to the data file to print itself. Of course, a simple data file doesn't know
how to print itself. Resolving this problem means that we must effectively
turn the data file into an object, which is exactly what the System Object
Model (SOM) lets you do.
The SOM is a set of tools, header files, and macros that let you define
object-oriented classes and objects. Its goal is to let developers design
object classes (encapsulated functions and data) and then implement the class
in any supported programming language. Theoretically, a developer could define
a class in SOM, then write the code for the class in Smalltalk. Another
developer could create instances of the class using, say, C++. Currently,
however, the only supported language is C, even though C isn't an OOP
language. (C++ is in beta.)
Listing Three (page 95) shows a class definition for WPSpreadFile: a data file
subclassed from the predefined WPDataFile class. WPDataFile contains methods
that support delete and copy operations, drag and drop, plain text printing,
and so on. WPSpreadFile overrides six methods from its parent class, most of
which are class methods, meaning that they apply to all instances of the
class. (wpPrintObject, on the other hand, is referred to as an "instance
method.") WPS calls the class methods when the object class is registered and
will call the wpPrintObject method only when the user drags and drops an
instance of the class on a printer-object icon. We override the class methods
to define the icon for our data files and our data files' .TYPE-extended
attribute. Once the class definition is complete, run the SOM compiler (that
comes with the OS/2 Programmer's Toolkit) to generate a stubbed-out .C file
that, when completed, will implement the class.
Listing Four (page 96) shows a completed class implementation. Most of the
code in Listing Four was generated by the SOM compiler, including the function
definitions and the first two lines of each method. The functions were
completed according to Table 1. I've included a make script file (see
"Availability," page 5) to compile the resulting listings.
Table 1: Functions used to complete the listing generated by the SOM compiler.

 Function Description
 ----------------------------------------------------------------------

 wpclsQueryTitle Return a string that WPS uses as the default
 name for new objects.
 wpclsQueryInstanceType Return a string that WPS assign as .TYPE EA
 for new object's file. This matches the
 ASSOCTABLE in the .EXE's resource file.
 wpclsQueryInstanceFilter Return a string that WPS uses to associate
 the .EXE by file extension.
 wpclsInitData Load the icon for our class objects as a
 resource. Note that we reference the
 class data item with_hicon.

 wpclsQueryIcon Return the icon's handle.
 wpPrintObject Call a function that actually prints the
 data file.

The installation program shown in Listing Five (page 97) can be used to create
a folder on the desktop for our program, register our new subclass, and create
a reference object (WPProgram) to the application file. When you run the
installation program, WPS automatically creates a template for the data files
and connects it to the object DLL. When the user manipulates an object created
from the template, WPS calls methods in the DLL. For example, when the user
drops a data-file object on a printer object, WPS calls the wpPrintObject
method. The DLL then prints the data file that the object represents. This
example actually prints the file in the method. In a real-world case, we would
start a background process to print, so the WPS wouldn't be frozen during the
print. Also, to keep the code simple I copied the openfile function from
Listing One (the .EXE) to Listing Four (the DLL). In a real-life program, you
could put such shared code in another DLL that both the .EXE and DLL could
call.
Finally, the WPS automatically sets up an association between the data files
and the .EXE file so that the user can launch the .EXE by double-clicking on
the data-file object. Note that you will have to modify the installation
program to reflect the directory in which the object's DLL resides.


Conclusions


While all existing 1.x programs should run under OS/2 2.0, you can make minor
changes that will allow users to put the Workplace Shell to its greatest use.
The good news is that you don't have to convert the program all at once -- you
can make incremental changes and increase WPS compliance as you go.


Acknowledgments


I'd like to thank Peter Magid in Shell Development at IBM Boca Raton for help
above and beyond the call of duty.

_PROGRAMMING FOR THE OS/2 2.0 WORKPLACE SHELL_
by Joel Barnum


[LISTING ONE]

// spread.c -- a sample WPS application

#define INCL_WIN
#define INCL_GPI
#include <os2.h>
#include <stdlib.h>
#include <string.h>
#include "spread.h"

#define WM_INIT WM_USER

// Internal function prototypes
int main ( int argc, char *argv[] );
BOOL savefile ( PSZ szFname, LONG alValues[] );
BOOL openfile ( PSZ szFname, LONG alValues[] );
BOOL settype ( HFILE hf, PSZ pszType );
MRESULT EXPENTRY ValueDlgProc ( HWND hwnd, ULONG msg, MPARAM mp1, MPARAM mp2
);
MRESULT EXPENTRY ClientWinProc ( HWND hwnd, ULONG msg, MPARAM mp1, MPARAM
mp2);
// global variables
 HAB hab; // Anchor block handle
int main ( int argc, char *argv[] )
{
 HMQ hmq; // Message queue handle
 HWND hwndFrame; // Frame window handle
 HWND hwndClient; // Client window handle
 QMSG qmsg; // Message from queue
 ULONG flCreate; // Window creation flags
 BOOL fSuccess; // return from API
 hab = WinInitialize ( 0 );
 hmq = WinCreateMsgQueue ( hab, 0 );
 fSuccess = WinRegisterClass (hab,"spread",ClientWinProc,CS_SIZEREDRAW,0);
 flCreate = FCF_SYSMENU FCF_SIZEBORDER FCF_TITLEBAR 
 FCF_MINMAX FCF_SHELLPOSITION FCF_TASKLIST FCF_ICON;
 hwndFrame = WinCreateStdWindow ( HWND_DESKTOP, WS_VISIBLE

 , &flCreate, "spread", NULL, 0L, 0 , ID_WINDOW, &hwndClient );
 if ( hwndFrame == NULLHANDLE )
 DosExit ( 1, 1 );
 // send the client a message passing arg count and arguments
 WinSendMsg ( hwndClient, WM_INIT, MPFROMSHORT ( argc ) ,MPFROMP ( argv ));
 while ( WinGetMsg ( hab, &qmsg, NULLHANDLE, 0, 0 ) != FALSE )
 WinDispatchMsg ( hab, &qmsg );
 fSuccess = WinDestroyWindow ( hwndFrame );
 fSuccess = WinDestroyMsgQueue ( hmq );
 fSuccess = WinTerminate ( hab );
 return 0;
}
//************************************************************
MRESULT EXPENTRY ClientWinProc ( HWND hwnd, ULONG msg, MPARAM mp1, MPARAM mp2
)
{
 BOOL fSuccess; // return from API
static LONG alValues[2]; // spreadsheet values
static HWND hwndMenu; // popup menu handle
static CHAR szFname[255]; // file name
 switch( msg )
 {
 case WM_BUTTON2DOWN:
 {
 POINTL ptl;
 // display a popup menu at the coordinates the user clicked
 ptl.x = SHORT1FROMMP ( mp1 );
 ptl.y = SHORT2FROMMP ( mp1 );
 WinMapWindowPoints ( hwnd, HWND_DESKTOP, &ptl, 1 );
 fSuccess = WinPopupMenu (
 HWND_DESKTOP , hwnd , hwndMenu , ptl.x , ptl.y
 , 0 , PU_KEYBOARD PU_NONE PU_MOUSEBUTTON1 );
 }
 return (MRESULT)FALSE;
 case WM_CLOSE:
 savefile ( szFname, alValues );
 WinPostMsg( hwnd, WM_QUIT, 0L, 0L );
 return (MRESULT) NULL;
 case WM_COMMAND:
 switch ( SHORT1FROMMP ( mp1 ) )
 {
 case IDM_CHANGE:
 {
 ULONG result;
 // display a modal dialog to let user enter in new values
 result = WinDlgBox ( HWND_DESKTOP
 , WinQueryWindow ( hwnd, QW_PARENT ), ValueDlgProc
 , NULLHANDLE , DLG_VALUES , alValues );
 if ( result == DID_OK )
 WinInvalidateRect ( hwnd, NULL, TRUE );
 }
 break;
 }
 return (MRESULT)NULL;
 case WM_CREATE:
 //load our popup menu
 hwndMenu = WinLoadMenu ( HWND_DESKTOP , NULLHANDLE, ID_MENU );
 return (MRESULT)FALSE;
 case WM_INIT: // user-defined message
 {

 int argc; // argument count
 CHAR **argv; // input arguments
 CHAR szTitle[255]; // titlebar text
 // extract argument count and strings
 argc = SHORT1FROMMP ( mp1 );
 argv = PVOIDFROMMP ( mp2 );
 // if there were no input arguments, exit
 if ( argc < 2 )
 {
 WinMessageBox (
 HWND_DESKTOP , WinQueryWindow ( hwnd, QW_PARENT )
 , "You must specify an input file"
 , "Error", 0 , MB_OK MB_ERROR );
 DosExit ( 1, 1 );
 }
 // attempt to open input file
 strcpy ( szFname, argv[1] );
 if ( openfile ( argv[1], alValues ) == FALSE )
 {
 WinMessageBox (
 HWND_DESKTOP
 , WinQueryWindow ( hwnd, QW_PARENT ) , argv[1]
 , "Unable to open file", 0 , MB_OK MB_ERROR );
 DosExit ( 1, 1 );
 }
 // update the titlebar text
 strcpy ( szTitle, "Spreadsheet - " );
 strcat ( szTitle, szFname );
 WinSetWindowText ( WinQueryWindow ( hwnd, QW_PARENT ),szTitle);
 }
 return (MRESULT)NULL;
 case WM_PAINT:
 {
 LONG lSuccess; // return from API
 HPS hps; // cached PS
 POINTL ptl; // coordinates for draw
 CHAR sz[50]; // temp string
 hps = WinBeginPaint ( hwnd , NULLHANDLE, NULL );
 fSuccess = GpiErase ( hps );
 // draw the values and their sum
 ptl.x = 100; ptl.y = 125;
 _itoa ( alValues[0], sz, 10 );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 ptl.y = 100;
 _itoa ( alValues[1], sz, 10 );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 ptl.y = 75;
 lSuccess = GpiMove ( hps, &ptl );
 ptl.x = 200;
 lSuccess = GpiLine ( hps, &ptl );
 ptl.x = 100; ptl.y = 50;
 _itoa ( alValues[0] + alValues[1], sz, 10 );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 ptl.x = 50; ptl.y = 25;
 strcpy ( sz, "Press the right mouse button to change values" );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 fSuccess = WinEndPaint ( hps );
 }
 return (MRESULT) NULL;

 default:
 return
 WinDefWindowProc( hwnd, msg, mp1, mp2 );
 }
 return WinDefWindowProc( hwnd, msg, mp1, mp2 );
}
// savefile: saves the current values to a file
// RETURNS: TRUE if successful, FALSE if not
BOOL savefile ( PSZ szFname, LONG alValues[] )
{
 HFILE h; // file handle
 ULONG ulAction; // action taken by OPEN
 ULONG ulActualWritten; // count written to file
 APIRET rc; // return code
 // open the current file
 rc = DosOpen ( szFname, &h, &ulAction, 0L
 , 0, OPEN_ACTION_OPEN_IF_EXISTS OPEN_ACTION_CREATE_IF_NEW
 , OPEN_SHARE_DENYREADWRITE OPEN_ACCESS_READWRITE, NULL );
 if ( rc != 0 )
 return FALSE;
 // write the two values
 rc = DosWrite ( h, alValues, 8, &ulActualWritten );
 if ( ( rc != 0 ) ( ulActualWritten != 8 ) )
 {
 DosClose ( h );
 return FALSE;
 }
 // write our .TYPE EA on the file
 settype ( h, "XX Company Spreadsheet" );
 // close the file
 DosClose ( h );
 return TRUE;
}
// openfile: reads spreadsheet values from the specified file
// RETURNS: TRUE if successful, FALSE if not
BOOL openfile ( PSZ szFname, LONG alValues[] )
{
 HFILE h; // file handle
 ULONG ulAction; // action taken by OPEN
 ULONG ulActualRead; // count read from file
 APIRET rc; // return code
 // open the file
 rc = DosOpen ( szFname, &h, &ulAction, 0L , 0
 , OPEN_ACTION_OPEN_IF_EXISTS OPEN_ACTION_FAIL_IF_NEW
 , OPEN_SHARE_DENYNONE OPEN_ACCESS_READWRITE , NULL );
 if ( rc != 0 )
 return FALSE;
 // read the values
 rc = DosRead ( h, alValues, 8, &ulActualRead );
 if ( rc != 0 )
 {
 DosClose ( h );
 return FALSE;
 }
 // zero length files are OK, but otherwise less than 8 bytes means bad file
 if ( ulActualRead < 8 )
 if ( ulActualRead != 0 )
 {
 DosClose ( h );

 return FALSE;
 }
 // close the file
 DosClose ( h );
 return TRUE;
}
// settype: sets the .TYPE ea for a data file
BOOL settype ( HFILE hf, PSZ pszType )
{
// define a .TYPE EA structure
typedef struct _TYPEEALIST
{
 ULONG cbList; // length of all EAs in list
 ULONG ulNextEa; // offset of next EA
 BYTE bFlags; // EA flags
 BYTE cbName; // length of name
 USHORT cbEA; // sizeof EA
 CHAR szName[6]; // ".TYPE"
 USHORT usType; // EA data type
 USHORT cbValue; // sizeof value
 CHAR achValue[1]; // placeholder for EA value
} TEALIST, *PTEALIST;
 EAOP2 eaop; // extended attributes structure
 PTEALIST ptea; // points to TYPE EA list
 PSZ psz1, psz2; // temp pointers
 USHORT cb; // structure length
 APIRET rc; // return from API
 // allocate memory for the TYPE EA list
 // -1 because the structure itself defines 1 char
 cb = strlen ( pszType ) - 1 +
 sizeof ( TEALIST );
 ptea = (PTEALIST)malloc ( cb );
 // initialize the EA structures
 // fill in the EA value itself: for .TYPE it's the file type
 // can't use strcpy!! (needs to add '\0')
 psz1 = pszType;
 psz2 = ptea->achValue;
 while ( *psz1 != '\0' )
 *psz2++ = *psz1++;
 // fill in length of the EA value
 ptea->cbValue = strlen ( pszType );
 // fill type of EA value
 ptea->usType = 0xfffd; // length-preceded ASCII
 // length of EA (includes value + type and length fields)
 ptea->cbEA = ptea->cbValue + sizeof(ptea->usType) + sizeof (ptea->cbValue);
 // fill in the EA name (it's a null terminated string so strcpy is OK)
 strcpy ( ptea->szName, ".TYPE" );
 // fill in EA name length
 ptea->cbName = (BYTE)strlen ( ".TYPE" );
 // point to the TYPE EA list structure
 eaop.fpFEA2List = (PFEA2LIST)ptea;
 eaop.fpGEA2List = NULL;
 ptea->cbList = cb; // structure length
 ptea->ulNextEa = 0; // no more EAs
 ptea->bFlags = 0; // noncritical
 // attach the .TYPE extended attribute to the file
 rc = DosSetFileInfo ( hf, 2, (PBYTE)&eaop , sizeof (EAOP2) );
 free ( ptea );
 return (BOOL)rc;

}
MRESULT EXPENTRY ValueDlgProc ( HWND hwnd, ULONG msg , MPARAM mp1, MPARAM mp2
)
{
static PLONG alValues;
 switch ( msg )
 {
 case WM_INITDLG:
 // retrieve a pointer to values array
 alValues = PVOIDFROMMP ( mp2 );
 // write current values into entry fields
 WinSetDlgItemShort ( hwnd, DLG_VALUE1, (SHORT)alValues[0], TRUE );
 WinSetDlgItemShort ( hwnd, DLG_VALUE2, (SHORT)alValues[1], TRUE );
 // set the focus to the first entryfield
 WinSetFocus ( HWND_DESKTOP, WinWindowFromID ( hwnd, DLG_VALUE1 ) );
 return (MRESULT)TRUE;
 case WM_COMMAND:
 switch ( SHORT1FROMMP ( mp1 ) )
 {
 case DID_CANCEL:
 WinDismissDlg ( hwnd, DID_CANCEL );
 break;
 case DID_OK:
 // retrieve values from entry fields
 WinQueryDlgItemShort ( hwnd, DLG_VALUE1
 , (PSHORT)&alValues[0], TRUE );
 WinQueryDlgItemShort ( hwnd, DLG_VALUE2
 , (PSHORT)&alValues[1], TRUE );
 WinDismissDlg ( hwnd, DID_OK );
 }
 return (MRESULT)NULL;
 default:
 return
 WinDefDlgProc( hwnd, msg, mp1, mp2 );
 }
 return WinDefDlgProc( hwnd, msg, mp1, mp2 );
}






[LISTING TWO]

#include <os2.h>
#include "spread.h"
rcinclude spread.dlg

ICON ID_WINDOW spread.ico
MENU ID_MENU
BEGIN
 MENUITEM "~Change values", IDM_CHANGE
END
ASSOCTABLE 1
{
 "XX Company Spreadsheet", "*.SPR", spread.ico
}







[LISTING THREE]

# Subclass of WPDataFile for Spreadsheet sample WPS application
#include <wpdataf.sc>
class: WPSpread,
 external stem = wpspread, local,
 external prefix = wpspread_,
 classprefix = wpspreadM_,
 major version = 1,
 minor version = 2;
parent: WPDataFile;
passthru: C.ih;
 #define INCL_WIN
 #define INCL_DOS
 #define INCL_DEV
 #define INCL_GPI
 #define INCL_WPCLASS
 #define INCL_WPFOLDER
 #include <os2.h>
 #include <stdlib.h>
 #include <string.h>
endpassthru; /* .ih */
data:
 HPOINTER hicon, class; // class icon
methods:
 override wpclsQueryTitle, class;
 override wpclsInitData, class;
 override wpclsQueryIcon, class;
 override wpclsQueryInstanceFilter, class;
 override wpclsQueryInstanceType, class;
 override wpPrintObject;






[LISTING FOUR]

/* This file was generated by the SOM Compiler. FileName: wpspread.c.
 * Generated using: SOM Precompiler spc: 1.22 SOM Emitter emitc: 1.24 */
#define WPSpread_Class_Source
#include "wpspread.ih"

BOOL openfile ( PSZ szFname, LONG alValues[] );
BOOL printspread ( WPSpread *somSelf, PPRINTDEST pPrintDest );;

#undef SOM_CurrentClass
#define SOM_CurrentClass SOMMeta
SOM_Scope PSZ SOMLINK wpspreadM_wpclsQueryTitle(M_WPSpread *somSelf)
{
 M_WPSpreadData *somThis = M_WPSpreadGetData(somSelf);
 M_WPSpreadMethodDebug("M_WPSpread","wpspreadM_wpclsQueryTitle");
 return _wpclsQueryInstanceType( somSelf );
}

SOM_Scope void SOMLINK wpspreadM_wpclsInitData(M_WPSpread *somSelf)
{
 HMODULE hmod; // module handle
 PSZ psz; // module file name
 APIRET rc; // return from API
 M_WPSpreadData *somThis = M_WPSpreadGetData(somSelf);
 M_WPSpreadMethodDebug("M_WPSpread","wpspreadM_wpclsInitData");
 // initialize the parent classes first
 parent_wpclsInitData ( somSelf );
 // query our module name
 psz = _somLocateClassFile( SOMClassMgrObject, SOM_IdFromString( "WPChart" )
 , WPSpread_MajorVersion , WPSpread_MinorVersion );
 // query our module handle
 if ( psz != NULL )
 rc = DosQueryModuleHandle ( psz, &hmod );
 // load the icon (same as pointer) and store in class data
 _hicon = WinLoadPointer ( HWND_DESKTOP, hmod, 1 );
}
SOM_Scope HPOINTER SOMLINK wpspreadM_wpclsQueryIcon(M_WPSpread *somSelf)
{
 M_WPSpreadData *somThis = M_WPSpreadGetData(somSelf);
 M_WPSpreadMethodDebug("M_WPSpread","wpspreadM_wpclsQueryIcon");
 return _hicon;
}
SOM_Scope PSZ SOMLINK wpspreadM_wpclsQueryInstanceFilter(M_WPSpread *somSelf)
{
 M_WPSpreadData *somThis = M_WPSpreadGetData(somSelf);
 M_WPSpreadMethodDebug("M_WPSpread","wpspreadM_wpclsQueryInstanceFilter");
 return ".SPR";
}
SOM_Scope PSZ SOMLINK wpspreadM_wpclsQueryInstanceType(M_WPSpread *somSelf)
{
 M_WPSpreadData *somThis = M_WPSpreadGetData(somSelf);
 M_WPSpreadMethodDebug("M_WPSpread","wpspreadM_wpclsQueryInstanceType");
 return "XX Company Spreadsheet";
}
#undef SOM_CurrentClass
#define SOM_CurrentClass SOMInstance
SOM_Scope BOOL SOMLINK wpspread_wpPrintObject(WPSpread *somSelf,
 PPRINTDEST pPrintDest,
 ULONG ulReserved)
{
 /* WPSpreadData *somThis = WPSpreadGetData(somSelf); */
 WPSpreadMethodDebug("WPSpread","wpspread_wpPrintObject");
 return printspread ( somSelf, pPrintDest );
}
//************* Following code NOT generated by SOM compiler
// printspread: prints the file this object represents
// RETURNS: TRUE if successful, FALSE if not
BOOL printspread ( WPSpread *somSelf, PPRINTDEST pPrintDest )
{
 HDC hdcPrinter; // printer DC
 HAB hab; // anchor block handle
 HPS hps; // micro PS
 SIZEL sizel; // presentation page size
 LONG alValues[2]; // spreadsheet values
 BOOL fSuccess; // return from API
 BOOL lSuccess; // return from API
 CHAR szFname[255]; // file name

 ULONG cb; // filename length
 POINTL ptl; // drawing coordinates
 CHAR sz[50]; // temporary string
 // create a printer device context
 hab = WinQueryAnchorBlock ( HWND_DESKTOP );
 hdcPrinter = DevOpenDC ( hab , pPrintDest->lType
 , pPrintDest->pszToken , pPrintDest->lCount
 , pPrintDest->pdopData , NULLHANDLE );
 // create a micro PS associated with the printer DC
 sizel.cx = sizel.cy = 0;
 hps = GpiCreatePS ( hab, hdcPrinter, &sizel
 , GPIT_MICRO PU_LOENGLISH GPIA_ASSOC );
 // open the file associated with this object
 cb = 255;
 _wpQueryRealName ( somSelf, szFname, &cb, TRUE );
 fSuccess = openfile ( szFname , alValues );
 // start a printer job
 lSuccess = DevEscape ( hdcPrinter, DEVESC_STARTDOC
 , strlen ( _wpQueryTitle ( somSelf ) )
 , _wpQueryTitle ( somSelf ) , NULL, NULL );
 // print the values
 ptl.x = 100; ptl.y = 125;
 _itoa ( alValues[0], sz, 10 );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 ptl.y = 100;
 _itoa ( alValues[1], sz, 10 );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 ptl.y = 75;
 lSuccess = GpiMove ( hps, &ptl );
 ptl.x = 200;
 lSuccess = GpiLine ( hps, &ptl );
 ptl.x = 100; ptl.y = 50;
 _itoa ( alValues[0] + alValues[1], sz, 10 );
 lSuccess = GpiCharStringAt ( hps, &ptl, strlen ( sz ) , sz );
 // end a printer job
 lSuccess = DevEscape ( hdcPrinter, DEVESC_ENDDOC , 0 , NULL, NULL, NULL );
 // clean up
 DevCloseDC ( hdcPrinter );
 GpiDestroyPS ( hps );
 return TRUE;
}
// openfile: reads spreadsheet values from the specified file
// RETURNS: TRUE if successful, FALSE if not
BOOL openfile ( PSZ szFname, LONG alValues[] )
{
 HFILE h; // file handle
 ULONG ulAction; // action taken by OPEN
 ULONG ulActualRead; // count read from file
 APIRET rc; // return code
 // open the file
 rc = DosOpen ( szFname, &h, &ulAction, 0L
 , 0, OPEN_ACTION_OPEN_IF_EXISTS OPEN_ACTION_FAIL_IF_NEW
 , OPEN_SHARE_DENYNONE OPEN_ACCESS_READWRITE, NULL );
 if ( rc != 0 )
 return FALSE;
 // read the values
 rc = DosRead ( h, alValues, 8, &ulActualRead );
 if ( ( rc != 0 ) ( ulActualRead != 8 ) )
 {

 DosClose ( h );
 return FALSE;
 }
 // close the file
 DosClose ( h );
 return TRUE;
}





[LISTING FIVE]

// install.c -- installation program for spreadsheet app
// compile and link with: icc /Ss /Ti install.c

#define INCL_WINWORKPLACE
#include <os2.h>
int main ( int argc, char *argv[] )
{
 HAB hab; // anchor block
 HOBJECT hobjFolder; // folder object
 HOBJECT hobjProg; // program object
 BOOL fSuccess; // return from API
 // create an anchor block so we can retrieve errors
 hab = WinInitialize ( 0 );
 // create a folder on the desktop for our program object
 hobjFolder = WinCreateObject ( "WPFolder", "My Folder"
 , "OBJECTID=<MY_FOLDER>" , "<WP_DESKTOP>" , CO_REPLACEIFEXISTS );
 if ( hobjFolder == NULLHANDLE )
 {
 ULONG ul; // error code
 ul = WinGetLastError ( hab );
 printf ("Unable to create folder, error = %x\n", ERRORIDERROR ( ul ) );
 }
 // register our object class for our data files
 fSuccess = WinRegisterObjectClass ( "WPSpread" ,
"c:\\book\\wpsart\\wpspread.dll" );
 if ( fSuccess == FALSE )
 {
 ULONG ul; // error code
 ul = WinGetLastError ( hab );
 printf ("Unable to register class, error = %x\n", ERRORIDERROR ( ul ) );
 }
 // create a program object for our EXE file
 hobjProg = WinCreateObject ( "WPProgram", "Spreadsheet App"
 , "EXENAME=c:\\book\\wpsart\\spread.exe;"
 "ASSOCTYPE=XX Company Spreadsheet,,;"
 "ASSOCFILTER=*.SPR,," , "<MY_FOLDER>"
 , CO_REPLACEIFEXISTS );
 if ( hobjProg == NULLHANDLE )
 {
 ULONG ul; // error code
 ul = WinGetLastError ( hab );
 printf ("Unable to create program object, error = %x\n", ERRORIDERROR ( ul )
);
 }
 WinTerminate ( hab );
}
































































April, 1993
PROGRAMMING PARADIGMS


Psychedelic Technology




Michael Swaine


I have no trouble believing that young Bill Clinton smoked marijuana and
didn't inhale. I'm the President's age, and I remember one party back in the
'60s when...ahem.
Confession may be good for the soul, but confession of innocence can be bad
for the image. Let's put it this way: I admit to about as much as the
President admits to.
Why do I burden you with this dubious confession? As a disclaimer. This
month's column takes as its central metaphor a theme on which I, let us say,
am not an expert. Drugs.
I'm going to talk about virtual reality, which, I contend, is a psychedelic
technology. What I mean by that is that I suspect that some of the interest in
this technology comes from the same motives that cause people to use
psychedelic drugs. It's not just my opinion; a lot of people who are intrigued
by virtual reality draw the parallel. On the other hand, Steve Aukstakalnis
and David Blatner, authors of Silicon Mirage (Peachpit Press, 1992), an
excellent book on virtual reality (VR), caution that, whatever people's
motives, their expectations will be disappointed if they think VR is a drug.
You can always pull the plug, they say.
But I'm getting ahead of myself. First we need a definition, popular though
this term may be. Virtual reality means "a computer-generated, interactive,
three-dimensional environment in which a person is immersed." VPL founder
Jaron Lanier coined the term in 1989.
I'm going to look at how real the technology is, what has been accomplished to
date, what the practical applications are, and what technical problems remain
unsolved. And is there some reality to these alternate realities, or will
those expecting to be transported to some other place necessarily be
disappointed?
Virtual-reality systems are interactive, which means that they need to respond
to what the user does. So does a spreadsheet, but VR is usually charged to
respond to things like head orientation, hand movement in three dimensions,
and sometimes body movement or orientation. When you move your head in a VR
system, the view is supposed to change, and change realistically. Not only
that, the system may be required to provide feedback in other modalities, like
sound and tactile sensations. It's not necessary that the virtual reality
delivered by such a system feel exactly like "real" reality--virtual-reality
systems typically don't worry too much about keeping users from walking
through walls, for example--but it must be responsive and internally
consistent, or it just won't work. When you consider what it takes to make
this kind of feedback consistent and to make it responsive to subtle movements
of the user, you can see that VR isn't easy.
Before looking at some of the tools and the successes of VR, let's look a
little more deeply at the challenges.


Virtual Vision


You can always tell when someone is using a virtual-reality system. The
goggles are a dead giveaway.
Virtual-reality systems may or may not present their users with auditory or
tactile feedback, but they invariably give them a picture. That picture is by
definition an explorable 3-D space. The user gets the visual impression of,
for example, moving down virtual hallways, around rooms, and sometimes,
unsettlingly, right through furniture and other seemingly solid objects. As
you move your head or your hand or your entire body, you walk or fly through
virtual spaces.
Implementing a system like this requires more than just presenting a
succession of 3-D images. Goggles, or some such headgear, are a requirement
because the sense of immersion that virtual reality tries to give demands that
the image wrap around the user. The monitor that you probably spend too much
of your waking hours staring at cuts maybe a 30 degree wedge out of the center
of your field of view. A good virtual-reality system will wrap its picture
around you to the full extent of your visual field, perhaps 240 degrees
horizontally and 120 degrees vertically.
This wraparound view is crucial to giving the sense of immersion that virtual
reality demands. Two other things should be obvious about it, too. It demands
some sort of wraparound hardware; hence the goggles, helmets, and other
headgear. And it makes big demands on processing power. Of course, the
virtual-reality designer wants those 3-D images to move at a reasonably fast
frame rate and with decent resolution (and color would be nice, and realistic
color would be nicer). Now enlarge the field of view by a factor of 32 without
lowering the resolution. Just in terms of bits in the image, the large field
of view required by virtual reality can be as demanding as realistic color vs.
black-and-white.
The human visual system, interestingly enough, has a technique for dealing
with this problem. Your visual system keeps in focus only what's directly in
your line of sight; everything peripheral is out of focus, fuzzy. Something
similar has been proposed for virtual-reality systems: The idea is to display
most of the picture in low resolution, using higher resolution for what's
directly in the center of the user's field of view.
Another reason for the goggles is to provide 3-D. We speak of 3-D modeling
programs, but what these programs really do is represent 3-D objects
internally and let us have whatever 2-D views of the objects we like. To
really see three dimensions, our two eyes need to get different information,
and your monitor screen can't provide different pictures for your two eyes. To
deliver the two separate images that our visual systems integrate to produce a
sense of depth, it's necessary somehow to channel the images to the individual
eyes using techniques like those described in Duvanenko and Robbins's article,
"Algorithms for Stereoscopic Imaging" on page 18 of this issue. Hence, again,
the goggles. 3-D movies and comic books use differently colored lenses, but
this doesn't work if you want to use colored images.


Virtual Hearing


Hearing might seem simpler to virtualize than vision, until you start to think
about the complexities of concert-hall acoustics. Objects cast acoustic
shadows, but they don't block the sound in the simple way in which objects
block light. (Virtual-reality systems are not generally charged with having to
model the relativistic bending of light around masses, or the wave nature of
light.) In fact, obstructions can alter the character of a sound, adding
overtones and echoes. As anyone who remembers quadraphonic sound systems
knows, the experts are just now starting to understand what's involved in
reproducing sound with a sense of realism.
Virtual-reality systems don't all provide sound feedback, but stereo sound is
the basic minimum for those that do. One of the familiar sound phenomena that
just doesn't happen without stereo sound is the cocktail-party effect.
Standing in the middle of a living room full of people engaged in a dozen
different conversations, you can pick a conversation and listen to it,
ignoring the others. (An automatic version of this attention phenomenon is the
name effect: If anyone anywhere in the room uses your name, you'll probably
hear it, whether you were paying attention to their conversation or not.) The
cocktail-party effect doesn't happen without stereo hearing; plug one ear and
you can't pick a conversation out of the general hubbub.
So you need stereo. But stereo isn't enough; virtual-reality systems that
support sound really need highly responsive 3-D sound. The reason is that, if
the sound isn't real enough, it will provide cues that are just plain wrong,
and conflict with what the visual feedback is saying. We use sound a lot,
mostly unconsciously, for orienting ourselves. All of the world around us that
isn't in our field of view is ear space. We have some faint consciousness of
what's behind us, and where, and it's our ears that give us the data. A
virtual-reality system with no sound can work, but one with inadequate sound
will at some point break the illusion, providing feedback that seems just
plain wrong.
Not breaking the illusion is more than just a matter of providing a seamless
interface. In some VR applications, bad sound feedback or a mismatch between
sound and visual feedback can have an unsettling effect: nausea. Bad data can
make you sick.


Virtual Touch


Touch is really two sensory systems: mechanoreceptors, the nerves in the skin
that respond to contact; and proprioception, which is feedback from our
muscles. Although the two systems are sometimes hard to distinguish, it's
worth doing so, because they are not equally virtualizable.
Mechanoreception is really hard to virtualize. Virtual-reality systems don't
try, generally, and it probably isn't important that they do, although it
might be nice to be able to slide your hand across the virtual hardwood floor
in the virtual house to see how smooth it is.
Proprioception is different. Robots have been responding to proprioceptive
feedback for decades. Proprioceptive feedback is when your hand or a robot
vacuum cleaner or a virtual hand encounters resistance. In the case of the
hand or the robot, it's real resistance. If you're using a VR system to pull
molecules apart with your fingers, it's all artificial, but what you feel
should still match what you see. That feel must be constructed and fed to you
somehow.
There seem to be some basic limitations to the possible realism of
proprioceptive feedback. Even if a VR system could somehow make the visitor to
a virtual office feel the solidity of the office walls, how could it stop the
user from walking through the wall anyway? You can make a VR system look real,
sound real, and even to some extent feel real, but you can't make it be real;
you can't make it be composed of solid objects. That's where the illusion
breaks down in every VR system I know of today. You can test if it's real by
poking it.


Reality


So what's the reality of virtual reality?
The tools of VR have improved a lot since Ivan Sutherland made the first
head-mounted display back in 1968. It didn't block your "real" vision, but
superimposed virtual wire-frame models on reality, and was familiarly known as
the "sword of Damocles" because it hung from the ceiling on a heavy arm that
provided the mechanics for tracking head position.
Today, the latest versions of the VPL eyephone spread a 422x238 or 720x480
pixel display across a 108-degree field of view, weigh only a couple of
pounds, and use magnetic position sensors and fresnel lenses to spread the
picture across the field of view. The ARVIS from Concept Vision Systems
(Conway, Washington) uses contact lenses and curved screens and fills a
240x120 field of view. The BOOM from Fake Space Labs (Menlo Park, California)
frees the user from the helmet; you just rest your head on it when you want to
dive into its virtual realities. It uses CRTs to give higher resolution than
the other systems mentioned. The Sutherland approach, overlaying the virtual
reality transparently onto reality, can be seen in the Terminator movies when
Arnold looks at people and gets data about them superimposed on their faces.
That display technology exists today, although the database it implies does
not.
Aukstakalnis and Blatner say that 3-D interactive sound is farther along than
3-D interactive visuals, but in no area are all the problems solved, even in
expensive NASA systems.

The input devices are more impressive to look at. VPL's dataglove uses
fiber-optic cables running along the fingers and deduces hand movements from
the quantity of light traversing the cables. Light escapes through cuts in the
cable as fingers are bent. It is one standard for capturing hand movements;
the Data Suit generalizes the concept. The Dextrous Hand Master from Exos
(Burlington, Massachusetts) is a higher-precision device that wraps an
exoskeleton of mechanical sensors around the hand; ugly, but precise. A less
imposing but important tool is the ubiquitous Polhemus tracker, developed by
Polhemus Navigation Sciences (Colchester, Vermont), which tells the system
where your head is and how it's oriented. It uses two sets of three magnetic
coils, which define the 6 degrees of freedom of 3-space motion: positions on
the x, y, and z axes as well as roll, pitch, and yaw. Alternatives to magnetic
orientation include optical systems (you wear cameras on your head) and image
extraction. Image extraction is the "right" answer; remote cameras observe you
and software figures out what you're doing. It's also the hardest.
Virtuality has some success stories to tell. There's Sitterson Hall at the
University of North Carolina, which now houses the computer-science
department. It was designed using architectural walk-throughs, VR models that
let the user virtually walk through the building before ground has been
broken. Researchers built a treadmill-and-helmet system that lets the user
walk naturally to move forward, turn handlebars to turn, and tilt his or her
head to orient the view. The walk-through actually allowed the users,
including the computer scientists who would be occupying the building, to
discover design flaws. The architect balked at one change the computer
scientists thought was needed, until he took the virtual tour himself.
NASA researchers have developed systems that can simulate in real time
acoustic factors such as reflecting surfaces and various degrees of
surface-sound absorption. This sound can then be delivered through headphones
as a 3-D sound field, and can be correlated with visual feedback.
VR systems are also available commercially (or will be soon) for things like
observing how radiation will pass through a patient's organs, setting up
virtual physics labs in schools, and letting astronomers get a sense of the
3-D structure of the universe by flying through a virtual universe. Arcade
games that use the technology of VR to wrap the player in a virtual world are
already in use, and more are likely to come out this year.
Possibly the most useful application of VR is also the most abstract, the
least real. Often analysts struggle with summary statistics and exploratory
data-analysis techniques, trying to get a feeling for how a complex,
multidimensional database is structured. VR lets them fly through the data,
switching dimensions with a wave of the hand. This, it seems to me, is very
useful.
So does virtual reality work as a psychedelic drug? Does it take the user to
another place, inaccessible to the unaltered consciousness? Does it bring one
face-to-face with the howling Tao? Do you care?
The answer is probably no to the howling Tao bit. But does VR take the user
elsewhere? That question is probably useless in that form, but here are some
alternatives that may be more useful: Does virtual reality have the potential
to give us a sense of being somewhere else? Yes. Will there be things we can
do in the somewhere elses of virtual reality that we can't do elsewhere? Yes.
Is it possible that some of those things will become so important to us that
we can't imagine getting along without them? Yes. Is that what the seeker
after transcendent experience is looking for? No, probably not. Does that
matter? No, probably not. The real question about this stuff is perhaps, How
important will it be to me? Time will tell, but some of the applications I've
described suggest that some applications of virtual reality may become, or may
already be, very important to you.
Another issue is, does virtual reality have the potential to be addictive? If
that means psychologically addictive, rather than physiologically, then I
think the answer is, of course. Why should virtual reality, skillfully done,
be any less addictive than television, the premiere electronic drug?






















































April, 1993
C PROGRAMMING


Of Keyboards and Menus


 This article contains the following executables: DFPP01.ARC


Al Stevens


I've been messing around with MIDI, something I used to avoid. MIDI is the
Musical Instrument Digital Interface, a standard protocol that electronic
instruments use to communicate among themselves and with sequencing computers.
A few months ago I got a CD-ROM drive because SDKs are coming out now on
CD-ROM disks. The hardware interface to the drive is also a PC sound card that
includes a MIDI adapter. It's called a Sound Blaster Pro.
I am an old jazz pianist with a mild aversion to Yuppie jazz -- sterile,
synthesized elevator stuff with saxophone players who sound alike and repeat
themselves more often than an AK-47. I don't like being called a "keyboard
player." I play the piano -- the real thing, the wooden kind with felt and
strings. A couple of years ago in a weak moment, I bought a Yamaha Clavinova
electric piano. It almost feels and sounds like a real one, and I can clamp on
the headset and practice scales into the wee hours without jeopardizing my
marriage or violating any neighborhood covenants. It's not the real thing, but
it served its purpose.
The Yamaha has two MIDI sockets, in and out. I never paid much attention to
them, but since my new Sound Blaster Pro has a cable that fits those sockets,
I decided to see what it would do. The result is sometimes fun but mostly
frustrating. It's almost the way I want it, but not quite right. If I could
write a C program that would read and write that MIDI port, I could make it do
my bidding.
With that in mind I bought a book that has C and Pascal code for programming
the Sound Blaster. The book is called The Sound Blaster Book, by Axel Stolz
(Abacus, 1992). It's a really good book with clear explanations of many of the
technical aspects of music and sound reproduction. It explains everything
about programming the Sound Blaster Pro except how to read and write the MIDI
ports, the very thing I want to know. I have a number of other books that
teach MIDI in a C environment, but everyone seems to assume that if you can
copy a MIDI file to the instrument, that's all you need, and since the
hardware comes with a couple of programs that will do that, there's no need to
tell you how to write your own.
Using the sequencer software that came with the Sound Blaster, I got my system
to do the following: I "record" a synthesized string-bass line for a given
song and store it in a MIDI file. I record it by playing the bass notes on the
lower part of the keyboard. When I play the song back, the thing uses the bass
synthesizer and somehow kicks in the drum machine built into the keyboard
device, from which I can select a tempo style. Music minus one, they used to
call it. Karaoke, today. I can tell the sequencer to smooth out any erratic
tempos or notes in what I recorded. I can record at a lower tempo to get it
right and then play it back at the normal tempo. I can edit the notes on the
screen to correct any clams. When I play the bass and drum parts back, I can
manually play the piano along with them and have the tightest, most precise
rhythm section a jazz pianist ever worked with, and it doesn't get stoned on
the job, hit on the waitresses, come in late, or ask for more pay or better
rooms.
That's great, and I'm having fun with it, but I want to control the program in
real time just like I can with a live, breathing, drinking rhythm section. I
want to decide on-the-fly to play another chorus, do a turnaround, put a tag
on the end, change keys, play a substitution, or return to the bridge--all the
things that improvising musicians can do with the structure of a song whenever
the spirit moves them. To do that, I need to write a program that fetches
different parts of the MIDI stream on command and writes the output port. I
also have some ideas about improvising bass lines from a stream of chord
symbols, perhaps even reading the keyboard input in real time and doing a
translation. I've seen sequencers do that, but they never seem to understand
all the chord inversions or contemporary voicings. To do those things the way
I want, I need to read and write those MIDI ports. The documentation doesn't
provide the port numbers, much less the protocols.
No doubt there is a subculture of MIDI mavens out there who already have these
problems under their belts. Maybe some of them will fill in the holes created
by my ignorance. I'll be researching this stuff further in the months to come
and will report on its progress. Back to work.


D-Flat++ Menus


This month fills out the D-Flat++ application environment with a menu system.
The CUA menu architecture uses a menu bar across the top of the screen with
labels to represent each of several pop-down menus. DF++ uses classes to
define the menu bar and the pop-down menus. The applications program does not
interact closely with these classes. The Application window takes care of
that. The applications program must, however, define the characteristics of
the menu system in a series of tables.
You will recall from code in the February "C Programming" column that the
Application class includes a pointer to a MenuBar object. You build an
application program by deriving a custom application window from the generic
one. When you declare the custom application object, you include as a
parameter a pointer to an array of MenuBarItem objects. The array defines the
menu-bar items, which define the pop-down menus.
A D-Flat++ menu system is a hierarchy of objects--not a class inheritance
hierarchy, but a hierarchical organization of related classes. At the top is
the Application window, which points to a MenuBar object. The MenuBar object
points to a list of MenuBarItem objects. Each MenuBarItem object builds a
PopDown object. Each PopDown object contains a list of MenuSelection objects.
Each MenuSelection object represents an action, which is usually a function
call, a menu toggle, or a cascaded pop-down menu.


A Sample Application Program


Listing One, page 130, is test.cpp, a very small D-Flat++ application that
shows how an application program defines a menu and associates it with the
Application window. The program builds an Application window with a single
pop-down File menu. The first two commands, New and Open, do nothing. The Exit
command terminates the program.
This 45-line program is self contained. It fully describes its menus and
application environment. The myAppl class defines the application, which has a
menu, a title, and three functions executed by menu commands.
Observe the way the program defines menus. It declares three MenuSelection
objects. These are the selections that users are given. They will be
distributed among the pop-down menus. Their proximity to one another here is
coincidental. You can change their distribution on the menus without touching
these declarations.
Next comes an array of pointers to the MenuSelection objects. Each such array
defines a pop-down menu, in this case the only one in the program. The address
of the SelectionSeparator object specifies that DF++ should display a
horizontal separator line between selections. The array is NULL terminated.
The array of MenuBarItem objects defines the menu bar. Each entry in the array
specifies a menu-bar label and a pointer to the MenuSelection array that
defines the associated pop-down menu. The array is terminated by an entry
constructed with a NULL pointer argument.
You can see that the menu definition uses two different techniques for
defining the menu bar and the pop-down menus. The array of MenuBarItems is an
array of objects. The arrays of MenuSelection items are arrays of pointers to
externally declared MenuSelection objects. By declaring those objects outside
of the actual menu definitions, you give each command its own identity and can
send messages to it, such as to tell it to disable itself, report its current
toggle state, and so on. If the objects were arrayed themselves in the menus,
their identities would be through subscripts into the specific menus and would
not be as easy to move around when you modified the design.


The MenuBar Class


The constructor for the Application window class uses the pointer to the
MenuBarItem array to construct an instance of the MenuBar class defined in
menubar.h (Listing Two, page 130). Listing Two also defines the MenuBarItem
class. The array of MenuBarItem objects defines the contents of the MenuBar
object. Listing Three, page 130, is menubar.cpp, which contains the member
functions for the MenuBar and MenuBarItem classes.
When the Application window displays, so does the menu bar. When the
Application window gets resized, the MenuBar window receives the Parent-Sized
message and adjusts its own size to fit the new size of the Application
window.


MenuBar Keystrokes


When any document window in an application does not intercept and process a
keystroke, the base DFWindow class passes the keystroke to the parent of the
window that rejected the key. The top ancestor of all windows is the reigning
Application window, so it ultimately gets any unintercepted keystrokes. When
the Application window receives a keystroke that it does not want to process,
it passes the keystroke to the MenuBar object. The MenuBar object, therefore,
gets all the keystrokes that no other window wants.
The MenuBar class's Keyboard function looks first to see if the keystroke is
any of the accelerator keys assigned to menu commands within the PopDown
objects. An accelerator key is a keystroke that the user may employ to execute
a menu command even though the menu is not popped down. You'll see later how
PopDown objects are built. Next, the MenuBar object tests to see if the
keystroke is one of the menu bar's shortcut keys. A shortcut key is a
highlighted letter or number in a menu selection's label. The user may press
Alt plus the highlighted character to select the shortcut key on a menu bar.
(Alt is not needed to select the shortcut key on a pop-down menu.) Selecting
the menu's shortcut key in this fashion selects and pops down the associated
menu.
The MenuBar class's LeftButton message can select a pop-down menu, too. If the
mouse coordinates are within the menu label's display, the MenuBar object will
pop down the menu.


The MenuBarItem Class



Each MenuBarItem object represents one pop-down menu and includes the address
of an array of MenuSelection objects that define the characteristics of each
selection on the menu. Most selections will consist of a label and the address
of a function to be executed when the user chooses the selection. Those
functions are member functions of the custom application class that you derive
from the D-Flat++ Application class.


The MenuSelection Class


Listing Four (page 132) is menusel.h, the header file that defines the
MenuSelection class, and Listing Five (page 132) is menusel.cpp. Menu
definitions for an application consist of arrays of pointers to MenuSelection
objects, as shown in Listing One. You can construct a MenuSelection object
with a label and a function, or with various combinations of label, function,
accelerator key, and active/inactive state conditions. MenuSelection objects
can also be separator tokens to display a line between the normal selections
on the menu. One constructor allows you to specify a toggle value of On or
Off, which identifies the selection as one that can be toggled. A toggled menu
selection maintains an On/Off state depending on the most recent selection by
the user. It displays a check mark next to its label when it is in the On
state. The application program can query the current condition of a toggled
menu selection's On/Off state even when the pop-down menu is not active.
One constructor for the MenuSelection class allows you to specify the address
of another array of MenuSelection objects. This format identifies a cascading
menu selection. When the user selects the item, DF++ pops down a cascading
menu as represented by the second array. Cascading menus may themselves
contain cascading menus.


Building the Menu Bar


The MenuBar constructor iterates through the list of MenuBarItem objects and
builds the menu bar by building a string of menu labels. This string, which is
the menu-bar display, has a text label for each pop-down menu. The constructor
also instantiates each of the pop-down menus by building PopDown objects,
passing the address of the array of MenuSelection objects associated with each
MenuBarItem object. The pop-down windows do not display at this time--they are
simply constructed. The MenuBar object will display each of the pop-down
windows only when the user selects them.


The PopDown Class


Listing Six (page 132) and Listing Seven (page 132) are popdown.h and
popdown.cpp, respectively. They define the PopDown class, which is derived
from the ListBox class I described last month. A PopDown object has some
unique behavior, however. Each pop-down window adjusts its size and position
to match its contents and the position of its parent. A noncascading pop-down
menu is positioned just below the menu-bar label that selects it. If that
position puts the menu offscreen, the menu will self-adjust to stay onscreen.
A cascading menu positions itself adjacent to the menu selection that selected
the cascade, also adjusting itself to stay onscreen.
A pop-down menu's borders are constant. They do not change to reflect an
in-focus state like other list boxes do. The Paint message recognizes toggled
selections and displays or clears the check mark depending on the state of the
toggle.
The Keyboard message exhibits unique behavior, too. The text of a pop-down
menu never exceeds the height of the window, so scrolling never occurs. The
selection cursor wraps from bottom to top and vice versa. The forward and back
arrow keys close the current menu and cause the next right or left pop-down
menu to be opened.
Selection of a menu item occurs when the user selects the item and presses the
Enter key, a matching accelerator key, or a matching shortcut key while the
menu is open. Selection also occurs if the user presses and releases the left
mouse button on a menu selection. The button release makes the selection.
A menu selection--implemented by the Choose method--does one of the following
things: If the menu is not enabled, the selection sounds the audible alarm; it
toggles the selection's On/Off state if the selection is a toggle; it opens a
cascaded pop-down menu if the selection is a cascading selection; and it calls
the Application window's member function if one is assigned to the selection.
If a function is assigned, the program closes the pop-down menu and any higher
cascading menus. Menuselection application functions execute with the menus
all closed down. A toggle selection may include a command function as well.


The Control Menu


The last part of the menu system is the control menu, implemented in Listing
Eight, page 138, ctlmenu.cpp. The CUA control menu, sometimes called the
"system menu," contains one or more of a fixed set of generic window commands.
These commands are: Restore, Move, Size, Minimize, Maximize, and Close. The
user can use this menu to perform these operations. The user opens the window
by clicking the control box in the upper-left corner of the window or by
pressing Alt+Spacebar for an Application window or dialog box and Alt+Hyphen
for a document window. The DFWindow class's Keyboard method intercepts these
keystrokes and calls the OpenCtlMenu function, which customizes the control
menu according to the attributes of the window. For example, if the window may
be minimized or maximized, the control menu will have a Restore command. If
the window is currently in either of those conditions, the Restore command
will be enabled. The other commands are similarly customized.


How to Get the Source Code


D-Flat++ is moving along. The current version is not a full CUA package but it
is growing, and there is enough of an implementation to give you an idea of
how it will work and how it will differ from D-Flat. You can download DF++
from the CompuServe DDJ forum or from M&T Online. You can also get it by
sending a stamped, self-addressed diskette mailer and a formatted diskette to
me at Dr. Dobb's Journal, 411 Borel Avenue, San Mateo, CA 94402. I'll include
the latest version of D-Flat as well. The software is free, but if you wish,
include a dollar for my Careware charity, the Brevard County Food Bank.

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

// ------------- test.cpp
#include "dflatpp.h"

extern MenuBarItem aMenu[];

// ------- application definition
class myAppl : public Application {
public:
 myAppl() : Application("Hello",aMenu) {}
 // ----- menu command functions
 void DoNew() {}
 void DoOpen() {}
 void DoExit() { CloseWindow(); }
};
// --------- MenuSelection objects

MenuSelection
 NewCmd ("~New", (void (DFWindow::*)()) &myAppl::DoNew ),
 OpenCmd ("~Open", (void (DFWindow::*)()) &myAppl::DoOpen ),
 ExitCmd ("E~xit Alt+F4", (void (DFWindow::*)()) &myAppl::DoExit, ALT_F4);

// --------- File menu definition
MenuSelection *File[] = {
 &NewCmd,
 &OpenCmd,
 &SelectionSeparator,
 &ExitCmd,
 NULL
};
// --------- menu bar definition
MenuBarItem aMenu[] = {
 MenuBarItem( "~File", File ),
 MenuBarItem( NULL )
};
void main()
{
 myAppl *aWnd = new myAppl;
 while (desktop.DispatchEvents())
 ;
 delete aWnd;
}







[LISTING TWO]

// -------- menubar.h
#ifndef MENUBAR_H
#define MENUBAR_H

#include "textbox.h"
#include "popdown.h"

class MenuBarItem {
public:
 String *title; // menu bar selection label
 int x1; // 1st label position on bar
 int x2; // last " " " "
 MenuSelection **ms; // popdown selection list
 PopDown *popdown; // popdown window
 void (*menuprep)(); // menu prep function
 MenuBarItem(char *Title, MenuSelection **Ms = NULL,
 void (*MenuPrep)() = NULL);
 ~MenuBarItem() { if (title) delete title; }
};
class MenuBar : public TextBox {
 MenuBarItem *menuitems; // list of popdowns
 int menucount; // count of popdowns
 int selection; // current selection on the bar
 Bool ispoppeddown; // True = a menu is down
 DFWindow *oldfocus; // previous focus

 void SetColors();
 void Select();
 Bool AcceleratorKey(int key);
 Bool ShortCutKey(int key);
public:
 MenuBar(MenuBarItem *MenuItems, DFWindow *par);
 ~MenuBar();
 // -------- menubar API messages
 void Keyboard(int key);
 void LeftButton(int mx, int my);
 Bool SetFocus();
 void ResetFocus();
 void Paint();
 void Select(int sel);
 void SetSelection(int sel);
 void ParentSized(int xdif, int ydif);
};
#endif







[LISTING THREE]

// ------------- menubar.cpp
#include <ctype.h>
#include "desktop.h"
#include "menubar.h"
#include "menusel.h"

// -------- construct a menubar item
MenuBarItem::MenuBarItem(char *Title, MenuSelection **Ms,void (*MenuPrep)())
{
 if (Title != NULL)
 title = new String(Title);
 else
 title = NULL;
 ms = Ms;
 menuprep = MenuPrep;
 popdown = NULL;
}
// -------- construct a menubar
MenuBar::MenuBar( MenuBarItem *MenuItems, DFWindow *par) :
 TextBox( par->ClientLeft(), par->ClientTop()-1,
 1, par->ClientWidth(), par)
{
 windowtype = MenubarWindow;
 menuitems = MenuItems;
 SetAttribute(NOCLIP);
 SetColors();
 selection = -1;
 ispoppeddown = False;
 MenuBarItem *menu = menuitems;
 menucount = 0;
 oldfocus = NULL;
 int off = 2;

 SetTextLength(desktop.screen().Width()*2);
 while (menu->title != NULL) {
 int len = menu->title->Strlen()-1;
 menu->x1 = off;
 menu->x2 = off+len-1;
 off += len+2;
 String ttl(" ");
 ttl += *(menu->title);
 AddText(ttl);
 int n = text->Strlen()-1;
 (*text)[n] = '\0';
 menu->popdown = new PopDown(this, menu->ms);
 menu++;
 menucount++;
 }
}
// -------- menubar destructor
MenuBar::~MenuBar()
{
 MenuBarItem *menu = menuitems;
 while (menu->title != NULL) {
 delete menu->popdown;
 menu++;
 }
 TextBox::CloseWindow();
}
// -------- set the fg/bg colors for the window
void MenuBar::SetColors()
{
 colors.fg = BLACK;
 colors.bg = LIGHTGRAY;
 colors.sfg = BLACK;
 colors.sbg = CYAN;
 colors.ffg = BLACK;
 colors.fbg = LIGHTGRAY;
 colors.hfg = BLACK;
 colors.hbg = LIGHTGRAY;
 shortcutfg = RED;
}
// ---- menubar gets the focus
Bool MenuBar::SetFocus()
{
 if (oldfocus == NULL)
 if (desktop.InFocus() != NULL)
 if (desktop.InFocus()->State() != ISCLOSING)
 oldfocus = desktop.InFocus();
 return TextBox::SetFocus();
}
// ---- menubar loses the focus
void MenuBar::ResetFocus()
{
 if (!ispoppeddown) {
 SetSelection(-1);
 oldfocus = NULL;
 }
 TextBox::ResetFocus();
}
// -------- paint the menubar
void MenuBar::Paint()

{
 WriteShortcutLine(0, colors.fg, colors.bg);
 if (selection != -1) {
 int x = menuitems[selection].x1;
 int len = menuitems[selection].x2-x+2;
 String sel = text->mid(len, x+selection);
 DisplayShortcutField(sel,x,0,colors.sfg,colors.sbg);
 }
}
// ------- left mouse button is pressed
void MenuBar::LeftButton(int mx, int)
{
 mx -= Left();
 MenuBarItem *menu = menuitems;
 int sel = 0;
 while (menu->title != NULL) {
 if (mx >= menu->x1 && mx <= menu->x2) {
 if (selection != sel !ispoppeddown) {
 if (ispoppeddown) {
 PopDown *pd = menuitems[selection].popdown;
 if (pd->isOpen())
 pd->CloseMenu(False);
 }
 Select(sel);
 }
 return;
 }
 sel++;
 menu++;
 }
 if (selection == -1)
 SetSelection(0);
}
void MenuBar::SetSelection(int sel)
{
 selection = sel;
 Paint();
}
// ----- programmed selection
void MenuBar::Select(int sel) {
 selection = sel;
 Select();
}
// ------- user selection
void MenuBar::Select()
{
 Paint();
 ispoppeddown = True;
 MenuBarItem &mb = *(menuitems+selection);
 int lf = Left() + mb.x1;
 int tp = Top()+1;
 if (mb.menuprep != NULL)
 (*mb.menuprep)();
 mb.popdown->OpenMenu(lf, tp);
}
// ------ test for popdown accelerator key
Bool MenuBar::AcceleratorKey(int key)
{
 MenuBarItem *menu = menuitems;

 while (menu->title != NULL) {
 PopDown *pd = menu->popdown;
 if (pd->AcceleratorKey(key))
 return True;
 menu++;
 }
 return False;
}
// ------ test for menubar shortcut key
Bool MenuBar::ShortCutKey(int key)
{
 int altkey = desktop.keyboard().AltConvert(key);
 MenuBarItem *menu = menuitems;
 int sel = 0;
 while (menu->title != NULL) {
 int off = menu->title->FindChar(SHORTCUTCHAR);
 if (off != -1) {
 String &cp = *(menu->title);
 int c = cp[off+1];
 if (tolower(c) == altkey) {
 SetFocus();
 Select(sel);
 return True;
 }
 }
 sel++;
 menu++;
 }
 return False;
}
// -------- keystroke while menubar has the focus
void MenuBar::Keyboard(int key)
{
 if (AcceleratorKey(key))
 return;
 if (!ispoppeddown && ShortCutKey(key))
 return;
 switch (key) {
 case F10:
 if (ispoppeddown)
 break;
 if (this != desktop.InFocus()) {
 if (selection == -1)
 selection = 0;
 SetFocus();
 break;
 }
 // ------ fall through
 case ESC:
 ispoppeddown = False;
 SetSelection(-1);
 if (oldfocus != NULL)
 oldfocus->SetFocus();
 else
 parent->SetFocus();
 break;
 case FWD:
 selection++;
 if (selection == menucount)

 selection = 0;
 if (ispoppeddown)
 Select();
 else
 Paint();
 break;
 case BS:
 if (selection == 0)
 selection = menucount;
 --selection;
 if (ispoppeddown)
 Select();
 else
 Paint();
 break;
 case '\r':
 if (selection != -1)
 Select();
 break;
 case ALT_F6:
 TextBox::Keyboard(key);
 break;
 default:
 break;
 }
}
// ---- resize the menubar when the application window resizes
void MenuBar::ParentSized(int xdif, int)
{
 Size(Right()+xdif, Bottom());
}







[LISTING FOUR]

// ------- menusel.h
#ifndef MENUSEL_H
#define MENUSEL_H

#include <stdio.h>
#include "strings.h"
#include "dfwindow.h"

enum MenuType {
 NORMAL,
 TOGGLE,
 CASCADER,
 SEPARATOR
};
enum Toggle { Off, On };

class DFWindow;
class PopDown;


#define NULLFUNC (void (DFWindow::*)())NULL
class MenuSelection {
 String *label; // selection label
 void (DFWindow::*cmdfunction)(); // selection function
 MenuType type; // NORMAL, TOGGLE,
 // CASCADER, SEPARATOR
 Bool isenabled; // True = enabled, False = disabled
 int accelerator; // accelerator key
 PopDown *cascade; // cascaded menu window
 MenuSelection **cascaders; // cascaded menu selection list
 Toggle toggle; // On or Off
 void NullSelection(); // Build a null selection
 friend PopDown;
 void CommonConstructor( char *Label,
 int Accelerator,
 void (DFWindow::*CmdFunction)(),
 Bool Active,
 MenuType Type,
 Toggle Tgl,
 MenuSelection **Cascaders = NULL);
public:
 MenuSelection( char *Label,
 void (DFWindow::*CmdFunction)() = 0,
 int Accelerator=0,
 Bool Active=True );
 MenuSelection( char *Label,
 void (DFWindow::*CmdFunction)(),
 Toggle Tgl,
 int Accelerator=0,
 Bool Active=True );
 MenuSelection( char *Label,
 MenuSelection **Cascaders,
 int Accelerator=0,
 Bool Active=True );
 MenuSelection(MenuType Type);

 Bool isEnabled() { return isenabled; }
 void Enable() { isenabled = True; }
 void Disable() { isenabled = False; }
 void SetToggle() { toggle = On; }
 void ClearToggle() { toggle = Off; }
 void InvertToggle() { toggle = toggle == On ? Off : On; }
 Bool isToggled() { return (Bool) (toggle == On); };
 Type() { return type; }
};
extern MenuSelection SelectionSeparator;
extern MenuSelection SelectionTerminator;

#endif








[LISTING FIVE]


// ------- menusel.cpp
#include <string.h>
#include "menusel.h"

MenuSelection SelectionSeparator(SEPARATOR);
void MenuSelection::NullSelection()
{
 label = NULL;
 cmdfunction = NULL;
 type = NORMAL;
 cascade = NULL;
 isenabled = True;
 accelerator = 0;
 cascade = NULL;
 toggle = Off;
}
void MenuSelection::CommonConstructor(
 char *Label,
 int Accelerator,
 void (DFWindow::*CmdFunction)(),
 Bool Active,
 MenuType Type,
 Toggle Tgl,
 MenuSelection **Cascaders)
{
 NullSelection();
 if (Label != NULL)
 label = new String(Label);
 accelerator = Accelerator;
 cmdfunction = CmdFunction;
 isenabled = Active;
 type = Type;
 toggle = Tgl;
 cascaders = Cascaders;
}
MenuSelection::MenuSelection( char *Label,
 void (DFWindow::*CmdFunction)(),
 int Accelerator, Bool Active )
{
 CommonConstructor(Label, Accelerator, CmdFunction, Active, NORMAL, Off);
}
MenuSelection::MenuSelection( char *Label,
 void (DFWindow::*CmdFunction)(),
 Toggle Tgl, int Accelerator, Bool Active)
{
 CommonConstructor(Label, Accelerator, CmdFunction, Active, TOGGLE, Tgl);
}

MenuSelection::MenuSelection(char *Label,
 MenuSelection **Cascaders,
 int Accelerator, Bool Active )
{
 CommonConstructor(Label, Accelerator, NULL,
 Active, CASCADER, Off, Cascaders);
}
MenuSelection::MenuSelection(MenuType Type)
{
 NullSelection();
 type = Type;

}







[LISTING SIX]

// -------- popdown.h
#ifndef POPDOWN_H
#define POPDOWN_H

#include "desktop.h"
#include "listbox.h"

const unsigned char LEDGE = '\xc3';
const unsigned char REDGE = '\xb4';
const unsigned char CASCADEPOINTER = '\x10';

inline unsigned char CheckMark()
{
 return desktop.screen().Height() == 25 ? 251 : 4;
}
class MenuSelection;
class MenuBar;

class PopDown : public ListBox {
 MenuSelection **selections; // array of selections
 Bool isopen; // True = menu is open
 Bool iscascaded; // True = menu is cascaded
 int menuwidth; // width of menu
 int menuheight; // height of menu

 void BuildMenuLine(int sel);
 void MenuDimensions();
 void SetColors();
 void DisplayMenuLine(int lno);
 Bool ShortCutKey(int key);
protected:
 void ClearSelection();
public:
 PopDown(DFWindow *par, MenuSelection **Selections = NULL)
 : ListBox(5, 5, par)
 { selections = Selections; OpenWindow(); }
 virtual ~PopDown()
 { if (windowstate != CLOSED) CloseWindow(); }
 // -------- listbox API messages
 void OpenWindow();
 void CloseWindow();
 void OpenMenu(int left, int top);
 void CloseMenu(Bool SendESC = False);
 void Show();
 void Paint();
 void Border();
 void Keyboard(int key);
 void ShiftChanged(int sk);
 void ButtonReleased(int mx, int my);

 void LeftButton(int mx, int my);
 void DoubleClick(int mx, int my);
 void Choose();
 void SetSelection(int sel);
 Bool isOpen() { return isopen; }
 Bool &isCascaded() { return iscascaded; }
 Bool AcceleratorKey(int key);
 Bool ParentisMenu(DFWindow &wnd);
 Bool ParentisMenu() { return ParentisMenu(*this); }
};
#endif







[LISTING SEVEN]

// ------------- popdown.cpp
#include <ctype.h>
#include "desktop.h"
#include "popdown.h"
#include "menusel.h"

// --------- create a popdown menu
void PopDown::OpenWindow()
{
 windowtype = PopdownWindow;
 if (windowstate == CLOSED)
 ListBox::OpenWindow();
 SetAttribute(BORDER SHADOW SAVESELF NOCLIP);
 selection = 0;
 DblBorder = False;
 isopen = False;
 SetColors();
 iscascaded = False;
 if (selections != NULL) {
 MenuDimensions();
 SetTextLength(menuwidth * menuheight);
 for (int i = 0; i < menuheight; i++) {
 MenuSelection &ms = **(selections+i);
 BuildMenuLine(i);
 if (ms.type == CASCADER) {
 ms.cascade = new PopDown(this, ms.cascaders);
 ms.cascade->isCascaded() = True;
 }
 }
 rect.Right() = rect.Left() + menuwidth;
 rect.Bottom() = rect.Top() + menuheight + 1;
 }
}
// ---- shut down a popdown menu
void PopDown::CloseWindow()
{
 if (selections != NULL) {
 // --- delete all cascader popdowns
 for (int i = 0; selections[i]; i++) {

 MenuSelection &ms = *selections[i];
 if (ms.type == CASCADER && ms.cascade != NULL)
 delete ms.cascade;
 }
 }
 ListBox::CloseWindow();
}

// ------- pop down the menu
void PopDown::OpenMenu(int left, int top)
{
 Rect rc(0, 0, desktop.screen().Width()-1,
 desktop.screen().Height()-1);
 DFWindow *Wnd = parent;
 while (Wnd != NULL && Wnd->WindowType() == PopdownWindow)
 Wnd = Wnd->Parent();
 if (Wnd != NULL && (Wnd = Wnd->Parent()) != NULL) {
 Rect rc = Wnd->ClientRect();
 left = min(max(left, rc.Left()), rc.Right() - ClientWidth());
 top = min(max(top, rc.Top()), rc.Bottom() - ClientHeight());
 }
 left = min(max(left, rc.Left()), rc.Right()-ClientWidth()-1);
 top = min(max(top, rc.Top()), rc.Bottom()-ClientHeight()-1);
 isopen = True;
 Move(left, top);
 CaptureFocus();
 Paint(); // in case a command attribute changed
}
// ---------- deactivate the popdown menu
void PopDown::CloseMenu(Bool SendESC)
{
 if (isopen) {
 // ------- close any open cascaded menus
 PopDown *Wnd = (PopDown *)first;
 while (Wnd != NULL) {
 Wnd->CloseMenu();
 Wnd = (PopDown *) (Wnd->next);
 }
 Hide();
 isopen = False;
 ReleaseFocus();
 if (parent && !iscascaded && SendESC)
 parent->Keyboard(ESC);
 }
}
void PopDown::Show()
{
 if (isopen)
 ListBox::Show();
}
// -------- build a menu line
void PopDown::BuildMenuLine(int sel)
{
 int wd = menuwidth;
 String ln;
 if (selections[sel]->type == SEPARATOR)
 ln = String(--wd, LINE);
 else {
 ln = String(" ");

 ln += *(selections[sel]->label);
 int r = wd-ln.Strlen();
 ln += String(r, ' ');
 if (selections[sel]->type == CASCADER)
 ln[wd-1] = CASCADEPOINTER;
 }
 AddText(ln);
}
// -------- compute menu width
void PopDown::MenuDimensions()
{
 int txlen = 0;
 for (int i = 0; selections[i] != NULL; i++) {
 if (selections[i]->type != SEPARATOR) {
 int lblen = (selections[i]->label)->Strlen()-1;
 txlen = max(txlen, lblen);
 }
 }
 menuwidth = txlen+4;
 menuheight = i;
}
// -------- set the fg/bg colors for the window
void PopDown::SetColors()
{
 colors.fg = BLACK;
 colors.bg = CYAN;
 colors.sfg = BLACK;
 colors.sbg = LIGHTGRAY;
 colors.ffg = BLACK;
 colors.fbg = CYAN;
 colors.hfg = DARKGRAY; // Inactive FG
 colors.hbg = CYAN; // Inactive FG
 shortcutfg = RED;
}
// ------ display a menu line
void PopDown::DisplayMenuLine(int lno)
{
 if (isopen) {
 int fg, bg;
 int isActive = selections[lno]->isEnabled();
 int sfg = shortcutfg;
 if (lno == selection) {
 fg = colors.sfg;
 bg = colors.sbg;
 }
 else if (isActive) {
 fg = colors.fg;
 bg = colors.bg;
 }
 else {
 fg = colors.hfg;
 bg = colors.hbg;
 }
 if (!isActive)
 shortcutfg = fg;
 WriteShortcutLine(lno, fg, bg);
 shortcutfg = sfg;
 }
}

// ------ set no selection current
void PopDown::ClearSelection()
{
 if (selection != -1) {
 int sel = selection;
 selection = -1;
 DisplayMenuLine(sel);
 }
}
// ------ set a current menu selection
void PopDown::SetSelection(int sel)
{
 ClearSelection();
 if (sel >= 0 && sel < wlines) {
 selection = sel;
 DisplayMenuLine(sel);
 }
}
// ---------- paint the menu
void PopDown::Paint()
{
 if (text == NULL)
 ListBox::Paint();
 else {
 for (int i = 0; i < wlines; i++) {
 if (selections[i]->type == TOGGLE) {
 char *cp = TextLine(i);
 if (selections[i]->toggle == On)
 *cp = CheckMark();
 else
 *cp = ' ';
 }
 DisplayMenuLine(i);
 }
 }
}
// --------- paint the menu's border
void PopDown::Border()
{
 if (isopen && isVisible()) {
 int fg = colors.ffg;
 int bg = colors.fbg;
 int rt = Width()-1;
 ListBox::Border();
 for (int i = 0; i < wlines; i++) {
 if (selections[i]->type == SEPARATOR) {
 WriteWindowChar(LEDGE, 0, i+1, fg, bg);
 WriteWindowChar(REDGE, rt, i+1, fg, bg);
 }
 }
 }
}
// ------- test for a menu selection accelerator key
Bool PopDown::AcceleratorKey(int key)
{
 for (int i = 0; i < wlines; i++) {
 MenuSelection &ms = **(selections+i);
 if (key == ms.accelerator) {
 SetSelection(i);

 Choose();
 return True;
 }
 }
 return False;
}
// ------- test for a menu selection shortcut key
Bool PopDown::ShortCutKey(int key)
{
 key = tolower(key);
 for (int i = 0; i < wlines; i++) {
 MenuSelection &ms = **(selections+i);
 int off = ms.label->FindChar(SHORTCUTCHAR);
 if (off != -1) {
 String &cp = *ms.label;
 int c = cp[off+1];
 if (key == tolower(c)) {
 SetSelection(i);
 Choose();
 return True;
 }
 }
 }
 return False;
}
// ----- keystroke while menu is popped down
void PopDown::Keyboard(int key)
{
 if (AcceleratorKey(key))
 return;
 if (ShortCutKey(key))
 return;
 switch (key) {
 case UP:
 if (selection == 0) {
 SetSelection(wlines-1);
 return;
 }
 if (selections[selection-1]->type == SEPARATOR) {
 SetSelection(selection-2);
 return;
 }
 break;
 case DN:
 if (selection == wlines-1) {
 SetSelection(0);
 return;
 }
 if (selections[selection+1]->type == SEPARATOR) {
 SetSelection(selection+2);
 return;
 }
 break;
 case ESC:
 CloseMenu(ParentisMenu());
 return;
 case FWD:
 case BS:
 CloseMenu();

 if (parent != NULL) {
 parent->Keyboard(key);
 return;
 }
 break;
 default:
 break;
 }
 ListBox::Keyboard(key);
}
// ----- shift key status changed
void PopDown::ShiftChanged(int sk)
{
 if (sk & ALTKEY)
 CloseMenu(ParentisMenu());
}
// ---------- Left mouse button was clicked
void PopDown::LeftButton(int mx, int my)
{
 if (ClientRect().Inside(mx, my)) {
 if (my != prevmouseline) {
 int y = my - ClientTop();
 if (selections[y]->type != SEPARATOR)
 SetSelection(y);
 }
 }
 else if (!rect.Inside(mx, my)) {
 if (parent && my == parent->Bottom())
 parent->LeftButton(mx, my);
 }
 prevmouseline = my;
 prevmousecol = mx;
}
// ---------- Left mouse button was double-clicked
void PopDown::DoubleClick(int mx, int my)
{
 if (!rect.Inside(mx, my)) {
 CloseMenu();
 if (parent)
 parent->DoubleClick(mx, my);
 }
}
// ---------- Left mouse button was released
void PopDown::ButtonReleased(int mx, int my)
{
 if (ClientRect().Inside(mx, my)) {
 if (prevmouseline == my && prevmousecol == mx)
 if (selections[my-ClientTop()]->type != SEPARATOR)
 Choose();
 }
 else if (!rect.Inside(mx, my)) {
 DFWindow *Wnd = desktop.inWindow(mx, my);
 if (!(Wnd == parent && my == Top()-1 &&
 mx >= Left() && mx <= Right())) {
 CloseMenu(ParentisMenu());
 if (Wnd != NULL && Wnd != desktop.InFocus())
 Wnd->SetFocus();
 }
 }

}
// --------- user chose a menu selection
void PopDown::Choose()
{
 MenuSelection &ms = *selections[selection];
 if (ms.isEnabled()) {
 if (ms.type == CASCADER && ms.cascade != NULL)
 // -------- cascaded menu
 ms.cascade->OpenMenu(Right(), Top()+selection);
 else {
 if (ms.type == TOGGLE) {
 // ---- toggle selection
 ms.InvertToggle();
 char *cp = TextLine(selection);
 if (*cp == CheckMark())
 *cp = ' ';
 else
 *cp = CheckMark();
 DisplayMenuLine(selection);
 }
 if (ms.cmdfunction != NULL) {
 // ---- there is a function associated
 DFWindow *wnd = (DFWindow *)this;
 // --- close all menus
 while (wnd &&
 wnd->WindowType() == PopdownWindow) {
 ((PopDown *)wnd)->CloseMenu();
 wnd = wnd->Parent();
 }
 if (wnd && wnd->WindowType() == MenubarWindow){
 wnd->Keyboard(ESC);
 wnd = wnd->Parent();
 }
 if (wnd)
 // ---- execute the function
 (wnd->*ms.cmdfunction)();
 }
 }
 }
 else
 desktop.speaker().Beep(); // disabled selection
}
inline Bool isMenu(DFWindow *wnd)
{
 if (wnd != NULL) {
 WndType wt = wnd->WindowType();
 return (Bool) (wt==MenubarWindow wt==PopdownWindow);
 }
 return False;
}
// ----- test for the parent as menu or menubar
Bool PopDown::ParentisMenu(DFWindow &wnd)
{
 return isMenu(wnd.Parent());
}






[LISTING EIGHT]

// --------------- ctlmenu.cpp
#include "dflatpp.h"
#include "frame.h"

MenuSelection RestoreCmd ("~Restore", &DFWindow::Restore);
MenuSelection MoveCmd ("~Move", &DFWindow::CtlMenuMove);
MenuSelection SizeCmd ("~Size", &DFWindow::CtlMenuSize);
MenuSelection MinimizeCmd ("Mi~nimize", &DFWindow::Minimize);
MenuSelection MaximizeCmd ("Ma~ximize", &DFWindow::Maximize);
MenuSelection CloseDocCmd ("~Close [Ctrl+F4]",&DFWindow::CloseWindow,
CTRL_F4);
MenuSelection CloseApCmd ("~Close [Alt+F4]",&DFWindow::CloseWindow, ALT_F4);

MenuSelection *ControlMenu[8];

MenuBarItem CtlMenu[] = {
 MenuBarItem( "", ControlMenu ),
 MenuBarItem( NULL, NULL )
};
void DFWindow::OpenCtlMenu()
{
 if (ctlmenu != NULL)
 delete ctlmenu;
 int mn = 0;
 if (attrib & (MINBOX MAXBOX))
 ControlMenu[mn++] = &RestoreCmd;
 if (attrib & MOVEABLE)
 ControlMenu[mn++] = &MoveCmd;
 if (attrib & SIZEABLE)
 ControlMenu[mn++] = &SizeCmd;
 if (attrib & MINBOX)
 ControlMenu[mn++] = &MinimizeCmd;
 if (attrib & MAXBOX)
 ControlMenu[mn++] = &MaximizeCmd;
 if (mn != 0)
 ControlMenu[mn++] = &SelectionSeparator;
 if (Parent())
 ControlMenu[mn++] = &CloseDocCmd;
 else
 ControlMenu[mn++] = &CloseApCmd;
 ControlMenu[mn] = NULL;

 MinimizeCmd.Disable();
 MaximizeCmd.Disable();
 RestoreCmd.Disable();
 SizeCmd.Disable();
 MoveCmd.Disable();

 switch (windowstate) {
 case ISRESTORED:
 if (attrib & MINBOX)
 MinimizeCmd.Enable();
 if (attrib & MAXBOX)
 MaximizeCmd.Enable();
 MoveCmd.Enable();
 SizeCmd.Enable();
 break;

 case ISMINIMIZED:
 RestoreCmd.Enable();
 MoveCmd.Enable();
 break;
 case ISMAXIMIZED:
 RestoreCmd.Enable();
 if (attrib & MINBOX)
 MinimizeCmd.Enable();
 break;
 }
 ctlmenu = new PopDown(this, ControlMenu);
 ctlmenu->OpenMenu(Left()+1, Top()+1);
}
void DFWindow::DeleteCtlMenu()
{
 if (ctlmenu != NULL)
 delete ctlmenu;
 ctlmenu = NULL;
}
void DFWindow::CtlMenuMove()
{
 desktop.mouse().SetPosition(Left(), Top());
 new Frame(this, Left());
}
void DFWindow::CtlMenuSize()
{
 desktop.mouse().SetPosition(Right(), Bottom());
 new Frame(this);
}

































April, 1993
STRUCTURED PROGRAMMING


The (Shower) Curtain Falls




Jeff Duntemann KG7JF


There may be 50 ways to leave your lover (as the song says) but there's no
good one; nor is there any good way to leave a group of readers whom you've
been with as long as I have. So I might as well just admit up front that this
column brings the curtain down on "Structured Programming," Jeff
Duntemann-style, after 51 installments. It wouldn't be quite right to spring
it on you in the final paragraph.
Part of the reason I'm leaving the hallowed pages of DDJ is that I own my own
publishing company, which has now grown big enough to be several simultaneous
full-time jobs. Certain people are hinting that I have some focusing to do if
I intend to see my 50th birthday, and an unexpected tussle with pneumonia last
fall indicated that they may be onto something.


No Holes in the Bucket


But it's more than just a matter of overwork. The technology has changed
enormously over the last four years, to the extent that a "structured
programming" column doesn't make anything like the sense it used to. If I were
writing in Bucket-Makers Monthly, it would be the equivalent of hosting a
column called "Buckets That Don't Leak." Hey, we've won that war.
The column has always been defined more by what it isn't than by what it is.
"Structured Programming" is the Un-C column; and with the traditional
languages, Un-C today generally means Turbo Pascal. Early on I paid serious
attention to Modula-2, but over time Modula withered in the American market.
I've touched on other languages now and then (Smalltalk being my favorite, old
Xeroxer that I am), but DDJ's demographics indicate that Turbo Pascal
predominates in the non-C realm.
Over the last year, some disturbing trends have begun to surface. The number
of people citing Pascal as their major language on magazine surveys has begun
to drop. Books on Pascal are not selling very well, I've heard from several
authors. (I'm bucking that trend nonetheless, and my Borland Pascal 7 From
Square One should be very close to publication by the time you read this.)
Where are the Pascal people going? And why? I'd love to ask those specific
questions on a survey, and I may in time. I've gotten some hints in the mail
and on the networks. (If you'd like to tell me why you've moved away from
Pascal, and to what, please write.) I've been able to distill two major forces
at work in the market that may provide an answer.
The bulk of new PC development is happening for Windows. Many Turbo Pascal
people bought Turbo Pascal for Windows, assuming a smooth transition from
their DOS work, and hit a very high wall. Working in Turbo Pascal for Windows
is a lot like working in Turbo Pascal for DOS and Turbo Vision, with the
Object Windows library (OWL) in place of Turbo Vision. Learning Turbo Vision
is brutally difficult, and learning OWL is no easier. Furthermore, OWL is not
optional. Using OWL is, in fact, easier than using Turbo Pascal for Windows
"naked," and making raw Windows API calls. Both Turbo Vision and OWL rely
pervasively and inescapably on pointers, and developing for them feels a great
deal like developing in C. I think a lot of Pascal people felt that if they
really wanted to work in C and be up to their eyebrows in pointers all the
time, that they'd just work in C.
Commercial development is under intense productivity pressure. At the same
time that understanding and using the preferred platform (Windows) has become
enormously more difficult, market pressures are requiring that products come
to market faster and with less labor. Even the C people are hurting in this
area, though they're not likely to openly admit it to me. The pressure is
especially intense in vertical-market shops, where vendors are finding it
harder and harder to coerce $10,000 from users for a market-specific
application. If the price of the box goes down, the stuff in the box had
better come together more cheaply. Pascal development has traditionally been
fairly fast, but it may no longer be fast enough, especially when the
spaghetti-tangle that is Windows must be catered to.
Caught between the twin pincers of market pressures and the hassles of working
in Windows at a low level, I think a lot of Pascal people have simply jumped
ship and taken up with one of the visual programming languages. The one whose
name I hear far more than any other these days is Visual Basic.


Drag-and-drop Until You Drop


I confess I didn't care much for Visual Basic in its first incarnation, for
reasons that didn't come clear right away. In the interim I completed Mortgage
Vision, a modest application with TP6/Turbo Vision, and used Blaise's Turbo
Vision Development Toolkit (TVDT) to ease the way. When I came back to Visual
Basic to check out its 2.0 release, it felt like home in a way it hadn't
before. In fact, what had bothered me about VB1.0 was not VB at all, but the
idea of drag-and-drop, which I hadn't really used at length before. (Even
Smalltalk, for all its graphic style, remains a typing-intensive environment.)
Once I had drawn all my menus and dialogs with TVDT and gotten used to the
feeling (and once I had suffered through enough event-driven programming to
make it seem natural and easy), Visual Basic abruptly became very compelling.
I have to make a point here that Microsoft won't like: The Basic-ness of
Visual Basic is close to irrelevant. And this too: In Microsoft's capable
hands, Basic has evolved syntactically to look a great deal like--gasp!--Turbo
Pascal. Visual Basic is a fully structured language, with procedures,
functions, and every control-flow structure present in Pascal. They even write
it now using the same standard indentation conventions that Pascal people have
been using since Year One. What's significant in Visual Basic has nothing to
do with any individual programming language at all: It has placed the bulk of
the Windows UI machinery behind a curtain, from which it emerges in its naked
terror only in dire need. The language behind Visual Basic's forms and
controls could be any language at all. Rumor has it we'll see a Visual C++
from Microsoft in time. Meridian Data is about to present its own
drag-and-drop C++ called C++ Designer, and we've had VZ Programmer (an early
but excellent visual dialect of C++) for some time. There's no reason at all
that somebody couldn't present a visual Pascal, and if a certain Somebody
would just get off their collective butts and do it, Visual Basic wouldn't be
bleeding the Turbo Pascal market dry.
Some people may well enjoy fencing with Windows at a low level, and we'll
always need a certain number of such people. But more and more people are
finding it un-fun in a big way, and a lot of companies are coming to feel that
working cheek-by-jowl with Windows costs far more than it is worth, when other
roads are rapidly opening up.
Let's talk about yet another one.


The Big Blur


I've been hoping for years that the traditional languages would extend
themselves in the direction of database development by incorporating verbs
implementing relational concepts. Clarion did exactly that, and reached
something close to the ideal mix of interactive tools and underlying language
expressiveness, but Clarion is a proprietary language from a small company and
thus won't even be considered by a great many shops. (Besides, the 3.0
release, with the globally optimized Top-Speed code generator, is now so late
that some folks are beginning to wonder if it will ever happen.) What I'm
seeing instead is that the major databases are extending themselves toward the
traditional programming languages by allowing users to write traditional
structured code "behind" the usual interactive features for maintaining tables
and generating queries and reports. The Big Blur between traditional languages
and database managers has begun, and if it's coming from the direction
opposite that which I expected, well, hey, I reserve the right to be
surprised.
The best example of the Big Blur I've encountered recently is Microsoft
Access. I sure hope you all took them up on their $99 intro market-share grab;
you're unlikely to net a development tool of that magnitude so cheap for a
while. Yes, it's a database, and while I haven't used all that many database
managers on the PC, it's as good as any I've ever had--certainly better than
Paradox 4.0, which while quite fast, has this ugly habit of generating
cross-links on every hard disk I've installed it on. I doubt I will ever use
it again.
Access is a Windows-specific database manager, fully relational by my
standards, and beautifully designed for interactive use by advanced end users.
It has the traditional macros that users have come to expect in products like
this--but behind the macros is another and still more powerful layer that is
essentially Visual Basic with access to all the power of the database manager.
Access Basic can manipulate the very high-level objects (databases, tables,
forms, queries, and reports) you create interactively with Access's
interactive tools, much as Visual Basic manipulates the forms and controls you
draw. (I should point out that while Access refers to things like tables and
forms as "objects," they are not true objects in the OOP sense, but simply
large data structures with documented interfaces and internal layouts mostly
hidden from the programmer.)
If Access has any severe limitations from a coding perspective, I haven't
encountered them yet. It can act as a DDE client and make DLL calls. This
allows you to add system-level code like a serial terminal window without much
difficulty, assuming you know how to write such code in a language like Turbo
Pascal for Windows or a DLL- capable C/C++. Access Basic has a richer suite of
numeric types than Turbo Pascal, and has great little functions like Weekday
(which returns a day-of-the-week code for any date passed as a parameter) that
I've always had to cobble up on my own.
Obviously, you're not going to create little hacker utilities in something as
massive as Access. But for creating business applications, I'm finding it
brutally effective. By building what amounts to Visual Basic into Access,
Microsoft is very intelligently recruiting traditional programmers into its
database corner. At this writing, Borland had not yet released Paradox for
Windows, so I haven't seen it. However, Paradox for Windows has at its heart
an object-oriented extension of PAL (Paradox Application Language) that may be
solid and useful, but looks nothing at all like either C++ or Pascal. Over the
next few years, this may come to be seen as an extraordinary mistake.


Who Shall Rule?


I don't want you to miss my point, so let me take a big, long roundhouse swing
and hit you over the head with it: The language wars are over. The war now, if
there is to be one, will be between visual development and traditional
development. There's a lot more at stake in such a war than you might first
perceive. Here's a point that may not have occurred to you: In visual
development, the highest-level design issues (and hence the shape of the
created application) are defined and controlled by the shape of the tool being
used.
In other words, if you create an app in Visual Basic, the code you write is
limited to procedures and functions, each with a fairly narrow scope. The
shape of the application is the shape of Visual Basic itself.
This may seem obvious, since Basic code has traditionally had all the shape of
a cup of water thrown into a spring breeze. But consider the situation of C++.
Many of the gnarliest C++ features are in fact very high-level "shape
definers" that impose a design vision upon the application being written.
Templates, multiple inheritance, operator overloading, and things like that
have an influence that permeates every corner of the eventual application.
These are the sorts of things expressed by the nature of the underlying kernel
within a visual language. I don't think they'll be easily changed. The design
vision of a visual language will become the design vision of the applications
it creates. The code that the programmer writes will almost invariably be
local in nature.
What about the role of CASE tools? Well, there have been CASE tools for years.
Why haven't they come to dominate the development world as they probably
should? The answer to that is simple, and it should make all of us squirm:
Because programmers much prefer coding to design. A tool that produces a live,
working, wonderful-looking application will always win the war against a dry
design tool that produces nothing more compelling than reams and reams of
bubble charts and structure diagrams. The tool that produces the code will
rule.
Count on it.


The Last Days of Steam



These developments will have a profound effect on most traditional languages,
and on C++ far more than any other. I hope you all read Scott Guthery's
slightly grouchy complaint published in the December 1992 issue of DDJ,
entitled "A Curmudgery on Programming Language Trends." Scott is concerned
that C++ will be the death of structured programming, for reasons he can
express far better than I. I can respond simply by saying, Not to sweat. These
effects are to some extent self-limiting.
Bjarne and his gang are having loads of fun piling every conceivable
expression of program logic and abstraction into C++, making it easily the
largest, most complex, and least-understood programming language in history.
Remember that for them, the design of C++ is art, and basically a good time.
They don't really have to use the damned thing--not the way working
programmers have to. Only a handful of C++ programmers probably understand the
language well enough to use all of it. That handful frequently creates code so
arcane that fellow C++ programmers have a hard time understanding just what it
was they did, or even how. Recently, I'm hearing whispers of shops where you
have to get management permission to use operator overloading or multiple
inheritance--what a concept!
And how long, long, overdue. It's way too easy to forget that the most
important part of wielding power is knowing when not to use it.
Scott's article expresses justifiable concern, but I take the long view: We
are currently seeing the beginning of the last days of steam. Early in the
development of the steam locomotive, there was an incredible parade of mutant
locomotives working the High Iron. Locos called crabs and camels, humpbacked
monstrosities with walking-beam linkages; all manner of weird and wonderful
machines that expressed what each designer thought was the current state of
the steam art. Some ideas worked better than others. Many ideas worked well
but were too fragile to maintain. (Locomotive fanciers may recall the cranky
poppet valves on the Pennsylvania Railroad's wedge-nosed behemoths, the T1s.)
Some ideas were mere decoration that earned no mechanical advantage. However,
each idea that came to mind had to be tried.
That's what's happening right now: We're trying every programming language
concept that comes to mind, especially in languages like C++ that springs from
restlessly creative teams of brilliant human beings with too little to do. And
C++ may well be becoming what the ill-fated T1 was: an awesome engine with
tremendous power that was simply too fragile to maintain. We should know in a
few years.
While the bizarre and brilliant locomotive ideas were being tried and
discarded, the design of conservative, workaday engines was converging on a
near-standard shape: that of the familiar, undecorated 4-8-4 that worked the
rails until the end came. I'd like to think that, away from the excesses of
C++, most workaday programming languages are converging to what ANSI C,
Microsoft Basic, and Turbo Pascal all are: structured programming languages
with simple but robust designs. Ample power without too much fragile
complexity--that's structured programming in a nutshell. A little object
orientation doesn't hurt, and may help a lot. Too much takes a language right
over the horizon of comprehension and beyond the bounds of maintainability.
In the last days of steam (back in the '30s), a new idea appeared: the
diesel-electric locomotive, incorporating 20th-century rather than
19th-century technology, and in another 20 years it swept the rails clear of
steam. I won't go so far out on a limb as to say that visual programming is to
traditional programming what the diesel was to steam, but lordy, I get these
feelings sometimes.
The people who love steam locomotives (and I remain one of them) sometimes
forget that the whole idea is not locomotives but transportation, and that
what gets the load where it's going fastest and cheapest wins. Admiring steam
as art doesn't pay the bills. As programmers we have to remember that creating
a working, reliable application the fastest way possible is the goal of the
game. Programming as we know it is indeed the most intoxicating kind of fun
(or maybe the second most), but if something better comes along, we're doing
no one any favors by clinging to the old ways.
Obviously, the tough one is knowing what is better, and what is merely more of
the same old...steam.


A Peek at the Future


I don't see any discontinuities in the programming business for the next
several years. (That doesn't mean they aren't there.) Intel CPUs and their
clones will continue to rule, expanding to own perhaps 95 percent of the CPU
market for desktop machines. Workstations will suffer terribly once Pentium is
out there and cheap, and may vanish entirely once Pentium's successor (Hexium?
Why can't we just call a 686 a 686?) hits the streets. The Power PC will come
out in the classic IBM/Apple position--overpriced and underpowered--and will
elicit the Big Yawn. It may eventually come to own 0.08 percent of the desktop
market. Windows will utterly dominate the platform game, especially once
Microsoft releases a credible midmarket embodiment of Win32s. Bill Gates will
eventually become rich enough to get his million-dollar car out of hock.
I've long hoped that Apple would go out of business before 2000. It may take a
few more years than that, but I will continue to hope.
This is kind of a dull future; most changes will be changes in scale and not
in shape. It may well be the ideal future for the small software developer,
however, especially one who knows how to work fast and keep a small turning
radius. The world market for English-language PC software will be in the
500-million range by the year 2000, and if you can't pay the rent on that, you
belong in a government job.
Small will be important for another reason: We've just elected a goodhearted
but weak president who has all but invited Congress to walk all over him, and
they're rubbing their hands in anticipation. Congress is dominated by a party
that hates business above all else, and they will be taking more and more of
your money and throwing more and more roadblocks in your way as the '90s
progress. The only way to duck some of these roadblocks is to stay small and
get the hell away from the big cities and their dark nimbus of murder, drugs,
welfare despair, and trial lawyers searching for something, anything to sue.
The good news is that a software development firm with eight or ten employees
can work very well and keep all of those people in jobs. Forget any thought of
getting rich, relocate to a small town in a temperate climate, and rediscover
what it's like to simply live.


Remember Bath Fashions


That's pretty much all I wanted to say. People who haven't been with me since
the beginning may wonder what all this "shower-curtain" business is about, and
I think I'll close by recounting my confusion on getting out of school in 1974
with a degree in English and no favorable prospects to speak of. In scanning
the papers, I saw a lot of ads looking for people to do peculiar-sounding
things, and one that made me smile was an ad for "manufacturer's rep in bath
fashions." I had an older cousin who worked hard running around pushing shower
curtains and peel-off stick-down cartoons of fish and bubbles to discount
stores in the Midwest. He went to shower-curtain conventions and probably read
shower-curtain trade magazines. He was in on the scam, and it didn't seem to
bother him: Call it representing bath fashions if you want, but what it
amounts to is selling shower curtains.
Things got dark enough so that once I thought I might call Cousin Sammy and
see if there was an opening in the Illinois territory. Instead, I got a job
fixing Xerox machines, and with time and luck faked my way into a job as a
self-taught Cobol programmer.
A couple of years in, it occurred to me that there was no hokey euphemism for
programming. The work was good; why gild the lily? It's worth remembering that
garbage men are called sanitation engineers, and shower-curtain salesmen are
called bath-fashion reps, but programmers are called...programmers.
The next time the bugs are biting and the deadlines are looming large in the
shadows, keep in mind that you, too, could be selling shower curtains. Of all
the work in the world, ours is among the best. Keep doing it. Do it well.
I'll be out here with the coyotes if you need me.


End Note


This is Jeff Duntemann's final column in Dr. Dobb's Journal. Write to Jeff at
PC TECHNIQUES, 7721 E. Gray Road #204, Scottsdale, Arizona, 85260.


























April, 1993
UNDOCUMENTED CORNER


The Undocumented LAN Manager and Named Pipe APIs for DOS and Windows




Michael Shiels


Michael has been programming in the networking field for the last six years.
He is currently working at IMARA Research in Toronto, doing device interfaces
for desktop document/image-management software. He can be reached at MaS
Network Software and Consulting, 65-2380 Bromsgrove Road, Mississauga, ON L5J
4E6, Canada or at mshiels@masnet.uucp or mshiels@tmsoftware.ca.


When programmers talk about MS-DOS, they are generally referring not to the
C:\> prompt, but to the INT 21h programming interface. When you see a piece of
code with an INT 21h call, you know the program is making a DOS call. For
programmers, DOS is INT 21h and INT 21h is DOS.
But things aren't quite so simple. DOS has a function (INT 21h AH=25h) to
install a handler for an interrupt, and one of the interrupts you can install
a handler for is...INT 21h! Any DOS program can legitimately take over INT 21h
and masquerade as DOS. If the program becomes sufficiently popular, the word
"masquerade" is no longer accurate: The program makes its way into the de
facto DOS specification. For example, Novell's Netware hooks INT 21h and
inserts its own functions ahead of many of those supplied by Microsoft. If
you're writing software for the DOS environment, it's important to be at least
vaguely aware that any DOS calls your program makes will end up getting
serviced, not by code written in Redmond, but by code written in Utah.
Sometimes INT 21h is patched or extended by a third-party vendor that happens
to be another group within Microsoft. As Mike Shiels shows in this month's
"Undocumented Corner," Microsoft LAN Manager extends INT 21h to provide
functions such as DosCallNmPipe (INT 21h AX=5F37h) and NetRemoteCopy (INT 21h
AX=5F4Ah). For example, if you were to disassemble NETAPI.DLL from Windows,
you would see that many of the functions it provides are nothing more than
wrappers around these strange INT 21h calls.
If you've seen the LAN Manager documentation, you should be surprised to see
DosCallNmPipe and so on identified with INT 21h functions and subfunctions.
The LAN Manager documentation just provides C-function prototypes. More and
more, documentation tells you about high-level C or C++ calls, without telling
you anything about the underlying low-level interface for which they are often
just a thin wrapper. While high-level interfaces are very nice, this type of
"leave-the-driving-to-us" documentation unfortunately forces the programmer to
use whatever library the vendor supplies. Often these libraries are not
suitable for inclusion in device drivers, TSRs, or other programs with special
requirements.
Mike's article corrects this problem for LAN Manager on DOS and Windows,
showing the actual low-level interface that Microsoft uses. Most of this
interface involves INT 21h AH=5Fh and INT 2Fh AH=11h. This is consistent with
the DOS interface: Several documented network-related functions use INT 21h
AH=5Fh (for example, INT 21h AX=5F03h is the Make Network Connection
function), and INT 2Fh AH=11h is a large collection of subfunctions known as
the "network redirector" (see Undocumented DOS, Chapter 4). Any INT 21h AH=5Fh
functions that DOS itself doesn't handle get passed through to INT 2Fh
AX=111Eh, with AX on the stack.
Even though these functions use INT 21h and INT 2Fh and are undocumented, they
do not appear in the book Undocumented DOS, written by myself with Ray
Michels, Jim Kyle, Tim Paterson, David Maxey, and Ralf Brown. A question we
constantly faced while writing this book was: What is part of DOS? It was
tempting to consider DOS as anything that comes out of Redmond wearing an INT
21h or INT 2Fh interface. But this ignores the crucial role in the
industry-standard de facto DOS "spec" of third-party vendors such as Novell,
Qualitas, and Quarterdeck.
In the world of networks, Novell Netware is much more part of this de facto
DOS spec than any networking software that Microsoft produces. LAN manager is
so far behind Netware that rumor has it that the world's largest LAN Manager
site (running OS/2, of course) is in Redmond! On the other hand, one of the
lessons from the success of Windows 3.x is that Microsoft is very persistent.
The API implemented by LAN Manager will survive and grow in importance, even
if LAN Manager itself doesn't. Microsoft will probably just keep at it until
its API becomes the standard in PC networking. A book I recently edited,
Windows Network Programming: How to Survive in a World of Windows, DOS, and
Networks (Addison-Wesley, 1993), by Ralph Davis, clarifies how all this LAN
Manager stuff is being carried over into the next generation of Microsoft
network software in Windows for Workgroups (WFW) and in Win32. Interestingly,
according to Davis, much of WFW's support for LAN Manager is currently
undocumented.
While this month's column is devoted to yet another interface that Microsoft
has left undocumented, it should be noted that Microsoft's competitors are no
better. If anything, Novell's documentation appears to be even more incomplete
than Microsoft's: I have been hearing a lot recently from programmers who need
to know about the structure of the file system on Novell servers, undocumented
NLM calls, the Netware Core Protocol (NCP), undocumented interfaces to IPX
used by Novell's NETX, and so on. Novell also appears to have never documented
what INT 21h looks like when running under Netware.
That all sounds like good material for a future "Undocumented Corner." Now,
does anyone want to write it? Please send me (CIS ID 76320,302 or andrew@
pharlap.com) your questions, comments, and ideas for future columns.
--Andrew Schulman
Ever wanted to interface to a network-independent, named-pipe interface? Ever
wondered about the underlying interface for the Microsoft LAN Manager
application programming interface (API)? In this "Undocumented Corner," we'll
discuss the underlying interface currently used by Microsoft to implement the
resident LAN Manager API, named pipes, and mailslot interfaces.
LAN Manager was introduced as Microsoft's entry into the growing local area
network arena. It boasts all sorts of features, including a very extensive API
to interact with the client/server network components. Along with the LAN
Manager proprietary API, Microsoft introduced a network-independent API called
Named Pipes for connection-oriented network programming, and a partner
interface called Mailslots for doing connectionless (datagram) programming.
Microsoft's API is portable between all the existing LAN Manager platforms:
DOS, Windows, and OS/2 1.x. The DOS API interface is implemented as a library
that gets linked into your program, while the Windows and OS/2 1.x versions
naturally use dynamic link libraries (DLLs) to implement shared API code.
Microsoft's latest entry into the LAN arena, Windows for Workgroups (WFW),
supports the same LAN Manager API as LAN Manager for Windows. As a late
breaking note, it appears that the interface described here may also be
available as part of the LAN Manager 2.1a (or 2.2) support for IBM OS/2 2.x
Virtual DOS Boxes. Windows NT will also have this API available as part of LAN
Manager for NT. Portions of this API are supported in the IBM LAN Server
product. With the proliferation of products incorporating these APIs, they
will definitely be with us for quite a while!
The API works quite well if you are using Windows and/or OS/2. Since they use
DLLs, the size of your code does not increase when you include the LAN Manager
API. Once you compile some small utilities for DOS, however, you will discover
a problem: Using Microsoft's DOSNET.LIB file from the LAN Manager Programmers
Toolkit increases your program's size quite a bit when you are just calling a
simple API function to get some basic workstation information. For example, a
small C program with no LAN Manager API calls is typically just over 3K, but
once a call to NetWkstaGetInfo is placed into the code, the program grows to
over 20K. The library will add even more code to your program if you call more
of the API functions.
These API function names are structured as Net?category?verb or Dos?verb?
category; for example, NetUserAdd, NetGroupEnum, DosConnectNmPip, and
DosDisconnectNmPipe. Most of the Net* functions are also structured such that
the first parameter to the function is either a NULL pointer to signify
execution of the API call on the local machine (for instance, NetUserAdd(NULL,
...)), or a server name in the uniform naming convention (UNC) form
\\servername (for example, NetUserAdd("\\BIGSERVER", ... )), to signify remote
execution of the API call on another machine (usually a server).
Here, we'll explore the undocumented interrupt 21h and interrupt 2Fh function
calls that implement some of the LAN Manager API when it is called in local
mode with a NULL pointer for the server name. This code is much smaller than
the Microsoft code and won't impact the size of your program significantly.
Table 1 (page 139) lists the functions in the LAN Manager API, and Table 2
(page 139) lists the functions from Table 1 that provide an INT 21h or INT 2Fh
interface that can easily be called directly via the interrupt.
The LAN Manager for Windows NETAPI.DLL and PMSPL.DLL files are to a certain
extent just wrappers around the undocumented, interrupt-based API. The
components of the LAN Manager API that have no associated undocumented
functions are implemented in the DLLs themselves. Inside the DLL, the
undocumented functions are accessed as straight INT 21h calls when running
under the now-defunct Real-mode Windows; under Standard and Enhanced 386 mode,
the DOS Protected Mode Interface (DPMI) Simulate Real Mode Interrupt function
(INT 31hAX=0300h) is used to make the LAN Manager INT 21h calls.
Microsoft chose to extend the INT 21h function 5Fh network interface to
support the extended local API. Most of the functions use various register
combinations to pass in parameters, but a few functions actually just pass in
the address of the parameters on the stack, usually because there are too many
parameters to safely (and sanely!) use registers. Other functions create a
temporary buffer to store all the parameters and insert the address of this
buffer into the registers before making the INT 21h call.
The undocumented LAN Manager API consists of two major groupings. The first is
the "true" LAN Manager-specific API, which is implemented by LAN Manager (and
probably most LAN Manager-based networks), and will probably be implemented in
any future Microsoft networking software. The second group is the Named Pipe
interface implemented by Microsoft LAN Manager, IBM Lan Server, Novell
Netware, and Banyan Vines, as well as other LAN Manager-based networks such as
DEC Pathworks. The Named Pipes interface is becoming an industry-standard
interface (similar to Berkeley Software Distribution -- BSD -- sockets) for
network-independent programming, being implemented not only by Microsoft but
also by Novell and Banyan.
Of course, in dealing with an "undocumented" interface it is sensible to have
some reservations. However, this API has not changed in any of the current
releases of LAN Manager (2.0, 2.1, 2.1a, and 2.2). As noted earlier, it is
also supported in WFW, so this interface is very unlikely to be drastically
changed by Microsoft. But as with any undocumented interface, it could change
at some point in the future, so be forewarned.
Source code and header files for Microsoft C 6.0 that implement this interface
are available from MaS Network Software and Consulting and from DDJ (see
"Availability," page 5).


Named Pipes


The most interesting section of this undocumented API, and probably the most
useful, is the Named Pipe API, which originated in Microsoft LAN Manager and
has now been added to Novell Netware and Banyan Vines. This API implements a
connection-oriented protocol, and is used in particular by the Microsoft SQL
server to do all communications between the clients and the server. Named Pipe
programming is much too complicated to go into in this article, but the
Microsoft LAN Manager book in "References" covers Named Pipe programming very
well. Just substitute the interrupt-based calls shown here for the C wrappers
shown in the official Microsoft documentation. What we've done here,
basically, is get rid of the C wrapper.
Programming for Named Pipes is comparable to programming for the Berkeley UNIX
socket interface. The APIs are fairly straightforward and easy to use. In
contrast, NetBIOS, probably the biggest standard for PC networking, requires
structures to be filled in for each function to be executed (Call, Listen,
Send, Receive, and so on), which makes it more complicated.
In addition to Microsoft SQL Server (see next section), a very useful program
that uses Named Pipes is the OS2YOU program by M Wahlgren Software Development
in Sweden (OS2YOU*.ZIP on most BBSs). This program uses Named Pipes to talk
between a DOS workstation and an OS/2 workstation, to allow the DOS
workstation to run interactive, full-screen OS/2 programs on the OS/2
workstation. With this program, you can run full-screen LAN Manager NET ADMIN
while running DOS, which is otherwise not currently possible with LAN Manager.
LAN Manager's Named Pipe protocol is implemented as part of the workstation
service for the LAN Manager workstation. It uses the Server Message Block
(SMB) protocol defined by Microsoft and documented in the X/Open manual on SMB
(see "References"). This protocol is also supported in various UNIX LAN
Manager ports, DEC Pathworks, and other LAN Manager-based products.
SMBs were originally documented in some joint Intel/Microsoft documents in
early 1987. These documents define the packet headers for file-sharing and
print-sharing communications. Since then Microsoft has enhanced the protocol
for LAN Manager 1.x and 2.x extensions (see the 3COM section on CompuServe).
These extensions include Named Pipes, Mailslots, and the Remote Procedure Call
(RPC) API (see below).
Access to Named Pipes from Windows is provided by NETAPI.DLL. Under LAN
Manager for Windows, this DLL also implements the rest of the proprietary LAN
Manager API. WFW has essentially the same DLL as would a LAN Manager Windows
client. Novell and Banyan each implement their own NETAPI.DLL since they do
not want the rest of the proprietary LAN Manager API. It is unfortunate that
these different files all have the same name.


Microsoft SQL Server: The Biggest Named Pipes User


The biggest user of the Named Pipe protocol is Microsoft's SQL Server. Novell
and Banyan probably implemented Named Pipe modules for their own networks just
in order to run SQL Server. Microsoft SQL Server is based upon the Sybase SQL
engine and uses the Sybase DBLIBRARY interface at the client end. This
interface allows a client program for Sybase products to talk to any of the
databases they offer using different lower-level interfaces to implement the
DBLIBARY interface.
A Microsoft SQL Server or Sybase SQL Server Client program will interface to
either a DBNMPIPE.EXE TSR for normal DOS programs or the DBNMP3.DLL for
Windows programs. The DBNMPIPE.EXE TSR calls the undocumented INT 21h Named
Pipe functions directly to interface between the INT 62h Sybase interface for
DOS and the underlying network interface for Named Pipes. DBNMP3.DLL calls the
NETAPI.DLL functions directly to interface between the W3DBLIB.DLL Sybase
interface and the underlying network interface for Named Pipes. The NETAPI.DLL
will then turn around and call the INT 21h Named Pipe functions.


Novell's Named Pipes



Netware's Named Pipes protocol is implemented using Novell's Sequenced Packet
Exchange (SPX) and Internetwork Packet Exchange (IPX) protocols and is loaded
as an optional TSR called DOSNP.EXE. DOSNP hooks into the INT 21h chain and
watches for any file-related INT 21h calls, to catch the open/read/write/close
of the pipe. It does not support the DosReadAsynchNmPipe or
DosWriteAsynchNmPipe calls that are part of an INT 2Fh interface. As far as I
can tell, Novell has implemented all the basic calls that would be used by the
DBNMPIPE.EXE program necessary to get Microsoft SQL Server running. INT 2Fh
functions are a high-performance improvement, since they do not block the
machine while waiting to complete, but they are not universally implemented.
Novell also does not support the NetHandleGetInfo or NetHandleSetInfo
functions.
The Novell Netware Windows NETAPI.DLL just calls the DOSNP.EXE TSR to do each
function call. Similar to LAN Manager's NETAPI.DLL, it uses DPMI to execute
real-mode INT 21h from protected mode. Care must be taken in environments
where LAN Manager and Novell coexist, since each one provides a NETAPI.DLL,
and Windows will only allow one to be loaded. There seems to be no way to
allow the coexistence of Named Pipes under two different networks.
Novell Netware implements most of the standard Named Pipe calls as well as
NetServerEnum, which is part of the LAN Manager 1.x API. This function is used
to show a list of available servers. NetServerEnum is actually a LAN Manager
1.x function call obsoleted by NetServerEnum2 in LAN Manager 2.x, but Novell
needed to implement this function to allow Microsoft SQL Server programs to
find a list of database servers.


Mailslots


The third biggest section of this undocumented API is for Mailslots, which
provide connectionless (datagram) communications. NetBIOS connectionless
datagrams are used to implement the Mailslot protocol. This API has functions
for making a Mailslot and for reading from, writing to, and deleting a
Mailslot. Mailslots are used internally in the MESSENGER service to send and
receive messages.
Servers also announce their presence on the network periodically using
Mailslots, but the WORKSTATION service already has this Mailslot open so it is
not possible to access it at this level. The LocalServerEnum and
LocalServerEnum2 functions return information gathered from these periodic
server Mailslot broadcasts.
This API is implemented in Microsoft LAN Manager and LAN Manager-based
networks, but no one else, such as Novell or Banyan, bothered to implement
this interface when they implemented the Named Pipe interface. WFW does
implement this interface, so it will still be a factor in the network
programming world.
Mailslots have the same advantage over direct NetBIOS programming as the Named
Pipes interface: simplicity! The interface is based entirely on function calls
with no complicated structures to fill in for each operation.


Local LAN Manager API


Finally, we come to the various functions available in the LAN Manager native
API, which I describe below using new Local* names I have assigned to each Net
or Dos function.
For example, LocalMessageBufferSend (INT 21h AX=5F40h) is the local version of
the LAN Manager NetMessageBufferSend function. One nice feature of LAN Manager
is its built-in messaging facility. With the undocumented API entry, you can
now send a message directly to a single user, to a group of people in a
certain domain name, or to everyone as a broadcast. This is a useful addition
to programs that need to notify someone of a conflict or error. No need to
spawn the very large "NET.EXE MESSAGE" command just to send simple messages
around.
LocalServiceEnum (INT 21h AX=5F41h) can tell which network services are
running and what state they are currently in. It can be used to tell users if
they will be able to receive incoming messages at their machines, since the
MESSENGER service is necessary to receive any messages. LocalServiceControl
(INT 21h AX=5F42h) can be used to temporarily pause the disk or printer
redirector so that local resources can be accessed. This is similar to the
documented DOS Get and Set Redirection Mode (INT 21h AX=5F00h and AX=5F01h)
functions, in that it allows you to pause and unpause network redirections of
drives and printers.
Once you have a remote printer connected to an LPT? port you may forget where
your print job will eventually be printed. With LocalPrintJobGetId (INT 21h
AX=5F43h), you can get the server name, queue name, and job ID number to
report to the user any time after you open the LPT? port and before you close
the job to start it printing. These strings can be displayed to remind the
user where they need to go to pick up their output.
With the amount of information people are trying to move around on a LAN, bulk
file moving and/or copying can end up being quite slow, since the data ends up
moving across the network at least twice, even when the actual source and
destination are both on the same server. Now with the LocalRemoteMove (INT 21h
AX=5F4Bh) and LocalRemoteCopy (INT 21h AX=5F4Ah) functions (note the strange
naming!), files can be quickly moved and copied without the data ever having
to leave the server. This will greatly reduce the strain on the network when
large files have to be moved between locations on a server.
If applications need to do any audit logging with information such as the
user's workstation name and/or the logged-in user name, it is now quite easy
with LocalWkstaGetInfo (INT 21h AX=5F44h). This function can return three
different sets of information, depending on how much detail the programmer is
interested in. LocalWkstaSetInfo (INT 21h AX=5F45h) can be used to change
various system parameters for the current network session.
With LocalServerEnum (INT 21h AX=5F4Ch) and LocalServerEnum2 (INT 21h
AX=5F53h), it is now possible to display lists of available servers for users
to choose from when connecting resources or choosing servers to talk to via
Named Pipes. The enumeration will also provide information about the types of
servers available. This information is handy if you want to list the available
SQL Servers or Print servers.
All these functions add new possibilities for programs when they know they are
executing on a LAN Manager network. Default user names can be based on the
network computer name, messages can be sent, and servers can be listed for the
user's convenience.


The Struggle Continues


I've been unable to find out any details on the implementation of Named Pipes
in Banyan Vines or DEC Pathworks. I know that Microsoft SQL Server can run on
these platforms, so it must support the undocumented Named Pipe interface. I
just don't know which network protocol it us s to implement the Named Pipe
protocol.
This interface is not documented completely, since the implementation of some
functions is unknown to me at this time. DosRawReadNmPipe and
DosRawWriteNmPipe are not documented in any books or literature that I can
find, so the undocumented functions are described but the details of the
function parameters are unknown.
For a more detailed explanation of the LAN Manager API, Named Pipes, and
Mailslots, I recommend the two Microsoft Press books noted in the references.
Table 1 has a list of the LAN Manager API functions and some notes on specific
version support and the limitations associated with each function when used
under MS-DOS. As you can see from the table, a lot of the functions are only
applicable when executed on another machine, usually a server.


References


Brown, Ralf and Jim Kyle. PC Interrupts. Reading, MA: Addison-Wesley, 1991.
Davis, Ralph. Windows Network Programming. Reading, MA: Addison-Wesley, 1993.
Developers' Specification: Protocols for X/Open PC Interworking: SMB. U.K.:
X/Open Company Ltd., 1991.
Dunsmuir, Martin. "OS/2 to UNIX LAN," in Stephen Kochan and Patrick Wood, eds.
UNIX Networking. Indianapolis, IN: Hayden, 1989.
Microsoft LAN Manager Programmer's Reference (from LAN Manager 1.0), 1988.
Microsoft LAN Manager Programmer's Reference. Redmond, WA: Microsoft Press,
1990.
Microsoft Networks/OpenNET, File. Sharing Protocol, Intel Part No. 138446.
Ryan, Ralph. Microsoft LAN Manager, A Programmer's Guide. Redmond, WA:
Microsoft Press, 1990.














April, 1993
PROGRAMMER'S BOOKSHELF


Roaming the Internet, Part 2




Ray Duncan


In his remarkable book, Marooned in Realtime, science fiction author Vernor
Vinge postulated a world where universal networking and a rapid evolution in
human-computer interfaces led to a "discontinuity." The people who were online
when the discontinuity occurred--nearly the entire population of the
earth--simply vanished into another plane of existence, while the few who
happened to be isolated from the network for one reason or another at the
crucial instant were left behind to ponder the mystery of an empty world.
Vinge's latest book, A Fire Upon the Deep, foretells a much different future.
A future where theInternet grows to embrace the galaxy, but, as Figure 1
illustrates, the limited bandwidths across interstellar distances and the
difficulties of communication between alien species perpetuate text-based
newsgroups little different from the ones we know today.
Figure 1: From A Fire Upon the Deep.

 Crypto: 0
 As received by: Transceiver Relay03 at Relay
 Language path: Samnorks -> Triskeweline, SjK: Relay
 Units
 From: Straumli Main
 Subject: Archive opened in the Low Transcend!
 Summary: Our links to the Known Net will be
 down temporarily
 Key phrases: transcend, good news, business
 opportunities, new archive, communications problems
 Distribution:
 Where Are They Now Interest Group
 Homo Sapiens Interest Group

 Motley Hatch Interest Group
 Transceiver Relay03 at Relay
 Transceiver Windsong at Debley Down
 Transceiver Not-For-Long at Shortstop
 Date: 11:45:20 Docks Time, 01/09 of Org year 52089
 Text of message:
 We are proud to announce that a human exploration
 company from Straumli Realm has discovered an
 accessible archive in the Low Transcend... We have
 postponed this announcement until we were sure of our
 property rights and the safety of the archive. We
 have installed interfaces which should make the
 archive interoperable with standard syntax queries
 from the Net. In a few days this access will be made
 commercially available...

Of course, when news postings are shipped across gazillions of light-years,
filtered through multiple AI translators, and stored in archives that outlast
the races that created them, one is never quite sure what to believe:
The Known Net had existed in some form for billions of years in the Beyond. It
was not a civilization, few civilizations lasted longer than a million years.
But the records of the past were quite complete. Sometimes they were
intelligible. More often, reading them involved translations of translations
of translations, passed down from one defunct race to another with no one to
corroborate--worse than any multihop net message could ever be.
Small wonder that the Known Net of the distant future is also referred to by
its users as "The Net of a Million Lies."
In our own era, the Internet has not yet fallen completely under the sway of
the politicians, lawyers, and media moguls, so rather than The Net of a
Million Lies it's more like The Net of a Million Banalities. True enough, the
Internet is a treasure trove of information with its USENET newsgroups, FTP
archives, Archies, Gophers, World-Wide-Webs, and all the rest. But the
Internet faithfully follows the 90/10 rule, and for every USENET posting that
provides some valuable morsel of information or worthwhile insight, there are
at least nine trivial questions or inane remarks by people too lazy to pick up
a manual or read through a message thread to its end before adding their two
cents worth.
Two of the most aggravating characteristics of Internet News and e-mail are
the senseless squandering of readers' time and network resources by the misuse
of "signature blocks" and "included text." Signature blocks are ritualized
appendages to news postings that minimally include the author's full name,
place of work, and various network addresses for e-mail. Over the years, a
certain percentage of net denizens have adopted the signature block as a
vehicle for pretentious display, bulking it out with one or more hackneyed
aphorisms, so-called "ASCII Art," or elaborate disclaimers. The one useful
aspect of such bloated signature blocks is that they are a reliable predictor
of content; I've empirically determined that the number of characters in a
signature block is inversely proportional to the value of the entire message.
Included text, on the other hand, is material quoted from a previous message
to provide a context for the new material in an e-mail message or news
posting. When used sparingly, included text is extremely helpful in making
sense out of a message thread that has dozens of participants over a period of
weeks. Just as often, however, you see lengthy messages that consist almost
entirely of included text with only a few lines of original material. The
archetype of this genre has the mandatory header, then a hundred lines or so
of quoted material, followed by a one-line comment such as "Yes, I agree" or
"What utter nonsense," all terminated with two copies of a signature block
boasting half a dozen e-mail addresses and a couple of pompous aphorisms. (The
two signature blocks occur when the author includes one copy manually,
oblivious to the fact that his software is going to include another copy
automatically.)
In any event, whatever its features and faults, the Internet has been growing
exponentially for the last ten years and is expected to continue to do so for
years to come. It seems quite certain that, within three years, virtually
every computer user in North America and Europe will have some sort of
connectivity to the Internet. This opens up a huge market opportunity for
books that can help new users make some sense of the Internet along with the
fundamental networking facilities and utilities. I surveyed three such books
in my last review, and I've picked three more user-oriented books to discuss
in this month's column.
The Internet Companion: A Beginner's Guide to Global Networking is probably
the first Internet book ever to be aimed specifically at computer illiterates
and technophobes. It explains electronic mail, news readers, FTP file access,
and "netiquette" in terms that even a sixth-grade Valley Girl can easily
understand. The book is published in a handy pocket-sized quick-reference
format and includes a forward by Al Gore, a bibliography, and a list of
network resources and providers. The book's one flaw is that the authors have
cluttered it up with "human interest" sidebars that add no value whatsoever
and are written in a breathless sound-bite style reminiscent of USA Today. For
example:
The most important piece of information for potential users to know is that
the resource is gigantic and is growing larger. If it were an eggplant, we'd
be in real danger.
--Steve Cavrak, University of Vermont
Some of the other sidebars are entitled "Enough of White Man's ASCII," "Elvis
Sighted on Internet," "Geeks in Paradise," and "From Russia with Byte." I
guess we should just be grateful that the authors didn't choose to call the
book Bill and Ted's Excellent Network Adventure.
The second book on this month's list, Exploring the Internet: A Technical
Travelogue, defies classification. Carl Malamud, a columnist for
Communications Week and author of a number of networking reference books, was
subsidized by The Interop Company to spend six months traveling around the
world meeting Internet gurus, rogues, loose cannons, and entrepreneurs of
every description. Whether or not The Interop Company got what it was after I
cannot guess, but the book is incredibly entertaining. You'll learn about the
Tokyo Akihabara shopping district, the Royal Hong Kong Jockey Club, the ITU's
tyranny over ISO standards documents, the gated demon, Steve Robert's famous
Winnebiko, the Wellington City Net, the Bombay train-reservation system, toll
roads in Kuala Lumpur, and other loosely network-related topics too numerous
to mention here. You'll also learn about many new foods; Malamud apparently
fancies himself a gourmand, and his devotion to TCP/IP networking is rivaled
only by his affinity for exotic victuals and restaurants.
My flight wasn't until evening, so I had time to have lunch with Bob. We went
to an old hangout, the Italian restaurant Pan Pan. Normally, I try to stick to
Asian food while in Asia, but Pan Pan is an exception. The Italian food is
excellent, and more importantly, they have excellent durian ice cream.

Durian is one of those mysteries of the East. On the outside, it's about twice
the size of a pineapple, with very sharp spikes about a half-inch tall
sticking out all over, making it advisable not to fall asleep under a durian
tree at harvest time. Inside, there are a half dozen segments of creamy, pale
flesh that looks sort of like a banana. The durian's most famous feature,
however, is its powerful, distinctive smell.
The taste is great, but the smell does tend to dissuade many westerners from
taking an immediate liking to the "king of fruits." I describe it as tasting
something like a cross between a mushy banana and brie, but one Englishman I
know refers to it as "a bit like eating strawberries and cream in a public
lavatory."
Internet: Mailing Lists is a hardcopy version of an Internet file maintained
at SRI that is sometimes affectionately referred to as the "list of lists."
Mailing lists are somewhat like USENET newsgroups, but the message postings
are propagated to interested readers through the e-mail system rather than
through the USENET news servers and readers. Many USENET newsgroups are
reflected into mailing lists and vice versa, so if you have full Internet
access you can obtain the information in the way most convenient for you. The
important point about mailing lists is that they make information available to
a great many users who do not have complete connectivity to the Internet; for
example, people with CompuServe, MCI Mail, or BITNET accounts supporting
e-mail connection to the Internet only. And the abundance and diversity of the
existing mailing lists is guaranteed to amaze: Whether your interests center
on prisons, ham radio, Kate Bush's music, or the Romanov dynasty, there is
already a mailing list of your kindred spirits waiting for you.



























































April, 1993
DIFFERENTIAL COMPRESSION ALGORITHMS


Not just for graphics anymore


 This article contains the following executables: DIFFCOMP.ARC


James H. Sylvester


James recently graduated from California State University, Fullerton with a
masters in computer science. He can be contacted at P.O. Box 1097, Yorba
Linda, CA 92686-1097.


At first glance, differential image compression suggests some degree of
graphics dependence--after all, the word "image" in the title implies that
graphics must be used. However, a close look at source-code implementations of
differential image-compression algorithms--like that presented by John Bridges
in his article "Differential Image Compression," DDJ, February 1991--reveals
that data need not be graphics at all.
Differential image compression keeps track of only the information necessary
to change the currently active screen of graphics into the next screen; it
then repeats this process for subsequent screens. In John's coding, for
example, differential image compression uses the active screen of graphics as
a fixed-size buffer for data and the compression/decompression algorithms
simply modify the data in the buffer. That the buffer in the algorithms is
actually a graphics screen (commonly a VGA screen for its elegant
pixel-to-memory map) is simply a dependency of the algorithms presented.
There's nothing inherent in this approach to prevent or inhibit the buffer
from being any portion of random-access memory containing any type of data.
Still, you should make one minor modification when the buffer makes this leap
into nongraphics memory. The category for the compression and decompression
algorithms should be changed from differential image compression to simply
differential compression to signify that the data in the buffer may be used
for any general purpose.


Implementation Details


Implementation of differential compression begins with resolving at least the
following three issues in the design phase:
Is the data buffer fixed or variant in length? Regardless of whether it's
fixed or variant, what is the maximum size of the data buffer when full? Can
it all be kept in memory or must portions be accessed from disk?
How does the data buffer get used and possibly updated so that the size of the
encoded differential information can be kept as small as possible?
What type of initial information should be put into the data buffer at the
start of both the compression and decompression programs to get the best
overall compression results?
In my implementation of differential compression, I resolved the three issues
this way:
I keep coding simple and allow the entire data buffer to be kept in memory.
The buffer contains exactly 256 blocks, each having 8 bytes of storage data.
Thus, the size of the data buffer is 256x8=2 Kbytes long.
I encode source data by comparing successive 8-byte blocks with all 256 blocks
in the data buffer. The index of the best-matching block is then output as one
byte, and differential information is output to update the best-matching block
with the new data. Overall compression is optimized when the source data
contains similar, ideally repetitive, blocks of data.
To guarantee at least one matching byte for the first block of source data,
block #0 is filled with all 0 bytes; block #1 is filled with all 1 bytes;
block #2 is filled with all 2 bytes; and so on to block #255, which is filled
with all 255 bytes.


Coding Details


Figure 1 presents the encoding algorithm, and Listing One (page 146) lists
PACK.C, its C implementation. Figure 2 shows the decoding technique, while
Listing Two (page 146) lists the UNPACK.C decompression program. The listings
resemble the algorithm structures with one exception: Special handling needs
to be taken to handle the last byte(s) of source data, which may be smaller
than the block size of the data buffer. According to the pack and unpack
programs, any source file whose size MOD eight does not equal 0 requires this
special handling. Specific details are documented in the source code of these
two programs.
Figure 1: Encoding algorithm.

 initialize all blocks in data buffer
 while not end of file
 {
 load one block of source data
 find best matching block in data buffer
 output index of best matching block

 output a bit map (one byte) for the ordered bytes in
 the data buffer which need to be changed
 (0 bit ==> matching bytes)
 (1 bit ==> new byte in loaded block of source data)

 output all new bytes in loaded block of source data
 update best matching block with loaded block of data
 }

Figure 2: Decoding algorithm


 initialize all blocks in data buffer exactly as was done
 in the encoding algorithm

 while not end of file
 {
 load best matching block index
 load bit map (one byte) for the ordered bytes in
 the data buffer which need to be changed

 directly load all new (i.e. changed) bytes into the
 best matching block of data (automatic update)

 output the best matching block of data which now contains an exact
 copy of the originally loaded block
 }

In these programs, compression is optimized whenever the data in the source
file contains highly similar or exactly repetitive data, as determined by the
order of data in each block. For example, "abcdefgh" as a block of 8 bytes is
similar to "accddfgf," and consequently would be compressed into 5 bytes
(block index+01001001 bit map byte+c+d+f). However, "abcdefgh" is not similar
to "habcdefg," which would be compressed into 10 bytes (block index+11111111
bit map byte+h+a+c+d+e+f+g). You can get around this problem in one of three
ways:
Add a shift index just after the block index to signal how many bytes the data
in the key block needs to be shifted before applying the comparison tests. In
this case, the "habcdefg" block gets compressed into 2.375 bytes (block
index+001 shift index+00000000 bit map byte).
Increase the number of blocks in the data buffer.
Prescan the source file, adding highly repetitive blocks to a list and then
save this list and its parameters to the beginning of the encoded data file.
This will keep the number of blocks in the data buffer to a minimum, resulting
in the fewest number of bits required for the block index. The data buffer
should then be static during the encoding and decoding process since the
optimization work was already done during the prescanning process. If
"habcdefg" is a popular block, it will be encoded efficiently. If not, any
loss of compression should be negligible.


Suggested Improvements


There are a number of possible improvements you can make to programs. For one
thing, you might increase the number of bytes in the data buffer;
specifically, increase BLOCKFACTOR to an integer>1.
Likewise, you can delete code which updates the data buffer. For instance, you
can make the necessary deletions in the pack program. At the end of the unpack
program, use the replacement code shown in Listing Three (page 146).
Another improvement might be to write your own code to prescan the source
data, find the ideal initial information to be put in the data buffer, and
then save this information at the start of the encoded data file.
You can also rework the code entirely to handle N blocks rather than the fixed
256 count and then find the optimal size of N for any given source data.
Finally, consider allowing less than a 100 percent copy of the original and
then cut back on the number of changed bytes output during the encoding
process.

_DIFFERENTIAL COMPRESSION ALGORITHMS_
by James H. Sylvester


[LISTING ONE]

/****************************************************************************/
/* PACK.C -- by James Sylvester */
/****************************************************************************/

#include <stdio.h>

void main(int argc, char *argv[])
{
 const blockfactor = 1; /* adjust as desired */
 const blocksize = blockfactor * 8;
 char buffer [257] [8]; /* enter blocksize for second index */

 int i, j, currentsize;
 FILE *sf, *tf; /* sourcefile & targetfile respectively */
 int bestblock; /* best matching block in buffer */
 int bestcount, matchcount;
 int changeindex, bitvalue;

/* Verify and perform error handling for opening both the input file and */
/* the output file as binary files. Note, the input file has the original */
/* data and the output file will contain the encoded/compressed data. */

 if (argc != 3)
 {
 printf("Correct usage is >pack source_filename target_filename\n");
 exit(1);
 }
 if ((sf = fopen(argv[1], "rb"))==NULL) /* read only mode for input file */
 {
 printf("Unable to open source file %s\n", argv[1]);
 exit(2);
 }
 if ((tf = fopen(argv[2], "wb"))==NULL) /* write only mode for output file */
 {
 printf("Unable to open target file %s\n", argv[2]);
 exit(3);
 }
/* Initialize buffer with all zeros in the first block, all ones in the */
/* second block, and so forth so that there will be at least one matching */
/* byte in the first block of input data regardless what it might be. */
 for (i = 0; i < 256; i++)
 for (j = 0; j < blocksize; j++)
 buffer [i] [j] = i;
 while (1) /* while true ==> stay in loop until internal exit */
 {
/* Load the next block of data from the sourcefile into the last slot of */
/* the buffer. Also, keep track of how many bytes were read to signify */
/* proper handling for the unfull block when at the end of file. */
 currentsize = blocksize;
 for (j = 0; j < blocksize; j++)
 {
 i = getc(sf);
 if (i == EOF)
 {
 currentsize = j; /* reset correct currentsize */
 break; /* exit for loop */
 }
 buffer [256] [j] = i; /* put character into last block */
 }
 if (currentsize == 0) /* input data ended with previous full block */
 {
 printf("%s now contains encoded/compressed data!\n", argv[2]);
 exit(4);
 }
/* Find the best matching block in buffer for the recently loaded block */
/* of input data. Afterwards, send the bestblock index to the output */
/* file and process the changes between the two blocks. */
 bestcount = -1;
 for (i = 0; i < 256; i++)
 {
 matchcount = 0;
 for (j = 0; j < currentsize; j++)
 if (buffer [i] [j] == buffer [256] [j])
 matchcount++;
 if (matchcount > bestcount)
 {
 bestcount = matchcount;
 bestblock = i;
 if (bestcount == currentsize)
 break; /* exit for loop */
 }

 }
 putc (bestblock, tf);
 for (i = 0; i < blockfactor; i++)
 {
 changeindex = 0;
 bitvalue = 1;
 for (j = i*8; j < i*8+8; j++)
 {
 if (j >= currentsize)
 break;
 if (buffer [bestblock] [j] != buffer [256] [j])
 changeindex += bitvalue;
 bitvalue *= 2;
 }
 putc(changeindex, tf);
 for (j = i*8; j < i*8+8; j++)
 {
 if (changeindex % 2 == 1)
 putc(buffer [256] [j], tf);
 changeindex /= 2;
 }
 }
 if (currentsize < blocksize) /* input data ended with unfull block */
 {
 putc(currentsize, tf);
 printf("%s now contains encoded/compressed data!\n", argv[2]);
 exit(5);
 }
 /* Update the best matching block in buffer with new data. Compression */
 /* should improve as further loaded blocks match exact copies in buffer */
 for (j = 0; j < blocksize; j++)
 buffer [bestblock] [j] = buffer [256] [j];
 }
}






[LISTING TWO]
/****************************************************************************/
/* UNPACK.C -- by James Sylvester */
/****************************************************************************/

#include <stdio.h>

void main(int argc, char *argv[])
{
 const blockfactor = 1; /* adjust as desired */
 const blocksize = blockfactor * 8;
 char buffer [257] [8]; /* enter blocksize for second index */

 int i, j;
 FILE *sf, *tf; /* sourcefile & targetfile respectively */
 int bestblock = -1; /* best matching block in buffer */
 int changeindex;

/* Verify and perform error handling for opening both the input file and the
*/

/* output file as binary files. Input file contains encoded/compressed data */
/* data and output file should contain an exact copy of the original data. */
 if (argc != 3)
 {
 printf("Correct usage is >unpack source_filename target_filename\n");
 exit(1);
 }
 if ((sf = fopen(argv[1], "rb"))==NULL) /* read only mode for input file */
 {
 printf("Unable to open source file %s\n", argv[1]);
 exit(2);
 }
 if ((tf = fopen(argv[2], "wb"))==NULL) /* write only mode for output file */
 {
 printf("Unable to open target file %s\n", argv[2]);
 exit(3);
 }
/* Initialize buffer with exactly same information as in PACK.C program. */
 for (i = 0; i < 256; i++)
 for (j = 0; j < blocksize; j++)
 buffer [i] [j] = i;
/* Reconstruct original data from encoded data in sourcefile. */
 while (1) /* while true ==> stay in loop until internal exit */
 {
 if (bestblock == -1) /* input data yet to be loaded */
 {
 bestblock = getc(sf);
 if (bestblock == EOF) /* original and encoded files had 0 bytes */
 {
 printf("%s now contains reconstructed original data!\n", argv[2]);
 exit(4);
 }
 changeindex = getc(sf);
 }
 else
 {
 bestblock = getc(sf);
 if (bestblock == EOF) /* input data ended with previous full block */
 {
 for (j = 0; j < blocksize; j++) /* output full block */
 putc(buffer [256] [j], tf);
 printf("%s now contains reconstructed original data!\n", argv[2]);
 exit(5);
 }
 changeindex = getc(sf);
 if (changeindex == EOF) /* input data ended with unfull block */
 {
 for (j = 0; j < bestblock; j++) /* reinterpret bestblock as */
 /* last blocksize and output */
 /* this last, partial block */
 putc(buffer [256] [j], tf);
 printf("%s now contains reconstructed original data!\n", argv[2]);
 exit(6);
 }
 for (j = 0; j < blocksize; j++) /* output full block */
 putc(buffer [256] [j], tf);
 }
 for (i = 0; i < blockfactor; i++)
 {

 if (i > 0)
 changeindex = getc(sf);
 for (j = i*8; j < i*8+8; j++)
 {
 if (changeindex % 2 == 1)
 buffer [bestblock] [j] = getc(sf); /* directly load changes */
 /* into buffer bestblock */
 changeindex /= 2;
 buffer [256] [j] = buffer [bestblock] [j]; /* copy block info */
 }
 }
 }
}





[LISTING THREE]


 if (changeindex % 2 == 1)
 { buffer [256] [j] = getc(sf); }
 else
 { buffer [256] [j] = buffer [bestblock] [j]; }
 changeindex /= 2;
 } }
 }
}

































April, 1993
OF INTEREST





Symmetric Research has announced three new high-performance digital signal
processing (DSP) boards with 32-bit floating-point performance. Each includes
software for developing DSP applications.
The DSP_400 and DSP_MOD coprocessor boards feature a 25-Mflop AT&T DSP32C CPU.
They have up to eight Mbytes of zero wait state on-board memory, ideal for
applications that require large amounts of high-speed memory. The DSP_400 uses
individual memory chips, and the DSP_MOD board uses memory modules. Both
boards come with 12.5-MHz 32-bit parallel ports that allow direct data
transfers to on-board memory. The DSP_MUL multiple DSP32C board has four
50-MHz 32-bit DSP32C and 100-Mflop performance that combine multichannel data
acquisition and number-crunching power.
The software included in each package includes an assembler/compiler, symbolic
debugger, C and Fortran utility and math libraries, and a 1024x768, 256-color
graphics library for displaying time series and 2-D color images.
The DSP_400 and DSP_MOD cost $950.00; the DSP_MUL sells for $1400.00. All
packages include source code. Reader service no. 20.
Symmetric Research 15 Central Way, Suite 9 Kirkland, WA 98033 206-828-6560
New from Sound Horizons are the SpeakEz C++ class libraries for adding sound
capabilities to Windows applications. SpeakEz C++ classes provide transparent
access to the Windows Multimedia Extensions. Among the classes provided are:
the Media Control Interface (MCI) class, which affords support for
manipulating WAVE, MIDI, and CD-Audio sound; the Librarian class, which
manages the storage, data compression, and retrieval of media files using the
RIFF file format; the Sound Controller class, which gives you an interface for
playing and recording sound files; the Composition class, which supports
construction of complex sound function sequences; and Timer and Joystick
classes.
SpeakEz is compatible with Borland C++ 3.1 and Microsoft C7. The price is
$99.00; source-code licenses are $249.00. Reader service no. 21.
Sound Horizons P.O. Box 6625 Holliston, MA 01746 508-643-2882
A catalog of over 1200 public-domain computer programs created by NASA is now
available from COSMIC, NASA's software-technology transfer center. The
programs cover areas such as artificial intelligence, structural analysis,
thermodynamics, and project management.
The catalog is available online at cosline.cosmic.uga.edu (128.192.14.11) or
by calling 706-542-7354; or on diskette, including a search engine and
interactive front end for execution on your PC. It can also be used on most
LANs. Diskettes cost $30.00 each. Reader service no. 22.
COSMIC The University of Georgia 382 East Broad Street Athens, GA 30602-4272
706-542-3265
The "Object Oriented Numerics Conference," sponsored by Rogue Wave Software in
cooperation with SIAM, will be held in Sunriver, Oregon, April 25-27, 1993.
This conference will concern itself with the use of modern object-oriented
techniques in the design of software solutions to numerical problems. Among
the topics to be explored are reusable software components, applications,
parallelism, and OO compiler technology. For registration information,
contact:
Margaret Chapman, Program Coordinator Rogue Wave Software P.O. Box 2328
Corvallis, OR 97339 e-mail: amc@roguewave.com 503-754-3010
Tartan is offering two C libraries that target Texas Instruments' 320C3x and
320C4x DSP processors. The first is FloTar, an extended-precision
floating-point library that supplements the hardware floating-point
capabilities of TI digital signal processors with 64-bit arithmetic operations
and elementary math functions.
The library comes with C and assembly language interfaces for all calling
conventions and memory models and provides 16 digits of floating-point
precision. Basic functions include: format conversions, round, truncate,
floor, ceiling, compose, and the like. Math functions include: add, subtract,
multiply, divide, comparisons, 1/x, x{2}, square root of x, 1/square root of
x, ln, log[10], e{x}, x{y}, sin, cos, tan, cot, asin, acos, and more. Register
and stack parameters and small and large memory models are supported.
FasTar, the math library, provides 14 routines not supported by TI's library,
such as inverse of square root, powers of ten, cotangents and inverse
cotangents, inverse hyperbolic functions, and faster sine and cosine routines
for restricted input ranges.
FloTar costs $695.00 for PCs and $895.00 for SPARC stations. FasTar sells for
$495.00 (PC) and $695.00 (SPARC). Reader service no. 23.
Tartan 300 Oxford Drive Monroeville, PA 15146 412-856-3600
The International Standards Organization has granted Unicomp rights to
distribute electronic versions of the Fortran 90 standard, ISO/IEC 1539:1991,
Information technology--Programming languages--Fortran. You can obtain the
document as an ASCII file ($125.00), a PostScript file with a license that
allows you to print paper copies ($125.00 + $10.00 per copy), or the complete
source in ditroff, including macros and software to extract and create the
annexes ($1000.00).
Special arrangements, such as distributing a copy with each version of a
compiler or using the source to write documentation can be made subject to ISO
approval. For more information, contact Walt Brainerd at Unicomp
(walt@netcom.com). Reader service no. 24.
Unicomp Inc. 235 Mt. Hamilton Avenue Los Altos, CA 94022 415-949-4052
Visual C++, available in a Standard and Professional Edition, has been
released by Microsoft. Visual C++ is a Windows-hosted development environment
for editing, resource building, class/resource mapping, browsing, and
debugging Windows or DOS targets.
Both versions include the Windows-hosted Visual Workbench development
environment; AppWizard, which provides "skeleton" applications; App Studio, a
new interactive visual program for UI design/editing, resource creation, and
Visual Basic custom control manipulation; ClassWizard, a "programmer's
assistant" that manages messages, DDE, and so on; and the MFC 2.0 application
framework/class library.
The Standard Edition is capable of Windows, DLL, and VBX development. The
Profession Edition includes Codeview, the 3.1 SDK, source profiler, and
support for DOS target development.
Microsoft simultaneously announced the Visual Control Pack, 19 custom controls
for use with Visual C++ or Visual Basic. This toolset includes 3-D controls,
and controls graphs, communications, multimedia, pens, animation, and so
forth.
Visual C++ Standard Edition costs $199.00 and the Professional Edition is
$499.00. The Visual Control Pack retails for $149.00. Reader service no. 25.
Microsoft Corp. One Microsoft Way Redmond, WA 98052-6399 206-882-8080
John Wiley & Sons has published Windows 3.1 Insider, by Keith Weiskamp and Ron
Pronk. The book illuminates topics covered incompletely or cryptically by the
Windows User's Guide. Topics covered include Windows architecture, network
installation, optimal memory configurations, alternate file-management
techniques, TrueType font usage, desktop management, and third-party memory
managers.
There are many undocumented hints and techniques as well as two chapters on
trouble-shooting hardware and software problems that provide work-arounds and
alternative techniques to common Windows jams. A set of diskettes is available
separately that includes Windows-enhancement tools and utilities. Windows 3.1
Insider costs $26.95 for the paperback edition. ISBN 0-471-5794-X. Reader
service no. 26.
John Wiley & Sons Inc. 605 Third Avenue New York, NY 10158-0012 212-850-6000
or 800-CALL-WILEY
The Microsoft Windows Sound System, a set of software applications and an
add-in sound board, allows you to add audio features such as voice annotation,
proofreading, and voice recognition to your applications using Windows' Object
Linking and Embedding (OLE). The system consists of several utilities and
three applications: Quick Recorder, ProofReader, and Voice Pilot.
Quick Recorder generates and adds voice annotations to documents and files.
Using OLE, it inserts sound objects into a file. To add voice annotation to a
document, you record a message, then drag and drop its icon into the document.
You can customize the message icon and edit the annotations. Quick Recorder
automatically selects a recording sample rate and compression appropriate to
voice recordings to help conserve harddisk space. It also works with external
devices such as audio CD players and cassette decks, which let you add other
sounds to your recordings.
ProofReader provides audible proofing of numbers and common spreadsheet terms
with a high-quality human voice. ProofReader works with Microsoft Excel and
Lotus 1-2-3 for Windows and can be customized in several ways, including
adjusting the speed and adding your own dictionary of terms.
Voice Pilot lets you execute commands by voice, navigating through the Windows
OS via limited voice recognition. It controls customized commands, allowing
you to insert standard text into a document upon command. A training mode is
also available to adapt to different accents or pronunciations. Voice Pilot
comes with a predefined vocabulary for 15 Windows-based applications
Additional utilities include: SoundFinder, for locating and modifying audio
files from various formats, recording control, volume control, a sound-control
panel, a sound and icon library, and more.
The Sound System costs $289.00; bundled with Windows, it's $349.00. Reader
service no. 27.
Microsoft Corp. One Microsoft Way Redmond, WA 98052-6399 206-882-8080
The Btrv++ class library from Classic Software is an access library for use
with the Novell Btrieve Record Manager that provides C++ class encapsulation
of Novell Btrieve. Included with Btrv++ is Btrvgen++, its database code
generation system. Btrv++ gives you access to the complete functionality of
Btrieve 5.1 and incorporates many extensions to Btrieve such as memory
management and event-logging facilities.
Btrvgen++ is a Windows-hosted program for building a data dictionary that
describes the tables, columns, keys, and relationships in their databases and
then generates source code for C++ classes that encapsulate the tables and the
database schema. The generated classes for a typical small database contain
hundreds of functions that support the database. For each table, Btrvgen++
generates a class containing record and column access functions, keyed
get-data functions, utility functions, file open and close functions, and
functions that support the table's relationships. Utility functions are
included for cascading delete, record-count, and full-file iteration, and
relationship functions are generated for most common relationship access
needs.
Also new from Classic is the VBtrv Custom Control Library, an access library
for use under Visual Basic. VBtrv's features are accessible either through a
custom Toolbox control or through library function calls. The custom control
interface and the library interface can be used in the same VB program.
With VBtrv, commonly used structures and constants are predefined in the
declarations file, startup and shutdown protocols required by Btrieve under
Windows are automated, and Btrieve's single-function-call interface is
replaced by an easy-to-understand interface that defines a separate event
handler for each operation. All take standard VB data types as arguments.
VBtrv provides controls on the Visual Basic toolbar that represent Btrieve
itself and a Btrieve file. Btrieve access is through predefined methods
associated with the control.
Btrv++ sells for $349.00, Btrvgen++ for $229.00. VBtrv costs $249.00. Reader
service no. 28.
Classic Software Inc. 3542 Pheasant Run Circle, Suite 8 Ann Arbor, MI 48108
313-677-0732
Now shipping from Reasoning Systems is version 1.2 of REFINE/FORTRAN, a
reverse-engineering workbench for UNIX workstations. REFINE/FORTRAN can
analyze Fortran code and generate design information for StP, an analysis and
design toolset from IDE. With REFINE/FORTRAN you can use StP to document and
maintain Fortran systems as if they had been developed using StP. StP's C
development environment (CDE) uses the design information from REFINE/FORTRAN
to automatically generate StP design diagrams and populate the StP shared
repository. You can then use CDE's design editors to browse and edit the
design and navigate between design and code. This simplifies system
maintenance and makes the code more understandable. You can also use CDE to
generate complete documentation for FrameMaker and Interleaf5 automatically
and to synchronize the Fortran source with StP's design information. This lets
work be done incrementally, ensuring that all changes or additions to the code
are automatically reflected in StP and that the previous design is preserved.
CDE also lets you search part or all of StP's shared repository and generate C
code templates.
The price is $4900.00 per user on Sun SPARC stations and servers. You'll also
need IDE's Structured Design, C Navigator, and Design Generator for C that
sell for $5000.00, $2000.00, and $6000.00, respectively. Reader service no.
29.
Reasoning Systems 3260 Hillview Avenue Palo Alto, CA 96201 415-494-8053








April, 1993
SWAINE'S FLAMES


Home, But Not Alone




Michael Swaine


Hey, guys! I'm home!
It's me, HyperCard! I'm back at Apple! Hyper days are here again! Is this my
desk? Whoa, the work's really been stacking up. Stacking up, get it? Is there
any coffee?
Tell Danny Goodman to call off the telethon. Tell Bill Atkinson it's okay to
come home now. The old Wild Card is back and the good times are gonna scroll.
Boogaloo down Bandley. Yow, lotta messages here. Right off the top of the
screen, even with my new large card size. Later for that. Can somebody show me
around? I know there have been some changes since I went to Claris. I want to
get up to speed.
So let me see if I've got this straight. Centris and Quadra and Performa and
PowerBook are lines of Macintoshes. A Quadra 800, say, is a model in a line. A
Centris 650 8/230 E CD is a configuration of a model in a line. It's sort of
like species, genus, family, order, and so forth, isn't it? So where does the
Apple II fit in? Is it sort of like a fungus?
This is Newton? Are you telling me this is the Newton everybody's been talking
about? Hey, I know this kid! We were in adjacent cribs back in the nursery!
Atkinson thought we'd get together when we grew up. Maybe we will, if he ever
does grow up. What a shrimp. Hey, homeboy!
Doesn't say much, does he? Still in beta?
System software, right. I heard that Apple was turning into a system software
company. No, wait. That was Next. Sorry. I've been a little out of touch over
at Claris. So show me some system software. I'm stoked. Let's book.
Could you run that one by me again? What you said about AppleScript? I didn't
quite get it.
When you say, "integrated with AppleScript," just exactly what do you have in
mind? I mean, don't you think that's a little vague? "Integrated with?" If I'm
going to work with AppleScript, I want it clear who's working for whom, and
there's only one right answer to that. I hope you know that I don't like to
throw my weight around, but I didn't lose my seniority during my sojourn at
Claris, okay? 'Nuff said.
Where are we now? What is this place? This isn't Apple. The sign outside said
Taligent. What's a Taligent?
I'm sorry I asked. Apple is working with IBM? That's about as spooky as if the
head of system software were to defect to Microsoft, or Apple started pricing
its machines to compete with Dell and Compaq.
You don't say?
Hey, what's this box? Now I call that a nice looking Mac. I particularly like
the latch with the Frog design on it. But what's that label say? INTEL
INSIDE!?!?
Whew. Had me going there for a minute. It wasn't until I tried to touch that
"Intel Mac" and my hand went right through it that it even occurred to me that
it might all be a virtual reality model. Nice work. The gang in ATG, right?
Took me in completely. Good job, guys.
So--just how much of it was real, anyway?


































May, 1993
May, 1993
EDITORIAL


So What's It Going to be, the Highway or the Low Way?




Jonathan Erickson


When someone or something moves from the backwaters to the front page
overnight, you can usually write it off as just another instance of Andy
Warhol's "15-minutes-of-fame" axiom, particularly when said item fades from
the headlines faster than it appeared. It's true that after its moment in the
Silicon Valley sun, almost all mention of Bill Clinton and Al Gore, Jr.'s
"information highway" disappeared. But plans for the information highway are
alive and kicking, and work is underway on what may be one of the more
important proposals made in recent years -- and not just for techno-buffs who
want more and bigger toys.
As corny as it sounds, the "highway" analogy is right-on. As with today's
interstate highway system (which, as a cog in our national defense wheel, was
championed by Al Gore, Sr.), backers of the multibillion-dollar project see an
ocean-to-ocean, fiberoptic, electronic-data roadway carrying information to
your curbside just as cable television or telephone systems do -- but at the
rate of 1 billion-plus bits per second. Along the way, it will have countless
on and off ramps allowing business, education, research, and individuals to
access distributed databases and communicate with each other.
But if the network is really to be built, a number of pieces --technological,
financial, and political -- have to fall into place. Predictably, the "if" is
a very big one.
In theory, the technology issue is easy: You simply lay thousands of miles of
fiberoptic cable, install sophisticated switching equipment that just happens
to conform to standardized protocols, and write software that makes the thing
useful. Granted, there are stumbling blocks, the least of which is that fiber
optics is still prohibitively expensive and continues to be challenged by
technological advances in relatively inexpensive copper cable (like 100-Mbit
Ethernet at less than $100.00 per node).
And then there's the standards issue. There are many competing proposals,
including the ANSI-backed open-interconnect Fibre Channel protocol that IBM,
HP, and Sun have teamed up to support. ISDN is still hanging around, although
now it may never have a chance to get off the ground, no matter what your RBOC
says.
Assuming the cable is laid, the protocol is agreed upon, and the software is
written, how do you know if it works? You test it, and that's where any number
of other competing consortia jump into the fray. AT&T, DEC, and MIT, for
instance, have formed the "Wideband All-Optical Network Consortium" (funded by
an $8.4 million DARPA grant) to build a testbed for proof-of-concept
demonstrations of universal, scalable, optical networks. DARPA pumped another
$7 million into a similar Bellcore consortium that includes the seven RBOCs,
Columbia University, Hewlett-Packard, Hughes Aircraft, Lawrence Livermore
National Labs, Northern Telecom, Rockwell, and United Technologies. And at
least three other similar testbed projects are proposed or underway.
How much will all this cost? That depends on who you talk to and what exactly
they're talking about. Clinton has allocated $17 billion to jumpstart the
project, followed by $54 billion in 1994 and $150 million in 1995. No matter
how you slice it or dice it, a lot of money is going to change hands before
this thing is up and running.
Okay, now we're down to it -- big money for a big project. So the big question
is, who's going to pay for it? There are all kinds of suggestions for this,
most with fingers pointing the other way. Gore's idea is that the government
should fund, build, and regulate the information highway, arguing that a
public information highway is at least as important to the country's future as
the interstate highway system. He adds that the inevitable private-sector
network-access charges would limit public access (as with toll roads), making
it available only to those who could afford it. (One funding solution might
lie with the House's recently passed bill to reallocate the radio spectrum.
Clinton has supported auctioning rights to the highest bidder, raising
billions of dollars.)
At the other end of this ten-foot pole are those like AT&T CEO Robert Allen,
who speaks for the telephone and cable-television companies when he says that
the government can't afford to build the system and, even if it could,
construction should still be left to private enterprise.
Straddling the electronic median strip are those favoring public/private
partnership, a position supported by companies like Apple and IBM that are as
much motivated by the prospect of selling more boxes as by the public good.
(Apple CEO John Sculley can still pitch some of that old-time Apple religion
when he says, "through a public-private partnership, we can create an
infrastructure that will forever change the way we educate our children ...
earn a living ... deliver services ... and interact with family friends.")
Assuming the technological and financial barriers can be skirted, the biggest
hurdles may be the gaggle of commercial and political special interests
involved, each with its own agenda. The commercial players see a potentially
huge market, while the politicians are already licking their chops over this
high-tech pork barrel.
That isn't to say that there aren't dedicated people in both camps who truly
believe the information highway is critical to our future and want to make it
happen no matter what. If the highway project succeeds, it will be because of
their energy, spirit, and ability to mold disparate interests into a common
vision.
Whatever happens, it's clear that the information highway isn't going to be a
free ride. One of the pet scenarios of those pushing a publicly funded,
open-access network is that of an isolated child in a rural locale who can't
get to the local library, but can get into the electronic stacks of the
Library of Congress via the information highway. This is fine until you
consider that the Library of Congress recently petitioned Congress to begin
charging the general public for information services, particularly those
wanting computer access.
Information-highway proponents also like to point to the success of the
Internet. But keep in mind that the Internet didn't have all these competing
interests vying for special favors. If nothing else, what we've learned from
the Internet experience is that, given something as powerful and flexible as a
nationwide network, you can't predict how and why it will be used or guess at
all the new businesses and jobs that will spring up to support it.
In the short term, what we're likely to see emerge is a hybrid system littered
with compromise that's nothing like what visionaries envisioned. There's no
question that the information highway is too important to trust to either
private avarice or governmental inertia. But neither is there any question
about whether or not we should move forward with it.

































May, 1993
LETTERS







A Simpler Snippet


Dear DDJ,
This is in reply to C Snippet #38, which appeared in the November 1992 issue.
The snippet showed a method to remove trailing blanks from a character string.
It used a backward scan to find the trailing whitespace and then calculated an
index for each character, until it reached a non-whitespace character. There
are simpler ways to accomplish this.
The routine in Example 1 performs the same task as Snippet #38, but does it in
a single forward pass. Like Snippet #38, it also skips over embedded blanks
and handles all the special cases I can think of.
Example 1

 /**********************************************************
 * rmtrail-Removes whitespace from a null terminated string.
 * R. Richert 11-92
 **********************************************************/

 include <ctype.h>
 define NULL '\0'

 char* rmtrail (char *str) {

 char *strptr. *gotBlank;

 for ( gotBlank=0, strptr=str: *strptr: strptr(++) [
 if (isspace( *strptr) ) {
 if (I gotBlank) gotBlank = strptr;
 }
 else
 gotBlank - 0;
 ]
 if (gotBlank) *gotBlank = NULL;

 return str;
 }

One caution is in order. The braces separating the two If statements are
necessary. Without them, the Else statement will line up with the wrong If
statement.
I do not claim that this is the fastest and best way to remove trailing
blanks, but it is simple, and it gets the job done. Isn't that what snippets
are all about?
R. Richert
Sunnyvale, California


LUC


Dear DDJ,
Regarding Peter Smith's article, "LUC Public-key Encryption" (January, 1993),
I'd like to point out that the public key d--and so the private key e, is
strictly dependent on the message P to be sent; remember how to calculate r
with respect to D =p{2}-2!
Furthermore, LUC isn't new. In 1981, W.B. Muller and W. Nobauer introduced the
scheme in "Some Remarks on Public-key Cryptosystems" (Studia Sci. Math Hungar,
16), calling it the "Dickson Scheme." They also worked out r=lcm
(p[1]e[1]-1(p[1]{2-}1)... p[T]{3-}[T]-1(p[1]{2-}-1)), which is obviously not
message dependent. Since then the system has been reinvented several times,
mostly due to the unknown relationship between Dickson polynomials of the
first kind and the Lucas sequence V[n](P,Q).
There also exists a "Cryptoanalysis of the Dickson-Scheme," by W.B. Muller and
R. Nobauer (1986) in Advances in Cryptology, EUROCRYPT '85 (Springer-Verlag).
The article by R. Lidl and W.B. Muller entitled, "Permutation Polynomials in
RSA-Cryptosystems," Proceedings of Crypto '83 (University of California-Santa
Barbara, Plenum Press, 1984) provides a more general approach towards new
public-key cryptosystems by changing the exponentiation in the RSA
cryptosystem against other permutation polynomials and permutation functions.
Willi More
Klagenfurt, Austria
Dear DDJ,
Peter Smith's nicely written article, "LUC Public-key Encryption," with its
comparisons to the RSA public key cryptosystem, caught our attention. Like
recent work on elliptic curves, LUC combines old mathematics with new
cryptography and appears to offer some interesting properties.

Its advantages over RSA, however, remain unproven. Indeed, it is not clear
that, as the author claims, "LUC will be at least as efficient [as RSA]." Many
of the exponentiation heuristics that speed up RSA computation seem
ineffective for LUC, while heuristics for LUC remain effective for RSA.
Therefore, it is more accurate to say that RSA will be at least as efficient
as LUC. Nevertheless, the speeds are likely to be comparable for a given key
size.
Although LUC does appear to avoid adaptive chosen-message forgery, it is
susceptible to another type of forgery. Suppose P is a message and k is a
signer's private exponent. Then V[k](P,1) is the signature of P. Given that
signature, an attacker can forge the signature of the message V[n](P,1) for
any n. Since the attacker knows V[k](P,1) and n, he or she can compute
V[n](V[k](P,1),1). According to Example 4(e) in the article, V[n](V[k]
(P,1),1) = V[n][k](P,1)=V[k](V[n](P,1),1), which is the signature of
V[n](P,1).
Overcoming this "existential" type of forgery appears to require a hash
function or message formatting, contradicting one of the claimed advantages
over RSA. Since a hash function allows one to sign messages of arbitrary
length with a single RSA (or LUC) encryption, the property of not requiring a
hash function is not necessarily an advantage.
Taher Elgamal,
Director of Engineering
RSA Data Security
Burton S. Kaliski, Chief Scientist
RSA Laboratories
Peter responds: Willi More's point -- that r is dependent on the message-- is
covered in the listings that accompanied the article. While at first sight
this dependency appears to invalidate the LUC scheme, it can readily be seen
that there are only four possible d values (for a fixed e) that can be
precomputed. The correct value for a particular message is chosen by the
calculations outlined in the listings. This actually adds to the security of
LUC, and does not seriously affect the decryption time.
Willi also pointed out the earlier work of Muller and Nobauer, of which I was
unaware until after the article was printed. They describe a remarkably
similar system, but do not provide any practical way of actually carrying out
encryption of a message. (They use high powers of irrational numbers!) Using
their r instead of my (message dependent) r would lead to a doubling of the
decryption time.
In summary, LUC is both practicable and new. The article contains a reasonably
full description of the mathematics behind it.
Taher Elgamal and Burton Kaliski's courteous letter makes three points about
LUC. First, the heuristics necessary to give LUC parity with exponentiation
are admittedly more complicated, but certainly not ineffective: LUC has yet to
meet an exponentiation heuristic that can't be adapted to Lucas-function
calculation. Exponentiation heuristics are a well-studied field. It may well
be that more effective heuristics await Lucas functions, whose study for these
purposes is quite new.
The second point, that LUC appears "to avoid adaptive chosen-message forgery,"
is the most important advance of LUC over RSA, and I appreciate the
professional integrity of RSA Laboratories in recognizing this fact.
The presence of "existential" forgery in LUC may require message formatting.
This type of forgery produces random, nonsensical results, and is regarded as
a minor nuisance rather than a meaningful forgery. LUC's message signing will
use hashing, but only as an elective message-digesting process. RSA, on the
other hand, requires hashing for this purpose, as well as to guard against
adaptive chosen-message forgeries.
RSA has done a stalwart job leading the fight to establish public-key
cryptography over the last fifteen years. It appears that the mantle now
passes to LUC, RSA's match for speed and its superior in terms of
cryptographic security.


Curmudgeons Abound


Dear DDJ,
Regarding Scott Guthery's "A Curmudgery on Programming-language Trends"
(December 1992), why was the article given such a title? Are
programming-language trends so holy that to criticize them makes one a
curmudgeon?
A computer language isn't much more than a means to an expression. Every
elegant design, whether hardware or software, begins with the phrase, "let A
be Z." That is, the problem is so well understood that a starting place is
declared, and everything else follows. The final code can be written in any
language understood by a computer and an elegant design will stand.
Obviously, writing in C is faster than writing in assembly language. With
practice, either assembler or C code can be written to be reused with a
minimal effort. Scott is absolutely correct when he suggests getting on top of
the learning curve of whatever tools you use. After mastering those tools,
take the time to write some tools of your own.
Do we need C++? I've been using object techniques since writing my first
concurrent scheduler in 1978. The fact that C++ translates into C first tells
you that everything you need already exists in C. Will using C++ make you a
better programmer or a better program designer? I doubt it. If it takes C++ to
give a new perspective on programming, shouldn't we ask why such a perspective
has to be bundled with a specific implementation language? Isn't such a
perspective applicable to programs in general? Does using a schematic editor
make one a better engineer? Underlying principles must be understood before
elegant designs are possible. Once first principles are understood, C is more
than powerful enough to program in. In fact, it is superior to C++ because
there are fewer hidden details to trap the blissful. Data hiding may be
appropriate in some circumstances, but when your compiler or operating system
hides details that prevent specific debugging, then it's a hobble.
Another constant has been the promise of the latest language of compiler.
Software developers know that they have to continue promising, and magazines
love to help them purvey their promises. This set of conditions guarantees a
popular future for C++ and all the languages and dialects sure to follow. That
popularity will be used as evidence of the value of the latest programming
fad.
Programming productivity won't be boosted much by any programming language.
Take a look at a modest program. First, there is a set of decisions that are
forced by the application. That is, how do you squeeze the application problem
into a computable context? What are the general rules and what are the
exceptions? Let's call all the decisions forced by the application "X." Next,
there are all the internal coordination decisions that are necessary to bind
the application decisions into a coherent program. Let those decisions be "Y."
X+Y decisions must be coded before a program works. If a poor design
translates the application into a large X, and that is compounded by
structureless convolutions of Y, then the result is obvious. Productivity
requires that we reduce the sum of X and Y, not how we finally code the
problem. Focusing on a language or a dialect is part of the problem, not part
of a solution.
David Smead
Seattle, Washington


Japatent Revisited


Dear DDJ,
This is in response to the October 1992 letters from Donald Kenney and Shohei
Nakazawa concerning my "Japatent challenge" ("Letters," July 1992).
Reader Kenney correctly delineates the broad parameters involved in
translating a document from one language to another, but those parameters
might be tightened somewhat, as follows.
First, the text of a patent involves primarily the communication of a
technology. This may relieve the translation burden somewhat over that
involved in translating, say, a novel. Technical words are largely the same in
both English and Japanese. Kenney's idea of using two university students
(instead of software!) to translate and proofread is a good one.
Second, a reference to the original American patent would appear in the
translated patent; thus, inaccuracies such as the example Kenney gives
(translating "track jam" to "jelly on the track"), would be embarrassing but
not serious.
Third, one might translate using a software dictionary of phrases instead of
just words and even groups of phrases, "ideas," while focusing on the
particular technical area of immediate interest.
As to whether the Japan Patent Office might someday accept patent filings in
English, this may not be as improbable as one thinks. After all, my letter of
inquiry to the Japan Patent Office (in English) was promptly replied to with
one which was quite extensive and detailed, and in perfect English! Many
scientific papers written in Japan are written in English. And I recently
attended a technical conference at the University of Tokyo which was entirely
in English, except for one Russian lady who spoke in Russian with a
Russian-to-English translator from the local Russian embassy!
Indeed, I hereby predict that within approximately a year after a practical,
affordable translation program is developed, Japan will begin accepting patent
applications in English. (That prediction is based on one of Murphy's laws.)
Already the Japanese have taken a giant step forward by allowing patent
applications to be on computer disk; this relieves us of the burden of
printing out all those thousands of characters.
On that note, I would like to acknowledge the assistance of Pacific Software
Publishing, and thank them for sending me a copy of the AX Technical Reference
Guide: Kanji at Environment. This 318-page handbook contains, among other
things, the double-byte codes for many, many Kanji characters. It is printed
in Japan (in English) and is a must for anyone pursuing this subject.
Homer B. Tilton
Tucson, Arizona

















May, 1993
THE COMMONS OF INFORMATION


Lee Felsenstein


Lee is an electronic-design engineer responsible for some of the early designs
of personal computers. In 1975 he helped organize the meetings of the Homebrew
Computer Club, and was a founding member of The Community Memory Project. He
currently holds a senior position at Interval Research Corporation, 1801 Page
Mill Road, Bldg. C, Palo Alto, CA 94304. He can also be contacted at
lee@interval.com.


One of the most important factors in the development of personal computers has
been the urge to permit people to connect, for whatever purpose may suit them.
Where does this come from, and where is it going? How important is it? What
does it mean in thinking about society in cyberspace?
I'll argue here that there is a "commons of information" which has been
fundamentally important in the survival and development of mankind, that it
has been minimized in our present-day society, but that the inbred need for
the functions provided by such a commons are very much with us. I'll further
argue that the development of personal computers has been seriously affected
by the quest for the revitalized commons of information, and that this quest
is showing signs of success.
I'll refer to the commons of information by the Greek word for the
city-state's place of assembly--the "agora."


The Commons of Information: The Agora


It isn't natural for people to live in isolation from each other. All
traditional societies are based upon the village or at least the nomadic clan.
All villages are centered around some space of assembly, which usually
functions as a marketplace.
What goes on in these marketplaces is more than commerce. People hang out
there, display their identities (usually as members of groups), gather groups
of friends, banter and gossip within and among the groups, overhear others'
conversations, and inject themselves temporarily into those conversations. In
short, they get to know who the other people are who share their society, and
keep up with their daily doings.
Does this perhaps sound a bit like life at the mall? If so, there's good
reason. The mall, like the village square, is a space where transactions of
various sorts are carried out in public. These include commercial, social, and
political transactions. Obviously, the entire transaction isn't carried out in
public, but it is set up so that it can be completed in private.
People have been living around village squares for thousands of years. This
must have had its effect on our cultural evolution. Look at what happens when
people in urban societies get to stay put for a generation or two. European
cities developed structures of plazas each with a street life suitable for a
small village within the larger city. Neighborhoods developed identifications
with their particular plazas. This is all considered very livable by less
fortunate city dwellers from America, where people rarely remain living in one
place for more than 10 or 15 years.
The degree to which such a "village square" is unavailable to people is, I
maintain, the degree to which people are strangers to each other, and this
situation is directly related to the development of social pathologies such as
criminality, alcoholism, brutality, and the like. I claim that we all have an
inherent need for the function of the village square, which I call "the
function of the agora."


Industrialization and Privatization


The village square is a commons--it belongs to no one but is used by all. The
agora is a commons of information--a way of interacting. It is not property.
The process of industrialization began in England with the "Enclosures Act"
which deeded the village common grazing lands to whomever could build a fence
around them. Needless to say, the landlords were the only ones who could raise
the capital to do this, so the common lands went to them. This enabled them to
enlarge their holdings, and the resultant surplus of income over expenses
provided the pool of capital upon which the process of industrialization was
based. The peasants lost a source of food and were driven into dependence on
what wages they could get from serving as hired labor to the landowner. All in
all, a very tidy move by the landowners. Too bad about the resulting
starvation and homelessness, but, of course, there were too many peasants,
anyway. The whole thing was justified because the landowners could supposedly
make more efficient use of the land than the peasants.
As urbanization proceeded, a somewhat similar process of privatization of the
commons of information took place. The place where people could gather and
exchange information began to lose its function to the gradually centralizing
mass media. There was no money in an agora which could be concentrated at some
central point. But a newspaper could command a price for advertising space. So
small-town papers were supplanted by larger-scale publications that could
underprice them. I remember vividly the neighborhood shopping paper printed in
a storefront in my native Philadelphia neighborhood. It was filled with little
gossip items, each one of them of importance only to a small circle of people,
but the totality of which chronicled and defined the life of the community (a
Jewish neighborhood for more than one generation). It couldn't compete for
advertising dollars with larger throwaways which printed only ads and
"boilerplate" generic news that was produced nationally.
All of the media had the characteristic of concentration. The only anomalies
were the telephone system (but the directory was concentrated) and the postal
system (which was socialized). All the rest became structured with a central
point through which the information is funneled and from which information is
distributed in identical form. I call this a "broadcast" structure, and print
media qualify as well as electronic media. I remember the moment in 1969 when
I looked out my window down the street and saw all the living room windows
glowing with the blue light of TV. I realized that they were all getting their
information from Walter Cronkite in New York, but that we had no ready way to
get information from each other.
In the '60s I thought that the cause of re-establishing functioning
communities could be served by the newly established "underground press," and
for a while I helped at the Berkeley Barb, one of the oldest such papers. But
I saw the structure of that medium determine its economics and thereby its
content. By 1970 I knew that broadcast media were never going to serve the
cause of decentralization of power within society.
An encounter that year with mainframe-based network computing (through
learning Basic at the SDC training facilities) alerted me to the fact that
such a network had no geographical restrictions and that information items
could be made accessible to variously defined groups of users. It was clear to
me that through computer networks it would be possible to support the
information needs of an overlapping set of communities of interest. But where
to get a computer in those days?
Fortunately, other people had come to the same conclusions, and a group named
"Resource One" had formed in San Francisco to secure a timesharing computer
for roughly this purpose. In August 1973, we were able to try an idea proposed
by Efrem Lipkin: We placed terminals in public places (a record store in
Berkeley, California followed by a branch of the San Francisco Public Library)
that people could use as a bulletin board. We called it "Community Memory."
What happened was that an agora appeared, with an unknowable number of
different needs, desires, suggestions, proposals, offers, statements, poems,
and declarations cropping up. We, who had expected only a few categories of
classified-ad items, were amazed at the discovery. It became clear that the
crucial element was the fact that people could walk up to the terminals and
use them hands-on, with no one else interposing their judgment. The computer
system was not interposing itself between the individuals who used it, either.
It was serving a "secondary" information function, like the telephone
directory, except that you could make your own rules as to how you were
listed. When you completed your computer transaction, you knew who you wanted
to talk with. Subsequent transactions were carried out through other
nonbroadcast media, mostly the phone.


The Start of the Personal Computer Market


The next month, September 1973, an article appeared in Radio-Electronics
magazine that produced another discovery. The "TV Typewriter" was introduced,
a construction project by Don Lancaster which promised that you could build a
computer terminal that could display characters on your TV set. The response
for the mail-order plans was 50,000 percent above expectations--10,000 people
sent in their money.
What was going on here? Later I spoke with Lancaster and asked him why the
design was not really usable as a computer terminal. He responded that "people
just want to put up characters on their TV sets," and he was right. The
promise of "inverting the media," of controlling the display of ones' own TV
set, especially through a sacred-cow technology like digital computer
electronics, was hard to resist. A cultural vein had been tapped.
This was the start of the hobbyist market for personal computers, from which
the industry bootstrapped itself in the years 1975-77. Most hobbyists didn't
have a good, sound, rational reason why they wanted a computer. They could
make excuses about recipe files and checkbook programs, but nobody did that
when they had a computer. They just wanted to get their hands on the
technology and control it from below.
In 1978 personal computers found their first big function: communication. Alan
Kay had been saying for years that a computer was first, second, and thirdly a
communications tool. Ward Christensen and Randy Seuss opened a BBS in Chicago
in February of that year, and the rest is history. The number of BBS systems
is unknown and probably unknowable. All this in spite of primitive software
technology (of which more later). This was indeed a demand-driven application,
and it is important to note that the demand was not for official, certified,
top-down information, but for contact with other people having kindred
interests.
Many a well-financed "videotex" system has foundered and sunk because the
operators would not consider opening the system on a person-to-person basis.
People don't want to be subjected to centralized information. They do want to
be able to explore the social space of their surroundings and to ask the
question, "Who's out there?"
And then there's the Internet. Like Citizen's Band radio, it's totally out of
control, impossible to map accurately, and being used far beyond its original
intentions. So far, so good. It has, however, developed outside a commercial
structure, and if it is placed within a commercial structure, we may expect
the increasing centralization of control to which other media have already
fallen prey. Packet radio, which was developed by radio amateurs, is likewise
entering the commercial arena.


Technical Problems


Remember that the excuse for privatizing the commons was efficiency. The
technology used to implement the agora function through BBS systems is
actually quite primitive. The systems in use today are all derivatives of the
EIES system designed by Turoff and Hiltz starting in 1971; basic development
had stopped by 1977.
These systems are message based rather than data based, having originally been
conceptualized as allowing meetings with remote, asynchronous participants. In
such a meeting, you wait your turn for an opening while you marshal facts and
figures to support a statement. When permitted by the protocol, you make your
statement and await the response, ready to fend off criticism by any means
necessary. This all takes place in view of all other participants and under
the supervision of a moderator, who manages the information flow.
This is fine for bureaucratic or academic environments, but what happens when
the host stays out of the information-management role and recedes into the
background? The result is like having a do-it-yourself library containing not
books, but scrolls, written by the people who use the library. You must take
the desired scroll down from the shelf (log into a conference), read through
from the beginning (there are no page numbers or indexes), suffer all the
frustrations of topic drift, and finally come to the end, where you may enter
your message or comment. Once written, it may not be changed. You can refer to
another entry only by item number. If that item happens to be in another
scroll (conference), then you may not access it without closing the open
scroll.
You generally write using an editor--made by and for computer
programmers--which isn't fun to use, or to teach, for nonprogrammers. The
operative attitude towards these technological shortcomings seems to be rather
myopic, something along the lines of, "It's good enough for me and my friends,
so what's the problem?"



The Challenge


If it's going to survive, the agora of the future will have to be designed
somewhat better from the perspective of non-technical users.
Right away, there'll be objections from some quarters as to the desirability
of letting in the nontechs. But they will be there whether we want them or
not--the only question is whether the agora will be open or controlled for the
benefit of a few. Technology matters. The personal computer took the form it
did (with open architecture) because a hobbyist-based industry was free to set
its own criteria without guidance from the investment community for the first
two years of its life. IBM fielded the 5100--a thoroughly closed architecture
machine--as soon as it could, following the PC eruption in 1975. The company
had to withdraw that machine by 1979 and adopt open architecture in order to
have impact in this marketplace. Even after it grabbed the market share with
the 5150 (the IBM PC), IBM couldn't control the technology and close it up.
The design continued to propagate like a virus through the technological body
and has left the company behind.
Community Memory did not dissolve after the 1973 experiment. That system was
turned off in January 1975 and, in 1977, the people involved decided to set up
their own nonprofit corporation, The Community Memory Project. Under Lipkin's
technical leadership, we made a number of good calls (UNIX, relational
databases, and X.25 as future leading technologies) and worked out a solution
to the problem of system centralization through packet networking.
It took longer than anticipated, but in 1984 we put up a pilot version of the
intended system at four public locations in Berkeley. In 1989, assisted by a
Telecommunications Education Trust grant from the California Public Utilities
Commission, we put up a ten-terminal system using a front-end/back-end
architecture. We've continued to upgrade the user interface and have brought
it to the level of a stable design with designed-in upgrade paths.
Most significantly, this was done without classical marketing research, but
with the direct involvement of a base of users over a number of years. This
was "patient capital" in action. Had we gone the usual for-profit route, we
would have had a product out quickly, but it would have been indistinguishable
from the others and designed without input from users.


The Community Memory System


The Community Memory software defines a database server running on a UNIX V
system with 4 Mbytes of RAM that can serve requests from PC clones running the
front-end program. This program, developed under UNIX and ported to the PC
(and therefore portable to other systems), manages the display, keyboard,
coinbox, and modem. Data is exchanged with the host using an error-detecting
packet protocol only when a data request is pending. The program, which runs
in 512 Kbytes, buffers data items in local memory and allows scrolling locally
without support from the server. We estimate that 50 users can access the
server simultaneously without performance problems, but we've not tried a
test.
The front end is based on a windowing system written for the PC that allows
multiple overlapping windows. Control is accomplished by moving the cursor
with the cursor keys and "clicking" using the Enter key. The display is
alphanumeric and uses a monochrome display adapter. When the system is
inactive the front end runs a "teaser" program that creates a lively, animated
display.
Pressing any key stops the teaser and establishes a connection to the server.
The modem is dialed, and login sequences are exchanged with the server. (Why
is this not done with existing systems?) The server downloads some screens,
and the user is presented with an initial screen. At any point after this, a
delay of a preset number of seconds without key activity on the PC will result
in a warning of an impending logout followed by a logout after another delay.
The system can be navigated with the cursor and with the FL, F2, and F3 keys.
FL is always OPTIONS, which vary depending on context. F2 is always BACKUP,
which will return the user to the previous operation. F3 is always HELP, which
is context sensitive. In the public implementation the keys are color coded.
The system is data based, with two types of databases running. Messages (a
maximum of 99 lines of text, the number being a parameter) can be stored
either under an indexing system or in a network structure as comments on
another message. Items are indexed automatically by date last edited, author,
and a system-unique message ID (the tag, a six-digit alpha string). The user
can select index words from a list prepared by the host of the forum (our name
for conference) or may type in any amount of text, including spaces, as index
words. Angle brackets (<>) are used as delimiters. The index field may be
edited by the user under a password-protection scheme.
The ability to include text within angle brackets inside the message brings
about a limited hypertext capability. You can embed references to index words,
author tags, or message tags in the text, and the reader can land the cursor
on them and click to effect a search. The next message on the reader's screen
will be the one referenced.
The reader can extend or narrow a search by subsequent search commands
selected through the OPTIONS menu. At any point the reader is told how many
messages were found and can scroll quickly up and down the list of found
messages using the arrow keys. Each message occupies a window and can be paged
within the window. If a comment has been attached, a "button" is present, on
which the reader may click to see the comment. The comment will be indented
and displayed immediately below the item. Comments on comments use multiple
indentation.
The hypertext capability is not formal hypertext because there's no way for a
user to see who has made reference to his or her message. We have tried to
create a function that the ordinary user will expect, not one that will
satisfy a purist.
The commenting feature means that one can start a lateral thread and continue
it indefinitely without disrupting the forum. Comment messages are not
displayed until requested. People are used to handling interruptions in their
conversations, and used to interrupting others' conversations. This
multidimensionality is a requirement for any system which is used without an
agreed-upon discipline.
Community Memory was conceptualized as a system of publication, in the sense
that every message is public. We've recently added a private-message (e-mail)
capability which is still in test as of this writing. Thus, when one adds a
message, one must respond to a three-entry menu asking whether the item is to
be a new message, a comment, or a mail message to the author of the selected
message.
Messages in Community Memory can be associated with any number of forums. They
are not physically stored in a file associated with the forum--the connection
is virtual. This association of the message to forums can be edited by the
message owner or added to by a forum host.
Under the coinbox rules currently in effect, it costs 25 cents to add a
message and $1.00 to start a forum. The rates and structure are experimental,
and with the coinbox under control of the PC, the range of possible charging
strategies is wide. A credit-balance system has been written but not
implemented for dial-in use.
When we started Community Memory, we purposefully did not put it online
because we didn't want it to be overwhelmed by computer types, in order to
avoid a cultural barrier to the non-computer-familiar users. We took a lot of
criticism from those online for this decision, but we were rewarded by having
a much more broadly representative user profile than would have otherwise been
the case.


Off We Go


When I was hanging around the radical fringes of Berkeley 25 years ago, we
assumed that once the inconsistencies of the System were exposed and made
plain, everyone would just go about doing things the right way, whatever that
was. We had only to concern ourselves with talking about how bad things were,
not how to organize things right.
Now it's recognized that as a society we're a lot closer to the crunch. The
fate of the Soviet Union is staring our organizational culture in the face.
The issue of "how to do things right" cannot be avoided. Pop business writers
like Tom Peters are suddenly discovering that decentralized, laterally
connected enterprises function better than the centralized monsters we grew up
with. The door is open both for us (and the former Soviets) to walk through
out to the agora. But if we don't it could well close and we might follow them
into a period of authoritarian reaction.
Our task, as technologists, is to build the tools that get us through this
door in to the future we want. We've already done half the job by creating the
personal computer such that it took on a life of its own and evaded capture.
Now the task of furnishing the agora remains. Anyone up for another adventure?




























May, 1993
MODELESS DIALOG BOXES FOR WINDOWS


Graphical developer interfaces can speed development cycles




Joseph M. Newcomer


Joe received his PhD in the area of compiler optimization from Carnegie Mellon
University in 1975. He was involved in some of the earliest interactive
language design in the late 1960s, and has been dealing with user interfaces
since that time. He is currently a consultant and Windows-application
developer.


While almost all Windows programs take advantage of the operating
environment's rich graphical user interface, I have found that it's possible
to use the environment as a graphical "developer" interface too. For instance,
in many cases, information that comes out as debug print statements in a
conventional program can be represented more readily in a graphical form.
While this doesn't replace the need for the general debug print mechanism
(such as the WINIO package described by Schulman, Maxey, and Pietrek in
Undocumented Windows), GUIs have considerably shortened my development time on
a number of projects.
To expedite the development of these developer interfaces, I wanted to do
direct screen layout and not have to develop my own windows. Therefore, I
implemented these windows as modeless dialog boxes, allowing me to use the
dialog editor for all necessary layout. However, when I attempted to make the
modeless dialog boxes do what I wanted, I encountered a number of problems,
which I'll discuss here, along with their workarounds.


Debugging a Parser


For one job, I had to develop a report-generator application. The client had a
specification of the report language, and I had to implement the compiler for
this language, along with display mechanisms. I decided to create a modeless
dialog box with controls such as:
A static text item that was the input stream as seen by the lexer.
A static text item that was the current lexeme being scanned.
A static text item that was the lexeme code.
A multiline scrollable static text item that was the representation of the
parse stack.
A collection of buttons and checkboxes.
The buttons and checkboxes allowed me to single-step the lexer and parser, and
control the level of detail of the information I was seeing. At each point I
could examine the various states. When it came time to debug the generated
interpretive tree, I built another modeless dialog box that gave me
single-step execution capability on the tree, showing the current operator,
current evaluation stack, and so forth. Then I added another box that let me
see the underlying database records, and one that let me scroll through the
computed internal state vector (based on computations the client had already
written to massage the database and derive useful information). It was amazing
how fast development went when I could see the information in a reasonable
form. Like debug print statements, the output doesn't have to be fancy, and
once you've done a couple of these you can clone them forever.


Handling IsDialogMessage


As I added each new modeless dialog box, I discovered that I had to add an
IsDialogMessage test to my main message loop; this meant that my main message
loop had to know about each of the dialog boxes being used. Somehow this
violated a principle of abstraction: You shouldn't have to keep changing the
main message loop each time you add a new dialog box. I therefore implemented
a dialog-box registry mechanism: Whenever I created a new dialog box, I
registered it with the dialog-box handler, which was responsible for ensuring
that IsDialogMessage was passed on to each of the dialog boxes in turn; the
first one that handled IsDialogMessage would terminate the dialog handling.
The modeless registry (see Listing One, page 82) maintains a linked list of
pointers to window handles. It is limited in that the window handle is
expected to be in either static memory or in memory that's not freed until
after the handle is unregistered, as it determines the identity of
window-handle references by equality of pointers. Thus, if the modeless dialog
window is ever destroyed, its handle will be set to NULL. However, if it is
re-registered without having been unregistered, the old entry will be reused.
The ProcessModeless routine handles all modeless dialog boxes. If any modeless
dialog box handles the IsDialog Message call, ProcessModeless returns True;
otherwise it returns False. The main message loop (Listing Two, page 82) is
quite simple and never needs to know how many modeless dialog boxes exist.
The UnregisterModeless procedure takes a window handle and unregisters it.
This is done in Listing One by setting the variable that holds the window
handle to NULL, then removing the window-registry item from the list. This is
normally not called by the application, but by the dialog message handler I'm
about to describe.


Iconizing Dialog Boxes


With all this power and flexibility, my screen was suddenly cluttered with
many dialog boxes. When I minimized them, however, I discovered that they used
the class icon of the main window, so I couldn't readily tell one from the
other.
To address this, I first tried setting the GCW_ICON word of the dialog to hold
the icon I wanted to display. This didn't work because this was the class word
for dialog boxes, which caused the icon of the most-recently created dialog
box to be displayed for all active modeless dialog boxes. I discovered,
however, that if I set the GCW_ICON word of the dialog to NULL, it would
display the canonical empty-white-square icon.
Next, I needed to set up my own icon and attach it to the dialog box. There
are two ways to attach extra information to a window: SetProp and
SetWindowLong using the GDL_USER longword offset. Because the GDL_USER
longword is very powerful and can be used to attach arbitrary data structures
to a dialog window, I did not want to usurp it for the icon or require that
the object it references have the icon reference in a known place. Therefore,
I used SetProp.
The SetProp function allows you to attach, by name, an arbitrary 16-bit value
to any window. I attached the handle of the icon I wanted under the property
name "icon"; see Listing Three, page 82. In this procedure, I pass in the name
of the icon, which is the name that appears in the resource file with the ICON
declaration. I could easily have written one which took an icon handle, but
that wasn't necessary.
When the window is closed, you need to remove the property. This is handled in
my default dialog-box handler. Normally, a dialog box returns True to indicate
that it has handled the message or False to indicate that it has not; some
special messages have special values returned. If it returns False, the
message was not handled, and the default dialog-box handler, DefDlgProc, will
be called by Windows. (It must not be called explicitly in a dialog-box
handler.) I added my own default handler, in the style of DefWindowProc; see
Listing Four, page 82. If a dialog box doesn't want to process the message, it
calls MyDefDialogProc and returns whatever value MyDefDialogProc returns.
My DefDialogProc's main purpose is to handle the dialog-specific icon. The
messages it handles are WM_PAINT, WM_NCPAINT, WM_ERASEBKGND, WM_CLOSE,
WM_NCDESTROY, and WM_QUERYDRAGICON.
After some experimentation, I found that the class icon for the dialog would
be changed each time a new modeless dialog (and perhaps modal dialog) was
created; it was set to the class icon of the parent window. Thus, all existing
modeless dialog windows displayed their icons using the parent window's icon,
no matter what icon they had displayed previously. This wasn't immediately
obvious since the icon did not actually change until repainting occurred. I
therefore had to keep resetting the class icon of the dialog class to NULL.
The solution is to intercept the WM_NCPAINT message and simply force the
GCW_HICON class word to NULL. While this may not be the most elegant solution,
it does work consistently and correctly and has done so for over a dozen
Windows applications.
The WM_PAINT operation is key; if the dialog window is iconic, it calls
PaintIconicDialog; see Listing Five, page 82. This erases the background for
the icon and then draws the icon in the window location. The magic constant
"2," which is added to the x,y position passed to DrawIcon, is another piece
of empiricism; the area erased for the icon is a rectangle 36x36 logical units
in size, so these magic constants place the 32x32 icon in the correct place,
centered horizontally and vertically, in that area. If the window is not
iconic, the WM_PAINT message is not handled here, and will ultimately be
handled by the dialog class handler. This probably won't work for CGA
displays, but supporting a CGA is not high on my list of priorities. It does,
however, suggest a certain fragility in this code if Windows starts supporting
64x64 or scalable icons for high-resolution displays someday.
After I successfully got the correct icon displayed, I discovered that every
icon appeared in a white rectangle, no matter what I tried. Using Spy and
Codeview, I found that WM_ERASEBKGND was the culprit; a 36x36 hole was left on
the screen whenever the window was iconized, so the later DrawIcon operation
put the icon on this hole. This is not what you'd expect to see, so I
intercept this message. If the window is iconic, I report that I've handled
WM_ERASEBKGND (by returning True), while in fact I actually do nothing. Then
the icons appear drawn as expected. If the window isn't iconic, I return False
and let the operation proceed normally. WM_ICONERASEBKGND, which normally
handles this, is not sent if the class icon is NULL.
When the window is finally closed, WM_CLOSE removes the icon property by
calling DestroyWindow to close the modeless dialog box; a modeless dialog box
must not call EndDialog, which is only for modal dialog boxes. If properties
are not removed from Windows before they are destroyed, they may result in
unreclaimable resource consumption in the USER heap.
The last message needed to deal with custom icons is WM_QUERYDRAGICON. If
you're using this technique, the class icon is NULL, so a drag operation drags
a blank rectangle, which is not what you would expect. To get a drag icon that
resembles the icon you are using, you must respond to the WM_QUERYDRAGICON
with the handle to a cursor or icon that will be used to do the dragging. If
you provide an icon, it will be converted to a cursor, so there's no need to
keep two identical drawings -- one in icon form and one in cursor form --
unless Windows' default transformation for converting an icon to a drag cursor
results in something you find unacceptable. If you want to supply a cursor,
you'll have to add a "cursor" property, and have SetWindowIcon attempt to load
a cursor of the same name: If it succeeds, add the cursor property; if not,
either add the cursor property with the icon handle or don't add it at all.
At this point, it appears that WM_NCDESTROY is the last message ever sent to a
window. The MyDefDialogProc handler calls the UnregisterModeless procedure to
remove the window from the registry.



Other Useful Techniques


I discovered that if I create a dialog box in the dialog editor without a
thick border, or a maximize box, but with a system menu, the system menu
contains both a Size and Maximize selection. The Size selection lets me resize
the box using the cursor keys and the Maximize selection will maximize it. The
only way to handle these was to intercept the WM_INITMENUPOPUP message in
those dialog boxes in which I was not prepared to deal with resizing or
maximizing and gray out those system-menu options; see Listing Six, page 82.


Summary


Modeless dialog boxes are useful for both user and developer interfaces.
However, using them effectively requires awareness of what Windows does when
you want to iconize them and have them display unique icons. If you want to
use several modeless dialog boxes, it's much simpler if the main loop does not
need to know how many of them are active.
The code in this article was derived empirically. While not exactly
"undocumented Windows" in the same sense that Schulman, Maxey, and Pietrek
mean, the amount of work required to discover these less-than-obvious
interactions was considerable. This may not be the most correct or most
elegant way to handle the problems, but I have successfully used it for two
years, for Windows 3.0 and 3.1, without modifications.

_MODELESS DIALOG BOXES FOR WINDOWS_
by Joseph M. Newcomer


[LISTING ONE]

typedef struct modeless {
 LPHANDLE hpWnd;
 struct modeless * next;
 } modeless;
modeless * modeless_list = NULL;

/********************* RegisterModeless *************************************
* Inputs: LPHANDLE hpWnd: Pointer to modelss dialog handle
* Result: void
* Effect: Registers the handle for a modeless dialog

* not created for each reference.
****************************************************************************/
void RegisterModeless(LPHANDLE hpWnd)
 {
 modeless * hm;
 modeless * p;
 p = modeless_list;
 while(p != NULL)
 { /* scan list */
 if(p->hpWnd == hpWnd)
 return; /* already registered */
 p = p->next;
 } /* scan list */
 /* The window handle was not already registered */
 hm = (modeless *)malloc(sizeof(modeless));

 if(hm == NULL)
 return;
 hm->hpWnd = hpWnd;
 hm->next = modeless_list;
 modeless_list = hm;
 }
/***************************** UnregisterModeless ***************************
* Inputs: HWND hWnd: Window handle to unregister
* Result: void
* Effect: Locates the entry which references this modeless dialog box and
* removes it, after setting the handle to NULL.
****************************************************************************/
void UnregisterModeless(HWND hWnd)
 {

 modeless * * prev_next = &modeless_list;
 modeless * p;
 p = modeless_list;
 while(p != NULL)
 { /* scan list */
 if(*p->hpWnd == hWnd)
 { /* found it */
 *prev_next = p->next;
 *p->hpWnd = NULL;
 free(p);
 return;

 } /* found it */
 prev_next = &p->next;
 p = p->next;
 } /* scan list */
 }
/********************** ProcessModeless **************************************
* Inputs: LPMSG msg: Pointer to a message
* Result: boolean. TRUE if IsDialogMessage returned true. FALSE if returned
* false (or no modeless dialog windows were registered)
* Effect: Handles IsDialogMessage for registered modeless dialog windows
* Notes: If a registered modeless window is destroyed, the destroyer must set
* window handle which has been registered with this routine to NULL. If the
* window is later re-created, the handle can be set to the new value.
****************************************************************************/
boolean ProcessModeless(LPMSG msg)
 {
 modeless * p;
 p = modeless_list;
 while(p != NULL)
 { /* scan list */
 if(*p->hpWnd != NULL)
 if(IsDialogMessage(*p->hpWnd,msg))
 return TRUE;
 p = p->next;
 } /* scan list */
 return FALSE;
 }






[LISTING TWO]


while(GetMessage(&msg, NULL, 0, 0))
 { /* message loop */
 if(ProcessModeless(&msg))
 continue;
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 } /* message loop */







[LISTING THREE]

/************************** SetWindowIcon ************************************
* Inputs: HWND hWnd: Window into which to set icon. HANDLE hInst: Instance
* handle for fetching icon char * name: Name of icon
* Result: HICON Icon handle in case it is useful; NULL if there was a failure
* to load the icon
* Effect: Sets the "icon" property used by PaintIconicDialog
****************************************************************************/
HICON FAR PASCAL SetWindowIcon(HWND hWnd, HANDLE hInst, char * name)
 {
 HANDLE hIcon;
 hIcon = LoadIcon(hInst, name);
 if(hIcon == NULL)

 return NULL;
 SetProp(hWnd,"icon",hIcon);
 SetClassWord(hWnd,GCW_HICON,NULL);
 return hIcon;
 }






[LISTING FOUR]

/******************************* MyDefDialogProc *****************************
* Inputs: HWND hDlg: Dialog window handle unsigned message: Message received
* WORD wParam: word value from message. LONG lParam: long value from message
* Result: int
* Effect: Handles the most common cases of dialog boxes for this application
****************************************************************************/
int MyDefDialogProc(HWND hDlg, unsigned message, WPARAM wParam, LPARAM lParam)
 {
 switch(message)
 { /* stock responses */
 case WM_PAINT:

 return PaintIconicDialog(hDlg);
 case WM_ERASEBKGND:
 /* pretend we've erased the background for an iconic window */
 if(IsIconic(hDlg))
 return TRUE;
 else
 return FALSE;
 case WM_NCPAINT:
 SetClassWord(hDlg,GCW_HICON,NULL);
 return FALSE;
 case WM_CLOSE:
 RemoveProp(hDlg,"icon");
 DestroyWindow(hDlg);
 break;
 case WM_NCDESTROY:
 UnregisterModeless(hDlg);
 return TRUE; /* let normal processing proceed */

 case WM_QUERYDRAGICON:
 {
 HICON hIcon = GetProp(hDlg,"icon");
 return hIcon;
 }
 default:
 return FALSE;
 } /* stock responses */
 return TRUE;
 }






[LISTING FIVE]

/************************* PaintIconicDialog *****************************
* Inputs: HWND hDlg: Dialog window
* Result: BOOL -- TRUE if painted icon; FALSE if did not paint icon
* Effect: Paints the icon if the window is iconic. Requires that the
* icon be installed via the SetProp function
****************************************************************************/
static BOOL PaintIconicDialog(HWND hDlg)
 {
 if(IsIconic(hDlg))
 { /* iconic */
 PAINTSTRUCT ps;
 HDC hDC;
 DWORD xy;
 HICON hIcon = GetProp(hDlg,"icon");
 if(hIcon != NULL)
 { /* found it */
 hDC = BeginPaint(hDlg, &ps);
 xy = GetWindowOrg(hDC);
 SendMessage(hDlg,WM_ICONERASEBKGND,hDC,0L);
 DrawIcon(hDC, LOWORD(xy)+2, HIWORD(xy)+2,hIcon);
 EndPaint(hDlg,&ps);
 return TRUE;
 } /* found it */
 } /* iconic */
 return FALSE;
 }





[LISTING SIX]


int FAR PASCAL WhateverWndProc(HWND hDlg, unsigned message, WPARAM wParam,
LPARAM lParam)
 {
 switch(message)
 { /* decode message */
 case WM_INITMENUPOPUP:
 if(HIWORD(lParam))
 { /* system menu */

 HMENU sys;
 sys = GetSystemMenu(hDlg, false);
 EnableMenuItem(sys, SC_SIZE, MF_GRAYED);
 EnableMenuItem(sys, SC_MAXIMIZE, MF_GRAYED);
 } /* system menu */
 return MyDefDialogProc(hDlg,message,wParam,lParam);
 ... other messages here























































May, 1993
 OS/2 2.X INITIALIZATION FILES AND PROFILE MANAGEMENT


Here's an initialization-file browser and editor based on the Profile Manager
API


 This article contains the following executables: INITOR.ARC


Derrel Blain, Kurt Delimon, and Jeff English


Derrel, Kurt, and Jeff are members of the Systems Development team at
Micrografx. They are the authors of Real-world Programming for OS/2 (Howard
Sams, 1993), on which this article is based. You can contact them at 1303
Arapaho, Richardson, TX 75081 or on CompuServe at 70743,351.


According to IBM, OS/2 2.x initialization files are "a convenient place to
store information between sessions." When you get to know them, however,
they're much more useful. This article has two purposes. One is to give you a
basic understanding of initialization files for OS/2 and the API calls used to
interact with them. The other is to provide you with a useful
initialization-file editor.
An OS/2 2.x initialization file is just like any other file. It's not hidden
or read only, nor does it need to be in any special place on the disk.
Initialization files can be copied, renamed, deleted, and so forth.
Information can easily be added to or deleted from the file. However, OS/2
initialization files (INI files) are binary, unlike those of Microsoft
Windows, which are ASCII. This makes OS/2 INI files less approachable since a
specific API is needed to interact with them.


The Profile Manager API


OS/2 2.x provides a number of function calls to open, write, read, and close
initialization files. For example, an application may wish to save its current
screen location and size. It can do so in an initialization file using
functions from the Profile Manager API. When the application is restarted, it
should look for its INI file entry, again using functions from the Profile
Manager API. It can then use the information it obtained from the INI file to
resize itself accordingly.
INI files are built using the Profile Manager API within OS/2. Two special INI
files are used by the operating system; any other file created using the
Profile Manager API can be used for various purposes, including those more
involved than simply storing information between instances of an application.
The Profile Manager API entries simply provide a consistent manner to interact
with these files. Table 1 lists this API's functions. This Profile Manager API
should also be used to add information to an INI file as well as delete from
it, rather than creating your own functions.
Table 1: Profile Management API.

 Function Description
 -----------------------------------------------------------------------

 PrfCloseProfile Closes a private profile file.
 PrfOpenProfile Opens a private profile file.
 PrfQueryProfile Retrieves full path name for system
 initialization files (OS2.INI and OS2SYS.INI).
 PrfQueryProfileData Retrieves information from the profile file.
 PrfQueryProfileInt Retrieves an integer from the profile file.
 PrfQueryProfileSize Retrieves size to hold desired information from
 the INI file.
 PrfQueryProfileString Retrieves a string from the profile file.
 PrfReset Resets Presentation Manager intialization files
 with new files.
 PrfWriteProfileData Writes or deletes binary data in the profile
 file.
 PrfWriteProfileString Writes or deletes a string in the profile file.

Using the Profile Manager API results in a consistent format in which you and
your applications interact with these files. This interaction can involve
application-specific data, information contained within the operating system's
INI files, and even data kept in the INI files of other applications.
An application can create its own INI file in which to store data, or it can
use the system initialization files. The latter is often preferable. OS/2 uses
two INI files to store system data and configuration information. OS2.INI and
OS2-SYS.INI are the default names for the user and system profile files,
respectively. These files can be found in the \OS2 directory. They have a
considerable number of entries that include information about system display
colors, printers, printer queues, settings for the serial ports, and much
more.
These system initialization files, like all initialization files built using
the Profile Manager API, have a three-tiered structure. The highest level of
organization is the application name. An INI file may have one or more
application names, each of which may in turn have one or more key names. Both
the application name and the key name are NULL-terminated strings that may not
exceed 1024 bytes (including the terminating NULL character). Each key name
has a data stream associated with it. This is the key data, and it may be a
NULL-terminated string or a block of binary data that does not exceed 64
Kbytes. For example, the default OS2.INI file contains the application name
(PM_colors, for example), a key name (like DialogBackground), and key data
(such as 204 204 204, which represents an RGB shade of grey). Note that the
key data is also a NULL-terminated string and that case is preserved in all
stored strings.


The INITOR Application


Much of this information is hidden from you because of the binary form of OS/2
initialization files--and IBM doesn't provide you with a file browser.
Consequently, we wrote our own initi lization-file browser and editor called
INITOR, which is based on the Profile Manager API and is designed to display
the application name, key name, and key data simultaneously; see Table 2. To
this end, it uses two list boxes and an application-defined data window. A row
of buttons across the bottom of INITOR give access to pop-up menus that
display system or private INI files and enable you to add or delete entries in
the open file.
Table 2: Files that make up the INITOR application.

 File Description
 -----------------------------------


 INITOR.C INITOR application's C
 code.

 INITOR.DEF Module-definition file.

 INITOR.H An include file.

 INITOR.ICO The application's icon
 file.

 INITOR.RC INITOR'S resource file.

 INITOR.DLG INITOR'S dialog-definition
 file.

Because of space constraints, we'll discuss only that portion of the code that
deals directly with enumerating the values in INI files using the Profile
Manager API; see Listing One (page 84). All the required files (resource file,
icon file, and others) are available electronically; see "Availability," page
7.
Before going further, remember that if you intend to edit any system file, you
should be sure to save a copy of that file in a safe place. One way of doing
this is to let OS/2 automatically back up your current INI files each time you
start your system. You will always have a copy of your INI files as they
existed at system startup if you add the following line to your CONFIG.SYS
file: CALL=C:\OS2\XCOPY.EXE C:\OS2\ OS2*.INI C:\OS2\*.BAK.


Initialization-file Handles


Reading from or writing to an INI file requires an initialization-file handle
HINI. An INI file must be opened to get an INI-file handle. The user and
system profile files are opened by the system at startup as a result of a call
to PrfReset. Private INI files must be opened using the PrfOpenProfile
function. To access the system INI files, three predefined handles are
supplied by the operating system: HINI_USERPROFILE, for reading/writing user
profile; HINI_SYSTEMPROFILE, for reading/writing system profile; and
HINI_PROFILE, for reading both user and system profile and writing user
profiles. (The term "profile file" is used almost interchangeably with
"initialization file," apparently following a historical convention passed
down from Windows coding.)


Enumerating Entries in an INI File


Though differing API entries are called, each of the three levels in the
structure of initialization files is enumerated in much the same manner. The
INITOR sample application begins this process by enumerating the application
names. The primary function for this in INITOR is enum_app_names. We describe
the process of that function here, but first, let's take a look at the code
for enum_key_name and get_key_data. Notice that their structure and process is
similar to that of enum_app_names. This is understandable since the logical
structure of INI files is the same at each of the three levels. The
enum_app_names function in INITOR takes two parameters: an initialization-file
handle and the ID of a string to load and set in the title bar. This ID is
defined in the resource file. enum_app_names will enumerate all application
names in the INI file by calling PrfQueryProfileString. The application names
are then added to the application list box displayed by the INITOR
application. The PrfQueryProfileString is defined as in Table 3.
Table 3: Defining PrfQueryProfileString. PrfQueryProfileString
(hIni,pAppName,pKeyName,pDefault,pBuffer, ulBufferSize)
 -------------------------------------------------------------------------

 HINI hIni Initialization-file handle.

 PSZ pAppName Pointer to a NULL-terminated string that identifies
 the application name to be queried.

 PSZ pKeyName Pointer to a NULL-terminated string that identifies
 the key name to be queried.

 PSZ pDefault Pointer to a NULL-terminated string to be copied to
 pBuffer if the requested entry is not found.

 PSZ pBuffer Pointer to the location where return string is to be
 copied.

 ULONG ulBuffSize Length in bytes of the memory pointed to by pBuffer.

PrfQueryProfileString performs a case-insensitive search for the string
pointed to by pKeyName in the application section identified by pAppName. If a
match is found, the key-name data is copied to pBuffer, and the number of
bytes--including the NULL terminator copied--is returned. If a match is not
found, the contents of pDefault are copied to pBuffer. If pBuffer is NULL,
nothing is copied to pBuffer and 0 is returned. By specifying NULL for the
pAppName parameter, PrfQueryProfileString returns all application names in the
INI file as an array of NULL-terminated strings. The last string will be
double-NULL terminated.
If NULL is passed as the pKeyName parameter and pAppName is the pointer to a
string which is found in the file, all key names in that application section
are returned as an array of NULL-terminated strings. As with the enumerated
application names, the last string will be double-NULL terminated.
Since the number and size of the entries in an INI file varies, enum_app_names
determines the space required to hold the application names by calling
PrfQueryProfileSize, which takes a set of parameters similar to those of
PrfQueryProfileSize; see Table 4. enum_app_names will pass NULL for both the
pAppName and pKeyName since we want to know how much memory is required to
hold all of the application names. If the function in Example 1(a) is
successful and the length returned in ulSize in non-NULL, memory is allocated
and the strings are retrieved.
Table 4: enum_app_names determines the space required to hold the application
names by calling PrfQueryProfileSize.

 PrfQueryProfileSize (hIni,pAppName,pKeyName,pDataLength)
 ------------------------------------------------------------------------

 HINI hIni Initialization-file handle.


 PSZ pAppName Pointer to a NULL-terminated string that
 identifies the application name to be queried.

 PSZ pKeyName Pointer to a NULL-terminated string that
 identifies the key name to be queried.

 PULONG pDataLength Pointer to a ULONG which will contain the length
 of keyname data if the function is successful.

Example 1: (a) Returning the number of bytes (including NULLS) copied to
pData; (b) allocating memory and retrieving strings.

 (a)

 if(PrfQueryProfileSize(hini,NULL,NULL,(PULONG)&ulSize) && ulSize)

 (b)

 DosSubAlloc(pMem.(PPVOID)&pData,ulSize);
 if(PrfQueryProfileString(hini,NULL,NULL,"No Entries",pData,ulSize))

The strings in Example 1(b) are copied to the memory pointed to by pData, and
the actual number of bytes (including NULL) copied to pData is returned. Since
we already know the length of the returned data, the return length is ignored.
The application list box is disabled and its contents are deleted. Disabling
the list box prevents redrawing after each string is added. The array of
application names is walked, and each string is added to the list box in
ascending order. When all items are added, the first item is selected and
updates to the list box are re-enabled. The memory used for the application
names is freed. By selecting an entry in the list box, a WM_CONTROL message
with a notification code of LN_SELECT is sent to the owner window. The owner
window calls enum_key_name, which will fill up the key-name list box in a
similar fashion.
After the key-name list box is filled by enum_key_name, the first entry is
selected causing another WM_CONTROL message to be sent to the owner window.
This time the owner calls get_key_data. If a data item currently exists, the
memory for it is freed and PrfQueryProfileSize is called to determine the
storage requirements of the new data. The data is retrieved by calling
PrfQueryProfileData since the data may be a NULL-terminated string or a stream
of binary data. The PrfQueryProfileData is defined as shown in Table 5. As
with PrfQueryProfile-String, the search is case insensitive and passing a NULL
for either pAppName or PKeyName has the same effect described earlier. If the
data is successfully returned, the data window is invalidated, causing the new
data to be displayed.
Table 5: PrfQueryProfileData function definition.

 PrfQueryProfileData(hIni,pAppName,pKeyName,pKeyData,(pDataLength)
 ------------------------------------------------------------------------

 HINI hIni Initialization-file handle.

 PSZ pAppName Pointer to a NULL-terminated string that
 identifies the application name to be queried.

 PSZ pKeyName Pointer to a NULL-terminated string that
 identifies the key name to be queried.

 PSZ pBuffer Pointer to the location where the data is to be
 copied. The return data is not NULL terminated
 unless the last byte of the data stored was a
 NULL.

 PULONG pDataLength Pointer to a ULONG that contains the length of
 buffer pointed to by pBuffer. If the function
 is successful the value will be replaced with the
 number of bytes actually copied to pBuffer.



Using Your Own INI File


For the purpose of testing the application, we've written INITOR so that it
saves its screen location in the system INI file. Applications created for
commercial distribution should store their configuration information in a
private initialization file. This prevents the loss of settings if OS/2 is
reinstalled or the system INI files are rebuilt by the user. Holding down
Alt+F1 for 20 seconds before the first OS/2 logo panel appears will reset the
default desktop configuration. Before data may be written to a private INI
file, it must be opened using PrfOpenProfile. If the file exists, it is
opened; otherwise, a new file is created. An initialization-file handle is
returned if the function is successful; NULL is returned if an error occurs.
The PrfOpenProfile function requires an anchor-block handle and a pointer to a
NULL-terminated string that contains the filename.
When you select the Open Private menu item from the display pop-up menu,
INITOR invokes the common open-file dialog. You may then open or create any
private INI file. While most initialization files use the .INI file extension,
it is not required. The PrfOpenProfile function will fail if the current
system or user initialization filenames are specified. Attempting to open an
initialization file created by Windows applications in a WIN/OS2 session will
also fail since these files are stored in a different format.
Once a private profile file is opened or created it may be queried or written
to using the same API as the system initialization files. However, when your
application is through updating the file, it should be closed using
PrfCloseProfile. This function takes the INI file handle returned from a
PrfOpenProfile call as its only parameter and cannot be used to close one of
the system initialization files. If successful, the function returns True or
False.

_OS/2 2.X INITIALIZATION FILES AND PROFILE MANAGEMENT_
by Derrel R. Blain, Kurt Delimon, Jeff English


[LISTING ONE]


void APIENTRY enum_app_names(HINI hini,USHORT usStringID)
/*-----------------------------------------------------------------*\
This function will query the size required to hold all of the application
names
for the current INI file. If the file contains entries, a temporary block of
memory is allocated and the application names are queried from PM. The strings
are then added to the application name listbox and the memory is freed. The
first entry in the listbox is selected, causing its owner, the client, to be
notified. The client window will then fill the key name listbox by calling
enum_key_name function.
\*-----------------------------------------------------------------*/
{
 PVOID pData;
 PBYTE pCurrent;
 ULONG ulSize = 0L;
 if (PrfQueryProfileSize(hini,NULL,NULL,(PULONG)&ulSize) && ulSize)
 {
 DosSubAlloc(pMem,(PPVOID)&pData,ulSize);
 if(PrfQueryProfileString(hini,NULL,NULL,"No Entries",pData,ulSize))
 {
 pCurrent = pData;
 WinEnableWindowUpdate(hAppLBox,FALSE);
 WinSendMsg (hAppLBox,LM_DELETEALL,NULL,NULL);
 while (*pCurrent)
 {
 WinSendMsg (hAppLBox,LM_INSERTITEM,(MPARAM)LIT_SORTASCENDING,
 (MPARAM)pCurrent);
 while(*pCurrent)
 pCurrent++;
 pCurrent++;
 }
 WinSendMsg (hAppLBox,LM_SELECTITEM,MPFROMSHORT(0),

 MPFROMSHORT(TRUE));
 WinEnableWindowUpdate(hAppLBox,TRUE);
 if (usStringID)
 {
 WinLoadString (hab,0,usStringID,MAX_TITLE,szTitle);
 WinSetWindowText(hWndFrame,(PSZ)szTitle);
 }
 }
 DosSubFree(pMem,pData,ulSize);
 }
 else
 {
 WinAlarm(HWND_DESKTOP,WA_WARNING);
 WinEnableWindowUpdate(hAppLBox,FALSE);
 WinSendMsg (hAppLBox,LM_DELETEALL,NULL,NULL);
 WinSendMsg (hAppLBox,LM_INSERTITEM,(MPARAM)LIT_SORTASCENDING,"No
 Entries");
 WinSendMsg (hAppLBox,LM_SELECTITEM,MPFROMSHORT(0),MPFROMSHORT(TRUE));
 WinEnableWindowUpdate(hAppLBox,TRUE);
 // No entries exist force the data window to free up it's
 // data buffer and repaint.
 get_key_data("","");
 WinInvalidateRect(hDataWnd,NULL,FALSE);
 }
}
void APIENTRY enum_key_name(PSZ pAppName)

/*-----------------------------------------------------------------*\
This function will query the size required to hold all of the key names for
the
current application name in the current INI file. If the application name
contains key strings, a temporary block of memory is allocated and the key
names are queried from PM. The strings are then added to the application name
listbox and the memory is freed. The first entry in the listbox is selected
causing its owner, the client, to be notifed. The client window will then fill
the data window by calling get_key_data.
\*-----------------------------------------------------------------*/
{
 PVOID pData;
 PBYTE pCurrent;
 ULONG ulSize = 0L;
 if (PrfQueryProfileSize(hCurrentIni,pAppName,NULL,(PULONG)&ulSize) &&
 ulSize)
 {
 DosSubAlloc(pMem,(PPVOID)&pData,ulSize);
 if(PrfQueryProfileString(hCurrentIni,pAppName,NULL,"No Entries",
 pData,ulSize))
 {
 pCurrent = pData;
 WinEnableWindowUpdate(hKeyLBox,FALSE);
 WinSendMsg (hKeyLBox,LM_DELETEALL,NULL,NULL);
 while (*pCurrent)
 {
 WinSendMsg (hKeyLBox,LM_INSERTITEM,(MPARAM)LIT_SORTASCENDING,
 (MPARAM)pCurrent);
 while(*pCurrent)

 pCurrent++;
 pCurrent++;
 }
 WinSendMsg (hKeyLBox,LM_SELECTITEM,MPFROMSHORT(0),
 MPFROMSHORT(TRUE));
 WinEnableWindowUpdate(hKeyLBox,TRUE);
 }
 DosSubFree(pMem,pData,ulSize);
 }
 else
 {
 WinEnableWindowUpdate(hKeyLBox,FALSE);
 WinSendMsg (hKeyLBox,LM_DELETEALL,NULL,NULL);
 WinEnableWindowUpdate(hKeyLBox,TRUE);
 // No entries exist force the data window to free up it's
 // data buffer and repaint.
 get_key_data("","");
 WinInvalidateRect(hDataWnd,NULL,FALSE);
 }
}
void APIENTRY get_key_data(PSZ pAppName,PSZ pKeyName)
/*-----------------------------------------------------------------*\
This function will attempt to query the keydata for the current
application-key
pair. If key data currently exists, it is freed before the new entry is
queried. If successful, a message is sent to the client to update the key data
window title with the size of the new data. The data window is invalidated.
\*-----------------------------------------------------------------*/
{
 if (pKeyData)
 {

 DosSubFree(pMem,pKeyData,ulKeySize);
 pKeyData = NULL;
 ulKeySize = 0;
 }
 if (PrfQueryProfileSize(hCurrentIni,pAppName,pKeyName,(PULONG)&ulKeySize)
 && ulKeySize)
 {
 DosSubAlloc(pMem,(PPVOID)&pKeyData,ulKeySize);
 if(PrfQueryProfileData(hCurrentIni,pAppName,pKeyName,pKeyData,
 (PULONG)&ulKeySize))
 {
 // Update the title
 WinSendMsg(hDataWnd,WM_UPDATE_TITLE,0L,0L);
 // Force the data window to repaint
 WinInvalidateRect(hDataWnd,NULL,FALSE);
 }
 }
}












































May, 1993
DYNAMIC LINKING UNDER BERKELEY UNIX


Invoking code at run time




Oliver Sharp


Oliver is a graduate student at the University of California Berkeley, doing
research into parallel proramming environments. He can be reached at
oliver@cs.berkeley.edu.


The job of a linker is to take a group of object modules and combine them into
a single program. With a normal (static) linker, you supply all the different
pieces of your program and it assembles the final binary. Most of the work is
patching up the locations of code and data; each module of a language like C
is compiled independently, so the compiler doesn't know exactly where in the
final program the current module will go. When it sees a call to a function,
what address should it use? Even if the function is defined in the current
module, the compiler can't know how the different modules will be assembled
into the final program. If the function is in another module, the compiler
doesn't have any idea where the function will be--in fact, the programmer may
have forgotten to write it entirely. So, the compiler leaves the addresses
unresolved, relying on the linker to handle them.
A dynamic linker is more flexible--it can be invoked on new pieces of code at
run time, combining them with the already executing program. Unlike other
dynamic techniques such as overlays, dynamic linking is not set up ahead of
time. You can add any code that you like while your program runs; the new code
may not have even been written when you started executing your program.
Loading code on-the-fly is useful in many situations, but it is particularly
important feature of an interactive programming environment.


Added Complications of Dynamic Linking


One problem with linking to an existing binary is that the result will be read
into some memory buffer. We need to give the address of that buffer to the
linker, so it can figure out where the new routines will be located. A more
interesting problem is the exporting and importing of symbols.
Suppose you have a running program called "parent" that decides to load the
module "child." Child consists of a group of functions; when parent calls the
dynamic linker, the linker returns a list of all the functions in child and
their addresses. That might be good enough--child sits inside parent like a
foreign body, providing a set of services--but such a restricted solution has
its disadvantages. Parent may have functions in it that would be useful for
child to access. Also, if parent and child both use the same library routines,
each would need its own copy, wasting space.
Since the idea of a linker is to let many pieces of code act together as a
single entity, we would like a dynamic linker to offer the same level of
integration. In other words, the linker should also have a list of parent's
symbols, to which child is allowed access. If parent wants the dynamic linker
to act just like a static linker, it can give the linker a complete symbol
table. Alternatively, parent can restrict the list, limiting child's access to
internal routines. This has the additional advantage that child can use any of
the unexported symbols inside parent for its own purposes without causing a
conflict.


UNIX Support


Berkeley UNIX (BSD) added support for dynamic linking because the Franz LISP
system that came with it offered the ability to invoke code written in more
traditional languages. Franz could have executed the foreign code as a
separate program, but passing data between languages would have been extremely
inefficient. Rather than incorporating a modified linker into Franz, the
implementors chose to add the functionality into the standard linker, ld.
Unfortunately, the man page for ld devotes a single paragraph to dynamic
linking, enough only for someone well versed in the details of ld's behavior.
I figured out how the facility works by poking around in the UNIX source code
for ld and nm, and by getting some helpful tips from Keith Sklower(a Berkeley
staff programmer and one of the Franz LISP implementors). Listing One (page
86) contains dyn_link, a simple program that invokes ld on a new module, loads
the result, and lets the user call functions within it. Listing Two (page 88)
is a sample module to load.


Invoking ld


To prepare a module for dynamic linking, you invoke the linker with
ld-A old_binary -T F000 -N new_file.o
The -A flag notifies ld that it is being used as a dynamic linker. Instead of
constructing a new binary from scratch, it will base its address resolution on
an existing binary called old_binary. It scans through old_binary, keeping
track of all the symbols in it which are publicly exported. If new_file.o
refers to those symbols, the addresses will be patched up correctly. If we
specify our real initial binary, all our symbols will be visible to
new_file.o. To achieve better control, dyn_link sets up an ititial binary with
a restricted symbol table that contains only those symbols we are willing to
export.
When dyn_link loads the new code, it first figures out how large it is and
allocates a large enough buffer. We have to tell ld the buffer's starting
address so it knows what the new code's addresses will be after being loaded.
The -T flag specifies the address, which in this invocation is set to F000h.
Many linkers require that this address be on an even-page boundary.
The values called "magic numbers" in UNIX are just a code that tells the
system what kind of information is in a file and how to treat it. Unlike older
operating systems (OS/360, for example), files in UNIX are simply a sequence
of bytes. However, many kinds of files require special treatment. The system
must know, for instance, that a given file is an executable, so it doesn't try
to run a data file. One way UNIX distinguishes file types is by putting these
magic numbers at the beginning of the file. The magic number for an executable
also tells the system how to do memory management. UNIX, like MS-DOS, can load
whole programs at once, but it is much more efficient to do demand
paging--bring in pieces of the program when needed--because most of the time a
program only executes a small fraction of its body. Why waste all that
valuable RAM to store unused information? Since dyn_link reads all the new
code into a buffer, the magic number in the executable is never seen by the
operating system. For simplicity's sake, dyn_link specifies the -N flag,
asking that old-fashioned memory management be used.
If the linking process succeeds, ld creates a binary with just the new code,
ready to load into memory. The binary's symbol table has all the symbols from
the original binary, together with the new ones. The dyn_link program figures
out if the link succeeded by looking at the return value; ld returns 0 if
there were no errors and nonzero otherwise.


UNIX Binary File Format


There are three main object-module formats in current implementations of UNIX.
The oldest, a.out, was developed for the DEC VAX and is still used in BSD. The
other two, COFF and ELF, are used in other implementations of UNIX (notably
those developed by AT&T). Since dyn_link is designed for BSD, it only knows
about the a.out format.
Figure 1 shows the structure of an a.out file, which is made up of five
regions. The first is the exec structure, which gives a road map to the rest
of the file. The fields of interest to dyn_link are the text, data, and bss
segment sizes, and the symbol-table size. The exec structure is followed by
the text and data segments; text contains executable code, and data has space
for global and initialized static variables. The bss segment is not stored in
the binary, but created when the program is executed; bss is used to store
uninitialized static variables and is initially filled with 0s. The text and
data segments follow the exec structure immediately, unless the binary is set
up for demand-paged memory management, in which case they (usually) follow on
the next page boundary. UNIX lets the programmer ignore this bit of complexity
by providing a macro (N_TXTOFF) which returns the starting address of the text
segment. After the two segments, there may be some relocation records; we
won't worry about those. The last two regions of the file are the symbol table
and the string table, the addresses of which are given by N_SYMOFF and
N_STROFF, respectively.
Figure 1: Structure of an a.out file.

 N_TXTOFF: Exec structure
 (possibly padded to page boundary)
 Text and data segments
 Relocation records
 N_SYMOFF: Symbol table
 N_STROFF: String table


There is a separate string table so that each symbol can be described in a
fixed amount of space; the symbol description gives an offset into the string
table for the symbol's name. Each name in the string table is followed by
'\0'. It would be a little more convenient if the string table's size were
also in the exec structure, but it isn't. It is the first piece of information
in the table; see Figure 2 for an example.
Figure 2: Sample string table.

 12
 ''
 -
 'g'
 'r'
 'e'
 'e'
 'n'
 '\0'
 ''
 -
 'r'
 'e'
 'd'
 '\0'

The symbol table is a series of nlist structures; see Figure 2. The structure
fields we'll be using are the symbol's type (internal, external, text, and so
on), an index into the string table for the name, and the value. In an
executable binary, the value is its address in memory.


The Code


The dyn_link program in Listing One is a simple demonstration of dynamic
linking. It loads SAMPLE.o, the object module derived from Listing Two,
exporting any symbols inside itself which begin with export. While loading the
new code, dyn_link displays the names of all the newly defined external
functions. It then loops, waiting for the user to specify functions to call.
The dyn_link program counts the number of times the user specifies a function
to call, passing that value as the argument. When the user hits a carriage
return, dyn_link exits.
Begin by setting up a binary that reflects the currently running program. We
could use the actual dyn_link binary, but then its internal symbols would be
exported. To restrict the exported symbols to those that start with export,
we'll construct a fake binary with a restricted symbol table. First, open the
original dyn_link binary and read in the exec structure at the front. The
find_symbol_table() routine prepares us to look through the symbols. It uses
the N_STROFF macro in calling fseek() to move to the start of the string
table. The first word in the string table is an integer that gives its length.
When we scan through the symbols, we can use the string table to figure out
their names. We prepare an output file by opening it destructively and moving
the file pointer forward to leave space for the exec structure. We can't write
the new exec structure yet, because we don't know how many symbols will be in
the fake binary.
With the string table in hand, one file pointer sitting at the beginning of
the old binary's symbol table, and another file pointer to an empty file, we
are ready to pick out the symbols to export. The output_exported_symbols()
routine loops through the symbols in the old binary (having computed their
number from the information in the exec structure), building up a new string
table. Each symbol that starts with _export is copied into the new file; the
n_un.strx field is patched to point to the name in the new string table. (The
preceding underscore is automatically added to symbols by the C compiler.)
After all the symbols have been scanned, we write out the size of the new
string table and its contents. Finally, we rewind the output file and write
the new exec structure at the beginning. The new exec is a copy of the old
one, modified to reflect the smaller number of symbols and the absence of
code.
To access the functions in the new code after we load it, we need to find all
the text symbols (like function names) that it exports. It is easiest to find
them in the object module being loaded, before the old and new symbols are
mixed together into a single binary by the linker. This is the job of
get_ops(), which scans through the symbols in the object module in much the
same way that output_exported_symbols() reads the initial binary. The
get_ops() function looks at each symbol, checking to see if it is an external
text symbol by looking at the n_type field in its nlist structure. If so,
get-ops() adds it to a linked list of new functions.
Before invoking the linker, we must allocate enough space in memory for the
loaded code, so that we can tell the linker what the new starting address will
be. Many linkers insist that the address be at a page boundary, so
allocate_space() uses the PAGE_SIZE constant to enforce that constraint.
Finally, we use system() to invoke the linker. If there was a problem, the
linker returns a nonzero value, in which case we exit. If everything
succeeded, there should now be a file called "a.out" which contains the new
code ready to be loaded.
To read in the code, open a.out and read the exec structure. Use the N_TXTOFF
macro to find the beginning of the text and data segments and read them into
the previously allocated buffer. Then get the string table and go through the
symbols one more time. Running through the symbols in the linked list, scan
through the symbol table to find each one and figure out its address (which is
stored in the n_value field of the nlist structure). To avoid scanning the
entire symbol table for each symbol, dyn_link relies on the fact that the
symbols in the linked binary's table appear in the same order as they did in
the object code's table.
With the code loaded into memory, calling it is straightforward; when the user
types in the name of a routine, dyn_link runs through the linked list looking
for it. If the name is in the list, the structure contains a pointer to the
function.
Note that the loaded code cannot use any of the standard C library calls
because we did not export them. To allow the routines to print something,
dyn_link exports a routine called exported_printf(), which the loaded code can
use. Alternatively, you could modify dyn_link to export more of its internal
symbols, including the standard C library routines it contains.
Listing Four, page 88, shows a sample run of dyn_link. The program lists the
routines exported by SAMPLE.o and lets the user execute them. The first one,
simple(), just calls exported_printf() with a message. The call_back()
function prints a message, calls the routine export_hook() inside dyn_link,
and prints the return value.


Portability


I've tested the code in Listing One under SunOS (Sun 3 and SPARC), BSD 4.3
Reno, and Dynix. Under Dynix, the linker looks for a symbol called start, so
you must export it along with the ones starting with _export. On a SPARC, you
must compile the original dyn_link binary with the -Bstatic flag. (See the
makefile in Listing Three, page 88.) To use dyn_link on a different flavor of
BSD, you may have to make a few changes. There might be special symbols that
must be defined, or the linker might add some segments to the a.out file,
which you must leave space for in your allocated buffer. include files also
have an unfortunate habit of moving or changing their names slihtly. Make sure
that the constant PAGE_SIZE is an even multiple of your system's page size.
(If you aren't sure of the value for your system, 4096 is a pretty safe bet.)
A further complication is that modern RISC processors have made life more
difficult for dynamic linkers. Because they have a variety of caching
strategies, it is possible to run afoul of the processor by trying to cross
back and forth between old and newly loaded code. David Keppel's "A Portable
Interface for On-the-Fly Instruction Space Modification" provides a detailed
discussion of the problem and its solution.
If your linker does not support dynamic linking, or you want to have more
control over the process, Wilson Ho has released a dynamic linker called "dld"
through the GNU project. Unfortunately, it does not yet support System V. GNU
code is available on many archives and bulletin boards (notably
prep.ai.mit.edu on the Internet).


References


Keppel, David. "A Portable Interface for On-the-Fly Instruction Space
Modification." Fourth International Conference on Architectural Support for
Programming Languages and Operation Systems (ASPLOS). Santa Clara, CA (April,
1991).
Ho, W. Wilson and Ronald A. Olsson. "An Approach to Genuine Dynamic Linking."
Software--Practice and Experience (April, 1991).

_DYNAMIC LINKING UNDER BERKELEY UNIX_
by Oliver Sharp


[LISTING ONE]

/* dyn_link.c - a simple dynamic linker */
#include <stdio.h>
#include <a.out.h>


/************ Declarations ************/
#define TRUE 1
#define FALSE 0
#define PAGE_SIZE 4096 /* machine-dependent constant */

/* FUNC_INDEX - a linked list of these structures stores the names and
 * addresses of dynamically linked code. */
typedef struct index {
 char *name;
 int (*function)();
 struct index *next;
} FUNC_INDEX;
/* some declarations to pacify careful compilers */
FUNC_INDEX *dynamic_load();
char *find_symbol_table();
FUNC_INDEX *get_ops();
FUNC_INDEX *reverse_list();
char *allocate_space();
char *malloc(), *calloc();

/************ Main Program ************/
main()
{
 FUNC_INDEX *dynamic_load(), *functions, *index_p;
 char buffer[BUFSIZ];
 setup_initial_binary("dyn_link");
 functions = dynamic_load("sample.o");
 for (index_p = functions ; index_p ; index_p = index_p->next)
 puts(index_p->name);
 while (TRUE) {
 printf("Enter name of routine to call (<CR> to exit): ");
 fflush(stdout);
 if ((fgets(buffer,BUFSIZ,stdin) == NULL) (strlen(buffer) == 1))
 exit();
 buffer[strlen(buffer)-1] = '\0'; /* strip off \n */
 call_routine(buffer,functions);
 }
}
/* call_routine - given the name of a routine and a list of available ones,
 * scan and find address. If you find routine, call it with an argument equal
 * to number of times call_routine has been invoked. If not, complain. */
call_routine(name,index_p)
 char *name;
 FUNC_INDEX *index_p;
{
 static argument = 1;
 for ( ; index_p ; index_p = index_p->next) {
 if (strcmp(index_p->name,name) == 0) {
 (index_p->function)(argument++);
 return;
 }
 }
 printf("Sorry, there is no function called '%s'\n",name);
}
/************ Setting Up ************/
/* setup_initial_binary - read our image, creating a code-less binary a.out
 * with only those symbols that we want the new code to be able to see. */
setup_initial_binary(my_name)

 char *my_name;
{
 FILE *fp, *outp;
 if ((outp = fopen("a.out","w+")) == NULL) {
 fprintf(stderr,"Can't open a.out for destructive writing\n");
 exit(-1);
 }
 /* leave room for the exec at the front */
 fseek(outp,(long) sizeof(struct exec),0);
 if ((fp = fopen(my_name,"r")) == NULL) {
 fprintf(stderr,"That's funny, I thought I was called %s\n",my_name);
 exit(-1);
 }
 output_exported_symbols(fp,outp);
 fclose(fp);
 fclose(outp);
}
/* output_exported_symbols - run through symbols in binary, looking for ones
 * that start with "_export". Put them in the output file's symbol table. */
output_exported_symbols(fp,outp)
 FILE *fp, *outp;
{
 struct exec the_exec, fake_exec;
 struct nlist symbol;
 char *binary_strings, *name;
 char name_buffer[BUFSIZ];
 int i, new_table_size = sizeof(int), how_many_symbols = 0;
 if (!fread(&the_exec,sizeof(the_exec),1,fp)) {
 fprintf(stderr,"I can't read my own header structure\n");
 exit(-1);
 }
 binary_strings = find_symbol_table(&the_exec,fp);
 for (i = 0 ; i < (the_exec.a_syms / sizeof(struct nlist)) ; i++) {
 if (!fread(&symbol,sizeof(symbol),1,fp)) {
 fprintf(stderr,"Error reading symbol #%d\n",i);
 exit(-1);
 }
 if (!symbol.n_un.n_strx) /* symbol doesn't have a name */
 continue;
 name = binary_strings + symbol.n_un.n_strx - sizeof(int);
#ifdef DYNIX
 if ((strncmp(name,"_export",7) == 0) (strcmp(name,"start") == 0)) {
#else
 if (strncmp(name,"_export",7) == 0) {
#endif
 symbol.n_un.n_strx = new_table_size; /* fix offset */
 how_many_symbols++;
 if (new_table_size-sizeof(int) + strlen(name) >= BUFSIZ) {
 fprintf(stderr,"Error: string table overflowed\n");
 exit(-1);
 }
 strcpy(name_buffer + new_table_size-sizeof(int),name);
 new_table_size += strlen(name) + 1; /* keep track of size of table */
 fwrite(&symbol,sizeof(symbol),1,outp);
 }
 }
 /* write out the string table */
 fwrite(&new_table_size,sizeof(int),1,outp);
 fwrite(name_buffer,new_table_size-sizeof(int),1,outp);

 /* rewind and write out the proper exec structure */
 rewind(outp);
 bcopy(&the_exec,&fake_exec,sizeof(fake_exec));
 fake_exec.a_magic = OMAGIC; /* simple memory management */
 fake_exec.a_syms = how_many_symbols * sizeof(struct nlist);
 fake_exec.a_text = fake_exec.a_data = fake_exec.a_bss = 0;
 fwrite(&fake_exec,sizeof(fake_exec),1,outp);
}
/************ Doing the Dynamic Link ************/
/* dynamic_load - figure out how big the file is, allocate a buffer, call the
 * linker, and load resulting code. Return a linked list of structures giving
 * name and address of every function in new code that can be called. */
FUNC_INDEX *
dynamic_load(object_file)
 char *object_file;
{
 FILE *ofp;
 char *buffer;
 FUNC_INDEX *index;
 int how_big;
 if ((ofp = fopen(object_file,"r")) == NULL) {
 fprintf(stderr,"I can't read the file %s\n",object_file);
 exit(-1);
 }
 index = get_ops(ofp);
 buffer = allocate_space(ofp,&how_big); /* allocate space for new code */
 fclose(ofp);
 if (!run_the_linker(object_file,buffer)) {
 fprintf(stderr,"Linker reports errors, exiting\n");
 exit(-1);
 }
 read_code(buffer,how_big,index);
 return(index);
}
/* run_the_linker - given a filename, run "ld" on it to create an a.out file.
 * Return TRUE if the link succeeds, FALSE otherwise. */
run_the_linker(object_file,buffer)
 char *object_file, *buffer;
{
 char command[BUFSIZ];
 sprintf(command,"ld -N -A a.out -T %x %s",buffer,object_file);
 return(system(command) ? FALSE : TRUE); /* system returns true for error */
}
/************ Manipulating Files ************/
/* find_symbol_table - read the string table, seek the file pointer to
 * the beginning of the symbol table, and return the string table. */
char *
find_symbol_table(the_exec,fp)
 struct exec *the_exec;
 FILE *fp;
{
 char *string_table;
 int table_size;
 fseek(fp,((long) N_STROFF(*the_exec)),0);
 if (fread(&table_size,sizeof(int),1,fp) != 1) {
 fprintf(stderr,"couldn't read string table size\n");
 exit(-1);
 }
 if ((string_table = malloc(table_size)) == NULL) {

 fprintf(stderr,"couldn't allocate space for string table\n");
 exit(-1);
 }
 if (fread(string_table,table_size-sizeof(int),1,fp) != 1) {
 fprintf(stderr,"couldn't read string table\n");
 exit(-1);
 }
 fseek(fp,((long) N_SYMOFF(*the_exec)),0);
 return(string_table);
}
/* get_ops - given a pointer to beginning of a .o file, build a FUNC_INDEX
 * structure with names of external text identifiers in symbol table. */
FUNC_INDEX *
get_ops(fp)
 FILE *fp;
{
 char *string_table;
 struct nlist symbol;
 struct exec the_exec;
 int num_symbols, i;
 FUNC_INDEX *index = NULL, *temp;
 rewind(fp);
 if (fread(&the_exec,sizeof(the_exec),1,fp) != 1) {
 printf("couldn't read file header\n");
 return(NULL);
 }
 num_symbols = the_exec.a_syms / sizeof(struct nlist);
 string_table = find_symbol_table(&the_exec,fp);
 /* run through the symbol table, making an index struct for each
 * external text identifier */
 for (i = 0 ; i < num_symbols ; i++) {
 if (fread(&symbol,sizeof(symbol),1,fp) != 1) {
 fprintf(stderr,"Can't read symbol #%d\n",i);
 exit(-1);
 }
 if (symbol.n_type != (N_TEXT N_EXT)) /* check if exported function */
 continue;
 if ((temp = (FUNC_INDEX *) calloc(1,sizeof(FUNC_INDEX))) == NULL) {
 fprintf(stderr,"Couldn't allocate a new function index structure\n");
 exit(-1);
 }
 /* get the name, adding 1 to skip the initial underscore */
 temp->name = string_table + symbol.n_un.n_strx - sizeof(int) + 1;
 temp->next = index;
 index = temp;
 }
 /* since we insert at the front of the list, we need to reverse it to
 * maintain symbols in their original order */
 if (index)
 index = reverse_list(index);
 return(index);
}
/* allocate_space - given a pointer to an object module, figure out how big
 * it is and return a buffer to hold it. Leave the size in buffer_size. */
char *
allocate_space(fp, buffer_size)
 FILE *fp;
 int *buffer_size;
{

 struct exec the_exec;
 int size, buffer;
 rewind(fp);
 if (fread(&the_exec,sizeof(struct exec),1,fp) != 1) {
 fprintf(stderr,"I couldn't read the exec header of the object module\n");
 exit(-1);
 }
 *buffer_size = size = the_exec.a_text + the_exec.a_data + the_exec.a_bss;
 printf("Sizes: text = 0x%lx, data = 0x%lx, bss = 0x%lx. Total = 0x%x,%d\n",
 the_exec.a_text,the_exec.a_data,the_exec.a_bss,size,size);
 if (size & (PAGE_SIZE-1)) /* if not on page boundary, leave extra space */
 size += PAGE_SIZE;
 /* allocate the buffer; use calloc() so bss is automatically zeroed */
 if ((buffer = (int) calloc(size,1)) == NULL) {
 fprintf(stderr,"Couldn't allocate %d byte buffer\n",size);
 exit(-1);
 }
 /* return the address rounded up to the nearest page */
 return((char *) ((buffer + (PAGE_SIZE - 1)) & ~(PAGE_SIZE - 1)));
}
/* read_code - read code and data from fp and put it in buffer (which is
 * "size" bytes long). Fill in new addresses of routines in function index. */
read_code(buffer,size,index)
 char *buffer;
 int size;
 FUNC_INDEX *index;
{
 struct exec the_exec;
 struct nlist symbol;
 int code_size, num_symbols, symbols_so_far = 0;
 char *string_table, *symbol_name;
 FILE *fp;
 if ((fp = fopen("a.out","r")) == NULL) {
 fprintf(stderr,"I can't read a.out\n");
 exit(-1);
 }
 if (fread(&the_exec,sizeof(the_exec),1,fp) != 1) {
 fprintf(stderr,"Couldn't read the new image's header\n");
 exit(-1);
 }
 fseek(fp,(long) N_TXTOFF(the_exec),0);
 code_size = the_exec.a_text + the_exec.a_data;
 if ((code_size + the_exec.a_bss) > size) {
 fprintf(stderr,"Allocated buffer is %d bytes, need %d\n",size,
 (code_size + the_exec.a_bss));
 exit(-1);
 }
 if (!fread(buffer,code_size,1,fp)) {
 fprintf(stderr,"couldn't load the code\n");
 exit(-1);
 }
 string_table = find_symbol_table(&the_exec,fp);
 num_symbols = the_exec.a_syms / sizeof(struct nlist);
 while (index) {
 if (++symbols_so_far > num_symbols) {
 fprintf(stderr,"Ran out of symbols while looking for %s\n",index->name);
 exit(-1);
 }
 if (fread(&symbol,sizeof(struct nlist),1,fp) != 1) {

 fprintf(stderr,"Died reading symbol\n");
 exit(-1);
 }
 if (symbol.n_un.n_strx == 0) /* doesn't have a name */
 continue;
 /* if we have found the next name in the index list, stash address */
 symbol_name = string_table + symbol.n_un.n_strx - sizeof(int) + 1;
 if (strcmp(symbol_name,index->name) == 0) {
 index->function = (int (*)()) symbol.n_value;
 index = index->next;
 }
 }
 fclose(fp);
}
/************ Utility Routines ************/
/* reverse_list - run down a non-empty linked list, reversing its order.
 * Return the new head (i.e. the old tail). */
FUNC_INDEX *
reverse_list(list)
 FUNC_INDEX *list;
{
 FUNC_INDEX *last = list, *temp;
 list = list->next;
 last->next = NULL;
 while (list) {
 temp = list->next;
 list->next = last;
 last = list;
 list = temp;
 }
 return(last);
}
/************ Exported Functions ************/
/* export_hook - this function starts with "export", so it will be visible
 * to loaded functions. */
export_hook(string,value)
 char *string;
 int value;
{
 printf(" ** this is export_hook, and I was called with '%s' **\n",string);
 return(value * 2);
}
/* exported_printf - the loaded functions don't have access to the libraries
 * so we define a printf stub for them to use. */
exported_printf(string,number)
 char *string;
 int number;
{
 printf(string,number);
}






[LISTING TWO]

/* sample.c - a few functions to test the dynamic loader */

#include <stdio.h>
simple(value)
 int value;
{
 exported_printf("I'm a simple routine and got argument: %d\n",value);
}
call_back(value)
 int value;
{
 exported_printf("This is call back; I got argument %d\n",value);
 value = export_hook("test string",value);
 exported_printf(" The final value is %d\n",value);
}






[LISTING THREE]

# Makefile for dynamic linker dyn_link - choose one of the targets
CFLAGS = -g
# Use this for most systems (BSD Reno, SunOS without dynamic libraries, etc.)
generic: dyn_link.o sample.o
 $(CC) $(CFLAGS) -o dyn_link dyn_link.o
# Dynamic linking doesn't work correctly on systems running SunOS with
# dynamic libraries unless you link the original binary with the -Bstatic
flag.
sundl: dyn_link.o sample.o
 $(CC) $(CFLAGS) -Bstatic -o dyn_link dyn_link.o
# For dynix machines, you need the symbol "start" in the constructed binary
dynix: dyn_link.c sample.o
 $(CC) $(CFLAGS) -DDYNIX -o dyn_link dyn_link.c





[LISTING FOUR]

Sizes: text = 0x5c, data = 0x78, bss = 0x0. Total = 0xd4,212
simple
call_back
Enter name of routine to call (<CR> to exit): simple
I'm a simple routine and got argument: 1
Enter name of routine to call (<CR> to exit): call_back
This is call back; I got argument 2
 ** this is export_hook, and I was called with 'test string' **
 The final value is 4
Enter name of routine to call (<CR> to exit): simple
I'm a simple routine and got argument: 3
Enter name of routine to call (<CR> to exit): call_back
This is call back; I got argument 4
 ** this is export_hook, and I was called with 'test string' **
 The final value is 8
Enter name of routine to call (<CR> to exit):


































































May, 1993
A PORTABLE LIBRARY FOR EXECUTING CHILD PROCESSES


A multi-environment library for C programmers




Matt Weisfeld


Matt is currently employed by the Allen-Bradley Company in Highland Heights,
Ohio. He is responsible for the design and development of test software on
VAX/VMS, UNIX, DOS, and other platforms. This article is an excerpt from his
book, Building and Testing Portable Libraries in C, to be published by QED.
You can reach him on CompuServe at 71620,2171.


One of the primary advantages of the C programming language is the ease with
which code can be ported between various platforms. While the ANSI standard
provides the vehicle for this portability, many useful system calls are
specific to certain platforms, and consequently reduce portability. One group
of system calls not bound to a standard involves the creation and execution of
child processes. Despite their nonstandard, nonportable nature, these system
calls provide a powerful tool for managing child processes. This article
presents a library of functions that make the creation and execution of child
processes portable across many different computer platforms.


Portability Issues


No matter how uniform a language design attempts to be, some functions simply
can't be performed on certain systems. These limitations can be the result of
either hardware or software constraints. For example, both VMS and UNIX are
multitasking; thus, a parent process can continue to execute even while one of
its children is executing. DOS, on the other hand, is not multitasking, so a
parent and child cannot execute concurrently. Because of differences like
these, it is unreasonable to expect one portable call to handle all
possibilities for all conceivable systems.
However, it is possible to construct libraries that perform limited
functionality common to many operating systems. In this context, the objective
is to have the parent create the child process, wait for a return code from
the child and then, based on the result of the return code, continue execution
(capabilities shared by VMS, UNIX, and DOS). The intent is to make the
function call look portable to the application program, regardless of the
compiler used.


Child-process Concepts


UNIX, DOS, and VMS all require that sufficient memory exist before a child
process can be created. If this isn't the case, the system commands will fail.
Memory management is handled differently, depending on the system call
invoked. For VMS and UNIX, creating a child process is handled by a
combination of system calls. The UNIX fork() command copies the parent's
memory space exactly into newly acquired memory space. At this point two
identical copies of the parent exist. (There are times when it is desirable
for two identical processes to execute concurrently.) To initiate a new,
totally distinct child, the exec() family of commands are used. An exec()
command actually overlays the process space of the duplicated parent with a
new process, thus resulting in two distinct processes. There are many flavors
of the exec() command and they will all be investigated later.
VMS uses a version of the fork() command called vfork(). The fork() command
can be incredibly inefficient as it copies each memory address of the parent.
No process is actually created by vfork(), just some setup functions necessary
for a subsequent call to an exec().fork() can be used without an exec(), but
vfork() must be combined with a call to exec(). Thus, when it is not necessary
to have two identical processes running asynchronously, vfork() is more
efficient. Some UNIX systems also have a vfork() command. However, one man
page stated that the vfork() command will be eliminated at some point in the
future because the fork() command is becoming much more efficient. The vfork()
will be used for the remainder of this discussion because it is available on
both UNIX and VMS and it satisfies the functionality needs of the portable
library.
If the parent has no concern about the child's fate, it can proceed as if the
child does not exist. However, if the parent needs a return code from the
child, it must execute a wait() command. The wait() command will suspend the
parent until the child completes and passes back a return value. (If a parent
does not execute a wait(), it may still communicate with a child via pipes and
mailboxes. These data structures have no bearing on the fork()/exec() and are
beyond the scope of this article.)
The basic use of fork()/exec()/wait() is shown in pseudocode in Example 1. The
format is a bit confusing at first. The call to fork() actually returns values
at two different times. When fork() is initially called, it returns a 0 and
enters the else part of the example where the exec() is invoked. If the exec()
is successful, control is passed back to the point of the fork(), which now
returns 1. The parent can then either ignore the child and continue on its way
or execute a wait() command and suspend until the child completes.
Example 1: Pseudocode illustrating fork/exec/wait.

 if ( (status = fork()) ! = 0) {
 /* parent code */
 if (status < 0)
 error (fork failed);
 if (wait (&child_status) == -1)
 error (wait failed);
 } else {
 /* exec the child */
 if (exec () == -1)
 error (exec failed);
 }

A UNIX exec() command can determine if a child-program image exists. Thus, if
a child executable cannot be found, the exec() will fail at the point of the
call. VMS cannot determine if the image exists. Therefore, the exec() will
appear to succeed even though the image does not exist. The fact that the
creation of the child process failed will not be reported until the status
from the wait() command is inspected.
The DOS exec() command behaves differently than its counterparts on VMS and
UNIX. Because DOS does not support multitasking, it is impossible for both the
parent and the child to be running at the same time. The DOS exec() command
terminates the parent when the child is created. When the exec() command is
called, DOS actually overlays the parent process with that of the child. The
only way the parent can regain control is if the exec() on the child fails.
Once the child is running, the parent process cannot be recovered.
DOS also provides a facility for the parent to suspend itself while the child
initiates, and then re-awaken when the child process completes. The commands
to accomplish this are called spawn() commands. DOS must find enough space for
the child to occupy, or the spawn() will fail. A sample spawn() format is
shown in Example 2.
Example 2: Using spawn().

if ((status = spawn ()) == -1 )
 error (spawn failed);

The DOS spawnv() command requires less code to perform a function similar to
the fork()/exec()/wait(). However, the fork()/exec()/wait() affords much more
flexibility. The remainder of this discussion will focus on the VMS/UNIX
exec() and the DOS spawn() family of commands.


Command Line and Environment Variables



Two types of parameters can be passed via the exec()/spawn() commands:
mandatory and optional. The mandatory parameter is the argument list by which
the child is called. The optional parameter is the environment-variable list.
The child is called in the same way any C program is called from the command
line. Recall that when accepting arguments from a command line, a program's
main procedure looks like Example 3.
Example 3: Main procedure when accepting arguments from a command line.

 main (argc, argv, envp);
 int argc;
 char *argv [];
 char *envp[];

The environment variables (envp) are optional. Under normal circumstances, a
child inherits most of the parent's environment attributes. Using the
environment variables in the exec()/spawn() command allows the child to be
initiated within a different environment. The variables that can be changed
are HOME, TERM, PATH, and USER.


The Different Flavors of exec()/spawn()


In the previous section I mentioned that the use of the environment variable
is optional. The manner in which the environment variable is utilized depends
upon which exec()/spawn() command is used. The two primary categories are
differentiated by the way the argument list is passed. The execv()/spawnv()
command format is shown in Example 4(a) .
Example 4: (a) execv()/spawnv() command format; (b) execl() format; (c)
execl() function call; (d) hard coding the command line; (e) calling getenv
using execv(); (f) argument lists when executables are kept in another
directory; (g) command to utilize path information; (h) command that allows
parent to change environment variables of the child.

 (a)

 execv (argv[0], argv);
 spawnv(P_WAIT, argv[0], argv);

 (b)

 execl (command, command, arg 1, .... arg N, NULL);

 (c)

 execl ("command", "command", "-x", "-c", "-v", NULL);
 spawnl (P_WAIT, "command", "command", "-x", "-c", "-v", NULL);

 (d)

 char *argv[] = {"command", "-x", "-c", "-v", NULL};

 execv (argv[0], argv);
 spawnv (P_WAIT, argv[0], argv);

 (e)
 char *argv[] = {"getenv", NULL);

 execv(argv[0], argv);
 spawnv(P_WAIT, argv[0], argv);

 (f)

 char *argv[] = {"{test.exe}getenv", NULL}; /* for VMS */
 char *argv[] = {"/test/bin/getenv", NULL}; /* for UNIX */
 char *argv[] = {"C:\\bin\\getenv", NULL}; /* for DOS */

 (g)

 execvp(argv[0], argv);
 spawnvp(P_WAIT, argv[0], argv);

 (h)


 char *argv[] = {"getenv", NULL};

 char *envp[] = { "HOME = /test/home",
 "TERM = vt100",
 "PATH = /test/bin",
 "USER = test"
 NULL };

 execve(argv[0], argv, envp);
 spawnve(P_WAIT, argv[0], argv, envp);

The DOS command spawnv() has one more parameter than execv(). The Turbo C
Reference Guide states that the first parameter passed with spawnv() specifies
whether or not the parent should wait for the child to return, either P_WAIT
or P_NOWAIT. A later note explains that P_NOWAIT is not supported
(understandable because DOS cannot support two concurrent processes). This
parameter is provided for future use; however, it still needs to be included.
In all flavors of execv()/spawnv() the argument list argv must be built and
its pointer passed. (From this point on, any reference to an exec() command
will cover the equivalent spawn() command.) The command argv[0] is the first
parameter. Notice that the command is actually passed twice, once as a
standalone parameter and then as the first member of the argument list.
The argument list can also be passed explicitly, that is, as hard-coded
strings. This command is called execl() and its format is shown in Example
4(b). Again, notice that the command is passed twice, and the list is
terminated by a null. For instance, Example 4(c) invokes the execl() function
using command -x-c -v as the command line.
There are uses for the execl() family of commands; however, the hard-coding
aspects of the command make it less flexible than the execv() commands. In
fact, if you're intent on hard coding the command line, execv() can still be
used; see Example 4(d) . Because of this, only the execv() functions will be
discussed further.
When the execv() command is called, the current directory is searched for the
executable specified by argv[0]. If it is not found, an error will result. If
the executable resides in another directory, the entire path must be
explicitly provided. To illustrate, a program called "getenv," which simply
prints out the current value of the environment variables, is called by
execv(); see Example 4(e).
The getenv executable must reside in the current directory or the command will
fail. However, assume that all program executables are kept in a directory
called [test.exe] (for VMS), /test/bin (for UNIX), or C:\bin (for DOS). The
argument lists would look like Example 4(f).
Explicitly providing the path information is not always the most elegant
solution. There is a way to take advantage of the execution paths UNIX and DOS
provide. VMS does not provide a path, but a path variable called VAXC$PATH can
be set. The command to utilize the path information is called execvp(), which
looks similar to the regular execv(); see Example 4(g).
The only difference is that if the executable is not found in the current
directory, the PATH environment variable is searched. In VMS, the VAXC$PATH
variable can be set either at the command line or in the login.com file as
define VAXC$PATH [test.exe].
The last version of execv() is called execve(). This command allows the parent
to change the environment variables of the child; see Example 4(h). This
command will create an environment for the child without changing the
environment of the parent. All of these execv() commands have an execl()
counterpart.
DOS also provides a command called spawnvpe(). This command combines the
functionality of execve() and execvp(). VMS and UNIX do not provide this
function, but code can be added to the library to include this functionality.


Building a Portable execv()


Because the execv() command is the basis for all of the other extensions, this
command will be used as an illustration. The code for the complete library is
in Listing One (page 90). The intent is to create a library command, called
xexecv, that will handle all of the overhead involved in creating and
executing a child process. In essence, the only code required in an
application program should be: status = xexecv (argv[0], argv);.
This one line of code will fork() and exec() (in UNIX and VMS) or spawn() (in
DOS), execute the command, and return the proper status.
The first order of business is to determine what platform the program is
running on. This is defined in the header file, exec.h, presented in Listing
Two (page 90). Portability issues can be addressed here, such as whether or
not the platform is ANSI standard. Once this information is known, either
defined on the compilation line or in the exec.h file itself, the ifdefs in
the code can handle the portability issues.
After xexecv is called, the first ifdef determines whether or not the platform
is DOS. If it is, only the spawnv() command is executed. If it is not DOS, the
fork(), execv(), and wait() calls are made. In any event, the status--whether
the actual child return status or an error code--is passed back to the calling
program.


Special Portability Issues


Besides the fact that execv() and spawnv() are different commands, there are
other portability issues to consider. One, the way VMS uses VAXC$PATH, has
already been mentioned. Further portability concerns pertain to compiler/
operating-system bugs, return codes, and word alignment. As of this writing,
there is a bug registered against the VMS command execvp(). The problem is
that the proper value of VAXC$PATH is not returned. The way to get around this
is to use the getenv() command to obtain the information. Then the path can be
prepended to the command. The code workaround appears in the xexecvp function
in the library. It is somewhat curious that the VMS command xexeclp does work
properly.
Architecture also plays a role in the lack of portability. The program return
codes for VMS and DOS differ from those of some of the UNIX platforms, at
least the ones already ported to SUN/OS and HP/UX. These UNIX systems place
the return code in the high-order byte, so, to conform with the other
platforms, the contents need to be shifted as: child_status = child_status >>
8;.
The intent of the return codes poses a problem. VMS, UNIX, and DOS have
different ways of interpreting returned information. This portability issue is
dealt with in the error-handling section.
The three operating systems/compilers referenced by this article are not the
only ones that can be incorporated into the library. Any operating system that
supports the creation of child processes and conforms to the functionality
discussed can be added.


Error Handling


Error handling is the responsibility of the calling routine, not the execv()
library (which returns only status information). This status information can
be either an error code or the status of a successfully invoked child process.
The error codes are defined in the include file exec.h. It is up to the
calling program to process the return status and report any anomalies. The
error codes can be tailored to a specific application.


Sample Application


To illustrate how the execv() library works, a short example is presented. The
example program is called child.c; see Listing Three (page 90). This short
program tests each flavor of the execv() library: xexecv, xexecve, xexecvp.
The data structure envp holds information that will be used to alter the
environment variables. The routine convert(), which is presented in Listing
Four (page 92), takes a (char*) command line as input and converts it into an
(char**) argv structure. A malloc, used within the convert() routine, acquires
the appropriate amount of memory for the argv structure. In this example, a
command called getenv is converted into an argv list, and then is passed to
the appropriate execv(). A status is passed back and checked by the routine
check_status(). When the process is complete, the memory used by the argv
structure is released.
The program getenv is presented in Listing Five (page 92) and will compile on
all three of the target systems. The routine invokes the system command
getenv, which displays the system's environment information.
To utilize the library in an application, simply compile the modules execv.c,
child.c, and convert.c, and then link all three into an executable (in this
case called "child"). Remember, either a compiler switch or an entry in the
exec.h file must define the target platform.
Sample output from the program child is presented in Listing Six (page 92).
Notice the output from the call to execve(). The environment variables have
been altered and reflect the data defined in the data structure envp.


Conclusion



Despite the fact that C is a very portable language, portability is still a
problem in some areas. System calls, such as the execv() and spawn() family of
commands, can be specific to an individual platform. When used, these commands
may obstruct portability. These problems can be resolved with code libraries
that, when linked into application programs, free the programmer from dealing
with these system concerns. With the vast number of platforms and compilers
now available, it is helpful to have these libraries of reusable code ready at
hand.

_A PORTABLE LIBRARY FOR EXECUTING CHILD PROCESSES_
by Matt Weisfeld


[LISTING ONE]

#include <stdio.h>
#ifdef DOS
#include <process.h>
#include <stdlib.h>
#endif
#include "exec.h"

int _execv(char **argv)
{
 int status, child_status;

 /* fork off the child process */

 /* vfork returns values on two different occasions :
 1) It returns a 0 the first time it is called, before
 the exec functions is called.
 2) After the exec call is made, control is passed
 back to the parent at the point of the vfork.
 */
#ifdef DOS
 if ((child_status = spawnv(P_WAIT,argv[0],argv)) == -1) {
 return (BADEXEC);
 }
#else
 if ((status = vfork()) != 0) {
 /* after the exec, control is returned here */
 if (status < 0) {
 printf ("Parent: child failed\n");
 return (BADFORK);
 } else {
 printf ("Parent - Waiting for child\n");
 if ((status=wait(&child_status)) == -1) {
 printf ("Parent: wait failed\n");
 return (BADWAIT);
 }
 }
 } else { /* if vfork returns a 0 */
 /* execute command after the initial vfork */
 printf ("Parent - Starting Child\n");
 if ((status=execv(argv[0], argv)) == -1) {
 printf ("Parent: execv on child failed\n");
 return (BADEXEC);
 }
 }
#endif

#ifdef UNIX
 child_status = child_status >> 8;
#endif
 return (child_status);
}

int _execvp(char **argv)
{
 /* fork off the child process */
 /* vfork returns values on two different occasions :
 1) It returns a 0 the first time it is called, before
 the exec functions is called.
 2) After the exec call is made, control is passed
 back to the parent at the point of the vfork.
 */
 int status, child_status;
 int pathlen,comlen;

 char *path;
 char *command;
 /* This code has to be here, if it is within the vfork domain, then any
 error return code will be interpreted as an error from the child. */

#ifdef VMS
 if ( (path = getenv("CHILD$PATH")) == NULL) {;
 printf ("Error: CHILD$PATH not defined\n");
 return(BADPATH);
 }
 pathlen = strlen (path);
 comlen = strlen (argv[0]);

 if ( (command = malloc(pathlen+comlen+1)) == NULL) {
 printf ("Error: malloc failed\n");
 return(BADMALLOC);
 }
 strcpy (command, path);
 strcat (command, argv[0]);
 argv[0] = command;

 printf ("command = %s\n", command);
#endif

#ifdef DOS
 if ((child_status = spawnvp(P_WAIT,argv[0],argv)) == -1) {
 return (BADEXEC);
 }
#else
 if ((status = vfork()) != 0) {
 /* after the exec, control is returned here */
 if (status < 0) {
 printf ("Parent: child failed\n");
 return (BADFORK);
 } else {
 printf ("Parent - Waiting for child\n");

 if ((status=wait(&child_status)) == -1) {
 printf ("Parent: wait failed\n");
 return (BADWAIT);
 }
 }
 } else { /* if vfork returns a 0 */
 /* execute command after the initial vfork */
 printf ("Parent - Starting Child\n");

#ifdef VMS

 if ((status=execv(argv[0], argv)) == -1) {
#else
 if ((status=execvp(argv[0], argv)) == -1) {
#endif
 printf ("Parent: execv on child failed\n");
 return (BADEXEC);
 }
 }
#endif

#ifdef UNIX
 child_status = child_status >> 8;
#endif
 return (child_status);
}
int _execve(char **argv, char **envp)
{
 int status, child_status;
 /* fork off the child process */

 /* vfork returns values on two different occasions :
 1) It returns a 0 the first time it is called, before
 the exec functions is called.
 2) After the exec call is made, control is passed
 back to the parent at the point of the vfork.
 */

#ifdef DOS
 if ((child_status = spawnve(P_WAIT,argv[0],argv,envp)) == -1) {
 return (BADEXEC);
 }
#else
 if ((status = vfork()) != 0) {
 /* after the exec, control is returned here */
 if (status < 0) {
 printf ("Parent: child failed\n");
 return (BADFORK);
 } else {
 printf ("Parent - Waiting for child\n");
 if ((status=wait(&child_status)) == -1) {
 printf ("Parent: wait failed\n");
 return (BADWAIT);
 }
 }
 } else { /* if vfork returns a 0 */
 /* execute command after the initial vfork */
 printf ("Parent - Starting Child\n");

 if ((status=execve(argv[0], argv, envp)) == -1) {
 printf ("Parent: execve on child failed\n");
 return (BADEXEC);
 }
 }
#endif

#ifdef UNIX
 child_status = child_status >> 8;
#endif
 return (child_status);

}






[LISTING TWO]

/* One of the following must be defined here or on the compile line
#define DOS 1
#define UNIX 1
*/

#define VMS 1

#define BADFORK -1
#define BADEXEC -2
#define BADWAIT -3
#define BADPATH -4
#define BADMALLOC -5






[LISTING THREE]

#include <stdio.h>
#include <stdarg.h>
#include "exec.h"

char envp1[50] = "HOME=sys$sysdevice:[weisfeld]";
char envp2[50] = "TERM=vt100";
char envp3[50] = "PATH=sys$sysdevice:[weisfeld.exe]";
char envp4[50] = "USER=test";

int main()
{
 int status, child_status;

 char **argv;
 char *envp[6];

 envp[0] = envp1;
 envp[1] = envp2;
 envp[2] = envp3;
 envp[3] = envp4;
 envp[4] = 0;


 printf ("CALL EXECV\n");
 argv = (char *)convert("getenv");
 status = _execv(argv);
 check_status(status);
 free (argv);

 printf ("CALL EXECVE\n");

 argv = (char *)convert("getenv");
 status = _execve(argv, envp);
 check_status(status);
 free (argv);

 printf ("CALL EXECVP\n");
 argv = (char *)convert("getenv");
 status = _execvp(argv);
 check_status(status);
 free (argv);

 return;
}

check_status(status)
int status;
{
 switch(status) {
 case BADFORK:
 printf ("Error: bad fork\n");
 break;
 case BADEXEC:
 printf ("Error: bad exec\n");
 break;
 case BADWAIT:
 printf ("Error: bad wait\n");
 break;
 case BADPATH:
 printf ("Error: bad path\n");
 break;
 case BADMALLOC:
 printf ("Error: bad malloc\n");
 break;
 default:
 printf ("Child status = %d\n", status);
 break;
 }
}






[LISTING FOUR]

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

#define EXIT_NORMAL 1
#define EXIT_ERROR 5

/* convert line to 'c' like argv */
char **convert(char *ptr)
{
 static char buffer[100];
 char *bufptr;
 int offset = 0;

 int counter = 0;
 char **argv;

 strcpy (buffer, ptr);

 bufptr = &buffer;

 for (;;) {
 /* bypass white space */
 while (isspace(*bufptr)) {
 bufptr++;
 }
 /* if we have a null string break out of loop */
 if (*bufptr == '\0')
 break;
 counter++;
 /* continue until white space of string terminator is found */
 while ((!isspace(*bufptr)) && (*bufptr != '\0')) {
 bufptr++;
 }
 }
 /* get space for argument */

 if ((argv = (char **)calloc(counter+1, sizeof(char *))) == (char **) 0) {
 printf ("Error: no space available\n");
 return (EXIT_ERROR);
 }
 bufptr = &buffer;
 /* build argument */
 for (;;) {
 while (isspace(*bufptr)) {
 *bufptr = '\0';
 bufptr++;
 }
 if (*bufptr == '\0')
 break;
 *(argv + offset++) = bufptr;
 while ((!isspace(*++bufptr)) && (*bufptr != '\0'));
 }
 /* return all the arguments */
 return(argv);
}






[LISTING FIVE]

#include <stdio.h>

#define EXIT_NORMAL 1

main()
{
 char *logical;

 logical = getenv("HOME");

 printf ("HOME = %s\n", logical);
 logical = getenv("TERM");
 printf ("TERM = %s\n", logical);
 logical = getenv("PATH");
 printf ("PATH = %s\n", logical);
 logical = getenv("USER");
 printf ("USER = %s\n", logical);

 return(EXIT_NORMAL);
}





[LISTING SIX]

CALL EXECV
Parent - Starting Child
HOME = sys$sysdevice:[weisfeld]
TERM = vt300-80
PATH = sys$sysdevice:[weisfeld.test]
USER = WEISFELD
Child status = 1
CALL EXECVE
Parent - Starting Child
HOME = sys$sysdevice:[weisfeld]
TERM = vt100
PATH = sys$sysdevice:[weisfeld.exe]
USER = test
Child status = 1
CALL EXECVP
command = [WEISFELD.EXE]getenv
Parent - Starting Child
HOME = sys$sysdevice:[weisfeld]
TERM = vt300-80
PATH = sys$sysdevice:[weisfeld.test]
USER = WEISFELD
Child status = 1


Example 1:

if ( (status = fork()) != 0) {
 /* parent code */
 if (status < 0)
 error (fork failed);
 if (wait(&child_status) == -1)
 error (wait failed);
} else {
 /* exec the child */
 if (exec() == -1)
 error(exec failed);
}



Example 2:


if ((status = spawn ()) == -1 )
 error (spawn failed);


Example 3:


main(argc,argv,envp);
int argc;
char *argv[];
char *envp[];

Example 4:

(a)

execv(argv[0], argv);
spawnv(P_WAIT, argv[0], argv);


(b)

execl(command, command, arg 1, ..., arg N, NULL);


(c)

execl ("command", "command", "-x", "-c", "-v", NULL);
spawnl (P_WAIT, "command", "command", "-x", "-c", "-v", NULL);


(d)

char *argv[] = {"command", "-x", "-c", "-v", NULL}

execv(argv[0], argv);
spawnv(P_WAIT, argv[0], argv);


(e)

char *argv[] = {"getenv",NULL};

execv(argv[0], argv);
spawnv(P_WAIT, argv[0], argv);


(f)

char *argv[] = {"[test.exe]getenv",NULL}; /* for VMS */
char *argv[] = {"/test/bin/getenv",NULL}; /* for UNIX */
char *argv[] = {"C:\\bin\\getenv",NULL}; /* for DOS */


(g)

execvp(argv[0], argv);
spawnvp(P_WAIT, argv[0], argv);



(h)

char *argv[] = {"getenv",NULL};

char *envp[] = { "HOME = /test/home",
 "TERM = vt100",
 "PATH = /test/bin",
 "USER = test"
 NULL }

execve(argv[0], argv, envp);
spawnve(P_WAIT, argv[0], argv, envp);

















































May, 1993
FLASH FILE SYSTEMS


A new approach to storing information


 This article contains the following executables: FLASH.ARC


Drew Gislason


Drew is vice president of research and development at Datalight and can be
contacted at 307N. Olympic Ave., Suite 201, Arlington, WA 98223 or via
CompuServe at 70671,3272.


Flash memory promises radical changes in the laptop, palmtop, and
embedded-systems markets. Because Flash memory requires little power to write
information into its cells--and none at all to retain it--long-life,
battery-operated systems with mass storage are possible.
Flash memory, typically packaged as cards (see Figure 1) as defined by the
Personal Computer Memory Card International Association (PCMCIA), plug into
slots in the computer. These cards store from 1 to 8 megabytes of data, with
64-Mbyte capabilities on the horizon. One of Flash cards' big advantages is
that they alleviate power-consumption problems inherent in power-hungry hard
and floppy disks, as well as static and dynamic RAM.
Still, Flash memory requires an extremely long time to erase--up to 30 seconds
per 128-Kbyte memory chip, and a 1-Mbyte card uses eight chips! Furthermore,
erasure can occur only in large discreet-sized blocks. These limitations have
given rise to the Flash file system, a new method of storing information.
Two approaches for storing file information in Flash memory on Flash cards
have emerged. One method, akin to the familiar RAM disks (of RAMDISK.SYS and
VDISK.SYS fame) simply divides the Flash memory into equal-sized chunks called
sectors. This article provides full source to just such a FAT-style,
sector-oriented Flash file system. The other approach is more byte-conscious.
It records file information in a byte-wise, sequential fashion in the Flash
memory, thus avoiding the "dead-space" problems associated with FAT file
systems and allowing files to be erased and/or modified. But it presents
challenges in terms of proprietary schemes and garbage collection.
Regardless of which method works best for your application or system, DOS
still manages the higher-level file I/O functions. In the places where DOS
file I/O falls short in dealing with Flash memory, a Flash file system fills
the gap.
While this article supplies source to a FAT-style Flash system, it isn't a
complete solution unto itself like those provided by DOS vendors. Microsoft,
for instance, offers a byte-oriented Flash file system for MS-DOS 5.0
embedded-systems developers and Datalight (the company I work for), developer
of a DOS 5.0-compatible operating system (a work-alike, not just MS-DOS
redistributed), also delivers a complete Flash file system.


The FAT-style Flash File System


DOS thinks of a disk as a large array of discrete-sized chunks called sectors.
It organizes these sectors on a disk into various areas, one of which is
called the file allocation table (FAT). The sectors are very similar to Flash
memory blocks. Sectors are 512 bytes in length. Flash memory blocks range from
2-128K in size, depending on manufacturer. Hence, the natural choice for a
Flash file system is the FAT, or sector-oriented method.
This sectored approach is what makes the FAT-style Flash file system so
simple. A sector is really just a portion of the Flash memory. A 1-Mbyte Flash
card may contain eight 128K Flash memory chips, each of which is divided into
256 sectors of 512 bytes, for a total of 2048 sectors. The sector size of 512
bytes is purely convention. Sector sizes of 128 bytes or even 1024 bytes are
possible, but floppies and hard disks both use 512 bytes.
A FAT-style Flash file system is really just a disk device driver for DOS that
understands Flash memory. DOS device drivers are well documented in books such
as Ray Duncan's Advanced MS-DOS Programming (Microsoft Press, 1988).
Additionally, a FAT-style Flash file system can be hooked into the BIOS via
interrupt 13H. Among other things, the BIOS reads and writes floppy-and
hard-disk sectors. DOS reads, writes, and formats (erases) those small chunks
of data (sectors) using BIOS interrupt 13H. By hooking into interrupt 13H, a
FAT-style Flash file system can fool any utility that deals directly with
disks (Norton Utilities, for example) into thinking the Flash disk is nothing
other than a very fast, odd-sized floppy.
Another advantage of the FAT-style Flash file system is that even non-Flash
file systems from any vendor can read a Flash disk. A Flash disk, written in
the FAT-style method, contains the same layout as ROM and SRAM disks. PCMCIA
Socket Services (that is, the PCMCIA BIOS) are designed to read Flash, ROM,
and RAM cards. Writing to Flash memory is another matter, hence the need for a
Flash file system.
A problem arises when a sector must be overwritten with new data. The FAT, for
instance, is periodically updated as files are added and deleted. FAT entries
containing 0s are set to some value representing the location of the file on
disk. For a floppy or hard disk, overwriting a sector with new data is not an
issue; it simply works. It's impossible to overwrite data in Flash memory.
Flash memory (and ROM, for that matter) consists of binary 1s. Any byte
written to the Flash memory changes some of the bits to 0s. Bits that are 1s
are ignored. In Flash memory (as in ROMs), once a bit has been changed to 0,
it's 0 until the whole chip has been flashed (erased).
Unfortunately, most DOS structures are set to 0 upon initialization (FORMAT),
including both the FAT and root directory. This would make a Flash disk
completely unusable! The Flash disk device driver presented here overcomes
this limitation by inverting each and every bit: All 1s becomes 0s, all 0s
become 1s.
Besides the bit problem, other anomalies surface when writing to Flash memory
in a FAT-style, sector-oriented fashion. Files cannot be deleted, as the
directory entry and FAT entries occupied by the file cannot be changed.
Neither can files be updated or appended to. Anything written to Flash memory
is there until the entire Flash chip is erased.
DOS commands such as COPY, DIR, and TYPE work as expected, as does the
create/write/close sequence in many applications. But ERASE, RENAME, and "ECHO
TEXT >>FILENAME" fail. The FAT-style flash disk effectively becomes a
write-once/read-many (WORM) drive. When the drive is full, there's no choice
but to stop writing or erase the whole thing.
In general, sector-oriented schemes share the disadvantage of sector dead
space that occurs when the file length doesn't exactly match the length of one
or more sectors. For instance, if a file is exactly 1036 bytes long, it will
be stored in three 512-byte sectors. The first two are filled with data, and
the final sector contains only 12 bytes of data with the remaining 500 bytes
unused. This problem grows to terrible proportions when a file contains just a
single byte of data. In this last case, 99.8 percent of the sector is unused.


The Device Driver


The FAT Flash disk device driver FLASHDEV.ASM (see Listing One, page 94) isn't
as robust as commercial Flash file systems. I haven't, for example, provided a
facility to erase a Flash disk. The FAT-style Flash disk device driver assumes
the Flash memory is either erased or contains a valid Flash disk. Listing One
presents the bare-bones version of FLASHDEV.ASM. A fully commented version
along with the source code for the Flash memory I/O (which reads and programs
Intel Flash memory) is available electronically; see "Availability," page 7.
The write routines blindly assume Intel parts. While it's possible to query
Flash chips to determine both manufacturer and type, this device driver takes
the dumb approach: It's hardcoded to Intel parts. An intelligent algorithm
would detect the manufacturer and write or flash (erase) accordingly.
The first half of the FLASHDEV.ASM comprises the actual device-driver
interaction with DOS. This code is generic to any disk driver. A function
table diverts the "request" from DOS to the code that must fulfill that
request. Simple, generic entry and exit routines handle all communiques from
DOS.
Each device-driver function--meminit, memchanged, memread, and memwrite--calls
one of the real "meat" routines to do the actual work, and all must intimately
understand Flash memory. They are prototyped in the comments of the device
driver.
The meminit routine formats the Flash memory if it hasn't already been
formatted. The formatting process will produce a disk the FAT-style Flash file
system can use. The memread and memwrite routines read and write Flash memory.
The memchanged routine currently does nothing; it could be expanded to detect
whether a Flash card has been removed and proceed accordingly.


The Byte-oriented Flash File System


The byte-oriented Flash file system takes a different approach to storing file
information. This method has been refined by a number of vendors, including
Microsoft and Datalight.
The byte-oriented method does not store file information in discreet-sized
sectors, but in a byte-wise, sequential fashion. Typically, a structure
surrounds the actual data, providing the byte-oriented Flash file system with
the information necessary to keep track of what data belongs to which file or
directory.
At Datalight, for instance, we implement the Flash file system with a linked
list of records. Each record contains a header of approximately 12 bytes
(describing the data's owner) and a variable-length data field. The overhead
of this header is far less than the typical dead space associated with
FAT-style Flash file systems.
Take, for example, the 1-byte file described earlier: A 1-byte file in a
FAT-style, sector-oriented Flash file system requires one full sector, or 512
bytes. That means that 511 bytes of overhead exist for that one piece of data.
Contrast this to 12 bytes of overhead, and you see why the byte-oriented Flash
file system saves space. The most compelling reason to use a byte-oriented
Flash file system, however, is its ability to append, overwrite, and delete
files in Flash memory. A FAT-style Flash file system can't do this.
These advantages don't come without a price. No standards yet exist that
define the data layout in physical Flash memory for a byte-oriented Flash
disk. That means you're tied to whatever software vendor you choose for the
Flash file system. It also means that Flash cards produced on your system
can't be shared with systems that use a different Flash file system. This
disadvantage, more than any other, makes the byte-oriented Flash file system
of limited use.

Other unfortunate things occur when files are deleted, renamed, or changed. As
mentioned earlier, Flash memory has the limitation that once data is written
to it, the data can't change without flashing (erasing) the whole chip. Or
more correctly, bits that contain a 0 cannot change. This has led makers of
byte-oriented Flash file systems to simply mark off a section of Flash memory
that must change as "no longer used." The data is then changed and written to
unused Flash memory elsewhere. From the point of view of DOS or an application
program, the data was simply modified and happens invisibly to DOS.
The marked-off Flash memory is now dead space that remains unused until the
entire Flash chip is erased. A Flash disk can quickly fill with dead space if
files are continuously rewritten or the data in them is frequently modified.
The only way to clear this dead space out of a byte-oriented Flash disk is
with garbage collection. Typically, this involves reading much of the data
into RAM, erasing various Flash chips, and writing data back out. When the
procedure is through, the beginning of the Flash disk contains valid, useful
data and the end of the disk is erased, empty, ready-to-be-used Flash memory.
Unfortunately, since Flash memory takes a long time to erase, this garbage
collection can take minutes on a large Flash disk.
An issue I haven't touched on is how DOS talks to a byte-oriented Flash file
system. While FAT-style systems use standard DOS disk device drivers,
byte-oriented systems can't. Typically, a byte-oriented Flash file system uses
a little-known mechanism in DOS called the "installable file system" (IFS).
DOS has two file systems built into it: the FAT file system (FFS) and IFS.
These systems talk to a disk in very different ways.
The FAT file system is the most familiar. DOS on a desktop uses it and its
built-in device drivers to talk to the floppy and hard disks. All access to
the disk is performed via sectors that are buffered by DOS (the BUFFERS=
command in CONFIG.SYS).
DOS talks to the IFS in a different way. The IFS answers DOS with high-level,
file- and directory-oriented responses. In this way, the physical layout of
the data has no constraints. DOS may ask the IFS to open a file or delete a
directory, but DOS won't ask anything about the layout of that data.
With this lack of constraints, the IFS allows a byte-oriented Flash file
system to store data in any manner it deems appropriate while DOS remains
blissfully ignorant of the details. For a good discussion of IFS, see Andrew
Schulman's book, Undocumented DOS, second edition (Addison-Wesley, 1993).


Conclusion


Some vendors have created hybrids of the FAT-style and the byte-oriented Flash
file systems to produce a Flash disk that looks and feels to applications to
be a fast, small floppy disk. Other vendors hook BIOS interrupt 13H to cause
such bit-twiddling programs as Norton Utilities to think it is dealing with a
standard, sector-oriented, FAT-style disk. Be aware, however, that unless true
sectors are written to the Flash memory in a standard FAT-style configuration,
the data layout in the Flash memory itself will be proprietary, confining
Flash cards written by the system to one vendor's software.
No matter which Flash file system suits your needs, it's important to weigh
the advantages and disadvantages of each--and carefully consider how it will
be used.

_FLASH FILE SYSTEMS_
by Drew Gislason


[LISTING ONE]

;----------------------------------------------------------
; FLASHDEV.ASM--02/16/93--Drew Gislason
; This FAT-Flash disk driver is designed as a Write Once Read Many (WORM)
; FAT-style disk driver. Normal DOS commands such as COPY, DIR, and TYPE will
; work as expected. DOS commands like RENAME, ERASE and "ECHO text >>file"
; will fail. Part 1 is generic to all disk devices. It could be used as a
; template for any FAT-style disk device driver. Part 2 (available
; electronically) actually reads and programs Flash memory. It is written
; for Intel Flash memory, but can be ported to other Flash memory mechanism.
;----------------------------------------------------------

;----------------------------------------------------------
; PART 1: GENERIC DISK DRIVER CODE -- Driver relies on a number of routines
; to read and write Flash memory. All of routines could be rewritten to
; accomodate ANY kind of Flash memory, or even ROM, RAM or an actual disk
; for that matter. Routines that MUST be supplied to make this generic driver
; into a working Flash Disk Driver are defined (in C syntax) as follows:
; bool meminit(char far *cmdline, unsigned *endseg);
; int memchanged(void);
; bool memread(long offset, unsigned len, char far buffer);
; bool memwrite(long offset, unsigned len, char far * buffer);
; Note that all the routines follow C conventions: that
; is, the last item "pushed" on the stack is the 1st
; argument, and each routine MUST preserve SI,DI,BP and
; all segment registers besides ES.
;----------------------------------------------------------

; Device Requests -- When DOS calls the device driver, the "request"
; (pointed to by ES:BX) may contain some or all of the following fields:
REQ_CMDLEN equ 00H ; length of command
REQ_UNIT equ 01H ; unit # for disks
REQ_CMD equ 02H ; the command itself
REQ_STATUS equ 03H ; status word
REQ_MEDIA equ 0DH ; media byte (disks)
REQ_TRANS equ 0EH ; transfer address
REQ_COUNT equ 12H ; # sectors for rd/wr
REQ_START equ 14H ; start sector for rd/wr
REQ_DRIVE equ 16H ; drive # (0=A, 1=B, etc)


; Device Error Codes -- If the device must return an error, the following
; list defines the possible error conditions:
ERR_WRITEPROT equ 00H
ERR_BADUNIT equ 01H
ERR_NOTREADY equ 02H
ERR_BADCMD equ 03H
ERR_CRC equ 04H
ERR_BADSTRUCT equ 05H
ERR_SEEK equ 06H
ERR_BADMEDIA equ 07H
ERR_SECTORNOTFOUND equ 08H
ERR_NOPAPER equ 09H
ERR_WRITEFAULT equ 0AH
ERR_READFAULT equ 0BH
ERR_GENERALFAIL equ 0CH

; C Function Arguments -- In small CODE models (that is, where CS never
; changes), the 1st argument begins at 4 bytes above the Base Pointer.
; +--------+
; Return 
; +--------+
; BP 
; +--------+
;
P equ 4

; Segment Ordering -- SEGMENT ORDERING is defined here for convenience.
; The linker gathers segments in the order it first encounters them. By
; defining the segment ordering immediately, segments will automatically
; appear in correct order as they do here. The segment alignment and combine
; type are specified here to simplify segment directives.
_TEXT segment byte public 'CODE' ; Code
 assume CS:_TEXT
_TEXT ends
_DATA segment word public 'DATA' ; Initialized data
_DATA ends
_BSS segment word public 'BSS' ; Uninitialized data
 DGROUP group _DATA, _BSS
 assume DS:DGROUP
_BSS ends
_ENDDEV segment para public 'ENDDEV' ; End of device driver
_ENDDEV ends

; Device Header -- Every DOS device driver, whether for Character or
; Disk devices, MUST begin with a device header.
_TEXT segment
 public flash_header
flash_header:
 dw -1, -1 ; Pointer to next header, set by DOS
 dw 0000H ; Disk device, not removable
 dw strategy ; Strategy routine
 dw service ; Service routine
maxdrv db 1,0 ; For future expansion
 ; DOS makes "requests" to device driver to perform following actions...
cmdtbl:
 dw flash_init ; 0 - initialize device
 dw flash_media ; 1 - detect media change
 dw flash_bpb ; 2 - build a bpb

 dw dev_cmd_err ; 3 - ioctl input (not implemented)
 dw flash_read ; 4 - read from device
 dw dev_cmd_err ; 5 - non-destructive read (unused)
 dw dev_cmd_err ; 6 - input status (not implemented)
 dw dev_cmd_err ; 7 - input flush (not implemented)
 dw flash_write ; 8 - write to device
 dw flash_write ; 9 - write to device w/ verify
cmdend:
_TEXT ends
; Device Strategy and Service Routines -- DOS makes two calls to a device
; driver for each request, one to the "strategy" routine and one to
; the service routine. The first call, to the device driver's "strategy"
; routine, was intended for multi-tasking that was never implemented in DOS
; and so, can be ignored. The second call, to the device driver's "service"
; routine, does the real work.
_BSS segment
 public _dos_req_ptr
 _dos_req_ptr DD ?
 stack_bottom DW 512 dup (?) ; 1K stack
 local_stack DW ?
 org_stack DD ?
_BSS ends
_TEXT segment
 public strategy
strategy:
 retf
 public service
service:
 ; save all registers we might change
 pushf
 push AX
 push BX
 push CX
 push DX
 push SI
 push DI
 push DS
 push ES
 ; save current stack
 mov DX, DGROUP
 mov DS, DX
 mov WORD PTR DGROUP:org_stack, SP
 mov WORD PTR DGROUP:org_stack+2, SS
 ; switch to our stack
 mov AX, OFFSET DGROUP:local_stack
 cli
 mov SS, DX
 mov SP, AX
 sti
 ; ES:BX points to "request" from DOS, save it
 mov word ptr [_dos_req_ptr], BX
 mov word ptr [_dos_req_ptr+2], ES
 ; load some basics that ALL commands use into registers
 mov CX, word ptr ES:[BX+REQ_COUNT] ; CX = count
 mov DX, word ptr ES:[BX+REQ_START] ; DX = start sector
 mov AL, byte ptr ES:[BX+REQ_CMD] ; command code
 cmp AL, (OFFSET cmdend - OFFSET cmdtbl)/2
 jae dev_cmd_err ; bad command
 cbw

 shl AX, 1 ; get index into cmd table
 mov SI, offset cmdtbl
 add SI, AX
 les DI, dword ptr ES:[BX+REQ_TRANS] ; ES:DI = transfer buffer
; we've set up the following:
; DS = DGROUP
; CX = count
; DX = start sector
; ES:DI = transfer area
 ; do the command
 jmp word ptr CS:[SI]
 ; Unknown command error
 public dev_cmd_err
dev_cmd_err:
 mov AL, 3
 ; Mark error status, error code is in AL
 public dev_err_exit
dev_err_exit:
 mov AH, 81H
 jmp SHORT err1
 ; Normal exits (no error) occur through here
 public dev_exit
dev_exit:
 mov AH, 01H ; done bit set
 ; save status (error or not) in "request" header from DOS
err1: les BX, [_dos_req_ptr]
 mov WORD PTR ES:[BX+REQ_STATUS], AX
 ; restore original stack
 mov AX, WORD PTR DGROUP:org_stack
 mov DX, WORD PTR DGROUP:org_stack+2
 cli
 mov SS, DX
 mov SP, AX
 sti
 ; restore all registers we may have changed
 pop ES
 pop DS
 pop DI
 pop SI
 pop DX
 pop CX
 pop BX
 pop AX
 popf
 retf
_TEXT ends
; Device Data Area
_BSS segment
 ; if an error occurs, what is it?
 _memerr db ?
 PUBLIC start_bpb
 start_bpb LABEL BYTE
 sec_size DW ? ; bytes per sector
 sec_per_clus DB ? ; sectors/cluster
 res_sec DW ? ; reserved sectors
 num_fat DB ? ; number of fats (2 for a floppy)
 dir_ent DW ? ; number of directory entries
 num_sec DW ? ; total number of sectors
 media_desc DB ? ; media descriptor

 sec_per_fat DW ? ; sectors per fat
 sec_per_trak DW ? ; sectors/track
 heads DW ? ; number of heads
 spec_res DW ? ; reserved for disk partitioning
 end_bpb LABEL BYTE
_BSS ends
_DATA segment
 PUBLIC bpbptr
 bpbptr DW offset DGROUP:start_bpb
_DATA ends
; Initialize Device (COMMAND=0) -- "request" defines following fields:
; On Entry
; 13-Byte header * set by DOS
; BYTE Number of units (undefined)
; DWORD End Address (undefined)
; DWORD Cmdline Pointer * ptr to "DEVICE=" command line
; BYTE Block Device # * this disk is... (0=A, 1=B, etc)
;
; device= \dev\name.sys /1
; ^
; Cmdline points here, string ends in a CR (0DH) or LF (0AH)
; On Exit
; 00H 13-Byte header (set by DOS on entry)
; 0DH BYTE Number of units * set by this driver
; 0EH DWORD End Address * set by this driver
; 12H DWORD Ptr to BPB array * set by this driver
; 16H BYTE Block Device # (set by DOS on entry)
;
_DATA segment
 endseg dw SEG _ENDDEV
_DATA ends
_TEXT segment
 public flash_init
flash_init:
 ; initialze the Flash memory hardware
 ; meminit(char far * cmdline, unsigned *endseg)
 mov AX, offset DGROUP:endseg
 push AX
 les BX, [_dos_req_ptr]
 push WORD PTR [BX+REQ_COUNT+2]
 push WORD PTR [BX+REQ_COUNT]
 call _meminit
 add SP, 6
 or AX, AX
 jz init_err
 ; read the BPB from Flash memory into device driver BPB
 call readbpb
 or AX, AX
 jz init_err
 ; set up the i/o packet
 les BX, _dos_req_ptr
 mov BYTE PTR ES:[BX+REQ_MEDIA], 1 ; 1 unit (drive)
 mov WORD PTR ES:[BX+REQ_COUNT+2], DS ; point to BPB list
 mov WORD PTR ES:[BX+REQ_COUNT], OFFSET DGROUP:bpbptr
 mov AX, word ptr endseg
 mov WORD PTR ES:[BX+REQ_TRANS], 0
 mov WORD PTR ES:[BX+REQ_TRANS+2], AX
 jmp dev_exit
 ; error occured during init, fail

init_err:
 mov AL, ERR_GENERALFAIL
 jmp dev_err_exit

; Detect Media Change (COMMAND 1) -- Has flash memory been changed somehow?
; (A PCMCIA card might be pulled out and swapped for another).
; int memchanged(void)
; Returns -1 if Device HAS changed
; Returns 0 if Device MAY have changed
; Returns 1 if Device has NOT changed
 public flash_media
flash_media:
 call _memchanged
 les BX, [_dos_req_ptr]
 mov byte ptr [BX+REQ_TRANS], AL
 jmp dev_exit
; Build a BPB (COMMAND 2) -- DOS makes this "request" when it believes disk
; has or may have been changed. This will detect public flash_bpb
flash_bpb:
 ; read the BPB from Flash memory into device driver BPB
 call readbpb
 or AX, AX
 jnz gotbpb
 ; we failed to read BPB
 mov AL, _memerr
 jmp dev_err_exit
 ; return BPB ptr in request header
gotbpb: les BX, [_dos_req_ptr]
 mov WORD PTR ES:[BX+REQ_COUNT], offset DGROUP:start_bpb
 mov WORD PTR ES:[BX+REQ_COUNT+2], DS
 jmp dev_exit
readbpb:
 ; read the BPB from the disk into our structure
 ; memread(long offset, unsigned len, char far * buffer)
 push DS
 push bpbptr
 mov AX, (OFFSET end_bpb - OFFSET start_bpb)
 push AX
 xor DX, DX
 mov AX, 11 ; offset 11L (start of BPB on disk)
 push DX
 push AX
 call _memread
 add SP, 10
 ret
; Read from Device (COMMAND 4) -- Read from Flash memory. DOS "requests" that
; this device driver read 1 or more sectors from Flash memory. Device driver,
; in turn, calls "memread" to do the real work.
; Upon Entry
; DS = ROM-DOS data segment
; ES:DI = buffer data
; CX = # of sectors to read
; DX = starting sector
 public flash_read
flash_read:
 mov SI, OFFSET _memread
 ; This generic read/write routine is passed a
 ; pointer to the read or write routine in "SI"
rdwr:

 ; convert # of sectors to # of bytes
 push DX
 mov BX, OFFSET DGROUP:start_bpb
 mov AX, [BX] ; sector size (512)
 mul CX
 mov CX, AX ; CX = bytes to transfer
 pop DX

 ; calculate starting offset
 mov AX, [BX] ; sector size (512)
 mul DX ; DX:AX = offset (in bytes)

 ; call memread(offset,len,buffer) to do the work
 push ES
 push DI
 push CX
 push DX
 push AX
 call SI ; call read or write routine
 add SP, 10
 or AX, AX
 jz rdwr_failed
 mov AL, 0
 jmp dev_exit

rdwr_failed:
 mov AL, byte ptr _memerr
 jmp dev_err_exit


;------------------------------------------------
; Write to Device (COMMAND 8 or 9)
;
;
; Write to Flash memory. DOS "requests" that
; this device driver write 1 or more sectors to
; the Flash memory. The device driver, in turn,
; calls "memwrite" to do the real work.
;
; Upon Entry
; DS = ROM-DOS data segment
; ES:DI = buffer data
; CX = # of sectors to read
; DX = starting sector
;
 public flash_write
flash_write:
 mov SI, OFFSET _memwrite
 jmp short rdwr

_TEXT ENDS











May, 1993
YOUR OWN NETWORK DATA SNOOPER


Keeping track of network traffic


 This article contains the following listings: SNOOP.ARC


Rahner James


Rahner is an independent consultant living near Sacramento, California. He can
be reached by phone at 916-722-1939 or through CompuServe at 71450,757.


One problem with today's complex computer systems is that they require more
and more sophisticated test gear. I remember, for example, when you could use
an AM radio to find out if your S-100 bus was working--just by putting it next
to the cable and listening for the static. But this trick won't work anymore.
Not long ago I was trying to figure out if data packets were being transmitted
through my network cable. I couldn't feel, touch, or hear them--even my trusty
radio couldn't pick them up. Consequently, I ended up writing my own "network
data snooper" that lets me view data packets in real time. The snooper, which
is presented in this article, gives me a feel for the continuity of the
request packets and their corresponding responses, and logs the received
packets to a file for later retrieval. It also allows me to "tune in" specific
node addresses and sockets.
For instance, one project I'm working on involves a program running as a
Netware-loadable module (NLM) on a file server attached to the network. The
NLM acts as a process synchronizer for requests made by the other nodes on the
network using IPX or SPX packets. The network uses D-Link's NE-2000-compatible
Ethernet interface cards. The heart of these cards is National Semiconductor's
DP8390 Network Interface Controller (NIC) chip.
The NIC allows the PC to receive and transmit data packets over the network
according to the IEEE 802.3 specification. Table 1 describes the 802.3 basic
data-packet format. The minimum size of a transmitted packet is 128 bytes. The
first 64 bytes and the FCS are generated by the NIC. The application must
provide the destination address, source address, length, and at least 46 bytes
of data. If the size of the structure is less than 128 bytes, whatever data
happens to be in the transmission buffer will be sent, too.
Table 1: IEEE 802.3 data-packet format.

 Byte Description
 ------------------------------------------------------------------------

 00-61 Preamble. These bytes are transmitted before each packet and
 stripped by the NIC as they are received.

 62-63 Start of Frame Delimiter. Indicates that the meat of the
 packet is about to start and is stripped with the preamble.

 64-69 Destination Address. Address of the interface board to which
 this packet is destined.

 70-75 Source Address. Address of the interface board for which this
 packet was generated.

 76-77 Length. Number of bytes contained within the body of this
 packet.

 78-??? Data. The body of the packet.

 ?+1-?+4 FCS. A 32-bit CRC appended to all packets.



NIC Transmissions and Receptions


The NIC performs all packet transactions using an external memory buffer.
Because the NIC uses 16-bit addresses, this external buffer can be up to 64
Kbytes. D-Link uses 8K for the DE-1000 (an NE-1000 compatible interface) and
16K for the DE-2000 (NE-2000). Using the NIC's control registers, you can
divide the memory into two separate circular buffers--one for transmissions,
the other for receptions. I created a simple, two-packet ping-pong buffer for
transmissions and use the rest of the NIC's memory for a circular-reception
buffer.
Transmissions by their very nature are nonvolatile. They can be scheduled and
buffered externally so as to eliminate any asynchronous traits. The amount of
time required to transmit a packet at 10 MHz is more than it takes to move a
new data packet into position to be transmitted; therefore, in an
interrupt-driven system, only two transmit buffers are required.
Receptions will always be asynchronous, ready to catch us at the most
inopportune times. The lion's share of available memory should be used to
buffer received packets.
To define the reception buffer, the NIC has the page-start and page-stop
registers. The page-start register contains the address of the start of the
memory area used for packet receptions. The page-stop register contains the
address of the memory that immediately follows the last byte used to store
received packets. Both of these ports are 8-bit registers, so only the most
significant byte of a 16-bit memory address is described. The least
significant byte is set to 0. To define a circular reception buffer that
starts at 0600h and ends after 08FFh, page start would be loaded with a 6 and
page stop with a 9. Most other memory-pointer registers contain only the upper
eight bits of an address; therefore, the memory controlled by the NIC is
divided into 256-byte logical regions called "pages."
The reception buffer is also controlled by the current-page and
boundary-pointer registers. The current-page register points to the page where
the next-received packet will be placed by the NIC. The boundary-pointer
register contains the page number from which the received packets can be
drawn. Both of these registers are updated by the NIC when packets are
received and when those packets are removed by the host CPU.
The transmission of packets uses the transmit page-start and transmit
byte-count registers. The transmit page-start register contains the page
number of the start of a packet to be transmitted. The transmit byte-count
register (divided into the most- and least-significant bytes as register 0 and
1) tells the NIC how many bytes are to be transmitted. Any
transmission-buffering scheme must be implemented externally to the NIC.
The RAM used by the NIC can be a memory area shared with the host CPU or, as
with the NE-2000, a memory area isolated from the CPU by the NIC. In the case
of the isolated memory setup, the NIC provides DMA registers that allow the
host CPU to move into and out of that buffer. The NIC documentation refers to
the isolated memory area as "remote" memory. Two 16-bit DMA registers are used
for accessing this remote memory: the remote start address and the remote byte
count.
Moving data into the NIC buffer for transmission is simple. First, the remote
start address needs to be set to the starting address of the buffer into which
the prepared packet will be placed. Because packets always start on page
boundaries, the least-significant byte of this address will usually be set to
0. Second, the remote byte-count register needs to be set to the number of
units (bytes or words) that will be sent to the memory buffer. The NE-2000
accepts 16-bit data, so 100 bytes of packet data would require a remote byte
count of 50. The number of bytes moved to remote memory is disconnected from
the number of bytes that will be transmitted; therefore, if you only have 30
bytes of data to transmit, there is no need to move in 34 more bytes to fill
out the minimum packet size of 64 bytes. To move the data into the NIC buffer,
the Command register is set to remote write/start and the bytes (or words) are
output to the data port.
Once a packet has been placed in a remote buffer, it can be sent out to the
Ethernet cable by writing to the transmit page-start, transmit byte-count, and
command registers. The transmit page-start register defines the page number
from which a transmission will be initiated. The transmit byte-count register
determines the number of bytes that will be sent (minimum of 64 bytes). Once
the transmit page-start and byte-count registers have been set appropriately,
the packet can be sent by asserting the transmit packet bit of the command
register.
Pulling received data from the remote buffer is roughly the same as putting
the data into the remote buffer. The remote start-address and remote
byte-count registers are set up in the same fashion. The command register is
set to remote read/start and then the bytes (or words) are input from the data
port.

Because the movement of data to the NIC buffer uses the same registers as the
data from the NIC buffer, it is up to the driver or application to control
that process with a semaphore of some kind.


Packet Structure


The first 14 bytes of any packet sent to the NIC's transmission buffer should
contain addressing and size information. The first six bytes of this header
are the address of the node to which the packet is destined. The second six
bytes are the address of the Ethernet adapter from which the packet is being
sent. The last two bytes are the number of bytes contained within the
packet--including this header. The packet can be transmitted regardless of the
actual contents of this header. The destination address is required to get the
packet to the physical address of where it's intended. Neither the source
address nor the byte-count value are used or validated by the transmitting or
receiving NIC. Although the last two fields are not important for
communication, the receiving node's application may use them for its own
purposes. For example, a Netware 3.11 file server expects them to contain the
correct source-node ID and the size of the packet (rounded up to the nearest
word) in Motorola format.
When a packet is received, the NIC creates a 4-byte prefix that can be read by
the host CPU. The first byte of this prefix is the status associated with the
received packet. The second is the number of the next page following the end
of the packet. The last two bytes of this prefix contain the number of bytes
actually received from the source of the packet. The transmission header
created by the source node follows this prefix. For various reasons, the
packet size in the prefix does not necessarily have to match the count in the
transmit header.


Bits and Crumbs to Get Us Back Home


For the NIC to receive packets other than broadcasts, the physical-address
registers must be filled with a legal address. Generally, this address is
retrieved from a ROM read after the NE-2000 is reset. The ROM-based address
can be ignored, and any 6-byte value may be placed in these registers (except
all-1s). Multiple boards can have the same physical address and all will
receive any packets directed at that address.
If an application is so inclined, the NIC can be configured to accept all
packets passing through the network by asserting bit 4 of the
receive-configuration register. This is called the "promiscuous physical bit"
in the NIC documentation. When this bit is set, all packets, regardless of
their destination address, will be received, but not (contrary to the
uncertainty principle) affected by, the NIC.


Implementation and Things That Matter


The major parts of the snooper are the NIC driver, the real-time memory
manager, a task manager and the display functions. Listing One (page 98) is
SNOOP.H, the program's include file, and Listing Two (page 98) is SNOOP.C, the
main entry and looping functions for the program. Because of its size, the
complete system is available electronically; see "Availability," page 7. I've
compiled the code using Watcom's 286 C compiler, Microsoft's Macro Assembler
6.0, and my own user-interface library, which is included. The make and link
files are also included to aid in converting to another compiler.
My goal was to be able to view packets in real time and store them for study.
Viewing packets in real time requires a nonscrolling visual because it allows
static placement of data. Nonmoving visual fields enhance the user's ability
to comprehend fast data transitions. The data-packet display format was
shamelessly stolen from the venerable DDT. The packet data is written to a
disk file, and if the network traffic overflows the storage buffers, data will
be discarded.
The basic flow of the program starts when the incoming packet generates an
interrupt, which is handled by the interrupt service routine (ISR). The ISR
reads in the packet header and allocates a chunk of memory from the memory
manager. If a memory chunk is available, the remainder of the packet is read
in and placed in that memory area. The packet pages in the NIC's remote buffer
are released to be used again.
The foreground process gets around to checking if any packets have been read
by comparing the packet storage count to 0. If any packets have been received,
the application retrieves a pointer to the packet buffer and checks the
source/destination addresses against those that are of interest. The qualified
packet is displayed and logged. The memory used to store the packet is
released back to the memory manager, and life goes on.
A majority of the functions are written in assembly language because I was
losing packets when I wrote them in C. The video routines expect the monitor
to be of the standard monochrome or color memory-mapped variety.


The Datastream


Having a snooper is a lot like being a passenger on a glass-bottom boat--there
sure are a lot of fish and they sure are pretty, but what are their names?
In a typical datastream, there are SAPs, routing information, and Netware
core-protocol (NCP) packets. The SAPs and routing packets are nice, but are
really not all that interesting. The NCPs may be a more interesting target, so
let's look at how to decipher them.
Before going further, let me say that this information should not be taken as
the absolute word on NCP. I don't have documentation from Novell that can be
used as a reference. All the information in this article was obtained by
observation, with no official communications from Novell. On the bright side,
I've never signed a nondisclosure agreement with Novell, so I'm not
constrained.
NCP uses the basic IPX packet structure to communicate from the node to the
server. The process is usually initiated by the node, followed by a response
packet from the server. The packet-type number used for request and response
is 17. On a single-server network, the request packet is directed at the
physical address of the server (placed in the transmission header of the NIC).
The IPX header contains the logical network and node address of the NCP
server. The destination socket is 451H. The source-address portion of the IPX
header contains the address of the node to which a response is expected. The
source-socket number is different than 451H. On all the networks I've
observed, the response socket is 4003H. The rest of the NCP header is the same
as any other IPX packet header. The structure of an NCP request is shown in
Table 2; Table 3shows the structure of an NCP response.
Table 2: Structure of an NCP request.

 Offset Description
 ------------------------------------------------------------------------

 00-01 Request signature. Contains 02222h to indicate an NCP
 request. The signature of 01111h is used to acquire a
 connection number.

 02 Sequence. Contains the sequence number of this request packet.
 The response will contain the same sequence number.

 03 Connection number. The connection number that the node is
 using to access file-server resources. This number is provided
 by the file server when the node attaches for the first time.

 04-05 GOK1. Unknown. Varies between 1 and 2, but doesn't seem to
 affect how any of the requests are processed.

 06 Command. The command request number. The values used here
 tend to have a direct correlation to the INT 21h shell
 commands. For example, request number 17h represents all
 the E3h INT 21h requests.

 07 GOK2. Unknown (usually 0).


 08 Start of the data specific to the command number.

Table 3: Structure of an NCP response.

 Offset Description
 -------------------------------------------------------------------------

 00-01 Response signature. Contains 03333h to indicate an NCP response.

 02 Sequence. Contains the sequence number of its associated request
 packet.

 03 Connection number. The connection number of the node being
 serviced.

 04-05 GOK1. Tends to be one less than the request.

 06 Error code. If nonzero, the request is denied for the reason
 defined by this error-code number.

 07 GOK2. Unknown.

To illustrate how this process works, we first need to get a connection number
so that we can make more involved NCP requests. In C, this request must use
the format shown in Example 1. The response from the server is to provide a
NULL packet that has the connection field filled with the number that the node
should use for further communication.
Example 1: A connection number request.

 0x1111 Signature
 0x00 Initialize the sequence counter
 0xFF Need a connection
 0xFF00 No reason for this
 0x00 Command 0

To illustrate opening a file, I'll use a print-server configuration file (see
Example 2) that allows access to unlogged connections. If successful, the
response could be like Example 3.
Example 2: An example print-server configuration file.

 0x2222 Signature
 0x01 Sequence
 0x07 My connection number
 0x0002 <blah> <blah> <blah>
 0x4C Open file command
 0x00 Filler
 0x0106 Open attribute
 0x1D Length of the filename to follow
 xxxxxx SYS: SYSTEM/19000003/PRINT.000
 Note: the '.' that separates PRINT from 000 is
 actually a 0xAE, because Netware requires the
 the periods to be converted

Example 3: A successful response to Example 2.

 0x3333 Response signature
 0x01 Sequence
 0x07 Connection
 0x0001 Yep, it's one less
 0x00 Good return code
 0x00 Filler
 0x13,0x53 Handle to use to access the file's data
 0x7C,0x2D
 0x00,0x00

 0x0000 ??? Always 0.
 PRINT.000 Filename. Always 12 bytes, NULL padded.
 4 bytes File attribute of some sort. Always is 0x200000.
 4 bytes File size in Motorola format.
 0x2719 File creation date. MS-DOS format.
 0x2719 Date of last access to this file. MS-DOS format.
 0x2719 Date of the last update to this file. MS-DOS format.
 0x287E Time of the last update to this file. MS-DOS format.

The login process involves three steps. The first is acquiring an encryption
key from the server. The second is to take that key and generate a password
sequence that involves a table-driven XOR folding of the text password, the
bindery ID number with that encryption key. The last step is to send that
encryption sequence to the server in a login request packet.
To get the encryption key, transmit the request packet in Example 4(a); the
response is shown in Example 4(b).
Example 4: (a) Transmitting a request for the encryption key; (b) the
response.

 (a)

 0x17 Command
 0x00 Filler
 0x01 Number of bytes to follow
 0x17 Request for encryption key. This key will be
 different for every request made of the server.

 (b)

 0x00 Good return code
 0x00 Filler
 0xDE,0x6C Eight byte encryption key.
 0x18,0x36
 0x93,0x72
 0xD2,0xA3

Now, suppose the bindery-ID number for the account we are trying to log into
is 0x18000001, and the password is "YO HO HO." Passing those parameters
through the encryption function yields a login request packet of the form
shown in Example 5.
Example 5: Passing parameters through the encryption function yields a login
request packet.

 0x17 Command
 0x00 Filler
 0x11 Length of the request data to follow
 0xC4,0xA6 Encryption sequence
 0x39,0x23
 0xFA,0x0B
 0xFA,0x32
 0x100 Type of object logging into
 0x05 Length of object name to follow
 TEST1 Name of the account

Needless to say, once I had the proper tools, I did an extensive mapping study
of the Netware commands and their corresponding NCP packets.


Conclusion


This snooper gives you a powerful tool with which to hunt for the various
protocol bugs that crop up. I've found it particularly helpful with regard to
IPX/SPX internode communication. In closing, I'd like to thank Henry Ngai at
D-Link (Irvine, California) for providing information on the address of the
data and reset ports.

_YOUR OWN NETWORK DATA SNOOPER_
by Rahner James


[LISTING ONE]

/*****************************************************************************

* Title: SNOOP.H -- Copyright (c) October 1992, Ryu Consulting by Rahner James
* This is the header file to be used for print server and NIC activities
*****************************************************************************/

#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <rjuser.h>

#define SAVE_BAD_PACKETS 1
#define ACCEPT_RUNT_PACKETS (1 << 1)
#define ACCEPT_BROADCAST (1 << 2)
#define ACCEPT_MULTICAST (1 << 3)
#define SLUT (1 << 4)
#define MONITOR_MODE (1 << 5)

#define IPX_HEADER_SIZE 30
#define NIC_HEADER_SIZE 18

/******* ECB Flags *******/
#define ECB_OWNER_DRIVER 0
#define ECB_OWNER_APP 1
#define ECB_FREE_XMIT (1 >> 1)

/* Socket numbers */
#define NCP_SOCKET 0x451
#define SAP_SOCKET 0x452
#define ROUTE_SOCKET 0x453

/* Error return codes */
#define NCPERR_NOERROR 0
#define NCPERR_NO_QJOB -43
#define NCPERR_FILENAME -128
#define NCPERR_ALLOC -129
#define NCPERR_TIMEOUT -130
#define NCPERR_NOACCOUNT -131
#define NCPERR_NOACCESS -132
#define NCPERR_LOST_CONNECT -133
#define NCPERR_MALFORMED -134

typedef struct NIC_ECB_S
{
 ui next;
 us flags;
 void (FAR *esr)();
 volatile char in_use;
 volatile char completion;
 uc node[6];
 us NIC_length;

/* The IPX structure follows: */
 us checksum;
 us ipx_length;
 uc transport;
 uc packet_type;
 ul dest_network;
 uc dest_node[6];
 us dest_socket;
 ul src_network;

 uc src_node[6];
 us src_socket;
} NIC_ECB_T;
typedef struct TASK_S
{
 struct TASK_S FAR *next;
 struct TASK_S FAR *prev;
 uc priority;
 uc status;
 void (FAR *function)( struct TASK_S FAR * );
 void FAR *buffer;
} TASK_T;

extern ui Base_Port;
extern ul Int_Count;
extern uc IRQ_Number, DMA_Channel, NE2000_Address[6];
extern ui Buffer_Start,Buffer_Size,Total_Allocated,First_Free,Last_MCB;
extern volatile ul Tick_Count;
extern volatile ui Total_Unread_Buffers;
extern volatile ui Total_Xmit_Buffers;
extern volatile ui Total_Xmit_Errors;
extern volatile ui Total_Recv_Errors;
extern volatile ui Total_Missed_Packets;
extern volatile ul Total_Xmit_Packets;
extern volatile ul Total_Recv_Packets;
extern uc NIC_Normal_Recv_Config;

extern int init_ne2000( void );
extern int start_ne2000_interrupts( void );
extern int stop_ne2000_interrupts( void );
extern int init_memory( void );
extern void FAR *alloc_NIC_ecb( ui );
extern int free_NIC_ecb( void FAR * );
extern NIC_ECB_T FAR *read_packet( void );
extern void start_time_isr( void );
extern void stop_time_isr( void );
extern void show_packet_body( void FAR *, int, int );
extern void show_net_address( int, int, void FAR * );
extern void show_node_address( int, int, uc FAR * );
extern int is_packet_valid( NIC_ECB_T FAR * );

extern int is_ipx( void );
extern int add_task( TASK_T FAR * );
extern int do_next_task( void );
extern void atoaddr( char FAR *, char FAR * );
extern void FAR *farset( void FAR *, char, unsigned int );

#ifdef __WATCOMC__
 #pragma aux init_ne2000 "_*";
 #pragma aux start_ne2000_interrupts "_*";
 #pragma aux stop_ne2000_interrupts "_*";
 #pragma aux init_memory "_*";
 #pragma aux alloc_NIC_ecb "_*" parm [cx] value [es di];
 #pragma aux free_NIC_ecb "_*" parm [es di];
 #pragma aux read_packet "_*" value [es di];
 #pragma aux start_time_isr "_*";
 #pragma aux stop_time_isr "_*";

 #pragma aux is_ipx "_*";

 #pragma aux show_packet_body "_*" parm [dx ax] [cx] [bx];
 #pragma aux show_net_address "_*" parm [di] [ax] [cx si];
 #pragma aux show_node_address "_*" parm [di] [ax] [cx si];
 #pragma aux is_packet_valid "_*" parm [es di];

 #pragma aux add_task "_*" parm [es di];
 #pragma aux do_next_task "_*";

 #pragma aux atoaddr "_*" parm [es di] [dx si];
 #pragma aux farset "_*" parm [es di] [ax] [cx] value [es bx];
#endif


[LISTING TWO]

/****************************************************************************
* Title: SNOOP.C -- Copyright (c) October 1992, Ryu Consulting by Rahner James
* This file contains the main entry and looping functions
******************************************************************************/
#define _SNOOP_C_
#include "snoop.h"

/***************************************************************************
* Global Data Items--All variables that have global accessibility have first
* letters of each whole word within their name capitalized. All variable names
* (local or global) begin with a noun or a word that is a noun by context.
*****************************************************************************/
/* Names of the type of packets that will be passed */
uc *Type_Names[] =
{
 "Unknown Type, maybe IPX", // 00 - anything, but probably IPX
 "Routing Info", // 01 - Routing information packet
 "Echo Packet", // 02 - Hello, hello, hello ...
 "Error Packet", // 03 - Never saw one
 "IPX", // 04 - IPX packet
 "SPX", // 05 - SPX packet
 "unknown", // 06
 "unknown", // 07
 "unknown", // 08
 "unknown", // 09
 "unknown", // 10
 "unknown", // 11
 "unknown", // 12
 "unknown", // 13
 "unknown", // 14
 "unknown", // 15
 "unknown", // 16
 "NCP" // 17 - Netware Core Protocol packet
};
int Match_ID_Count = 0; // Number of ID in match this array
uc Match_IDs[20][6]; // Storage for arrays we have to match
int Match_Socket_Count = 0; // Sockets in socket match array
us Match_Socket[20]; // Sockets we have to match
int Handle = -1; // Handle for the loggin file
int Full_Display = 1; // !0 if we want to see entire packet
char And_Flag = 0; // 0 = socket or ID, !0 = socket and ID
char Broadcast_Flag = 0; // 0=no checking, !0=check

/*****************************************************************************

* void PARSE_LINE( int ARGC, char *ARGV[] )--Parses command line. Given: ARGC=
* number of command line tokens; ARGV -> array of pointers to command line
* tokens. Returns: Command line parsed and variables updated
******************************************************************************/
void parse_line( int argc, char *argv[] )
{
 int i,j;
 farset( NE2000_Address, 0, 6 );
 for ( i=1 ; i < argc ; ++i )

 {
 if ( (*argv[i] == '/') (*argv[i] == '\\') (*argv[i] == '-') )
 ++*argv[i];
 switch ( *argv[i] )
 {
 case 'A': // Catch all packets
 case 'a':
 NIC_Normal_Recv_Config = SLUT;
 break;
 case 'B': // Check broadcast as well
 case 'b':
 Broadcast_Flag = 1;
 break;
 case 'C': // AND flag
 case 'c':
 And_Flag = 1;
 break;
 case 'D': // OR flag
 case 'd':
 And_Flag = 0;
 break;
 case 'F': // File to output the packet to
 case 'f':
 ++argv[i];
 if ( (*argv[i] == '=') (*argv[i] == ' ') )
 ++argv[i];
 print( "\fOpening file %s ...", argv[i] );
 Handle = dos_open( argv[i], OPEN_RW+OPEN_CREATE );
 print( "\nReturn = %d", Handle );
 if ( Handle < 0 )
 exit( 2 );
 dos_write( Handle, &i, 0 );
 break;
 case 'I': // Match ID
 case 'i':
 if ( Match_ID_Count >= 20 )
 break;
 ++argv[i];
 if ( (*argv[i] == '=') (*argv[i] == ' ') )
 ++argv[i];
 atoaddr( Match_IDs[Match_ID_Count++], argv[i] );
 break;
 case 'N': // Set current node ID
 case 'n':
 ++argv[i];
 if ( (*argv[i] == '=') (*argv[i] == ' ') )
 ++argv[i];
 if ( strlen(argv[i]) == 12 )
 atoaddr( NE2000_Address, argv[i] );

 break;
 case 'P': // Base port
 case 'p':
 ++argv[i];
 if ( (*argv[i] == '=') (*argv[i] == ' ') )
 ++argv[i];

 j = atoh( argv[i] );
 if ( (j >= 0x200) && (j <= 0x360) )
 Base_Port = j;
 break;
 case 'R': // IRQ number
 case 'r':
 ++argv[i];
 if ( (*argv[i] == '=') (*argv[i] == ' ') )
 ++argv[i];
 j = atoi( argv[i] );
 if ( (j > 1) && (j < 16) )
 IRQ_Number = j;
 break;
 case 'S': // Match socket
 case 's':
 if ( Match_Socket_Count >= 20 )
 break;
 ++argv[i];
 if ( (*argv[i] == '=') (*argv[i] == ' ') )
 ++argv[i];
 Match_Socket[Match_Socket_Count++] = atoh( argv[i] );
 break;
 case '?':
 print( "\n\nCommand lin syntax is:"
 "\n\ttest [option 1] [option 2] ... [option n]" );
 print( "\n\nCommand line options:"
 "\n\tA - accept all packets "
 "\n\tB - accept broadcast packets (use with I option)"
 "\n\tC - accept only correct IDs and sockets"
 "\n\tD - accept correct IDs or sockets (default)"
 "\n\tFxxxxxxxx - name of output file for data logging"
 "\n\tI######## - node ID to accept data for"
 "\n\tN######## - set our packet ID as this number"
 "\n\tP### - set base port address (in hex)"
 "\n\tR## - IRQ number (in decimal)"
 "\n\tS#### - socket number to accept data for (in hex)"
 "\n\nNote: if no ID or socket qualifiers are set, all
 packets will be accepted."
 "\n" );
 exit( 0 );
 break;
 default:
 break;
 }
 }
}
/******************************************************************************
* int DISPLAY_PACKET(NIC_ECB_T FAR *NP )--Displays contents of a packet.
Given:
* NP -> reception packet. Returns: 0 if nothing going on, !0 if exit program
******************************************************************************/
int display_packet( NIC_ECB_T FAR *np )
{

 int y, i, j;
 char FAR *cp;
 static char first_time = 1;

 if ( first_time )
 {
 clear_line( 0, BLUE + (WHITE BACKGROUND) );
 clear_line( 1, (INTENSE WHITE)+(CYAN BACKGROUND) );
 center( "IPX and NIC Header Information", 1, (INTENSE WHITE)+
 (CYAN BACKGROUND) );
 print( "%p%r[ ]", 34,0 );
 print( "%p%rTransport:%pType:", 2,2, 26,2 );
 print( "%p%rIPX Length:%pSrc:", 2,3, 26,3 );
 print( "%p%rNIC Length:%pDest:", 2,4, 26,4 );
 first_time = 0;
 no_cursor();
 }
 print( "%p%r%ld packets%pUnread =%4d", 2,0, Total_Recv_Packets, 67,0,
 Total_Unread_Buffers );
 show_node_address( 35,0, np->node );
 if ( Full_Display == 0 )
 return 0;
 show_net_address( 32,3, &np->src_network );
 show_net_address( 32,4, &np->dest_network );

 clear_area( 36,2, _Screen_Width-1,2, 0 );
 print( "%p%r%4d%p%4d%p%2X%p%d - %s", 14,3, np->ipx_length,
 14,4, np->NIC_length,
 16,2, np->transport,
 32,2, np->packet_type,
 np->packet_type > 17 ? "unknown" : Type_Names[np->packet_type] );
 switch ( np->src_socket )
 {
 case 0x451:
 print( " (File Service)" );
 break;
 case 0x452:
 print( " (SAP)" );
 break;
 case 0x453:
 print( " (Route Information)" );
 break;
 case 0x455:
 print( " (NetBIOS Packet)" );
 break;
 case 0x456:
 print( " (Diagnostic Packet)" );
 break;
 default:
 break;
 }
 show_packet_body( &np[1], np->ipx_length, np->NIC_length );
 return 0;
}

/******************************************************************************
* int INIT_SNOOPER( int ARGC, char *ARGV[] )--Initializes snooper, program's
* internals, and display. Given: ARGC = number of command line tokens ARGV ->
* array of pointers to the command line tokens. Returns: 0 if all went well;


* otherwise an exit code
******************************************************************************/
int init_snooper( int argc, char *argv[] )
{
 int i;
 print( "\fEthernet Snooper for NE-2000 Compatible Adapters on
 Netware, version 1.00"
 "\nCopyright (c) September 1992, Ryu Consulting, 916/722-1939"
 "\nWritten by Rahner James\n" );
 if ( is_ipx() )
 {
 print( "\nIPX is currently installed, please remove it" );
 beep();
 return 1;
 }
 if ( argc > 1 )
 parse_line( argc, argv );
 print( "\nReturn from initialization of memory pool = %d", init_memory() );
 print( "\nReturn from initialization of NE2000 = %d", i=init_ne2000() );
 if ( i )
 return 2;
 clear_screen( (INTENSE WHITE)+(BLUE BACKGROUND) );
 clear_line( 5, (INTENSE WHITE)+(CYAN BACKGROUND) );
 center("IPX Packet Data Display",5,(INTENSE WHITE)+(CYAN BACKGROUND) );
 clear_area( 0,6,_Screen_Width-1,_Screen_Height-1,(INTENSE WHITE)+
 (GREEN BACKGROUND) );
 print( "%p%rEthernet Snooper, version 1.00, waiting for first
 packet\n%58c", 2,0, 0xc4 );
 return 0;
}
/******************************************************************************
* int MAIN( int ARGC, char *ARGV[] )--Initial entry point into this program.
* Given: ARGC = number of command line tokens ARGV -> array of pointers to
* command line tokens. Returns: Exit code
******************************************************************************/
main( int argc, char *argv[] )
{
 int i,j, x,y, loop_flag = 1;
 ui key;
 uc FAR *cp;
 NIC_ECB_T FAR *np;
/* First, set everthing up */
 if ( i=init_snooper(argc,argv) )
 return i;
/* Now, loop waiting for packets or a keystroke */
 while ( loop_flag )
 {
 do_next_task();
 if ( kbhit() )
 {
 if ( charin() == ESC )
 loop_flag = 0;
 }
 if ( Total_Unread_Buffers )
 {

 if ( (np = read_packet()) == NULL )
 {

 print( "%p%a\nGot a packet but no data", 0,_Screen_Height-1, WHITE );
 beep();
 break;
 }
 if ( is_packet_valid(np) )
 {
 display_packet( np );
 if ( Handle >= 0 )
 dos_write( Handle, np, np->NIC_length+0x12 );
 }
 free_NIC_ecb( np );
 } // if (Total_Unread_Buffers)
 } // while
/* Close everything up and go home */
 if ( Handle >= 0 )
 dos_close( Handle );
 print( "%p%a\n", 0,_Screen_Height-1, WHITE );
 clear_line( _Screen_Height-1, WHITE );
 print( "%p%aStopping ...\n", 0,_Screen_Height-1, WHITE );
 i = stop_ne2000_interrupts();
 stop_time_isr();
 print( "Return = %d", i );
 print( "\nTotal interrupts received = %ld", Int_Count );
 show_cursor();
}


Example 1

 0x1111 Signature
 0x00 Initialize the sequence counter
 0xFF Need a connection
 0xFF00 No reason for this
 0x00 Command 0



Example 2

 0x2222 Signature
 0x01 Sequence
 0x07 My connection number
 0x0002 <blah> <blah> <blah>
 0x4C Open file command
 0x00 Filler
 0x0106 Open attribute
 0x1D Length of the filename to follow
 xxxxxx SYS:SYSTEM/19000003/PRINT.000
 Note: the '.' that separates PRINT from 000 is
 actually a 0xAE, because Netware requires the
 the periods to be converted.

Example 3

 0x3333 Response signature
 0x01 Sequence
 0x07 Connection
 0x0001 Yep, it's one less
 0x00 Good return code

 0x00 Filler
 0x13,0x53 Handle to use to access the file's data
 0x7C,0x2D
 0x00,0x00
 0x0000 ??? Always 0.
 PRINT.000 Filename. Always 12 bytes, NULL padded.
 4 bytes File attribute of some sort. Always is 0x200000.
 4 bytes File size in Motorola format.
 0x2719 File creation date. MS-DOS format.
 0x2719 Date of last access to this file. MS-DOS format.
 0x2719 Date of the last update to this file. MS-DOS
 format.
 0x287E Time of the last update to this file. MS-DOS
 format.


Example 4:

(a)

 0x17 Command
 0x00 Filler
 0x01 Number of bytes to follow
 0x17 Request for encryption key. This key will be
 different for every request made of the server.

(b)
 0x00 Good return code
 0x00 Filler
 0xDE,0x6C Eight byte encryption key.
 0x18,0x36
 0x93,0x72

 0xD2,0xA3


Example 5

 0x17 Command
 0x00 Filler
 0x11 Length of the request data to follow
 0xC4,0xA6 Encryption sequence
 0x39,0x23
 0xFA,0x0B
 0xFA,0x32
 0x100 Type of object logging into
 0x05 Length of object name to follow
 TEST1 Name of the account














May, 1993
DIRECT MEMORY ACCESS FROM PC FORTRANS


No assembly required


 This article contains the following executables: CPTPUT.ARC


Kenneth G. Hamilton


Ken has a PhD in physics from the University of California at San Diego and
has been involved in solid-state theory, atomic and molecular calculations,
numerical hydrodynamics, exploration geophysics, and signal processing. He can
be contacted at Garjak Research, 5330 Carroll Canyon Road, Suite 100, San
Diego, CA 92121 or on CompuServe at 72727,177.


The Fortran language has traditionally been used on large systems where
multiuser protection is an issue. Under these conditions, it has never been
tolerable for user programs to access pure physical addresses, because they
would then be able to interfere with other concurrent programs. As a result,
no direct memory-access capabilities have ever been built into standard
Fortran that would be comparable to, say, Basic's PEEK and POKE. However,
because DOS is a single-user, nonprotected environment, there is no one else
to interfere with. In addition, DOS does not provide an operating-system
control library as, for example, VAX/VMS does. Therefore, it is often up to
the individual programmer to figure out how to get at the bells and whistles
of the machine.
Most PC Fortrans now provide a routine for calling DOS and BIOS interrupts,
leaving them with one major deficiency: the inability to access memory using
absolute addresses to get at places such as the BIOS data area and the
memory-mapped video. This article remedies that by describing how to perform
direct memory access using some of the better-known PC Fortrans, including the
Microsoft, Watcom, and Lahey compilers.


Microsoft Fortran


Microsoft's real-mode Fortran compiler normally passes all arguments by
reference: Before jumping to a subroutine, a calling routine pushes the
segment: offset address of each argument onto the stack. Ordinarily, the
compiler and linker control the numerical values of these addresses, but now
we want to bypass this mechanism and specify an address ourselves. The
approach is to construct a 4-byte address, define the argument as a 4-byte
integer, and pass the argument by value rather than by reference. The calling
routine will then push this number (rather than a pointer to it) onto the
stack. The subroutine below it will pop that value off and treat it as though
it were the address of a normal variable somewhere in RAM.
For example, suppose that you want to POKE into a particular location. This
can be easily done by passing two arguments: a 4-byte address sent by value
(but received by reference); and an (conventional) argument that has the value
to store into the indicated location. The called subroutine then merely copies
from the second argument to the first, unaware of the fact that one of those
addresses is artificial. You, of course, can make the first address point
anywhere you want.
As a concrete example, the INTERFACE block in Listing One, page 102,
references the PEEKBO subroutine. The INTERFACE block specifies the passing
protocol for each argument. Arguments can be individually specified as
[REFERENCE] or [VALUE]. Any argument whose passing method is not explicitly
declared is by default passed by [REFERENCE]. Therefore, PEEKB0 requires that
its first argument be passed by value and its second by reference. However,
PEEKBO (see Listing Two, page 102) knows nothing of this pass-by-value
business. PEEKB0 expects the addresses of two ordinary INTEGER*1 variables and
should copy the first into the second. In this case, the second argument
points to a byte in the BIOS data area. This byte, at address 449h, contains a
value that indicates the video mode. Monochrome text is mode 7, and all others
are color modes.
Listing One declares a second subroutine, CRTPUT, to pass its first argument
by value and its other four arguments by reference. In the main body of
Listing One, that first argument icrt is given the value B8000000h (B0000000h,
in the case of monochrome) and then passed to CRTPUT. This subroutine (see
Listing Two) thinks it is being sent the segmented address B800:0000 (or
B000:0000) and maps the screen array into that location. Of course, this
causes screen to be located exactly where the video adapter is sitting, and so
the messages appear on the screen in the rows and columns indicated. It should
be noted that the screen memory consists of both characters and video
attributes. The first, third, fifth, and so on bytes are the characters to be
displayed, starting from the upper-left corner; the second, fourth, sixth, and
so on bytes are the attributes that go with each of those characters. We have
defined the screen array so that the first subscript indicates character vs.
attribute, the second subscript is the column number, and the third subscript
is the line number. This effectively trivializes the offset calculation within
the display adapter.
The Microsoft compiler is smart enough to know that we are trying to trick it,
and will issue a diagnostic message if both the calling program
(CRT_WRITE_MSF) and the subroutines (PEEKB0 and CRTPUT) are compiled from the
same source file. While the generated code works, you can avoid these spurious
errors by placing the caller and callee in different source files and
compiling them as separate modules.


Protected Mode


Microsoft's 32-bit protected-mode compiler, Fortran PowerStation 1.0, takes
advantage of the Phar Lap DOS extender. Unfortunately, there appears to be no
direct way of carrying out a corresponding trick using this compiler and
extender. In a protected-mode environment, an address is not a simple segment
and offset. Rather, it involves a pointer to a selector in a memory-allocation
table. The Phar Lap loader prevents user programs from accessing memory areas
that do not belong to them. There is a selector that points to the screen in
that environment, but it is necessary to use assembly code in order to work
with it.
For those willing to tackle some assembly language, I've included a module for
screen-writing from Fortran PowerStation (see "Availability," page 7). The
example writes to the screen. If you need to access some other area of memory,
you'll need to find a selector that points to that region. Also note that the
PowerStation compiler expects entry-point names to be in the "decorated"
format, with a leading underscore character and the list of the number of
bytes of each argument after the root name. This routine was written to be
assembled by MASM 6.0. (The new MASM 6.1 is able to automatically decorate
names.)


Lahey Fortran


Lahey Computer Systems' real-mode compiler, F77L, ordinarily passes numeric
data by pushing the segment and offset onto the stack, just like the Microsoft
compiler. Fortunately, F77L includes a %VAL() function that allows you to pass
by value. %OVAL(), which was syntactically copied from VAX Fortran, can only
be used in the argument list of a CALL statement or function invocation.
As an example, Listing Three (page 102) makes calls to the very same PEEKB0
and CRTPUT routines (in Listing Two) as Listing One. In fact, both PEEKB0 and
CRTPUT can be used unmodified with each of the compilers discussed--all of the
compiler-specific code is contained in the calling routine. F77L does not
require the two routines to be separately compiled.
Note, when comparing the Lahey and Microsoft versions, that the format of the
hexadecimal numbers is different. Unfortunately, the ANSI Fortran-77 standard
did not include an official definition for hex numbers, but the new Fortran-90
standard will finally provide one. Thus, this compatibility nuisance will
eventually go away.
Lahey also has a 32-bit compiler, F77-EM/32, which is sold with a version of
the Phar Lap extender. Just as is the case with the Microsoft PowerStation,
access to absolute addresses requires assembly code. This is certainly
feasible, but will not be considered in this article. A sample output module
is available, which writes through the screen selector, 1C(hex).


Watcom Fortran


Watcom provides a c$pragma mechanism similar to Microsoft's INTERFACE block.
The c$pragma declaration specifies how each argument is passed. If the
subroutine is called with more arguments than the c$pragma has defined, the
attribute of the last position in the declaration is applied to all of the
additional arguments. For example, Figure 1 shows a c$pragma declaration. If
the subroutine has more than two arguments, then the reference attribute will
be applied to the third, fourth, and so on arguments. Note again that the
pass-by-value address must always be declared as value*4 even if it is the
address of an integer*1, integer*2, real*4, or whatever. (This is because
addresses are always 32 bits in length.) Finally, it is necessary to compile
the calling routine separately from the called one, or else the value
attribute becomes known to the callee, which then proceeds to receive by
value.
Figure 1: Syntax for the c$pragma declaration.

 c$pragma aux <routinename> parm
 (value*4, reference)

With Watcom, it is possible to access absolute addresses even when using the
32-bit protected-mode compiler. Watcom conveniently defines a conditional
compilation variable, __386__, which allows you to have one compiler-specific
module that works in either real or protected mode. Listing Four, page 102,
shows how to write to and read from the screen using either Watcom's real-mode
compiler, F77, or their protected-mode F77/386. A pair of c$pragma directives
is being used in order to specify that the first argument to PEEKBO and CRTPUT
must be passed by value, and that the remaining ones should go by reference.
The conditional compilation directives in Listing Four can be seen to be
automatically selecting the appropriate address format for real or protected
mode, so that the same source code can be used with either of the Watcom
compilers.
Watcom has a clever feature for absolute addressing already built into their
compilers. If a program allocates an array, an optional location=loc clause
can be used to place the resulting array at a specified place in memory. As an
example, we could specify allocate (screen(2,80,25), location='B8000000') and
then use the screen array to access the video adapter. This is a wonderfully
seamless capability: The only drawback is that none of the other compilers
have it. This could make a multicompiler program somewhat awkward, as the
screen array should be allocated in the main program, rather than in a
compiler-specific interface module. (We wouldn't want to allocate and
deallocate for each video I/O call.)
The protected-mode compiler also allows the use of the location=loc clause in
an allocate statement, although the location specified needs to reflect the
flat addressing format, rather than the segment-offset manner of the
real-memory version.



Silicon Valley Software Fortran


SVS only sells a protected-mode Fortran. Like the Lahey implementation, SVS
has a VAX-like %VAL() function for defining a hexadecimal value using a dollar
sign. The sample program for use with this compiler is shown in Listing Five,
page 102. Note that the format of the video address must be changed to reflect
the 32-bit flat memory layout, just as when using the Watcom protected-mode
compiler.


Text Graphics


Other applications in which this capability can come in handy involve reports
of executing program status in static boxes on the screen, along with
character graphics. To demonstrate the latter case, Listing Six (page 102)
presents a mocked-up, one-dimensional hydrodynamics program that computes the
propagation of a shock wave, as pressure (P) vs. distance (X). Both of these
working arrays are located in a common block--a typical practice in large
simulations. (In fact, the real guts of this program are missing, replaced by
a simple analytical expression in the CALCULATION subroutine.)
The portions of the program dependent on Microsoft's implementation of the
compiler have been isolated into an interface module (SCR_INIT and its
alternate entry points) that serves as an interface to a plot-drawing routine,
VIDEO. This drawing routine (see Listing Seven, page 103) makes a profile of
pressure vs. distance at each time step, and can also display additional
information about the progress of the calculation. The particular operation
performed depends on the value of the switch variable IFUNC. VIDEO scales the
axes to match the data being plotted, but this particular scaling routine
expects only positive values. If you want to use it to plot points with both
positive and negative values, some modifications will be necessary.
The progress of the calculation can then be assessed by simply watching the
screen, even though (in a real computation) the major part of the output,
composed of mass quantities of numbers, would be going to a disk file. The
problem time and time-step number, along with other selected data, are written
to fixed positions on the screen and are also updated every cycle.
The screen shows something called the "Courant zone," named after the
mathematician Richard Courant who, in 1928, showed that the solution of a
partial differential equation can be approximated by the solution of a
difference equation, and defined the conditions for numerical stability. In a
real simulation of this kind, the Courant condition requires that the time
steps be small enough that a sound wave not be able to cross any of the zones
during a single step. This involves computing a local sound speed in each zone
(sound speed goes up with increasing pressure in any real material), computing
the thickness of each zone, and identifying the zone with the shortest travel
time. This is the Courant zone, and it is normally in the area of greatest
compression at the shock front; the next time step is taken as a fraction
(typically 0.3-0.9) of its traversal time. In this demonstration, we are
stepping the Courant zone forward at one full zone per cycle in order to have
a parameter to use in the calculation of the pressure pulse.


Direct-memory Access


Direct-memory access from PC Fortrans is relatively simple to code and fast to
execute. And while the examples in this article deal primarily with video I/O,
the method can be used in a wider range. For example, the machine ID byte at
address FFFF:000E (in the bootstrap segment of the ROM BIOS chip) is easily
accessible and can provide a program with useful information about the
hardware it is running on. In my experience, the most maintainable method of
using this in a large program is to write one primary video-output subroutine
that, when called, is passed the screen address and a switch variable that
tells the routine which part of the display to update. Internally, the switch
variable can be used to direct the flow of control to the proper section. The
general structure of a program that does direct memory operations is shown in
Figure 2.


Conclusion


It's unfortunate that absolute data addressing has never been adequately dealt
with by any of the Fortran standards committees. The capability to declare a
POINTER variable is now included in Fortran-90, but there is really no reason
why a named COMMON block cannot be given an "absolute" attribute and a
starting location. This could, in fact, be done entirely in a linker without
any changes to the associated compiler. If that capability were available,
locations of scalars would be completely resolved prior to execution time,
with addresses being present in instructions as immediate data, thus allowing
the fastest possible access in real-time control situations. By comparison,
the PEEK and POKE of Basic require a subprogram call, and the pointer
mechanism of C or Fortran-90 constitute indirect addressing, both of which are
slightly slower.
While we are waiting for Fortran-90, however, we can still go ahead and access
memory with most PC Fortrans, with all compiler-dependent code contained in
one source file and no machine-dependent "contamination" in the major part of
a program.


Acknowledgments


I'd like to thank Blair Learn of Lahey Computer plus John Norwood and Bruce
McKinney of Microsoft for advice on getting around the Phar Lap extender used
with their respective protected-mode compilers.

_DIRECT MEMORY ACCESS FROM PC FORTRANS_
by Kenneth G. Hamilton


[LISTING ONE]


 INTERFACE TO SUBROUTINE PEEKB0(L,I)
 integer*4 L [VALUE]
 integer*1 I
 end

 INTERFACE TO SUBROUTINE CRTPUT(L,I1,I2,I3,C)
 integer*4 l [VALUE]
 integer*2 i1, i2, i3
 character*1 c
 end

 PROGRAM CRT_WRITE_MSF
 integer*4 laddr/#00000449/ ! Address of video mode byte
 integer*1 ividmod ! Value at that location
 integer*4 imono/#B0000000/ ! B&W adapter address
 integer*4 ivga /#B8000000/ ! Color adapter address
 integer*4 icrt ! Adapter in use

 integer*2 iat1, iat2, iat3 ! Video attributes
c
c Program to demonstrate direct memory access using Microsoft Fortran
c Kenneth G. Hamilton
c
 call peekb0(laddr,ividmod) ! First, get the video mode
c
 if (ividmod.eq.7) then ! Mono is video mode 7
 icrt = imono
 iat1 = #07 ! Normal video
 iat2 = #0F ! Bold
 iat3 = #87 ! Blinking
 else ! All other modes are color
 icrt = ivga
 iat1 = #07 ! Normal white-on-black
 iat2 = #1F ! Bold white on blue
 iat3 = #9E ! Flashing yellow on blue
 endif
c
 call crtput(icrt, 17, 21, iat1,
 & 'Message from Microsoft Fortran Follows')
 call crtput(icrt, 18, 21, iat2, 'HOW''S ')
 call crtput(icrt, 18, 27, iat3, 'THIS?')
c
 stop
 end






[LISTING TWO]

* Listing 2 - Direct Memory Access - K. G. Hamilton

 SUBROUTINE PEEKB0(I,IBYTE)
 integer*1 i,ibyte
 ibyte = i ! Get what's there
 return
 end

 SUBROUTINE CRTPUT(SCREEN,ILIN,JCOL,IATT,A)
 character*(*) a
 integer*1 screen(2,80,25),ic1
 integer*2 ilin, jcol, iatt
c
c This routine puts a character string to the screen with a
c video attribute.
c
c Input: screen = array mapped to the screen
c ilin = line number
c jcol = starting column number
c iatt = video attribute
c a = string to write
c
 la=len(a) ! Length of message
c
 do 20 j=1,la

 ic=ichar(a(j:j))
 ic1=iand(ic,255)
 screen(1,jcol+j-1,ilin)=ic1 ! Character
 20 screen(2,jcol+j-1,ilin)=iand(iatt,255) ! Attribute
c
 return
 end






[LISTING THREE]

 PROGRAM CRT_WRITE_F77L
 integer*4 laddr/z'00000449'/ ! Address of video mode byte
 integer*1 ividmod ! Value at that location
 integer*4 imono/z'B0000000'/ ! B&W adapter address
 integer*4 ivga /z'B8000000'/ ! Color adapter address
 integer*4 icrt ! Adapter in use
 integer*2 iat1, iat2, iat3 ! Video attributes
c
c Program to demonstrate direct memory access using Lahey F77L
c Kenneth G. Hamilton
c
 call peekb0(%val(laddr),ividmod) ! First, get the video mode
c
 if (ividmod.eq.7) then ! Mono is video mode 7
 icrt = imono
 iat1 = 7 ! Normal video
 iat2 = 15 ! Bold
 iat3 = 8*16+7 ! Blinking
 else ! All other modes are color
 icrt = ivga
 iat1 = 7 ! Normal white-on-black
 iat2 = 1*16+15 ! Bold white on blue
 iat3 = 9*16+14 ! Flashing yellow on blue
 endif
c
 call crtput(%val(icrt), 17, 21, iat1,
 & 'Message from Lahey F77L Follows')
 call crtput(%val(icrt), 18, 21, iat2, 'HOW''S ')
 call crtput(%val(icrt), 18, 27, iat3, 'THIS?')
c
 stop
 end






[LISTING FOUR]

 PROGRAM CRT_WRITE_WATCOM
c$pragma aux peekb0 parm (value*4, reference)
c$pragma aux crtput parm (value*4, reference)
 integer*4 laddr/z00000449/ ! Address of video mode byte

 integer*1 ividmod ! Value at that location
c$ifdef __386__ ! Use flat memory addresses
 integer*4 imono /z000B0000/ ! Monochrome adapter
 integer*4 ivga /z000B8000/ ! Color adapter
c$else ! Use segmented memory addresses
 integer*4 imono /zB0000000/ ! Monochrome adapter
 integer*4 ivga /zB8000000/ ! Color adapter
c$endif
 integer*4 icrt ! Adapter in use
 integer*2 iat1, iat2, iat3 ! Video attributes
c
c Program to demonstrate direct memory access using either
c Watcom F77 or F77/386.
c Kenneth G. Hamilton
c
 call peekb0(laddr,ividmod) ! First, get the video mode
c
 if (ividmod.eq.7) then ! Mono is video mode 7
 icrt = imono
 iat1 = 7 ! Normal video
 iat2 = 15 ! Bold
 iat3 = 8*16+7 ! Blinking
 else ! All other modes are color
 icrt = ivga
 iat1 = 7 ! Normal white-on-black
 iat2 = 1*16+15 ! Bold white on blue
 iat3 = 9*16+14 ! Flashing yellow on blue
 endif
c
c$ifdef __386__
 call crtput(icrt, 17, 21, iat1,
 & 'Message from Watcom F77/386 Follows')
c$else
 call crtput(icrt, 17, 21, iat1,
 & 'Message from Watcom F77 Follows')
c$endif
 call crtput(icrt, 18, 21, iat2, 'HOW''S ')
 call crtput(icrt, 18, 27, iat3, 'THIS?')
c
 stop
 end






[LISTING FIVE]

 PROGRAM CRT_WRITE_SVS
 integer*4 laddr/$00000449/ ! Address of video mode byte
 integer*1 ividmod ! Value at that location
 integer*4 imono/$000B0000/ ! B&W adapter address
 integer*4 ivga /$000B8000/ ! Color adapter address
 integer*4 icrt ! Adapter in use
 integer*2 iat1, iat2, iat3 ! Video attributes
c
c Program to demonstrate direct memory access using SVS Fortran
c Kenneth G. Hamilton

c
 call peekb0(%val(laddr),ividmod) ! First, get the video mode
c
 if (ividmod.eq.7) then ! Mono is video mode 7
 icrt = imono
 iat1 = $07 ! Normal video
 iat2 = $0F ! Bold
 iat3 = $87 ! Blinking
 else ! All other modes are color
 icrt = ivga
 iat1 = $07 ! Normal white-on-black
 iat2 = $1F ! Bold white on blue
 iat3 = $9E ! Flashing yellow on blue
 endif
c
 call crtput(%val(icrt), 17, 21, iat1,
 & 'Message from SVS Fortran Follows')
 call crtput(%val(icrt), 18, 21, iat2, 'HOW''S ')
 call crtput(%val(icrt), 18, 27, iat3, 'THIS?')
c
 stop
 end






[LISTING SIX]

 PROGRAM PLOT_DEMO_WITH_MSF
 common /ctl/ ividmod, ncycle, maxcyl, time, maxz, iactz, icourn
 common /ctl/ istatus
 common /probvars/ x(1000), p(1000)
c
c Demonstration of screen text graphics
c This program is intended to look like a one-dimensional
c finite-difference calculation of shock wave propagation.
c Kenneth G. Hamilton
c
 maxz = 1000 ! Maximum number of zones
 ncycle = 1 ! Initialize cycle number
 maxcyl = 800 ! Maximum number of cycles to run
c
 call scr_init ! Initialize the plot
 istatus = 1 ! Status is "running"
 call scr_status ! Show status
c
c Main problem loop
c
 100 call calculation ! Hydrodynamics done here
 call scr_draw ! Draw the data
 call time_step ! Move to next cycle
 if (ncycle.le.maxcyl .and. iactz.le.maxz) go to 100
c
c Display completion message and
c wait for key press before exiting
c
 istatus = 2 ! Status is "done"

 call scr_status ! Show status
 call press_any_key
c
 return
 end

 SUBROUTINE CALCULATION
 common /ctl/ ividmod, ncycle, maxcyl, time, maxz, iactz, icourn
 common /ctl/ istatus
 common /probvars/ x(1000), p(1000)
 integer*4 itime/0/, itime0/0/
 save itime, itime0, amp
c
c This is where a program would normally compute the
c pressures, velocities, positions, etc., using a finite
c difference scheme.
c We're just faking it here, for this demonstration.
c
 if (ncycle.eq.1) then ! Perform some initialization
 amp = 500. ! Original amplitude of shock
 icourn = 15 ! Fake the Courant zone number
 iactz = icourn + 5 ! Number of active zones
 else
 amp = 0.995 * amp ! Let the peak decay a bit
 endif
c
 do i=1,iactz ! For all active zones
 x(i)=0.1*float(i) ! This is the zone position
 if (i.gt.icourn) then ! Front of shock
 p(i) = amp*float(iactz-i)/float(iactz-icourn)
 else ! Decaying coda
 p(i) = amp*exp(-0.01*float(iactz-i))
 endif
 enddo
c
c This delay loop is intended to mimic the main part of the program.
c You can set INC=0 to get maximum speed, or a larger value
c in order to slow things down for better visibility of the plot.
c INC is the time delay in 1/100ths of a second, between
c consecutive cycles.
c
c inc=10
 inc = 0
 do while (itime .le. itime0+inc)
 call gettim(ih,im,is,ic)
 itime = int4(ic) + 100*(int4(is) + 60*(int4(im) + 60*int4(ih)))
 enddo
 itime0 = itime
c
 return
 end

 SUBROUTINE TIME_STEP
 common /ctl/ ividmod, ncycle, maxcyl, time, maxz, iactz, icourn
 common /ctl/ istatus
c
c This is where the time would normally be incremented, based
c sound speed and some characteristic times and lengths.
c

 ncycle = ncycle + 1
 time = time + 1.5E-3
 icourn = icourn + 1
 iactz = icourn + 5
c
 return
 end

* Microsoft-specific portion follows

 INTERFACE TO SUBROUTINE PEEKB0(L,I)
 integer*4 L [VALUE]
 integer*1 I
 end

 INTERFACE TO SUBROUTINE VIDEO(L,I1)
 integer*4 l [VALUE]
 integer*2 i1
 end

 INTERFACE TO SUBROUTINE INTDOS [C] (ir1,ir2)
 integer*2 ir1 [REFERENCE] ! Regs into INTDOS
 integer*2 ir2 [REFERENCE] ! Regs returned
 end

 SUBROUTINE SCR_INIT
 integer*4 laddr/#00000449/ ! Address of video mode byte
 integer*1 ividmod ! Value at that location
 integer*4 imono/#B0000000/ ! B&W adapter address
 integer*4 ivga /#B8000000/ ! Color adapter address
 integer*4 icrt ! Adapter in use
 integer*2 iregs(7) ! For INTDOS
 save icrt
c
c Microsoft-specific screen interface routine
c Kenneth G. Hamilton
c
 call peekb0(laddr,ividmod) ! First, get the video mode
 if (ividmod.eq.7) then ! Mono is video mode 7
 icrt = imono
 else ! All other text modes are color
 icrt = ivga
 endif
 call video(icrt,1) ! Set up frame on screen
 return
c
 ENTRY SCR_DRAW ! Draw the data
 call video(icrt,2)
 return
c
 ENTRY SCR_STATUS ! Report status
 call video(icrt,3)
 return
c
 ENTRY PRESS_ANY_KEY ! Wait for key press
 iregs(1) = #0800 ! Load into AX register
 call intdos(iregs,iregs) ! Read from CON, no echo
 return
c

 end





[LISTING SEVEN]

 SUBROUTINE VIDEO(SCREEN,IFUNC)
 common /ctl/ ividmod, ncycle, maxcyl, time, maxz, iactz, icourn
 common /ctl/ istatus
 common /probvars/ x(1000), p(1000)
 integer*1 screen(2,80,25)
 integer*2 iat1, iat2, iat3
 character buf*80
 integer*1 kblank /32/ ! ' '
 integer*1 kstar /42/ ! '*'
 integer*1 kuplf /-38/ ! 'Z'
 integer*1 klort /-39/ ! 'Y'
 integer*1 khorz /-60/ ! 'D'
 integer*1 ktlft /-61/ ! 'C'
 integer*1 kttop /-62/ ! 'B'
 integer*1 ktbot /-63/ ! 'A'
 integer*1 klolf /-64/ ! '@'
 integer*1 kuprt /-65/ ! '?'
 integer*1 ktrgt /-76/ ! '4'
 integer*1 kvert /-77/ ! '3'
 save paxis0, xaxis0, iat1, iat2, iat3
c
c Screen text graphics routine
c Kenneth G. Hamilton
c
 go to (100,200,300) ifunc
c
 100 if (ividmod.eq.7) then ! Monochrome mode
 iat1 = #07 ! Normal video
 iat2 = #70 ! Reverse video
 iat3 = #F0 ! Flashing reverse video
 else ! Color modes
 iat1 = #1F ! Bold white on blue
 iat2 = #2F ! Bold white on green
 iat3 = #4F ! Bold white on red
 endif
c
 do ilin=1,25 ! Clear the entire screen
 do jcol=1,80
 screen(1,jcol,ilin) = kblank
 screen(2,jcol,ilin) = iand(iat1,255)
 enddo
 enddo
c
 screen(1,73,25) = ichar('X') ! Label the X-axis
 screen(1, 3, 6) = ichar('P') ! Label the Y-axis
c
 do il=2,3 ! Draw left and right sides
 screen(1, 1,il) = kvert ! Left side of top box
 screen(1,80,il) = kvert ! Right side of top box
 enddo
 do il=5,23

 screen(1, 5,il) = kvert ! Left side of main box
 screen(1,80,il) = kvert ! Right side of main box
 enddo
c
 do il=8,20,4 ! Put tick marks on L & R
 screen(1, 5,il) = ktlft ! Left side
 screen(1,80,il) = ktrgt ! Right side
 enddo
c
 do jc=2,79 ! Draw horizontals
 screen(1,jc, 1) = khorz ! Top line
 screen(1,jc, 4) = khorz ! Division between boxes
 enddo
 do jc=6,79
 screen(1,jc,24) = khorz ! Bottom line
 enddo
c
 do jc=21,66,15 ! Put tick marks on T & B
 screen(1,jc, 4) = kttop
 screen(1,jc,24) = ktbot
 enddo
c
 screen(1, 1, 1) = kuplf ! Mark the corners
 screen(1,80, 1) = kuprt
 screen(1, 1, 4) = klolf
 screen(1, 5, 4) = kttop
 screen(1,80, 4) = ktrgt
 screen(1, 5,24) = klolf
 screen(1,80,24) = klort
c
 call crtput(screen,1,34,iat1,' BOGUS CODE ')
 call crtput(screen,2,60,iat1,'Status: ')
 return
c
 200 write (buf,110) ncycle,maxcyl,time
 110 format ('Cycle',i5,' of ',i5,', Time:',1PE12.4)
 call crtput(screen, 2, 3, iat1, buf(:38))
 write (buf,120) icourn,iactz
 120 format ('Courant Zone :',i5,', Active Zones:',i5)
 call crtput(screen, 3, 3, iat1, buf(:39))
 if (iactz.le.0) then
 xmax=x(maxz)
 else
 xmax=x(iactz)
 endif
c
 pmax = 0
 do i = 1, iactz
 pmax = max(pmax,p(i))
 enddo
c
 write (buf,130) xmax,pmax
 130 format ('Max X:',1PE11.3,' Max P:',1PE11.3)
 call crtput(screen,3,44,iat1,buf(:35))
 if (xmax.le.0 .or. pmax.le.0) return
c
 if (ncycle.eq.1) then
 paxis0=0.
 xaxis0=0.

 endif
c
c Scale vertical axis
c
 call plot_scale(pmax,ppower,paxis,*190)
c
 if (paxis.ne.paxis0) then ! Rewrite p-axis labels
 do i=0,4 ! There are five labels
 il=24-4*i ! These are their line numbers
 ptemp = paxis*float(i) ! This is the a label
 write (buf,140) ptemp ! Put it into the buffer
 140 format (F4.1)
 call crtput(screen,il,1,iat1,buf(:4)) ! Write to screen
 enddo
 endif
c
c Scale horizontal axis
c
 call plot_scale(xmax,xpower,xaxis,*190)
c
 if (xaxis.ne.xaxis0) then ! Rewrite x-axis labels
 do i=0,5
 ir=5+15*i
 xtemp=xaxis*float(i)
 write (buf,140) xtemp
 call crtput(screen,25,ir-3,iat1,buf(:4))
 enddo
 endif
c
 do jc=6,79 ! Redraw bottom line
 screen(1,jc,24) = khorz ! to eliminate old stars
 enddo
 do jc=21,66,15
 screen(1,jc,24) = ktbot
 enddo
c
 do ilin=5,23 ! Blank the rest of the screen
 do jcol=6,79
 screen(1,jcol,ilin) = kblank
 enddo
 enddo
c
 do 180 iz=1,iactz ! Plot the data points
 xtemp=x(iz)
 if (xtemp.le.0) go to 180
 ix=nint((75.*xtemp)/(5.*xaxis*xpower))
 if (ix.le.0) go to 180
 ptemp=p(iz)
 if (ptemp.le.0) go to 180
 ip=nint((20.*ptemp)/(5.*paxis*ppower))
 if (ip.le.0) go to 180
 il=24-ip
 ir=5+ix
 screen(1, ir, il) = kstar ! Plot that point
 180 continue
c
 paxis0=paxis
 xaxis0=xaxis
 190 return

c
 300 if (istatus.eq.1) then
 call crtput(screen,2,68,iat2,' Running ')
 else if (istatus.eq.2) then
 call crtput(screen,2,68,iat3,' * Done * ')
 endif
 return
c
 end

 SUBROUTINE PLOT_SCALE(TMAX,TPOWER,TAXIS,*)
c
c Scaling routine for positive axes
c Kenneth G. Hamilton
c Parameters:
c tmax =
c tpower =
c taxis =
c
 if (tmax.le.0) return 1 ! Max is zero - error
 tscale=tmax
c
 tpower=1.0
 do while (tscale .lt. 2.4) ! If we're dealing with small
 tscale=10.*tscale ! numbers, scale up
 tpower=0.1*tpower
 enddo
 do while (tscale .gt. 24.) ! If they're big numbers,
 tscale=0.1*tscale ! then scale down
 tpower=10.*tpower
 enddo
 if (tscale.lt.4.8) then ! Set the whole-number
 taxis=1. ! increment to display
 else if (tscale.lt.9.0) then ! in the axis labels
 taxis=2.
 else if (tscale.lt.14.0) then
 taxis=3.
 else if (tscale.lt.19.0) then
 taxis=4.
 else
 taxis=5.
 endif
c
 return
 end

















May, 1993
BUILDING A PORTABLE PROGRAMMING ENVIRONMENT


The MKS Toolkit makes it possible




Ian E. Gorman


Ian is a developer for the Canadian Technology Marketing Group and can be
contacted at 11 Holland Ave., Suite 700, Ottawa, Canada K1Y 4S1.


My company does system development for a number of different operating
systems, including MS-DOS (local and network), UNIX, VAX/VMS, and MVS/TSO
(IBM). We do most of the programming on the clients' computers, and many of
them have only the minimum necessary programming tools (editors, compiler,
linkers, and so on). Even if more tools were available, however, differences
between sites would prevent us from acquiring the familiarity necessary for
greatest productivity.
In general, we use DOS-based PCs as terminals to minicomputers and mainframes,
doing much of the work on the PCs. This is more productive because coding the
programs for large systems is really a text-processing operation, and DOS PCs
allow us to use the same set of effective and familiar tools to handle text
for a variety of operating systems. We don't use this approach with UNIX,
because UNIX already has a good set of tools, and the differences between UNIX
systems are not great.
About a year-and-a-half ago I began using the MKS Toolkit, a set of UNIX -
like utilities for MS-DOS. The MKS Toolkit makes a DOS PC much more productive
when writing programs for MS-DOS computers, minicomputers, and mainframes. The
Toolkit also makes the DOS working environment much more like UNIX, so that
it's easier to move between the DOS development environment and a UNIX system.
There are good reasons for setting up a UNIX-like programming environment on
DOS PCs. First, the UNIX environment, with a shell and utilities, makes it
easy to build useful, special-purpose utilities with very little effort,
because much of the work is already done. Second, many of the UNIX utilities
(awk, grep, sed, and so on) are designed to handle problems that arise
frequently when working with text files. Third, MS-DOS PCs are available (or
can be brought in) everywhere. Finally, it's easy to connect an MS-DOS PC to
practically any other system using common terminal programs.
In this article, I'll describe the working environment I have set up on the
PC, then provide examples from two different projects that illustrate how the
environment reduces software-development costs. A UNIX system programmer would
not consider either of these examples to be in any way remarkable, but the two
projects represent cost savings not much less than my annual salary and are
obtained using UNIX tools in a non-UNIX environment.


The Environment


An important feature of the environment is online reference manuals. Wherever
possible, I select programs that have both online and hardcopy reference
manuals. While it's easier learning programs with hardcopy manuals, it's
faster to use the online manual after you've mastered the software. Also, as I
move from site to site, I prefer to carry everything on a few floppy disks,
leaving the manuals at home.
The foundation of the working environment is the MKS Toolkit: a collection of
about 150 UNIX utilities and commands, including a Korn shell (which combines
features of the C and Bourne shells). You can run these utilities from the DOS
command line like any other DOS program, or you can run the Korn shell and
work from a UNIX-like command line. Under the second option, you can still run
all of your MS-DOS programs from the command line.
The MKS Toolkit has an excellent online reference manual that uses man pages
similar to UNIX systems, except that the man pages are better written and
easier to understand. Also, it is easier to browse back and forth in a man
page than it is with some UNIX systems. The MKS online man pages are identical
to the hardcopy man pages in the 500-page manual. There's also a good tutorial
(only in hardcopy).
The MKS Toolkit is designed to follow POSIX. Most of the man pages have a
Portability section that tells you how closely a utility follows
specifications in POSIX.1, xOPEN, BSD UNIX, and SYSTEM V UNIX. The differences
between the MKS Toolkit and UNIX systems are no more than the differences
between several UNIX systems-SCO UNIX, DYNIX (BSD UNIX), SunOS, and Mach.
The biggest difference between UNIX and the MKS Toolkit imitation of UNIX is
the absence of multitasking. In MKS, you can't run a process in the
background, and you can't run a script that requires simultaneous processes.
Except for this, scripts that run in the MKS Toolkit can be run in UNIX with
minor changes, and vice versa. MS-DOS running the MKS Korn shell is similar to
SCO UNIX running the Korn shell.
Although the MKS Toolkit gives you some useful and powerful tools for
manipulating files, it's no substitute for a good file manager. Consequently,
my file manager is XTree Gold 2.5, which is also available for some UNIX
systems. If you prefer another file manager--and can start it from the DOS
command line--it should work with the MKS Toolkit.
I use Norton Utilities partly for some of the utilities, and partly for the
NDOS command processor. You still need a DOS command processor with the MKS
toolkit. The MKS Korn shell executes DOS batch files by passing them to a DOS
command processor. Also, programs like XTree and WordPerfect that let you
execute a DOS command must execute the command through a DOS command
processor. COMMAND.COM has given me trouble (with pipes and redirection) when
the default drive is a network drive. Switching to NDOS resolves these
troubles. The NDOS shell accepts any commands or batch files that work with
COMMAND.COM, and has a good online manual for MS-DOS and NDOS. Furthermore,
using NDOS as the MS-DOS command shell reduces the differences between MS-DOS
versions 3, 4, and 5.
When Kermit is installed on the target system, I use it as my terminal
emulator to download or upload text files. Otherwise, I use whatever my client
normally uses. If the client didn't use Kermit or any other emulator package,
I'd use Kermit's ASCII text-transfer functions to upload and download files.
Kermit has ASCII documentation files that can be converted into man pages for
MKS Toolkit use, although the two files are so long that you need to put an
index at the top. This takes less than five minutes when using grep and vi
with the MKS Toolkit. Kermit also has a good programming language to create
scripts; I've used Kermit scripts to execute sequences of commands on a
VAX/VMS system.


Example Project #1


In the first project I'll describe here, I had to produce 89 reports from an
SAS database on an IBM 3090, operating under TSO. The database contained 8000
variables describing various aspects of the performance of several mainframes
with hundreds of peripheral devices. I had no previous knowledge of the
data-base, but the online documentation for the database was excellent. I had
50 days to learn the database, identify the required data, and produce the
reports and documentation. The completed set of reports contained 2,947 SAS
statements in 11,000 lines of code and comments.
Although the documentation was good, it was cumbersome and seldom used.
Likewise, a hardcopy version would have been unmanageable, both because of its
bulk (10,000-plus pages) and because of confusing cross-references (a variable
description would say "see also" for several related variables and other
information). While the mainframe menu system was set up to let you easily
find and view individual descriptions, it still took several minutes to track
down the cross-references. By the time you had finished, you'd forgotten much
of what you had learned. I solved this by building a simple hypertext
documentation system on the PC to use for the actual programming work. I used
MKS vi as the display program, in the same way as I would use UNIX vi.
I first identified the appropriate data-base files, then downloaded the
description of each file from the mainframe to the PC. At the end of each file
description was a list of variables, each line giving a code that described
the time frames covered by the variable, the variable name, and a brief
one-line summary description. I used each description as input to a short
script (see Listing One, page 106) that extracted the variable list from the
file description. The variable list itself became one of the parts in the
hypertext system, but was also used as input to the next stage of
construction.
The second stage was to download the variable descriptions. The terminal
emulator would only download one description at a time, and there were 2000
descriptions to download. Fortunately, the download was implemented as a DOS
command, that runs inside an MKS Korn shell script (see Listing Two, page
106). Line 23 of the script is two commands--the output of one command becomes
part of the command line for the other. The awk command extracts the first
field of every line in the file named by $3, which is the third parameter in
the command line that invoked the script. Enclosing awk in backquotes makes
the list a part of the for command, which invokes the following loop once for
every name in the list, letting $i represent that name on each iteration. Line
27 does the work; $1/$i is the DOS directory and filename, and '"$2"($i)' is
the TSO PDS name and member name, in the form PDSname (member). I made the
mistake of downloading everything into one DOS directory. DOS file access is
very slow when a directory has hundreds of files, so I used XTree Gold to
relocate the files in groups of 125 to different directories.
The third stage was to make the index file that the hypertext system uses to
find files. MKS vi, like UNIX vi, has a tags facility--if you put the cursor
on a word that is the first string in a line of a sorted tags file, you can
press Ctrl-] and vi will display the file named in the second string of the
same line. The downloaded descriptions were in files that had the same name as
the variable being described. The script in Listing Three (page 106) builds
the index. Line 7 scans all files and subdirectories of c:/mic/doc to produce
a list of only the files. This list is piped into the awk program (lines 9 to
15), which extracts the filename from each complete pathname, and produces a
line with the filename, a complete filename, and a cursor address (actually, a
line number) in the file. The new list of lines is piped into a sort to ensure
that the list will be in the correct order.
Listing Four (page 106) shows the script that actually runs the system. Line
13 executes when the script has no arguments, to bring up a master list of
database files. Putting the cursor on any filename and pressing Ctrl-] brings
up the list of variables for that file. Line 15 executes when the script is
given a database filename as an argument and goes directly to the list of
variables for that file. The set ignorecase1 statement was an attempt to make
the system independent of case. It didn't work because the tags facility is
case sensitive even when case sensitivity has been turned off for editing.
The final step was to put the line set tags=c:/mic/tags in the initialization
file (ex.rc) for vi. This integrated the hypertext system with the editor,
allowing me to see documentation while editing, and allowing me to cut and
paste from the database documentation to comments in the SAS programs.
The time needed to download the descriptions and build a hypertext system out
of 3 Mbytes of text in 2000 files was one day. The system probably took 10 to
20 days off the time I would have needed to complete the project, just by
making it easier for me to use the database documentation.


Example Project #2


With another project, we had to produce on a VAX/VMS programs for about 250
reports as one part of a database. The number of different report formats
wasn't great; as many as three dozen reports had the same layout, and differed
only in details such as data and headings. Each program required 700 to 2000
lines of code, totalling about 250,000 lines of code. The problem was that we
originally estimated a much smaller amount of code and were committed to
delivery in three months. With one person on the job, it would take a year. I
was sent in to assist and discovered that we could cut more than 100 working
days off the job, making it possible for two people to deliver the code within
three months.
The database language posed some problems in this particular application. A
considerable amount of manual calculation was needed for things like the
location of data or a heading on a report line. The code was very repetitive.
The macro facility would not eliminate all of this repetition, and could not
be used at all if we wanted to compile the code. Running the code with an
interpreter instead of a compiler would give performance below our contract
specification. The limitations on the use of procedures caused an increase in
code repetition.
The cure for these problems was to automate much of the repetitive work and to
find a way to write things in detail only once. For example, if the same
complex data-set specification appeared at several points in a program, we
could name it and then use the name everywhere.
M4 (found in both UNIX and the MKS Toolkit) is an ideal tool for this job. You
can use M4 like a C preprocessor or an assembly language macro processor, but
you can apply M4 to any programming language. For such a powerful tool, M4 is
simple (the documentation is only eight pages long).
We started by getting specifications for all of the reports before doing any
programming. We then classified the reports into groups of similar reports,
looking for common features. These common features formed building blocks for
programs--a general set of M4 macro building blocks for all programs and a
separate set of macros for each group of similar programs. We wrote a
generating file for each program, which built that program out of the general
macro set and a set of group macros.
A group macro and one program macro would typically have more lines of code
than the generated program, but would take less time to develop. A shorter
development time was achieved by eliminating redundancy; the program was ready
for test sooner (less work) and was less likely to have bugs. (There are no
discrepancies when things are only defined once.) The greater number of lines
was mainly because M4 is such an ugly language; it takes many comments to make
an M4 program understandable (even to yourself!).

After the first program macro was done, the other programs in the group went
very quickly. Nearly all of the analysis and testing were done when the group
macro and the first program macro were completed. The elimination of redundant
work produces much more code than you actually write, so two people can
average about 5000 lines of finished code per day.
Listing Five (page 106) shows two examples of the macro building blocks for
one group of programs. Listings Six and Seven (page 107) show examples of the
code generated by two one-line calls to macros in Listing Five.
The first example is the use of a macro to repeat the same code fragment in
different places and to permit us to specify column positions by relative
column number instead of by the absolute character position on the line. Lines
147 to 167 of Listing Five define a macro_colheads_ that is invoked without
arguments by the one-line call _colheads_(). The result is shown in Listing
Six. The_colheads_macro invokes the macro_zcolcentr_14 times (lines 153 to
156), to center headings over each column of data in the report.
The_zcolcentr_macro (not shown) converts the relative column number (parameter
1) to an absolute position and adjusts for the length of the string (parameter
2), so that the column heading will be centered over the data for that column.
The conversion of the column number to position is based on constants that
were created by executing another macro repeatedly to calculate the position
of each column from the position and width of the previous column.
The second example shows how we used M4 macros to get around limitations on
the use of procedures in the database language. To meet performance
specifications, we had to use indexes on searches. The database would do a
very fast search if comparing an index key to a constant, but would scan the
data set if comparing the index key to a variable. This meant that a variable
would have to be passed as a string parameter to a macro if we wanted a fast
search. But we could not use macros in compiled code and we needed compiled
code to meet performance specifications.
Lines 384 to 398 of Listing Five define a macro, _getanswers_, that generates
lines 90 to 96 of Listing Seven. Lines 408 to 427 define a macro, _finddata_,
that repeats the first macro 12 times (see lines 415 to 426). For coding and
testing, the result is the same as if we write a procedure--we invoke 12
search operations with a line very similar to what we would have done with a
procedure: _finddata_('4831988','32.5','=130','=1'). We were able to specify
the search values in a way that meant something to the client, but still have
the search values concatenated with the month values into constant strings
that allowed fast search. Some of the programs required several of these
calls, each of which generated 96 lines of compilable code from one line of
code we wrote.
One of the errors I made in this project was to design excessively complicated
macros because I did not fully understand M4. I fell into the trap of writing
the macros as if they themselves were procedures. There are no procedures in
M4, even though I used M4 macros to make building blocks that looked like
procedures in the target language. When M4 evaluates a macro, the result is
appended to the text just processed.
Listing Eight (page 107) shows a complex macro that was invoked by an outer
macro that generated 60 to 110 lines of code from a single one-line call. The
macro is invoked from within the outer macro by the call _flagtotal
('x','y','z'), where z is a code that selects lines 258-262, 268-270, or
274-275 of the inner macro.
Listing Nine (page 107) shows the single macro split into three simpler macros
with different names. If the selection code were the second parameter of the
outer macro and had the value 0, we could invoke the appropriate inner macro
by the call _flagtotal_$2_('x','y'), which would become _flagtotal_0_('x','y')
when the outer macro was scanned and replaced. The result of the outer macro
would be scanned again before being appended to the output text and the new
name would be recognized as a macro, causing further replacement by lines 9 to
13 of Listing Nine.
Either choice of macros generates the same code (see Listing Ten, page 107),
but Listing Nine, being less inscrutable, demonstrates the better choice.


Conclusion


The value of UNIX utilities and shell programming lies in the ability to
construct useful tools with surprisingly little effort. The tools described
here are not likely to be useful on many projects, but I built them at far
less expense than the savings they generated. This is typical--the cost of
building tools is low enough that it is worthwhile to build a tool for use on
only one project. MKS Toolkit, by bringing UNIX utilities and shell
programming to DOS, makes it possible to achieve similar savings in non-UNIX
environments. The availability of MS-DOS systems make it possible to take your
working environment just about anywhere.

_BUILDING A PORTABLE PROGRAMMING ENVIRONMENT_
by Ian E. Gorman


[LISTING ONE]

 1 :
 2 # Extract variable names from a MICS file description
 3 # Ian E. Gorman, Canadian Technology Marketing Group, Ottawa
 4
 5 # Input:
 6 # A section of the MICS Data Dictionary describing the file and
 7 # listing the variables. For example, Section 4.4, The HARDVA
 8 # (Hardware Device Activity) File. The variables list is the last
 9 # section of this file.
 10 # Output (tab-delimited fields):
 11 # 1 variable name
 12 # 2 flags, in format "XDWMYTEC" (See the MICS data dictionary)
 13 # 3 variable description
 14 # The "C" flag is not described in MICS, and is the first letter
 15 # of the most recent subcategory heading in the variable list
 16
 17 awk '
 18
 19 { # remove the carriage control character
 20 gsub("^."," ")
 21 }
 22
 23 output_flag == 1 { # beginning of variables list
 24 output_flag = 2
 25 }
 26
 27 /^ *--* *--* *--* *$/ { # heading at top of variables list
 28 output_flag = 1
 29 next
 30 }
 31
 32 output_flag < 1 { # have not reached variable list yet
 33 next
 34 }
 35
 36 /^ [A-Za-z]/ { # Sub heading
 37 category = substr($1,1,1)

 38 next
 39 }
 40
 41 /^ *$/ { # null line
 42 next
 43 }
 44
 45 { # variable description line
 46 flags = sprintf("%-7.7s%1.1s", $1,category)
 47 gsub(" ",".",flags)
 48 title = ( substr( $0,index( $0,"-")+2 ) )
 49 printf("%-10.10s%-10.10s%s\n",$2,flags,title)
 50 }
 51
 52 ' "$@" 
 53
 54 sort # new order, ASCII sequence instead of EBCDIC







[LISTING TWO]

 1 :
 2 # 3270 command to download a PDS
 3 # Ian E. Gorman, Canadian Technology Marketing Group, Ottawa
 4
 5 # Transfers MVS PDS members as individual files to MS-DOS directory
 6 # Marks each received file as unchanged (turns off the archive bit)
 7 # Works only if all members are text
 8 # Requires a DOS file which contains a list of members to transfer
 9 # Member names are in field 1 of the file
 10 # You can get a list of members by using the TSO ISPF dataset list
 11 # (main menu option 3.4), choosing 'X' (print index listing), exiting
 12 # from ISPF, and noting the name of the saved listing file. Download
 13 # this file, and delete all the lines which do not have member names.
 14
 15 if [ $# -ne 3 ]
 16 then
 17 echo "Usage: $0 DOSdir MVSpds MVSmemberlist"
 18 echo " Download files in MVSmemberlist from DOSdir to MVSpds"
 19 echo " Works only for text. Requires full MVS name for MVSpds"
 20 exit 1
 21 fi
 22
 23 for i in `awk '{print $1}' $3` # get filenames from a list of members
 24 do
 25 echo
 26 echo MVS="'"$2"($i)'" to DOS="$1/$i"
 27 if c:/pc3270/receive "$1/$i" "'"$2"($i)'" ascii crlf # download file
 28 then
 29 chmod a=rwx "$1/$i" # if success, turn off archive bit
 30 else
 31 echo "\n$0: cannot continue" # if fail, stop entire run
 32 exit 2
 33 fi

 34 done






[LISTING THREE]

 1 :
 2 # make a 'tags' file for MICS documentation
 3 # Ian E. Gorman, Canadian Technology Marketing Group, Ottawa
 4
 5 cd c:/mic
 6
 7 find c:/mic/doc -type f -print 
 8
 9 awk -v 'OFS= ' '
 10 { basename = $1
 11 sub("^.*\/\$?","",basename)
 12 sub("\.[0-9]*","",basename)
 13 print toupper(basename),$1,"1"
 14 }
 15 ' 
 16
 17 sort '-t ' >tags






[LISTING FOUR]

 1 :
 2 # crude form of hypertext for MICS documentation
 3 # Ian E. Gorman, Canadian Technology Marketing Group, Ottawa
 4
 5 # Uses the 'tags' feature of 'vi' to establish links from names to files.
 6 # Put cursor on a name, and hit ctrl-] to go to next file.
 7 # Tags file is in c:/mic
 8
 9 cd c:/mic
 10
 11 if [ $# -lt 1 ]
 12 then
 13 vi -r -c 'set ignorecase1' index
 14 else
 15 vi -r -c 'set ignorecase1' doc/1/cat1/$1.1
 16 fi






[LISTING FIVE]

 1 dnl mlnonmet.m4 -- m4 definitions for monthly ledgers, metals and nonmetals

 2 dnl Ian E. Gorman, Canadian Technology Marketing Group
***
***
144 dnl
145 dnl
------------------------------------------------------------------------
146 dnl
147 define(`_colheads_',`dnl -- column heads
148 dnl -- parameters: none
149 dnl When calling this macro, do not terminate the call with "dnl".
150 dnl The last line of this macro is left unterminated so that the line
151 dnl can be terminated either with CR or backslash-CR in the macro call.
152 dnl
153 _zcolcentr_(`1',`RSN',`newline') \
154 _zcolcentr_(`2',`Company Name') \
155 _zcolcentr_(`3',`JANUARY') \
156 _zcolcentr_(`4',`FEBRUARY') \
157 _zcolcentr_(`5',`MARCH') \
158 _zcolcentr_(`6',`APRIL') \
159 _zcolcentr_(`7',`MAY') \
160 _zcolcentr_(`8',`JUNE') \
161 _zcolcentr_(`9',`JULY') \
162 _zcolcentr_(`10',`AUGUST') \
163 _zcolcentr_(`11',`SEPTEMBER') \
164 _zcolcentr_(`12',`OCTOBER') \
165 _zcolcentr_(`13',`NOVEMBER') \
166 _zcolcentr_(`14',`DECEMBER')dnl
167 ')dnl -- _colheads_()
168 dnl
169 dnl
------------------------------------------------------------------------
170 dnl
***
***
384 define(`_getanswers_',`dnl -- find Answers data
385 dnl -- parameters:
386 dnl 1 Qid and QvrsYr, e.g. ` 4501988'
387 dnl 2 QCellSect, e.g. `28.5'
388 dnl 3 QSrvyCensMn, e.g. ` 3'
389 dnl Uses macros __ROWCONDITION__, __COLCONDITION__ defined by _finddata_
390 dnl
391 find all _SetAns_ \
392 where QCellSect = "$1$3$2" and \
393 CurrentFlag = "Y" and \
394 QCellRow __ROWCONDITION__ and \
395 QCellCol __COLCONDITION__ and \
396 SrvyCensYr = fa2000.ScYrFr -> _REPTNAME_()SetAns
397 ListData ()'dnl
398 )dnl -- _getanswers_()
399 dnl
400 dnl
------------------------------------------------------------------------
401 dnl
402 dnl -- parameters:
403 dnl 1 Qid and QvrsYr, e.g. ` 4501988'
404 dnl 2 QCellSect, e.g. `28.5'
405 dnl 3 condition for QCellRow, e.g. `in (80,90,100)'
406 dnl 4 condition for QCellCol, e.g. `= 2'
407 dnl
408 define(`_finddata_',`dnl -- find Answers data
409 dnl -- parameters: none
410 dnl usually preceeded by "find number" and followed by the name of new set

411 dnl
412 define(`__ROWCONDITION__',$3)dnl -- avoids the need to nest quotes on
argume
nts
413 define(`__COLCONDITION__',$4)dnl
414 % Find data separately for each month, to allow use of an indexed field
415 _getanswers_(substr(x$1,1),$2,` 1')
416 _getanswers_(substr(x$1,1),$2,` 2')
417 _getanswers_(substr(x$1,1),$2,` 3')
418 _getanswers_(substr(x$1,1),$2,` 4')
419 _getanswers_(substr(x$1,1),$2,` 5')
420 _getanswers_(substr(x$1,1),$2,` 6')
421 _getanswers_(substr(x$1,1),$2,` 7')
422 _getanswers_(substr(x$1,1),$2,` 8')
423 _getanswers_(substr(x$1,1),$2,` 9')
424 _getanswers_(substr(x$1,1),$2,`10')
425 _getanswers_(substr(x$1,1),$2,`11')
426 _getanswers_(substr(x$1,1),$2,`12')'dnl
427 )dnl -- _finddata_()
428 dnl
429 dnl
------------------------------------------------------------------------
430 dnl






[LISTING SIX]
 Code That is Generated by the Macro Call _colheads_()
Headings are Centered over the Corresponding Data in Each Column of Report

218 "RSN" : column 6 newline: \
219 "Company Name" : column 26 : \
220 "JANUARY" : column 54 : \
221 "FEBRUARY" : column 68 : \
222 "MARCH" : column 85 : \
223 "APRIL" : column 100 : \
224 "MAY" : column 116 : \
225 "JUNE" : column 130 : \
226 "JULY" : column 145 : \
227 "AUGUST" : column 159 : \
228 "SEPTEMBER" : column 173 : \
229 "OCTOBER" : column 189 : \
230 "NOVEMBER" : column 203 : \
231 "DECEMBER" : column 218 : \






[LISTING SEVEN]
 Code That is Generated by the Macro Call
 _finddata_(` 4831988',`32.5',`= 130',`= 1')
 Lines 104 to 166 are nine more repetitions of similar code
 The repetitions differ only in the "where" lines (compare 91 to 98)

 89 % Find data separately for each month, to allow use of an indexed field
 90 find all Answers \

 91 where QCellSect = " 4831988 132.5" and \
 92 CurrentFlag = "Y" and \
 93 QCellRow = 130 and \
 94 QCellCol = 1 and \
 95 SrvyCensYr = fa2000.ScYrFr -> p2029SetAns
 96 ListData ()
 97 find all Answers \
 98 where QCellSect = " 4831988 232.5" and \
 99 CurrentFlag = "Y" and \
100 QCellRow = 130 and \
101 QCellCol = 1 and \
102 SrvyCensYr = fa2000.ScYrFr -> p2029SetAns
103 ListData ()
***
***
167 find all Answers \
168 where QCellSect = " 48319881232.5" and \
169 CurrentFlag = "Y" and \
170 QCellRow = 130 and \
171 QCellCol = 1 and \
172 SrvyCensYr = fa2000.ScYrFr -> p2029SetAns
173 ListData ()







[LISTING EIGHT]
 Example of Macro That Is Too Complicated -- Compare to Listing 9
 Lines longer than 80 characters have been wrapped at 80 characters

 1 dnl mlnonmet.m4 -- m4 definitions for monthly ledgers, metals and nonmetals
 2 dnl Ian E. Gorman, Canadian Technology Marketing Group
******
243 dnl
244 dnl
------------------------------------------------------------------------
245 dnl
246 define(`_flagtotal_',`dnl -- include a set of flag totals in a report
column
247 dnl -- parameters:
248 dnl 1 Relative column number on the page
249 dnl 2 Column ID (month)
250 dnl 3 selection: 0 ==> save flag
251 dnl 1 ==> print zero-suppressed total, print flag
252 dnl 2 ==> print total without zero suppress
253 dnl Parameter 3 selects one of three ifelse statements for expansion
254 dnl This macro generates code that both prints the total and saves the
255 dnl total in a local variable for later use.
256 dnl
257 ifelse($3,`0',`dnl -- save flag
258 (let lValueTypes = \
259 { $concat($tochar(lValueTypes,decr($2)), \
260 $tochar(ValueTypeFlag,1), \
261 $substring(lValueTypes,incr($2),80) ) \
262 where SrvyCensMn = $2, lValueTypes } ) : noprint :dnl
263 ')dnl -- close _ifelse_($3,`0', ... )
264 dnl
265 ifelse($3,`1',`dnl -- print zero-suppress total and one flag from
lValueTypes

266 dnl note that required column offset depends on length of mask
267 dnl note that new program line must be a continuation of previous program
line
268 $tonumber((let lCumulTot_$2 = $total(NumericValue where SrvyCensMn=$2)),0)
\
269 :column incr(_COL$1START_) mask "ZZZ,ZZZ,ZZ9": \
270 $tochar($substring(lValueTypes,$2,1),1) : column eval(_COL$1START_+13)
:dnl
271 ')dnl -- close _ifelse_($3,`1', ... )
272 dnl
273 ifelse($3,`2',`dnl -- print total without zero-suppress
274 $tonumber((let lCumulTot_$2 = $total(NumericValue where SrvyCensMn=$2)),0)
\
275 :column incr(_COL$1START_) mask "ZZZ,ZZZ,ZZ9":dnl
276 ')dnl -- close ifelse($3,`2', ... )
277 dnl
278 ')dnl -- close _flagtotal_()
279 dnl
280 dnl
------------------------------------------------------------------------
281 dnl





[LISTING NINE]
 Simplication of the Macro in Listing 8, by Splitting into Three Macros
 Lines longer than 80 characters have been wrapped at 80 characters

 1 dnl
 2 dnl
------------------------------------------------------------------------
 3 dnl
 4 define(`_flagtotal_0_',`dnl -- total 0, save a flag for use with total 1
 5 dnl -- parameters:
 6 dnl 1 Relative column number on the page
 7 dnl 2 Column ID (month)
 8 dnl
 9 (let lValueTypes = \
 10 { $concat($tochar(lValueTypes,decr($2)), \
 11 $tochar(ValueTypeFlag,1), \
 12 $substring(lValueTypes,incr($2),80) ) \
 13 where SrvyCensMn = $2, lValueTypes } ) : noprint :dnl
 14 ')dnl
 15 dnl
 16 dnl
------------------------------------------------------------------------
 17 dnl
 18 define(`_flagtotal_1_',`dnl -- print zero-suppress total and one flag from
lValueTypes
 19 dnl -- parameters:
 20 dnl 1 Relative column number on the page
 21 dnl 2 Column ID (month)
 22 dnl
 23 dnl note that required column offset depends on length of mask
 24 dnl note that new program line must be a continuation of previous program
line
 25 $tonumber((let lCumulTot_$2 = $total(NumericValue where SrvyCensMn=$2)),0)
\
 26 :column incr(_COL$1START_) mask "ZZZ,ZZZ,ZZ9": \
 27 $tochar($substring(lValueTypes,$2,1),1) : column eval(_COL$1START_+13)
:dnl
 28 ')dnl
 29 dnl
 30 dnl
------------------------------------------------------------------------
 31 dnl
 32 define(`_flagtotal_2_',`dnl -- print total without zero-suppress
 33 dnl -- parameters:
 34 dnl 1 Relative column number on the page

 35 dnl 2 Column ID (month)
 36 dnl
 37 $tonumber((let lCumulTot_$2 = $total(NumericValue where SrvyCensMn=$2)),0)
\
 38 :column incr(_COL$1START_) mask "ZZZ,ZZZ,ZZ9":dnl
 39 ')dnl
 40 dnl
 41 dnl
------------------------------------------------------------------------
 42 dnl






[LISTING 10 IS CURRENTLY UNAVAILABLE]















































May, 1993
PROGRAMMING PARADIGMS


The Open Programmer




Michael Swaine


I always expect Bill Moyers to let me down. I'm always pleased when he
doesn't. Moyers, the San Jose Mercury News tells me, "is virtually revered, as
if he were the second coming of Edward R. Murrow." I admire him, too, but he's
always running off to China to explore Oriental medicine or interviewing some
original thinker not accepted by the scientific community, and I watch these
shows with reservations. Moyers, with his earnest look, is the very image of
the open-minded observer, and I want to believe that I'm going to see the
truth unadorned through his baby blues. But I can't get it out of my mind that
this guy isn't a scientist or a priest; he's the former press secretary to
Lyndon Johnson.
Often, Moyers remains unconvinced by what I consider unconvincing arguments
and demonstrations, and when that happens, I'm relieved.
I find that I believe passionately in open-mindedness, so long as it comes up
with the right answer. That may not be the right attitude.
This is a column about attitude.


The Innocence of Richard Feynman


Yoshiro NakaMats, the inventor of the floppy disk and holder of more patents
than Thomas Edison, brainstorms on a special waterproof pad while floating
underwater.
Whatever works, I guess. Some creative people are more artless than this.
Physicist Richard Feynman, surely one of the most creative scientists of this
century, was artfully artless.
In two books of autobiographical anecdotes, Feynman (through his "as-told-to"
ghoster, Ralph Leighton) paints a picture of himself as an innocent child,
curious about the world and approaching it with unprejudiced openness. Like
Bill Moyers, only funny. The style in which these books are written is the
style in which Feynman lived his life:
The main reason people hired me was the Depression. They didn't have any money
to fix their radios, and they'd heard about this kid who would do it for less.
So I'd climb on roofs to fix antennas and all kinds of stuff.
No wonder Feynman was as close as anybody has ever come to being a hero to
Bill Gates: They sound alike. Feynman brought this artlessness to everything
he did: fixing radios as a kid in Far Rockaway; cutting string beans on a
summer job; watching ants on his windowsill as a graduate student at
Princeton; disobeying orders and taking apart the calculators while working on
the atomic bomb at Los Alamos; doing his Nobel work in physics; and bucking
the red tape while serving on the committee investigating the Challenger
disaster, when he demonstrated the cause of the crash so vividly that any
12-year-old could understand. Feynman bragged that he was never impressed by
the credentials of the person, but always focused on the idea. "If the idea
was lousy, I said that it looked lousy. If it looked good, I said it looked
good. Simple proposition."
In James Gleick's biography of Feynman, Gleick also captures this image
clearly, but he goes on to examine the image and to pry it apart. Feynman went
to some lengths, it appears, to paint this picture of himself. "He could not,
or would not, distinguish between the prestigious problems of elementary
particle physics and the apparently humbler everyday questions that seemed to
belong to another era," Gleick tells us, supporting the image of the curious
child approaching the world with unprejudiced openness. But he then quotes
Murray Gell-Mann: "He surrounded himself with a cloud of myth, and he spent a
great deal of time and energy generating anecdotes about himself."
Ultimately, one comes away from these books with the image of a person
knowledgeably, deliberately, consciously, and artfully trying to be open,
spontaneous, intuitive, artless. And succeeding.
So it can be done.


The Errors of Donald Knuth


Donald Knuth, one of our most respected programmers, a legend, a Turing
laureate, and the author of a series of definitive books on algorithms, is one
of the most modest of men. Donald Knuth is not the sort of person to spend
time and energy generating anecdotes about himself. Quite the opposite: He's
willing to expose his shortcomings as a programmer to the scrutiny of the
community of programmers.
In Literate Programming, Knuth has the courage to publish his errors. Thinking
it might be useful to others, he presents the error log he kept while writing
his typesetting program, T[E]X.
As Figure 1 illustrates, this sort of thing may indeed be useful to other
programmers.
Figure 1: Excerpt from Knuth's error log.

 11 Mar 1978
 10 insert space before '(' on terminal when opening a new file.
 11 Put 'p <- link(p)' into the loop of show_token_list, so that it
 doesn't loop forever.
 12 Shift the last item found by scan_tocs into the info field.
 13 Fix the previous bugfix: I shifted by the wrong amount.
 13 Mar 1978
 36 Introduce a new variable hang_first [later the sign of hang_after].
 37 Simplify the new code, realizing that if hang_indent = 0 then
 hang_first is irrelevant.
 Time sharing is very slow today, so I'm mostly reading technical
 reports while waiting THREE HOURS for compiler, editor, and loading
 routine.
 14 Mar 1978
 (Came in evening after sleeping most of day, to get computer at
 better time.)
 (Some day we will have personal computers and we will live more
 normally.) 8:30pm, began to enter corrections to yesterday's problems.
 53 Issue an error message for non-character in file name or font name.

 54 Display '...' for omitted stuff in show_context routine.

Knuth points out that his error log doesn't have the statistical virtues of
pools of data from a large population of programmers and projects. "However, I
do have one advantage that the authors of previous studies did not have;
namely, the entire program for T[E]X has been published. Hence I can give
fairly precise information about the type and location of each error." He
adds, "I believe that a detailed list gives important insights that cannot be
gained from statistical summaries."
The error log may have been useful to other programmers; the published source
code surely has been. And it has probably been useful to Knuth, too. Cast your
bread on the waters you'll get soggy bread, but cast your ideas on minds and
you'll get new ideas.


The Risks of Raymond Kurzweil


In Raymond Kurzweil's stylish book, The Age of Intelligent Machines, he
attaches to the section in which he discusses his own work the following
epigraph, attributed to Eric Vogt: "Success provides the opportunity for
growth, and growth provides the opportunity to risk at a higher level."
Kurzweil has taken a few risks in his life. He started three companies to
bring AI to different real-world applications. He founded Kurzweil Computer
Products in 1974 to solve the problem of omnifont optical character
recognition. That venture led to reading machines that have changed the lives
of many blind people and launched the OCR industry. In 1982, he founded
Kurzweil Applied Intelligence to master automatic speech recognition. Today,
radiologists dictate the results of examinations hands-free using Kurzweil
speech-recognition devices. Later in 1982 Kurzweil founded Kurzweil Music
Systems, inspired by a request from Stevie Wonder. The result can be seen in
the ads in any music magazine.
All of these ventures were ridiculously ambitious, and in no case can it be
said that Kurzweil has solved the problem he set out to solve. Is this
failure? Hardly.
Kurzweil summarizes his work with two points. The goal of the work has been
helping the handicapped and serving other useful social goals, and that was
satisfying. The other point, from which he seems to draw as much satisfaction,
has to do with the rewards of working with diverse teams.
All of the projects...have been highly interdisciplinary efforts and have
required the dedication and talents of many brilliant individuals in a broad
range of fields. Invention today is very much a team effort and its success is
a function of the quality of the individual members of the team as well as the
quality of the group's communication.
Drawing on diverse areas of expertise requires a kind of intellectual
openness. Kurzweil says it's worth it: "Perhaps my greatest pleasure has been
the opportunity to share in the creative process with the many outstanding men
and women who have contributed to these endeavors."
Well, yes, that does sound like 50 zillion other acknowledgments written or
delivered at the completion of a project or onstage at the awards ceremony.
But that doesn't mean it doesn't mean anything. It certainly means something
to Kurzweil, who applied the same multiple-experts approach to software
development. In his March, 1978, Byte article on the technology of the
Kurzweil Voice Writer, he says:
...rather than select a single technique such as Markov modeling, dynamic time
warping, robust feature analysis, or high-level feature extraction, the KVW
technology incorporates multiple experts, each of which uses a somewhat
different approach to the problem of large vocabulary speech recognition.
This multiparadigmatic approach is also a kind of intellectual openness.


The Mind of Marvin Minsky


In Society of Mind, Marvin Minsky presents a model of the human mind based on
this same image: the image of teams of experts working together. Any model of
the mind has to deal with how there can emerge from blind biological processes
something like mind or intelligence or soul or personality or self or some
such trait that we perceive as unitary. Minsky faces the question. "How can
intelligence emerge from nonintelligence? To answer that, we'll show that you
can build a mind from many little parts, each mindless by itself."
Well, maybe "expert" isn't the right word. Little parts, mindless by
themselves, are not what we usually think of as experts. But Minsky's mindless
mind parts are possessed of expertise. "I'll call 'Society of Mind'," he says,
"this scheme in which each mind is made up of many smaller processes. These
we'll call agents. Each mental agent by itself can only do some simple thing
that needs no mind or thought at all. Yet when we join these agents in
societies--in certain very special ways--this leads to true intelligence."
Minsky's book is aimed at a broad audience, so his presentation makes some
sacrifices to simplicity and clarity. This can have the effect of making his
argument seem to the knowledgeable reader either facile or obvious. Facile
because it skips over important points, obvious because it successfully hides
what it is skipping over. But I don't think that Minsky's model of the mind is
obvious. It isn't the most obvious model.
It's easier, I think, to view the mind as having somebody in charge, to
picture a hierarchical mental structure with a dispatcher sitting at the top.
That's the model implicit in the common use of words like "self' and "mind."
Minsky again:
One common image of the Self suggests that every mind contains some sort of
Voyeur-Puppeteer inside--to feel and want and choose for us the things that we
feel, want, and choose. But if we had those kinds of Selves, what would be the
use of having Minds? And, on the other hand, if Minds could do such things
themselves, why have Selves? Is this concept of a Self of any real use at all?
It is indeed--provided that we think of it not as a centralized and
all-powerful entity, but as a society of ideas that include both our images of
what the mind is and our ideals about what it ought to be.
Minsky's Mind is an open mind.


The Economics of Richard Stallman


"After nine years, people still don't get it." So begins Simson Garfinkel's
article on Richard Stallman in the premiere issue of Wired.
Stallman, the legendary last true hacker of Steve Levy's Hackers, is the
founder of the Free Software Foundation, the tireless and brilliant programmer
at MIT's AI Lab in the '70s, and the author of EMACS. Almost a decade ago,
Stallman launched the GNU project to create a free, portable, open operating
system that users could extend. His intent was and is that GNU be an
environment in which users share software as freely as researchers share ideas
in a scientific community. Or as freely as hackers at the MIT AI Lab shared
code in the '70s.
What is it that people don't get after nine years? That Stallman's philosophy
of free software isn't about giving away software; it's about the free flow of
ideas.
"Stallman's vision of freedom is software that has no secrets," Garfinkel
says. "It comes complete with source code so that anyone who gets it can take
it apart, see how it works, and make changes." The benefits of this kind of
openness are clear enough. Knuth's error log is useful, is meaningful,
precisely because the source code is all available. This master's lesson
depends on that openness. His books on T[E]X and Metafont are useful because
of the source code in them.
But Stallman has more in mind than that. Garfinkel continues: "But most
important, people can share free software with their friends--just by making a
copy--without having to pay royalties, shareware fees, or anything at all."
That's piracy to most of the software-development community. Stallman's view?
"I don't think that people should ever make promises not to share with their
neighbor."
This view runs counter to some other deeply held notions, and Stallman knows
it. I saw him on CNN the other day, and he summed it up: "Digital technology
is on a collision course with the concept of ownership of information."
People, he said, naturally want to share with their friends. If the law
prohibits this generous, natural act, "you're going to need a police state to
enforce it."
As we move into Cyberspace, onto the data highways that the Vice President
envisions, the threat of an information police is one of the dangers we need
to avoid. Another is a kind of information range war. Now, while there is a
chance to avoid some of these problems, we need to be open to all sorts of
ideas about how to manage that virtual space. We need to try on different
attitudes toward openness.
And we can, like Richard Feynman, choose our attitude. There are risks in
openness of any kind: Neither T[E]X nor GNU looks like a model of
profitable--or rapid--software development. And there are benefits: As Raymond
Kurzweil's experience shows, the idea you need is probably in somebody else's
head right now.
Or maybe the idea resides in the act of collaboration. Maybe we will discover
that Cyberspace is a Society of Mind, and we humans are its Minskyan agents.
Ah, but now I'm starting to sound like one of those people Bill Moyers
interviews.


Open Books




On the innocence of Richard Feynman:


Feynman, Richard P. Surely You're Joking, Mr. Feynman!: Adventures of a
Curious Character. New York, NY: Bantam Books, 1985.
Feynman, Richard P. What Do You Care What Other People Think?: Further
Adventures of a Curious Character. New York, NY: Bantam Books, 1988.
Gleick, James. Genius: The Life and Science of Richard Feynman. New York, NY:
Pantheon Books, 1992.
Thompson, Charles "Chic." What a Great Idea!: The Key Steps Creative People
Take. HarperCollins, 1992.



On the errors of Donald Knuth:


Knuth, Donald E. Literate Programming. Stanford University: Center for the
Study of Language and Information, 1992.
Knuth, Donald E. Metafont: The Program, Volume D of Computers and Typesetting.
Reading, MA: Addison-Wesley, 1986.
Knuth, Donald E. T[E]X: The Program, Volume B of Computers and Typesetting.
Reading, MA: Addison-Wesley, 1986.


On the risks of Raymond Kurzweil:


Kurzweil, Raymond. The Age of Intelligent Machines. Cambridge, MA: MIT Press,
1990.
Kurzweil, Raymond. "The Technology of the Kurzweil Voice Writer." Byte (March,
1986)


On the mind of Marvin Minsky:


Minsky, Marvin. The Society of Mind. Englewood Cliffs, NJ: Simon and Schuster,
1985-1986.


On the economics of Richard Stallman:


Garfinkel, Simson L. "Is Stallman Stalled?" Wired (January, 1993).
Levy, Steven. Hackers: Heroes of the Computer Revolution. Garden City, NY:
Anchor Press/Doubleday, 1985.


On the culture of cyberspace:


Benedikt, Michael. Cyberspace: First Steps. Cambridge, MA: MIT Press, 1992.
Gengle, Dean. The Netweaver's Sourcebook: A Guide to Micro Networking and
Communications. Reading, MA: Addison-Wesley, 1984.



























May, 1993
C PROGRAMMING


Screen Snapper


 This article contains the following executables: DFPP01.ARC


Al Stevens


I'm taking a side trip from the development of D-Flat++ to share with you a
utility that I've found useful in the development and documentation of D-Flat
applications. Building a computer system always includes the painful task of
writing a user's guide. Most programmers don't like to write any kind of
documentation, and the user's guide is often the most difficult document to
write, because its audience is from a variety of other disciplines, each with
its own jargon. A vertical application involves a whole new lexicon and
perhaps a user base that doesn't speak computerese. The more horizontal the
application, the more diverse the audience, so the language must be
specialized enough to describe the application yet general enough that anyone
can understand it. Large software houses use professional technical writers to
clear these hurdles. Smaller ones have to rely on the programmers to do it. If
you're all by yourself, maybe doing the bound documentation for the registered
users of a shareware application, you're stuck with the job--you wrote the
code, you write it up.
Often the hardest part in creating a user's manual is getting the screen
illustrations into print. If you are documenting a Windows application, the
job is easier, because Windows and its desktop-publishing applications have
everything you need to capture screens and import them into a document. But if
you are doing a DOS, text-mode application, there are problems.
Let me describe the environment. You're documenting a DOS application. It has
multiple-color screens that use the graphics character set and colors to
define windows of one sort or another. In my case, it might be a D-Flat
application. You're using a typical black-and-white laser printer to print the
master copy of the document, and you will use a typical copier or local-office
print service to manufacture the documents. You need to run the application,
capture screen shots to a disk file, and import those pictures into your
document. The screen shots are sometimes of the full screen, and at other
times they are captured from a defined rectangle within the screen. What tools
are available?
In several years of writing books, I've used a number of tools designed for
capturing DOS screens. Most came from publishers who chose them because they
prefer to use a particular capture format to lay out book pages. While the
tools work fairly well for book publishing, none have been exactly right for
the environment I just described. Some are incredibly difficult to use. Others
have limited features. None produce an acceptable, laser-printed copy of a
multiple-color DOS screen. Usually I'll run the application in monochrome mode
just to get something readable, but an application that uses color to define
boundaries is hard to read in its raw black-and-white format. To see for
yourself, run the DOS 5.0 DOSSHELL program in one of its monochrome
configurations. It works, but the screens are less than pleasing and, when
printed, are less readable than you know they could be. The best displays are
in color, and they do not print well. Look at the documentation from the
vendors. Even Microsoft's manuals look shabby when it comes to the text-mode
screen prints.
There are a lot of reasons for that. Screens use white characters on a blue
background in one place and black characters on green somewhere else, for
example. When you capture these screens, you'll get a .PCX or .TIF file which
looks pretty good when displayed with a paint program, but loses a lot of
information and visual appeal when printed. Green, red, yellow, and blue don't
translate well into black-and-white. Mind you, those formats work better when
a publisher prints them with a 1200 dots per inch (dpi) or better Lino-type
machine, but laser printers are typically 300 dpi, and if you have to do any
reduction whatsoever to fit your page, the added loss in resolution degrades
the picture's quality.


HP to the Rescue


One day I sat gazing at my LaserJet III after it had dumped out a bunch of
these marginal screen prints. I knew that it had a nice little font that
included the graphics character set and the ability to print filled rectangles
of dot patterns with white or seven different shades of gray, expressed as 2-,
10-, 15-, 30-, 45-, 70-, and 90-percent fill patterns. These features are
common to most laser printers, and the LJIII uses a command language that
other printers can emulate. I reasoned that a screen shot that uses shades of
gray to represent background colors and that always uses black characters in
the foreground might do the job. I hammered out some samples and was pleased
with the result.
The pictures that the LJIII makes are just about right. The font I use is
about 17 characters per inch and 8.5 lines per inch (lpi), which prints a
25x80 screen dump in a 4.75x3.25-inch picture, a size that fits nicely onto
most standard manual-page sizes. Smaller screen segments make smaller
pictures. Figure 1 is an example of a D-Flat screen printed this way. Until I
see the figure in the magazine, I won't know if it will impress you, because
DDJ has to take camera-ready copy and get it into the magazine somehow. I
printed it on the LaserJet, and trust me when I say that it looks pretty good
in the flesh.


Grabbing the Screen


Printing screen shots is only the beginning. You need a way to capture the
screen shots themselves, and I'm ready for that one. In the 1991 March and
April issues of DDJ I published a screen-grabber TSR program that captured the
text part but not the color attributes of selected text-screen rectangles. The
point of the project was to teach C-language, event-driven programming, both
in anticipation of D-Flat and for fledgling Windows programmers. I spent very
little time on the screen-grabbing code itself other than to explain what it
did.
I modified that program to embed LaserJet commands into the screen captures so
that the files, when sent to the LaserJet, would print the screens using the
17-pitch font and different shades of gray for different-colored backgrounds.
That is the program you will find with this column. I've pulled out the
event-driven stuff, not needing to teach that again, and accessed the screen,
mouse, and keyboard by using more traditional function calls. The TSR code is
virtually unchanged. I won't explain it again, either. If you have a burning
need to know about TSRs, read the earlier columns, which explain them some,
refer you to other sources for more detail, and tell you why I think you
shouldn't care.


Recycled Listings


Listings One and Two, page 140, are console.h and console.c. They implement
the low-level code for reading the keyboard and mouse and managing the screen
cursor. They, too, are similar to the code used in the earlier project and in
D-Flat as well. I am including them here to make the project complete in one
installation. Listing Three, page 141, is tsr.c, the TSR engine that turns the
program into a reasonably well-behaved TSR, also virtually the same as before.
This engine is minimal. It does not include code to unload the TSR from
memory. You can use the MARK/ RELEASE utilities or you can reboot to unload
the program. The engine does not allow you to change the hot key except by
recompiling the program. Change the KEYMASK and SCANCODE global variables to
change the hot key. Change the startup message in copyscrn.c as well. By the
way, the TSR works okay if you load it high. It occupies about 25K.


The Screen Grabber


Listing Four, page 142, is copysrcn.c, the part of the program that captures
screens. When you load it, the main function displays the shades of gray
associated with each background color and lets you change them. I found that
not all the fill patterns work well, and some screens display better if I
change the default. After you finish with that, the program declares itself
resident.


The Capture


The program captures the screen image into a file named SNAP.000. Subsequent
captures during the same load of the program are named SNAP.001, SNAP.002, and
so on. To start the program at a specific numbered extension, enter the number
as a command-line parameter when you load it.
The file will have screen text and LaserJet-command statements to print the
screen image. You can copy the file directly to the printer to see what it
looks like. You can import the file into the document's text file, too. The
number of text lines is the same as the number of lines at 8.5 lpi. I find the
gobbledygook printer commands to be distracting in the text. Therefore, I
insert merge commands into the word processor so that it will retrieve the
files at print time. By changing margins and lines per inch in the word
processor, I can center the figures and maintain the page integrity. I use
XyWrite for DOS word processing, and it supports these operations well.


Using Different Laser Printers


Not everyone will use a LaserJet-compatible laser printer, although most of
them emulate the command language. If you want to modify the program to work
with another printer, it must at least behave like a LaserJet in that it must
have commands to move the print cursor and to print rectangles in shades of
gray. It must also have a compatible font.
The LaserJet III macros in copyscrn.c implement the interface. The push and
pop macros write the commands telling the printer to remember the current
cursor location and restore it later. This allows the program to temporarily
change the cursor position without having to know how to restore it. The CH
and CW macros define the character height and width in cursor-address
increments. The movehorizontal and movevertical macros accept an integer
parameter that specifies a relative cursor position. The macros move the print
cursor accordingly. The shading macro accepts an integer value that programs
the percentage of shading for a rectangular block's fill pattern. The
rectangle macro defines the rectangle to be filled, and the rectanglefill
macro draws the defined rectangle with the latest programmed fill pattern. The
selectfont macro selects the font for the screen dump, and the resetfont macro
resets it to the printer's default.



Building and Running Screen Snapper


I used the Turbo C extensions to build screen snapper, so you'll need a
Borland compiler to build it. Compile the three modules with the small memory
model and link them together using the command line bcc -esnap.exe copyscrn.c
console.c tsr.c. This will build snap.exe, the TSR program. When you load
snap, the program will display the menu in Figure 2(a). Press the digit
associated with a color, and you see Figure 2(b).
Figure 2: The screen-snapper main menu.

 (a)

 Screen Snapper Version 2 - snap.000
 Gray scale defaults:
 0. Black 45%
 1. Blue 45%
 2. Green 30%
 3. Cyan 15%
 4. Red 45%
 5. Magenta 15%
 6. Brown 30%
 7. White 0%

 Enter number of a color to change,
 Esc to return to defaults,
 Or Enter to accept settings
 ...

 (b)

 Black:

 Sp = change, Esc = quit, Enter = accept
 45%

Press the spacebar to step through the fill patterns, Esc to return to the
earlier setting, and Enter to accept whatever change you made. The program
goes back to the first menu. Continue to change colors until you have the
setting you prefer.
When you pop the program up, it saves the current mouse and cursor
configurations, sets them both to the upper-left corner of the screen, and
turns the mouse cursor on. Define a screen rectangle with either the keyboard
or the mouse. With the keyboard, you move the cursor to a corner of the
rectangle and press the F2 key to tell the program that you are now defining a
rectangle. Move the cursor to the opposite corner and press Enter to save the
screen shot, or Esc to reject it and return to the interrupted application.
Press F2 again if you want to change the anchored corner. As you move the
cursor, the screen inverts the background color so you can see the rectangle
being defined. With the mouse, move the mouse cursor to a corner and press and
hold down the left button. Drag the mouse cursor to the opposite corner.
Release the button. The rectangle is defined. You can ignore it by moving the
cursor and pressing the button again. When the rectangle describes the one you
want to print, press the right mouse button. The program saves the screen in a
disk file, restores the interrupted mouse and cursor configuration, and
returns to the interrupted program.


Jazz: Wild Bill and Wild Philippe


Time for some musical plugs. Anyone who has attended a Borland conference
knows that Philippe Kahn, the Boss at Borland, likes to play the saxophone and
flute. He keeps his Turbo Jazz band of Borland employees busy during the
social hours. These days Turbo Jazz is minus one alto sax player who now blows
his Selmer at Symantec. Philippe is rumored to believe that the change raised
the musical quality of both organizations.
Besides leading Turbo Jazz, Philippe also produces his own compact disks,
having three of them out now. The latest is called Paradiso, and Philippe
plays the flute exclusively on it. His recording band consists of some of the
best players in California, including Alan Broadbent, Terrance Blanchard, and
John Patitucci, some world-class heavyweights. Philippe's own playing has
improved so much in the few years since I first heard him that I now find
myself routinely listening to this CD while I work and liking it very much. I
don't know if the CD is available commercially, but you can try Pacific High
Productions, P.O. Box 536, Los Gatos, CA 95031.
As a jazz pianist I had the good fortune over the years to work and record
with a cornet player named Wild Bill Davison. I first met him when I was still
a teenager, and worked with him many times while he lived in Washington, D.C.
in the 1970s. He died in 1989 at the age of 83, having inspired and influenced
several generations of musicians. He wasn't as well known as Satch, Dizzy, or
Bix, but they all knew and respected him, and because of his contributions to
our culture, he was inducted into the Jazz Hall of Fame and declared a
National Treasure. In the last years of Bill's life, Tommy Saunders, also a
fine jazz cornet player, undertook the production of a video tribute to Wild
Bill with interviews with Bill and his wife, video and still clips of Bill
throughout the years, including a delightful conversation with Johnny Carson
on "The Tonight Show," and many samples of Bill's music. If you are interested
in jazz and its history and would enjoy hearing the crystal-clear
recollections of an irascible yet beguiling and engaging elder statesman, I
recommend this work. It runs 100 minutes and is called "Wild Bill Davison, His
Life, His Times, His Music," T.T.&T Network, 1158 Bedford, Suite 1506, Grosse
Pointe, MI 48230 ($40.00 plus $2.00 P/H).

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ----------- console.h ------------ */
#ifndef CONSOLE_H
#define CONSOLE_H

#define TRUE 1
#define FALSE 0
#define ESC 27
#define F2 188
#define UP 200

#define FWD 205
#define DN 208
#define BS 203
#define KEYBOARD 0x16
#define ZEROFLAG 0x40
#define SETCURSORTYPE 1
#define SETCURSOR 2
#define READCURSOR 3
#define HIDECURSOR 0x20

int getkey(void);
int keyhit(void);
void curr_cursor(int *, int *);
void cursor(int, int);
void hidecursor(void);
void unhidecursor(void);
void savecursor(void);
void restorecursor(void);
void set_cursor_type(unsigned);

#define MOUSE 0x33
void resetmouse(void);
unsigned mouse_buffer(void);
int mouse_installed(void);
int mousebuttons(void);
void get_mouseposition(int *x, int *y);
void set_mouseposition(int x, int y);
void show_mousecursor(void);
void hide_mousecursor(void);
int button_releases(void);
void intercept_mouse(void *);
void restore_mouse(void *);
#define leftbutton() (mousebuttons()&1)
#define rightbutton() (mousebuttons()&2)

/* ------- defines a screen rectangle ------ */
typedef struct {
 int x, y, x1, y1;
} RECT;

#endif






[LISTING TWO]

/* ----------- console.c ---------- */

#include <bios.h>
#include <dos.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "console.h"

static unsigned video_mode;

static unsigned video_page;
static int cursorpos;
static int cursorshape;

/* ---- Test for keystroke ---- */
int keyhit(void)
{
 _AH = 1;
 geninterrupt(KEYBOARD);
 return (_FLAGS & ZEROFLAG) == 0;
}
/* ---- Read a keystroke ---- */
int getkey(void)
{
 int c;
 while (keyhit() == 0)
 ;
 if (((c = bioskey(0)) & 0xff) == 0)
 c = (c >> 8) 0x80;
 else
 c &= 0xff;
 return c;
}
static void videoint(void)
{
 static unsigned oldbp;
 _DI = _DI;
 oldbp = _BP;
 geninterrupt(0x10);
 _BP = oldbp;
}
void videomode(void)
{
 _AH = 15;
 videoint();
 video_mode = _AL;
 video_page = _BX;
 video_page &= 0xff00;
 video_mode &= 0x7f;
}
/* ---- Position the cursor ---- */
void cursor(int x, int y)
{
 videomode();
 _DX = ((y << 8) & 0xff00) + x;
 _AX = 0x0200;
 _BX = video_page;
 videoint();
}
/* ---- get cursor shape and position ---- */
static void near getcursor(void)
{
 videomode();
 _AH = READCURSOR;
 _BX = video_page;
 videoint();
}
/* ---- Get current cursor position ---- */
void curr_cursor(int *x, int *y)

{
 getcursor();
 *x = _DL;
 *y = _DH;
}
/* ---- Hide the cursor ---- */
void hidecursor(void)
{
 getcursor();
 _CH = HIDECURSOR;
 _AH = SETCURSORTYPE;
 videoint();
}
/* ---- Unhide the cursor ---- */
void unhidecursor(void)
{
 getcursor();
 _CH &= ~HIDECURSOR;
 _AH = SETCURSORTYPE;
 videoint();
}
/* ---- Save the current cursor configuration ---- */
void savecursor(void)
{
 getcursor();
 cursorshape = _CX;
 cursorpos = _DX;
}
/* ---- Restore the saved cursor configuration ---- */
void restorecursor(void)
{
 videomode();
 _DX = cursorpos;
 _AH = SETCURSOR;
 _BX = video_page;
 videoint();
 set_cursor_type(cursorshape);
}
/* ----------- set the cursor type -------------- */
void set_cursor_type(unsigned t)
{
 videomode();
 _AH = SETCURSORTYPE;
 _BX = video_page;
 _CX = t;
 videoint();
}
/* --------- generic mouse utility ---------- */
static void mouse(int m1,int m2,int m3,int m4)
{
 _DX = m4;
 _CX = m3;
 _BX = m2;
 _AX = m1;
 geninterrupt(MOUSE);
}
/* ----- test to see if the mouse driver is installed ----- */
int mouse_installed(void)
{

 unsigned char far *ms;
 ms = MK_FP(peek(0, MOUSE*4+2), peek(0, MOUSE*4));
 return (ms != NULL && *ms != 0xcf);
}
/* ---------- reset the mouse ---------- */
void resetmouse(void)
{
 if (mouse_installed())
 mouse(0,0,0,0);
}
/* ------ return true if mouse buttons are pressed ------- */
int mousebuttons(void)
{
 int bx = 0;
 if (mouse_installed()) {
 mouse(3,0,0,0);
 bx = _BX;
 }
 return bx & 3;
}
/* ---------- return mouse coordinates ---------- */
void get_mouseposition(int *x, int *y)
{
 if (mouse_installed()) {
 int mx, my;
 mouse(3,0,0,0);
 mx = _CX;
 my = _DX;
 *x = mx/8;
 *y = my/8;
 }
}
/* -------- position the mouse cursor -------- */
void set_mouseposition(int x, int y)
{
 if(mouse_installed())
 mouse(4,0,x*8,y*8);
}
/* --------- display the mouse cursor -------- */
void show_mousecursor(void)
{
 if(mouse_installed())
 mouse(1,0,0,0);
}
/* --------- hide the mouse cursor ------- */
void hide_mousecursor(void)
{
 if(mouse_installed())
 mouse(2,0,0,0);
}
/* --- return true if a mouse button has been released --- */
int button_releases(void)
{
 int ct = 0;
 if(mouse_installed()) {
 mouse(6,0,0,0);
 ct = _BX;
 }
 return ct;

}
/* --------- get mouse state buffer size --------- */
unsigned mouse_buffer(void)
{
 if (mouse_installed()) {
 mouse(21,0,0,0);
 return _BX;
 }
 return 0;
}
/* ----- intercept mouse in case an interrupted program is using it ------ */
void intercept_mouse(void *bf)
{
 if (mouse_installed()) {
 _ES = _DS;
 mouse(22, 0, 0, (unsigned) bf);
 }
}
/* ----- restore the mouse to the interrupted program ----- */
void restore_mouse(void *bf)
{
 if (mouse_installed()) {
 _ES = _DS;
 mouse(23, 0, 0, (unsigned) bf);
 }
}






[LISTING THREE]

/* --------- tsr.c --------- */
#include <dos.h>
#include <stdlib.h>
#include <stdio.h>
#include "console.h"

void tsr_program(void);
/* ------- the interrupt function registers -------- */
typedef struct {
 int bp,di,si,ds,es,dx,cx,bx,ax,ip,cs,fl;
} IREGS;
#define DISK 0x13
#define CTRLBRK 0x1b
#define INT28 0x28
#define CRIT 0x24
#define CTRLC 0x23
#define TIMER 8
#define KYBRD 9
#define DOS 0x21
#define KEYMASK 8
#define SCANCODE 52
unsigned highmemory;
/* ------ interrupt vector chains ------ */
static void (interrupt *oldtimer)(void);
static void (interrupt *old28)(void);

static void (interrupt *oldkb)(void);
static void (interrupt *olddisk)(void);
/* ------ ISRs for the TSR ------- */
static void interrupt newtimer(void);
static void interrupt new28(void);
static void interrupt newdisk(IREGS);
static void interrupt newkb(void);
static void interrupt newcrit(IREGS);
static void interrupt newbreak(void);
static unsigned sizeprogram; /* TSR's program size */
unsigned dossegmnt; /* DOS segment address */
unsigned dosbusy; /* offset to InDOS flag */
static int diskflag; /* Disk BIOS busy flag */
unsigned mcbseg; /* address of 1st DOS mcb */
static char far *mydta; /* TSR's DTA */
int hotkeyhit = 0;
int tsrss; /* TSR's stack segment */
int tsrsp; /* TSR's stack pointer */
/* -------------- context for the popup ---------------- */
unsigned intpsp; /* Interrupted PSP address */
int running; /* TSR running indicator */
char far *intdta; /* interrupted DTA */
unsigned intsp; /* " stack pointer */
unsigned intss; /* " stack segment */
unsigned ctrl_break; /* Ctrl-Break setting */
void (interrupt *oldcrit)(void);
void (interrupt *oldbreak)(void);
void (interrupt *oldctrlc)(void);
/* ------- local prototypes -------- */
static void resident_psp(void);
static void interrupted_psp(void);
static void popup(void);
void initialize(void);

void main(void)
{
 unsigned es, bx;
 initialize();
 /* ---------- compute memory parameters ------------ */
 highmemory = _SS + ((_SP + 256) / 16);
 /* ------ get address of DOS busy flag ---- */
 _AH = 0x34;
 geninterrupt(DOS);
 dossegmnt = _ES;
 dosbusy = _BX;
 /* ---- get the seg addr of 1st DOS MCB ---- */
 _AH = 0x52;
 geninterrupt(DOS);
 es = _ES;
 bx = _BX;
 mcbseg = peek(es, bx-2);
 /* ----- get address of resident program's dta ----- */
 mydta = getdta();
 /* ------------ prepare for residence ------------ */
 tsrss = _SS;
 tsrsp = _SP;
 oldtimer = getvect(TIMER);
 old28 = getvect(INT28);
 oldkb = getvect(KYBRD);

 olddisk = getvect(DISK);
 /* ----- attach vectors to resident program ----- */
 setvect(KYBRD, newkb);
 setvect(INT28, new28);
 setvect(DISK, newdisk);
 setvect(TIMER, newtimer);
 /* ------ compute program size ------- */
 sizeprogram = highmemory - _psp + 1;
 /* ----- terminate and stay resident ------- */
 _DX = sizeprogram;
 _AX = 0x3100;
 geninterrupt(DOS);
}
/* ---------- break handler ------------ */
static void interrupt newbreak(void)
{
 return;
}
/* -------- critical error ISR ---------- */
static void interrupt newcrit(IREGS ir)
{
 ir.ax = 0; /* ignore critical errors */
}
/* ------ BIOS disk functions ISR ------- */
static void interrupt newdisk(IREGS ir)
{
 diskflag++;
 (*olddisk)();
 ir.ax = _AX; /* for the register returns */
 ir.cx = _CX;
 ir.dx = _DX;
 ir.es = _ES;
 ir.di = _DI;
 ir.fl = _FLAGS;
 --diskflag;
}
/* ----- keyboard ISR ------ */
static void interrupt newkb(void)
{
 static unsigned char kbval;

 kbval = inportb(0x60);
 if (!hotkeyhit && !running)
 if ((peekb(0, 0x417) & 0xf) == KEYMASK)
 if (SCANCODE == kbval) {
 hotkeyhit = 1;
 /* --- reset the keyboard ---- */
 kbval = inportb(0x61);
 outportb(0x61, kbval 0x80);
 outportb(0x61, kbval);
 outportb(0x20, 0x20);
 return;
 }
 (*oldkb)();
}
/* ----- timer ISR ------- */
static void interrupt newtimer(void)
{
 (*oldtimer)();

 if (hotkeyhit && (peekb(dossegmnt, dosbusy) == 0) &&
 !diskflag)
 popup();
}
/* ----- 0x28 ISR -------- */
static void interrupt new28(void)
{
 (*old28)();
 if (hotkeyhit)
 popup();
}
/* ------ switch psp context from interrupted to TSR ----- */
static void resident_psp(void)
{
 intpsp = getpsp();
 _AH = 0x50;
 _BX = _psp;
 geninterrupt(DOS);
}
/* ---- switch psp context from TSR to interrupted ---- */
static void interrupted_psp(void)
{
 _BX = intpsp;
 _AH = 0x50;
 geninterrupt(DOS);
}
/* ------ execute the resident program ------- */
static void popup(void)
{
 running = 1;
 hotkeyhit = 0;
 intsp = _SP;
 intss = _SS;
 _SP = tsrsp;
 _SS = tsrss;
 oldcrit = getvect(CRIT); /* redirect critical err */
 oldbreak = getvect(CTRLBRK);
 oldctrlc = getvect(CTRLC);
 setvect(CRIT, newcrit);
 setvect(CTRLBRK, newbreak);
 setvect(CTRLC, newbreak);
 ctrl_break = getcbrk(); /* get ctrl break setting */
 setcbrk(0); /* turn off ctrl break */
 intdta = getdta(); /* get interrupted dta */
 setdta(mydta); /* set resident dta */
 resident_psp(); /* swap psps */
 enable();
 tsr_program(); /* call the TSR C program */
 disable();
 interrupted_psp(); /* reset interrupted psp */
 setdta(intdta); /* reset interrupted dta */
 setvect(CRIT, oldcrit); /* reset critical error */
 setvect(CTRLBRK, oldbreak);
 setvect(CTRLC, oldctrlc);
 setcbrk(ctrl_break); /* reset ctrl break */
 disable();
 _SP = intsp; /* reset interrupted stack*/
 _SS = intss;
 running = 0; /* reset semaphore */

}





[LISTING FOUR]

/* ---------------- copyscrn.c -------------- */
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include <dos.h>
#include "console.h"

static void near highlight(RECT);
static void writescreen(RECT);
static void near setstart(int *, int, int);
static void near forward(int);
static void near backward(int);
static void near upward(int);
static void near downward(int);
static void init_variables(void);
extern unsigned _stklen = 1024;
extern unsigned _heaplen = 8192;
static int cursorx, cursory;
static RECT blk;
static int done = 0;
static int kx = 0, ky = 0;
static int mx, my;
static int px = -1, py = -1;
static int mouse_marking = FALSE;
static int keyboard_marking = FALSE;
static int marked_block = FALSE;
static int vpcts[] = {0,2,10,15,30,45,70,90,100};
static int extension = 0;
static int pcts[] = {45, 45, 30, 15, 45, 15, 30, 0};
static int dfpcts[] = {45, 45, 30, 15, 45, 15, 30, 0};
char *clrs[] = {"Black", "Blue", "Green", "Cyan",
 "Red", "Magenta", "Brown", "White"};
/* ----- LaserJet III Macros ----- */
#define push(fp) fputs("&f0S",fp)
#define pop(fp) fputs("&f1S",fp)
#define CW 18
#define CH 40
#define movehorizontal(fp,n) fprintf(fp,"&a%dC",n)
#define movevertical(fp,n) fprintf(fp,"*p%dY",n)
#define shading(fp, c) fprintf(fp,"*c%dg",c)
#define rectangle(fp,h,w) fprintf(fp,"%db%dA",h*CH,w*CW)
#define rectanglefill(fp) fputs("*c2P", fp)
#define selectfont(fp) \
 fputs("&l6C&l8D(10U(s0p16.67h8.5v0s0b0T",fp);
#define resetfont(fp) \
 fputs("&l0O(8U(s1p10v0s0b4101T",fp);
/* ----------- make a RECT from coordinates --------- */
RECT rect(int x, int y, int x1, int y1)
{
 RECT rc;
 rc.x = x;

 rc.y = y;
 rc.x1 = x1;
 rc.y1 = y1;
 return rc;
}
static char *FileName(void)
{
 static char fn[15];
 sprintf(fn, "snap.%03d", extension);
 return fn;
}
void initialize(void)
{
 int i, c, done = 0;

 if (_argc > 1)
 extension = atoi(_argv[1]);
 printf("\nScreen Snapper Version 2 - %s", FileName());
 while (!done) {
 printf("\n Gray scale defaults:");
 for (i = 0; i < 8; i++)
 printf("\n %d. %-7.7s %d%%",
 i, clrs[i], pcts[i]);
 printf("\n\nEnter number of a color to change,\n"
 "Esc to return to defaults,\n"
 "Or Enter to accept settings\n...");
 c = getch();
 putch(c);
 switch (c) {
 case 27:
 for (i = 0; i < 8; i++)
 pcts[i] = dfpcts[i];
 break;
 case '\r':
 done = 1;
 break;
 default:
 if (c > '0'-1 && c < '8') {
 int ch = 0;
 int cl = c - '0';
 int p;
 for (p = 0; p < 8; p++)
 if (pcts[cl] == vpcts[p])
 break;
 printf("\n %s:", clrs[cl]);
 printf(
 "\n Sp = change, Esc = quit, Enter = accept\n");
 while (ch != 27 && ch != '\r') {
 printf("\r%3d%%", vpcts[p]);
 ch = getch();
 if (ch == ' ') {
 if (++p == 9)
 p = 0;
 }
 else if (ch == '\r')
 pcts[cl] = vpcts[p];
 else
 putch('/a');
 }

 }
 else
 putch('/a');
 break;
 }
 }
 printf("\n\nHot key is Alt+Period\nSnapper is resident");
}
/* ----- process keystrokes ------ */
static void keystroke(void)
{
 int key = getkey();
 switch (key) {
 case FWD:
 if (kx < 79) {
 if (keyboard_marking)
 forward(1);
 kx++;
 }
 break;
 case BS:
 if (kx) {
 if (keyboard_marking)
 backward(1);
 --kx;
 }
 break;
 case UP:
 if (ky) {
 if (keyboard_marking)
 upward(1);
 --ky;
 }
 break;
 case DN:
 if (ky < 24) {
 if (keyboard_marking)
 downward(1);
 ky++;
 }
 break;
 case F2:
 mouse_marking = FALSE;
 setstart(&keyboard_marking, kx, ky);
 break;
 case '\r':
 done = 1;
 break;
 case ESC:
 done = -1;
 break;
 }
 cursor(kx, ky);
}
/* ---------- enter here to run screen grabber --------- */
void tsr_program(void)
{
 static char *bf = NULL;
 int bfsize = mouse_buffer();

 /* ------ save the video cursor configuration ------- */
 savecursor();
 set_cursor_type(0x0607);
 unhidecursor();
 if (bfsize)
 if ((bf = malloc(bfsize)) != NULL)
 intercept_mouse(bf);
 resetmouse();
 init_variables();
 curr_cursor(&cursorx, &cursory);
 unhidecursor();
 cursor(0, 0);
 set_mouseposition(0, 0);
 show_mousecursor();
 /* ----- event message dispatching loop ---- */
 done = 0;
 while (done == 0) {
 if (keyhit())
 keystroke();
 if (leftbutton()) {
 if (!mouse_marking) {
 px = mx;
 py = my;
 keyboard_marking = FALSE;
 setstart(&mouse_marking, mx, my);
 }
 }
 get_mouseposition(&mx, &my);
 if (mx != px my != py) {
 if (mouse_marking) {
 if (px < mx)
 forward(mx-px);
 if (mx < px)
 backward(px-mx);
 if (py < my)
 downward(my-py);
 if (my < py)
 upward(py-my);
 }
 px = mx;
 py = my;
 }
 if (button_releases())
 mouse_marking = FALSE;
 if (rightbutton())
 done = 1;
 }

 /* ----- done ------ */
 if (marked_block) {
 highlight(blk);
 if (done == 1)
 writescreen(rect(min(blk.x, blk.x1),
 min(blk.y, blk.y1),
 max(blk.x, blk.x1),
 max(blk.y, blk.y1)));
 }
 resetmouse();
 if (bf != NULL) {

 restore_mouse(bf);
 free(bf);
 bf = NULL;
 }
 cursor(cursorx, cursory);
 restorecursor();
 init_variables();
}
/* ------- set the start of block marking ------- */
static void near setstart(int *marking, int x, int y)
{
 if (marked_block)
 highlight(blk); /* turn off old block */

 marked_block = FALSE;
 *marking ^= TRUE;
 blk.x1 = blk.x = x; /* set the corners of the new block */
 blk.y1 = blk.y = y;
 if (*marking)
 highlight(blk); /* turn on the new block */
}
/* ----- move the block rectangle forward one position ----- */
static void near forward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.x < blk.x1)
 highlight(rect(blk.x,blk.y,blk.x,blk.y1));
 else
 highlight(rect(blk.x+1,blk.y,blk.x+1,blk.y1));
 blk.x++;
 }
}
/* ---- move the block rectangle backward one position ----- */
static void near backward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.x > blk.x1)
 highlight(rect(blk.x,blk.y,blk.x,blk.y1));
 else
 highlight(rect(blk.x-1,blk.y,blk.x-1,blk.y1));
 --blk.x;
 }
}
/* ----- move the block rectangle up one position ----- */
static void near upward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.y > blk.y1)
 highlight(rect(blk.x,blk.y,blk.x1,blk.y));
 else
 highlight(rect(blk.x,blk.y-1,blk.x1,blk.y-1));
 --blk.y;
 }
}
/* ----- move the block rectangle down one position ----- */
static void near downward(int n)

{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.y < blk.y1)
 highlight(rect(blk.x,blk.y,blk.x1,blk.y));
 else
 highlight(rect(blk.x,blk.y+1,blk.x1,blk.y+1));
 blk.y++;
 }
}
static void fill(FILE *fp, int color, int len)
{
 if (color != LIGHTGRAY && pcts[color] != 0) {
 push(fp);
 // ---- go back len character positions
 movehorizontal(fp, -len);
 movevertical(fp, -30);
 // ----- set the shading
 shading(fp, pcts[color]);
 // ----- define the rectangle
 rectangle(fp, 1, len);
 // ----- fill the rectangle
 rectanglefill(fp);
 pop(fp);
 }
}
/* ------ write the rectangle to the file ------- */
static void writescreen(RECT rc)
{
 FILE *fp = fopen(FileName(), "wt");
 hide_mousecursor();
 extension++;
 if (fp != NULL) {
 int vx, vy, x;
 int fg, bg, color, colorstart = 0, prevcolor = -1;
 int wd = rc.x1-rc.x+1;
 int margin = 6 + (70-wd)/2;
 int i;

 selectfont(fp);
 fputc('\n', fp);
 // ---------- write the text
 for (vy = rc.y; vy < rc.y1+1; vy++) {
 for (i = 0; i < margin; i++)
 fputc(' ', fp);
 for (vx = rc.x; vx < rc.x1+1; vx++) {
 int vid;
 gettext(vx+1, vy+1, vx+1, vy+1, &vid);
 /* ---- get the video attribute ---- */
 color = (vid >> 8) & 255;
 bg = (color >> 4) & 7;
 fg = color & 15;
 if (bg != prevcolor) {
 if (prevcolor != -1)
 fill(fp, prevcolor, x-colorstart);
 prevcolor = bg;
 colorstart = x;
 }
 fputc((bg == BLACK && fg == DARKGRAY) ?

 0xdb : (vid & 255), fp);
 x++;
 }
 fill(fp, bg, x-colorstart);
 fputc('\n', fp);
 x = 0;
 prevcolor = -1;
 colorstart = 0;
 }
 resetfont(fp);
 fclose(fp);
 }
}
#define swap(a,b) {int s=a;a=b;b=s;}
/* -------- invert the video of a defined rectangle ------- */
static void near highlight(RECT rc)
{
 int *bf, *bf1, bflen;
 if (rc.x > rc.x1)
 swap(rc.x,rc.x1);
 if (rc.y > rc.y1)
 swap(rc.y,rc.y1);
 bflen = (rc.y1-rc.y+1) * (rc.x1-rc.x+1) * 2;
 if ((bf = malloc(bflen)) != NULL) {
 hide_mousecursor();
 gettext(rc.x+1, rc.y+1, rc.x1+1, rc.y1+1, bf);
 bf1 = bf;
 bflen /= 2;
 while (bflen--)
 *bf1++ ^= 0x7700;
 puttext(rc.x+1, rc.y+1, rc.x1+1, rc.y1+1, bf);
 show_mousecursor();
 free(bf);
 }
}
/* ---- initialize global variables for later popup ---- */
static void init_variables(void)
{
 kx = ky = blk.x = blk.y = blk.x1 = blk.y1 = 0;
 px = py = -1;
 mouse_marking = FALSE;
 keyboard_marking = FALSE;
 marked_block = FALSE;
}


















May, 1993
ALGORITHM ALLEY


Methods of Solution


 This article contains the following executables: ALLEY.ARC


Tom Swan


Want to start an argument? Ask a roomful of programmers to define algorithm,
then stand back and get ready to duck. Everybody, it seems, has their favorite
interpretation. Robert Sedgewick defines algorithms as "methods for solving
problems that are suited for computer implementation." Donald E. Knuth
identifies the word's origin as the name of a Persian textbook author, "Abu Ja
far Mohammed ibn Musa al-Khowarizmi." After suggesting that "an algorithm must
be seen to be believed," Knuth describes an algorithm as "a finite set of
rules which gives a sequence of operations for solving a specific type of
problem." The Encyclopedia of Computer Science and Engineering wins the prize
for brevity, calling an algorithm simply a "method of solution."
I like that one best--it's as neat as a paper clip. No definition, however,
will convince some programmers that algorithms are more than just academic
exercises. Real programmers don't study algorithms, critics might say; real
programmers just wing it. That's the view I aim to counter in this column,
"Algorithm Alley." Studying algorithms is a great way to improve your
programming skills while building a personal library of problem-solving tools.
Implementing them can also be challenging and fun. Each month I'll present one
or more algorithms, showing their inner workings in pseudocode, focusing on
practical applications and problem solving, and listing example programs in
Pascal.
Why Pascal? Several reasons. Pascal is widely available--most programmers have
a Pascal compiler hanging around. Pascal is algorithmic in nature--algorithms
and their implementations resemble one another. Pascal programs are easily
ported to other languages--whether you program in C, C++, Basic, assembly
language, or object-oriented Awk (just kidding), you can use the algorithms in
this column. This is not a column about programming in Pascal. It's a column
about programming with algorithms, illustrated using Pascal.
You might think of me as DDJ's algorithm gourmet. I dish out the recipes; it's
up to you to cook the turkey and season the sauce. If you can do it better,
let me know. If you want to provide detailed analysis of an algorithm or
suggest a new one to cover, be my guest. Go ahead, show me your code.
Together, perhaps we'll both learn some savory techniques.


Side Streets


Before jumping into this month's algorithm, it will help to define a few
ground rules. Borrowing from a variety of sources, I came up with four
elements of a proper algorithm. To be an algorithm, a "method of solution"
must have:
1. A problem to solve.
2. A beginning.
3. A step-by-step sequence of actions.
4. An ending.
A simple example illustrates these characteristics, and also introduces the
format I'll use for this column's algorithms. Example 1, Algorithm #0, shows
the pseudocode of an algorithm for finding the square root of a value, using a
formula known as the "Newton-Raphson technique." Algorithm #0 isn't
necessarily the best way to find a square root, but it has all the
requirements of a complete algorithm. (By the way, I'll number algorithms
sequentially for reference.)
Example 1: Pseudocode for Algorithm #0 (Square Root).

 begin
 y <-- 1; e <-- 5E-5;
 z <-- y * y;
 while (z - x) >= e do
 begin
 y <-- ((x / y) + y) / 2;
 z <-- y* y;
 end;
 Write ("The square root of ",x,"=",y);
 end.

Example 1 follows several commonly used pseudocode conventions. Language
structures such as repeat-until and while-do are the same as in Pascal.
Semicolons terminate statements. A left-facing arrow assigns the value at
right to the symbol at left. An asterisk stands for multiplication, as it does
in most programming languages. Vertical bars mean "absolute value." I'll use
scientific notation for floating point, unless a decimal representation makes
better sense. In the example, 5E-5 equals 5x10{-5}. Character strings are
double quoted.
Algorithm #0 is simple to understand. To find the square root of x, the method
sets a test value y to 1, then tests whether y{2} minus x is greater or equal
to some small value e. If not, y holds the square root of x, accurate to
within e. Otherwise, the method sets y to the next test value, using the
algorithm's formula to move y inevitably toward a solution.
Can you spot the flaw in Algorithm #0? The method works only for positive
nonzero values of x. A more complete algorithm might use an If statement to
test whether x is less than or equal to zero. Refinements like these are what
make the study of algorithms interesting--and, at times, frustrating.
Listing One (page 144) implements Algorithm #0 in Pascal. The program closely
matches the pseudocode. Run the sample, then rewrite it in your favorite
symbolic tongue, Use the Pascal implementation to check your results. Try any
improvements that come to mind. For example, consider how you might do away
with the duplicated statement that squares y.
Of course, if your language has a square-root function, there's little reason
to use Algorithm #0. The method might come in handy, however, in an assembly
language program with no access to a math library. That suggests another
benefit of collecting algorithms: The methods you have little use for now
might be invaluable somewhere down the road.
Next, let's try an algorithm with more potential. You can even use this one to
help your compiler run more efficiently.


Topological Sorting


Many sorting algorithms operate on independent data elements. The QuickSort
algorithm, for example, can alphabetize an array of strings regardless of
their original order.
Not all sorting techniques, however, are designed to put all of your ducks in
a row. Some methods rely on a data set's partial ordering, preserving existing
relationships between elements. One such method is called topological sorting.
Partially ordered data sets are commonplace. A university's courses, for
instance, typically list others as prerequisites. Topological sorting can
arrange a student's course selections so that all prerequisites are taken in
the proper sequence.
Topological sorting can also help a compiler run more efficiently. Listing
dependencies between a program's modules, and then sorting that list
topologically gives the most-efficient module order, so the compiler opens the
fewest number of header files (or units in Turbo and UCSD Pascal) at a time.
More on this later.
Figure 1 shows a graph of a partially ordered data set. The circled letters
might represent university courses or program module names. Read arrows as
"precedes." Thus B precedes C, F is a prerequisite of G and E, and so on.
The goal is to rearrange the jumbled graph in Figure 1 into a linear sequence,
so it resembles Figure 2. Reading from left to right, the letters are
topologically sorted so that prerequisites always come first.

A linked list is convenient for representing partially ordered sets in a
computer's memory (see Figure 3). Head and tail pointers identify the list's
beginning and end. Items appear along the top row. A secondary list of
"follower records" shows each item's successors. In the figure, item C has two
successors, G and D. Compare this figure with the graphs in Figure 1 and
Figure 2. Note that item E precedes nothing.
The topological sorting algorithm first locates all "leader" items with no
predecessors. There must be at least one; otherwise, the list is not partially
ordered, and it can't be sorted. Deleting and outputting the leaders results
in a new, partially ordered list, again having at least one item with no
predecessors. The algorithm repeats, deleting and outputting items, until none
remains.
Example 2 shows the pseudocode for Algorithm #1, Topological Sort. Comments
are in braces. First, the method initializes an empty list. Then it reads
elements, designating pitem as the predecessor of item. The items are linked
into a list by an unspecified Search function, and a follower record is
created to keep track of pitem's successor.
Example 2: Pseudocode for Algorithm #1 (Topological Sort).

 {Initialize}
 New(head);
 tail <-- head;

 {Input data}
 Read(pitem);
 while not Eof(input) do
 begin
 Read(item);
 a <-- Search(pitem);
 b <-- Search(item);
 Create new follower record f;
 Identify f as item at b
 Link f into chain at a
 Read(pitem);
 end;

 {Find leaders}
 a <-- head;
 head <-- nil;
 while a <> tail do
 begin { Search list }
 b <-- a; a <-- a^.next;
 if b has no predecessors then
 begin { Link b into list at head }
 b^.next <-- head;
 head <-- b;
 end;
 end;

 {Sort and output}
 b <-- head;
 while b <> nil do
 begin { Search list }
 c <-- b^.chain;
 b <-- b^.next;
 while c <> nil do
 begin { Output item }
 Write(b);
 a <-- c^.id;
 if a has no predecessors then
 begin { Link a into leader list
 at b }
 a^.next <-- b;
 b <-- a;
 end;
 c <-- c^.next;
 end;
 end;

After the algorithm reads all elements, the list resembles Figure 3. The
algorithm then finds the leaders--those items having no predecessors--and
links them into a new list at the head pointer. The final stage outputs and
deletes the leaders, then searches for new ones (sort of like what we do in
this country at election time). Eventually, the list is depleted, and the
algorithm ends.
Listing Two (page 144) implements Algorithm #1 and includes a Search function
that builds an element list in memory. I designed the sample program to read
string elements in pairs, separated by blank lines and stored in a file. To
test the program, I extracted a list of module names from a medium-size
program: a game that I wrote named "Mancala" (not shown here). With each word
on a separate line, the list looks something like this: UGlobals UEval <blank
line> WinTypes UGlobals <blank line> WinProcs UGlobals <blank line> Idents
UGlobals <cr>.
From this list, module UEval depends on (uses) UGlobals. The UGlobals module
uses WinTypes, WinProcs, and Idents. Of course, the actual list is much
longer. Sorting the list topologically gives the most-efficient module
ordering: Idents, WinTypes, WinProcs, UGlobals, and Eval.
Using this output, I can reorder each source file's include or uses statements
to help the compiler use fewer files and make better use of RAM. (I assume
that either the compiler is smart enough not to reread headers that it has
already seen, or you have used the common C programmer's trick of defining
symbols to prevent including the same headers more than once.)

Listing Two uses two records to store items on a linked list, and to keep
track of item relationships. The Leader record holds a key (representing data
stored in this element), a count of its predecessors, a pointer to the next
Leader record, and a chain pointer to the Follower records that keep track of
an item's successors.
Follower records address their Leader with a pointer named id. Another pointer
addresses the next Follower. Traversing this secondary list locates all of an
item's successors--duplicating in code the arrows from the graphs in Figure 1
and Figure 2.
The sample program also keeps a global itemCount. If this count is nonzero
after the program deletes all items from the list, the data set was not
partially ordered, and the program displays an error message.
To run the program, create a text file of dependent items with a blank line
between each pair. Follow the last line with a single carriage return. Feed
the input file to Listing Two. For example, under MS-DOS enter a command such
as topsort<sample.dat.


Your Turn


It's my pleasure to introduce this new column, and I'm looking forward to many
months ahead. Drop me a line in care of DDJ. I welcome all comments, and if
you have an algorithm to share, send it in. That'll be right up my alley.

_ALGORITHM ALLEY_
by Tom Swan


[LISTING ONE]

{ sqrroot.pas -- Algorithm #0: Square Root by Tom Swan }

program SquareRoot;
var
 e, x, y, z: Real;
begin
 Write('Value? ');
 Readln(x);
 if (x <= 0) then
 begin
 Writeln('Error: Value <= 0');
 Exit
 end;
 y := 1.0;
 e := 5E-5;
 z := y * y;
 while abs(z - x) >= e do
 begin
 y := ((x / y) + y) / 2;
 z := y * y
 end;
 Write('The square root of ', x, ' = ', y)
end.





[LISTING TWO]

{ topsort.pas -- Algorithm #1: Topological Sort by Tom Swan }

program TopSort;

type
 PLeader = ^Leader; { Pointer to Leader records }
 PFollower = ^Follower; { Pointer to Follower records }
 TKey = String[40]; { Item keys read from file }

 Leader = record { Leader record }
 key: TKey; { Data in this record }
 count: Integer; { Count of key's predecessors }

 next: PLeader; { Next Leader record in list }
 chain: PFollower; { Pointer to first Follower in chain }
 end;

 Follower = record { Follower record }
 id: PLeader; { Pointer to following Leader }
 next: PFollower; { Pointer to next Follower in chain }
 end;

var
 head, tail: PLeader; { Leader list head and tail pointers }
 itemCount: Integer; { Total number of items in list }

{ Search list for item. Return pointer to its Leader record }
function Search(item: TKey): PLeader;
var
 h: PLeader; { Pointer to head of list }
begin
 h := head;
 tail^.key := item; { Create sentinel at dummy record }
 while h^.key <> item do
 h := h^.next;
 if h = tail then
 begin { Insert new item at head of list }
 new(tail);
 itemCount := itemCount + 1;
 h^.count := 0; { No predecessors for new item yet }
 h^.chain := nil; { No follower chain }
 h^.next := tail { Link new record into list }
 end;
 Search := h { Return pointer to item's record }
end; { Search }

{ Read data into list and construct follower chains }
procedure InputData;
var
 pitem, item: TKey; { Predecessor and item }
 a, b: PLeader; { Leader record pointers}
 f: PFollower; { Pointer to Follower record }
begin
 Writeln('Input data:');
 Readln(pitem);
 while not eof(input) do
 begin
 Readln(item);
 if not eof(input) then
 begin
 Writeln(pitem, ' << ', item);
 { Find or insert predecessor and item into list }
 a := Search(pitem);
 b := Search(item);
 { Construct follower and link into chain }
 New(f);
 f^.id := b; { Address following item in chain }
 f^.next := a^.chain; { Link new follower into chain }
 a^.chain := f;
 b^.count := b^.count + 1; { Increment predecessor count }
 Readln; { Read blank line between item pairs }
 Readln(pitem) { Read next item if any }

 end
 end
end; { InputData }

{ Find leader records with no predecessors }
procedure FindLeaders;
var
 a, b: PLeader; { Leader record pointers}
begin
 a := head;
 head := nil;
 while a <> tail do
 begin
 b := a;
 a := a^.next;
 if b^.count = 0 then
 begin
 b^.next := head;
 head := b
 end
 end
end;

{ Sort and output records }
procedure OutputData;
var
 a, b : PLeader;
 c : PFollower;
begin
 Writeln; Writeln('Output data:');
 b := head;
 while b <> nil do
 begin
 Writeln(b^.key);
 itemCount := itemCount - 1;
 c := b^.chain;
 b := b^.next;
 while c <> nil do
 begin
 a := c^.id;
 a^.count := a^.count - 1;
 if a^.count = 0 then
 begin { Insert a^ in b-list }
 a^.next := b;
 b := a
 end;
 c := c^.next
 end
 end
end; { OutputData }

begin { TopSort }
 New(head);
 tail := head;
 itemCount := 0;
 InputData;
 FindLeaders;
 OutputData;
 if itemCount <> 0 then

 Writeln('Error in data set: not partially ordered')
end.



Example 1

begin
 y <- 1; e <- 5E-5;
 z <- y * y;
 while (z - x) >= e do
 begin
 y <- ((x / y) + y) / 2;
 z <- y * y;
 end;
 write("The square root of ", x, " = ", y);
end.


Example 2

{ Initialize }
New(head);
tail <- head;

{ Input data }
Read(pitem);
while not Eof(input) do
begin
 Read(item);
 a <- Search(pitem);
 b <- Search(item);
 Create new follower record f;
 Identify f as item at b
 Link f into chain at a
 Read(pitem);
end;

{ Find leaders }
a <- head;
head <- nil;
while a <> tail do
begin { Search list }
 b <- a; a <- a^.next;
 if b has no predecessors then
 begin { Link b into list at head }
 b^.next <- head;
 head <- b;
 end;
end;

{ Sort and output }
b <- head;
while b <> nil do
begin { Search list }
 c <- b^.chain;
 b <- b^.next;
 while c <> nil do
 begin { Output item }

 Write(b);
 a <- c^.id;
 if a has no predecessors then
 begin { Link a into leader list at b }
 a^.next <- b;
 b <- a;
 end;
 c <- c^.next;
 end;
end;




















































May, 1993
UNDOCUMENTED CORNER


Exploring Windows Palettes




Jeffrey M. Cogswell


Jeff works as a Windows programmer for Tech Specialists in the Research
Triangle Park in Raleigh, North Carolina. He can be reached on CompuServe at
71222,1404 or via the Internet at charvel@salzo.Cary.NC.US.


Far too often, the solution to a programming problem lies in an undocumented
function. Still, it would be surprising if more than a handful of programming
problems couldn't be solved through documented means.
A case in point is this month's undocumented corner. Jeff Cogswell and his
coworkers had to change the palette in someone else's Windows program. Jeff
and his crew spent a lot of time looking at all sorts of undocumented things
in the Windows graphics device interface (GDI). When they were done, it turned
out that, to change another program's palette, absolutely nothing undocumented
was required! Microsoft's ToolHelp library can be used to enumerate all
palette handles on the system, and then the documented SetPaletteEntries
function can be used to alter any specified palette.
All their digging into the undocumented corners of GDI didn't go to naught,
however: in the course of staring at how the GDI palette code works, they
became confident that they could safely zap another program's palette, because
they saw that GDI doesn't track palette owners.
Along the way, Jeff also uncovered several internal palette structures used by
GDI. These were missing and/or wrong in the book Undocumented Windows that I
wrote with David Maxey and Matt Pietrek. Jeff also discovered why the GDIWalk
program that comes with Undocumented Windows doesn't work. So there were
genuine benefits to his explorations, even though it turned out that he could
use documented functions.
After Jeff reverse-engineered the internal palette structures, I realized I'd
seen some of this somewhere before. The Windows SDK comes with a debug
versions of the KERNEL, USER, and GDI DLLs. Programmers should use this debug
version rather than the retail version, because it checks for more errors.
Naturally, the debug version also contains a Codeview (CV) symbol table, with
the names of functions, data structures, and so on. Unfortunately, Microsoft
deliberately smashes these CV symbol tables when it gets ready to release a
final version of the SDK. They evidently have some utility that replaces the
nice symbolic names with white spaces. However, when Windows 3.1 was still in
beta testing, Microsoft shipped several builds of the SDK in which this
symbol-destroying utility hadn't yet been run. GDI.EXE in particular had a
very large symbol table, complete with the field names for the internal data
structures.
Using Borland's TDUMP utility (written by Matt Pietrek), you can find a lot of
interesting information in the SDK beta CV symbol tables. Matt himself has
used this information in his forthcoming book, Windows Internals, which
contains detailed pseudocode for many of the key functions inside Windows.
And, yes, these debug versions of GDI.EXE also contain the names of the fields
in the palette structures. These are shown in Figure 1(a)in the way that TDUMP
displays it. The symbol table includes the field names, but not the names for
the structures themselves, so I have added these in. Compare Figure 1(a) with
Figure 4 to see what all this means.
Figure 1: (a) Internal palette structures from 3.1 debug GDI.EXE, as displayed
by TDUMP; (b) internal region structures from 3.1 debug GDI.EXE, as displayed
by TDUMP.

 (a)

 Palette;
 ilPaletteHead Offs: 00
 ilphPal Offs: 0C
 ilpUseCount Offs: 0E
 ilphLDevice Offs: 10
 ilpFlags Offs: 12
 ilphMetalist Offs: 14
 Colors:
 peForeIndex Offs: 00
 peCurIndex Offs: 02
 pePrevIndex Offs: 04
 peColor Offs: 06
 PalGlobal:
 phNumEntries Offs: 00
 phCurRealTime Offs: 02
 phColors Offs: 04

 (b)

 Region:
 rgnHead Offs: 00
 rgnSize Offs: 0C
 rgnSCnt Offs: 0E
 rgnMaxScan Offs: 10
 rgnBBox Offs: 12
 rgnScnList Offs: 1A
 Scans:
 scnPntCnt Offs: 00
 scnPntTop Offs: 02
 scnPntBottom Offs: 04
 scnPntsX Offs: 06

 scnPtCntToo Offs: 0A

Figure 4: Undocumented Windows palette structures.

 //structures within global heap
 typedef struct tagCOLORS {
 WORD peForeIndex; //Apparently used for Foreground palettes.
 WORD peCurIndex; //Index into system palette. This
 //number tells us where this particular
 //RGB value is mapped into the system
 palette.
 WORD pePrevIndex; //Previous index. Exact use unclear.
 PALETTEENTRY peColor; //RGB and flags -- see Figure 2
 } COLORS;
 typedef struct tagPALGLOBAL{
 WORD phNumEntries; //number of entries
 WORD phCurRealTime; //number of times realized
 COLORS phColors [1]; //actual RGB values
 } PALGLOBAL;
 //structure within GDI's local heap
 typedef struct tagNewPALETTEOBJ {
 GDIOBJHDR ilPaletteHead; //see Figure 3(a)
 HANDLE ilphPal; //handle pointing to global
 structure
 WORD ilphUseCount; //Number of times currently
 selected
 HANDLE ilphLDevice; //copied in from DC when
 selected
 WORD ilpFlags; //Used internally to
 RealizePalette ???
 WORD ilphMetaList; //Apparantly used in metafiles
 } NewPALETTEOBJ;

Figure 1(b) shows two structures, also from a TDUMP of the old debug GDI.EXE,
that are related to regions. In the March 1993 "Undocumented Corner," Joseph
Newcomer and Bruce Horn reverse-engineered the GDI region structure. Looking
now at the GDI symbol table, it turns out that they were right on target:
Every one of their fields matches up correctly with the actual field name. For
example, what Joe and Bruce referred to as the "length of the region object"
turns out to be called rgnSize. Likewise, what they called the "bounding box
for the entire region," Microsoft calls rgnBBox.
Why does Microsoft destroy this information before shipping the retail SDK?
This seemingly minor question is interesting because it gets to the heart of
Microsoft's role in the software industry.
Usually when Microsoft doesn't document something, it reflects a
resource-allocation problem. The biggest problem with Microsoft, I feel, is
that its eyes are bigger than its stomach. The company desires to control all
PC software standards, yet when push comes to shove it keeps its development
teams so understaffed that often essentials such as documentation can't get
the attention they deserve. There is nothing wrong if Microsoft wants to set
all the key software standards; the industry needs a new IBM for the '90s. But
if Microsoft wants to play this role, it had better start devoting more
attention to documenting the industry standards it seeks to create. That this
is mostly a resource-allocation problem is shown by the fact that, with its
Microsoft Developer Network (MSDN) CD-ROM, the company is actually capable of
producing excellent documentation and sharing large amounts of important
information with developers.
The removal of Windows debug CV symbol-table names, however, is a case of
Microsoft deliberately withholding information that would be extremely useful
to Windows developers. It would cost Microsoft nothing to leave it in. In
fact, they have to do something special to remove it. I have seen other cases
like this. Right now, for the forthcoming second edition of Undocumented DOS,
I'm writing a chapter on the many strange interactions between MS-DOS and
Windows. A lot of this interaction, such as the private interface that
MS-DOS.SYS in DOS 5.0 and higher uses to communicate with the DOSMGR (the
Enhanced-mode DOS extender), isn't documented. Yet, Microsoft has internal
documents that describe this and other private interfaces. The documentation
already exists, and Microsoft simply isn't letting us see it.
So why doesn't Microsoft freely provide this information? I'm beginning to
realize that Microsoft does provide it--on a selective basis, as part of
"technology swaps." In exchange for something valuable from an OEM or ISV,
Microsoft will provide some piece of Microsoft "technology"--which sometimes
turns out to be nothing more than documentation and some source code for an
interface that Microsoft really ought to document in the first place. The DOS
network redirector is a good example of this.
I would love to hear from those of you who have anecdotes, documents, or even
rumors that would help shed some light on exactly what Microsoft is up to when
it withholds information from programmers. I suspect there are ultimately as
many reasons for this behavior as there are people who work for Microsoft.
Large companies, even those as well run as Microsoft, tend not to act in a
very concerted fashion.
Of course, I would also love to hear your article ideas, comments on this
column, and so on. Upcoming columns include the PIF file format, undocumented
WinHelp, and internals of Windows edit controls. This line-up looks a little
too slanted towards Windows. Or does this just reflect where developers are
today? Write to me on CompuServe at 76320,302 or on Internet at
andrew@pharlap.com and let me know.
--Andrew Schulman
When Microsoft created Windows 3.0, it put together a set of routines called
the "palette manager" for overseeing the use of graphics-card registers
containing the colors currently available for display. The palette manager is
intended to solve the problem of a multitasking system allowing all programs
to have simultaneous access to a single graphics card. Such problems include
the possibility of two programs running at once on a 256-color system, with
each program trying to use a different set of 256 colors.
The palette manager gives you a way of defining a so-called "logical palette."
Essentially, a logical palette is a set of colors a program creates and uses.
Each program can create a logical palette, and the palette manager determines
from all the logical palettes which colors actually get displayed on the
screen--that is, which colors get to be placed in what is called the "system
palette."
This palette manager works well for most cases, but it lacks certain things.
For instance, what if we want to modify another program's logical palette, and
we don't have access to that program's source code? This was a real world
problem that my coworkers and I faced. In this article we'll examine this
problem in detail. As we show how we solved this problem, we will explore the
low-level activity of the palette manager and its internal structures.


Palette-manager Basis


First, let's quickly go over how to use the palette manager.
To create a logical palette, we must determine the red, green, and blue (RGB)
levels of each color we need. We then allocate some memory and cast this
memory to a LOGPALETTE structure (see Figure 2), along with the PALETTEENTRY
structure used by LOGPALETTE. A PALETTEENTRY contains the actual RGB values
for a single color. Note the strange notation of an array of size 1; the
actual size isn't known until run time.
Figure 2: Documented Windows Palette structures.

 typedef struct tagPALETTEENTRY {
 BYTE peRed;
 BYTE peGreen;
 BYTE peBlue;
 BYTE peFlags;
 } PALETTEENTRY;

 typedef struct tagLOGPALETTE
 WORD palVersion;
 WORD palNumEntries;
 PALETTEENTRY palPalEntry [1];
 } LOGPALETTE, FAR * LPLOGPALETTE;

Where NUMCOLORS is defined as the number of colors we wish to create, the
allocation, locking, and casting looks like Example 1. Next, we start filling
in palPalEntry[0], palPalEntry[1], and so on, until we've covered all our
colors. We place the number of colors in palNumEntries, and a 0x300 in
palVersion to indicate both Windows 3.0 and 3.1. Then we call hpal =
CreatePalette(LogPal), where LogPal is the address of the LOGPALETTE
structure. The palette manager gives us back a secret number, the handle to
our palette. This palette is not the same as the LOGPALETTE we created, but an
internal data structure maintained by the Windows graphics device interface
(GDI). This in turn contains a handle to another internal structure. These two
structures are what we'll explore in this article.
Example 1: Creating a logical palette.

 hLocal = LocalAlloc (LPTR,
 sizeof(LOGPALETTE) + NUMCOLORS
 * sizeof(PALETTEENTRY));
 LogPal = (NPLOGPALETTE)
 LocalLock(hLocal);

The two internal structures are the only places in which GDI requires the
palette information to be; the initial LOGPALETTE we filled with our RGB
values is nothing more than a temporary mechanism for conveying the color
information to the palette manager. After calling CreatePalette we can
LocalUnlock and LocalFree the memory allocated above. GDI has no further use
for it.
After our program receives a WM_ PAINT message, we call the GDI routines
BeginPaint, SelectPalette, and RealizePalette. This is shown in Example 2(a),
where oldpal is the palette previously associated with the window (probably a
system default) and mapped is the number of colors actually copied (or
"mapped") to the system palette.
Example 2: Palette processing: (a) At the start of a paint message; (b) at the
end of a paint message.

 (a)

 case WM_PAINT:
 hDC = BeginPaint (hWnd, &PtStr);
 oldpal = SelectPalette (hdc,
 hpal,FALSE);
 mapped = RealizePalette(hDC);

 (b)

 SelectPalette(hDC, hOldPal, FALSE);
 EndPaint(hWnd, &PtStr);

SelectPalette tells GDI that we want our window to use a specified palette.
Internally, GDI stores the palette handle in the device context (DC) structure
and increments a field inside the internal palette structure. RealizePalette
then copies our selected palette into the system palette.
The palette is now in place, and graphics commands can freely use it. Later,
after we've drawn in our window, we restore the previous palette prior to
calling EndPaint. This takes place before we've left the WM_PAINThandler, as
in Example 2(b). A quick note about RealizePalette: This routine attempts to
map all selected palettes, even those for other programs, into the system
palette, starting with the front-most window and working its way back in
Z-order through all the windows on the screen until the system palette is
full.
If a program's window is too far back and the system palette gets filled up,
the palette manager must map that program's logical palette colors to what
Microsoft considers the "closest color." This produces an unfortunate effect:
It's as though you have multicolored wallpaper that becomes colored with a
different set of colors when you run a graphics program. The system palette
filled up, and the colors in the wallpaper got mapped to the closest colors
available in the system palette.


Enumerating Palette Handles


That's the proper procedure for using the palette manager, as described in the
SDK manuals, and is all fine and dandy until we run into a problem such as
having to change another program's palette. This was a real-world problem: A
client was using a man-machine interface (MMI) program for monitoring and
controlling devices and my company had to change this MMI's palette.
Our goal was simple: to get the MMI to display our colors. To accomplish this,
we had to do two things: First, find the MMI's palette; and second, change it.
How hard could that be? Famous last words!
Actually, it turned out not to be hard, but it took a considerable amount of
work and a long detour into undocumented Windows to find this out. We were
looking for a palette-manager routine that would give us the handle of a
palette in another program, and we temporarily forgot about ToolHelp's ability
to "enumerate" all GDI objects of a given type, including palette handles.
When we started the project, we turned to the book Undocumented Windows, which
has a program called GDIWalk that lists all the structures inside GDI's local
heaps in a nice, scrollable window. Of course, other programs have similar
functionality, but this one comes with something the others don't: source
code.
Using GDIWalk, we could see that each structure inside GDI corresponds to the
objects associated with graphics: pens, brushes, fonts, palettes, and the
like. The structures all begin with a set of object-header fields called
GDIOBJHDR. Unfortunately the object header is different for the debug version
of Windows 3.1 that ships with the Windows SDK, in that it is four bytes
longer than those for the other versions. Fortunately, the structure is
consistent between Windows 3.0 and 3.1 retail versions; see Figure 3(a).
Figure 3: (a) GDI object header structure; (b) enumerating palettes with
ToolHelp.

 (a)

 typedef struct tagGDIOBJHDR {
 HANDLE hNext;
 WORD wMagic;
 DWORD dwCount;
 WORD wMetaList;
 #ifdef DEBUG_31
 WORD wSelCount;
 HANDLE hOwner;
 #endif
 } GDIOBJHDR; FAR *LPGDIOBJHDR;


 (b)

 #include "toolhelp.h"
 // ...
 SYSTEMHEAPINFO shi;
 LOCALENTRY le;
 shi.dwSize = sizeof(shi);
 SystemHeapInfo (&shi);
 hGDI = shi.hGDISegment;
 for (ok = LocalFirst(&le, hGDI);
 ok; ok = LocalNext(&le))
 if (le.wType == LT_GDI_PALETTE)
 printf ("%04x\n", le.hHandle);

It's interesting that the debug version is larger because it has an "owner"
field. This field should tell us which program created the palette. However,
since our client's software would usually be running in the retail version,
this didn't help.
Most of the actual information on the object follows the object header in each
of GDI's object structures. I say "most of" because some objects (such as
palettes) have information elsewhere, in other structures. We'll see where
when we look at the undocumented structures for palettes.
GDIWalk uses the ToolHelp functions SystemHeapInfo, LocalFirst, and LocalNext
to walk through the GDI heap. (ToolHelp ships with Windows 3.1, but also runs
under 3.0.) SystemHeapInfo gives us the default data segment for GDI's local
memory. Note that GDI may have more than one data segment; SystemHeapInfo
gives us the first. Fortunately, palettes always sit in this first heap.
LocalFirst and LocalNext operate on a structure called LOCALENTRY, which is
described in the Windows SDK. These routines may be used to find the actual
objects internal to GDI. LocalFirst gets us to the first object in GDI. To use
it, we simply set up a LOCALENTRY structure and call LocalFirst; LocalFirst
then fills in the structure with information on the first object in GDI. After
that, we use LocalNext to get to the rest of the structures.
For the most part, GDIWalk works well after a small modification. As we
studied it, however, we realized it would have to be modified further to truly
get to the palette information. The data that GDIWalk listed for palettes was
incorrect. For one thing, the size of the data was much too small. Yet, the
program used the PALETTEOBJ structure shown on page 540 of Undocumented
Windows.
We began to modify GDIWalk. The final program was so different we renamed it
IOPal (since it uses the winIO library and operates on PALettes). It is
presented in IOPAL.C, available electronically from DDJ; see " availability,
page 7.
We implemented the changes gradually. First, we modified GDIWalk to ignore all
objects that are not palettes. This was a simple matter of walking through
GDI's entire heap and watching for objects whose corresponding LOCALENTRY
structure had a WType equal to LT_GDI_PALETTE. The code in Figure
3(b)demonstrates this process.
A note about the WType field: GDIWalk uses code like that in Listing Five to
list all the objects in GDI's heap, but the WType always comes up 0. This is
due to a small bug in GDIWalk that modifies the heap selector, as in Example
3. The last two lines should simply not be there. This will allow GDIWalk (and
our IOPal program) to function properly, using the documented ToolHelp
LT_GDI_PALETTE.
Example 3: A bug in GDIWalk modifies the heap selector like this.

 shi.dwSize = sizeof(shi);
 SystemHeapInfo(&shi);
 GDIHeap = shi.hGDISegment;
 GDIHeap &= 0xfffc;
 GDIHeap = 1;

Originally, our IOPal program checked what's called "magic" inside the
undocumented GDI structures (see Undocumented Windows, Chapter 8, for more
information on GDI magic numbers) since the WType field was always 0. A
palette object has a magic number of 4F4Ah. Once we fixed the bug in GDIWalk,
we no longer needed to use GDI magic numbers.


The Undocumented Palette Structures


After our IOPal program could list all palettes, we further modified it to
dump the palette objects as raw hex values instead of formatting them with the
apparently incorrect PALETTEOBJ structure in Undocumented Windows. This
revealed a couple of WORD-size numbers immediately following the object header
that were certainly not RGB values.
The next modification was nerve-frazzling and borderline masochistic. To try
and discover what on earth these values were, we used every global
heap-walk-type program under the sun and searched through entire listings of
the global heap, trying to find these numbers embedded someplace else. Where?
Could be anywhere. After all, these numbers could be almost anything.
Addresses. Handles. Whatever.
After our eyes refocused, we found a chunk of data sitting in the middle of
the global heap with a handle corresponding to one of the 2-byte words in the
palette object mentioned earlier; this chunk was owned by GDI. But without a
structure through which to view the chunk, the data inside was little more
than a long sequence of bytes.
We dumped the raw hex values to the printer, and after more searching, we
found our RGB values scattered throughout the dump. We were there! It was the
palette! But heavens, it was a mess! There seemed no order to them.
Eventually, however, we began to make sense of it.
Then a silly thought occurred to us. We managed to find the palettes--that was
easy; our IOPal listed them all for us. But digging through all the
Undocumented Fun, we totally overlooked the fact that in plain sight was a
completely documented function, called Get-PaletteEntry. We had simply
forgotten about it. Staring at hex values for so long has this sort of
unfortunate side effect.
GetPaletteEntry is used for getting all of the colors (that is, the palette
entries) in a palette--if you know the handle. And we did. In fact, using
ToolHelp, we knew the handle for every palette in these systems. Any one of
these palette handles could be passed to GetPaletteEntry. Because GDI does not
track palette owners, we could pass any arbitrary palette handle to
GetPaletteEntry to find out what the palette contains. We didn't need to know
the actual underlying undocumented-palette data structure! GetPaletteEntry did
all the work for us. SetPaletteEntry could similarly be used to modify any
arbitrary palette--even one that belonged to another program, such as the MMI.
So we quickly assembled our findings, drew up the two structures for the
internal logical palettes, and left them alone. They're presented in a greatly
refined form in Figure 4. Note that the global structure is given in the form
of two structures. (Later, we were able to refine these structures by tracing
through the assembler code for the CreatePalette, SelectPalette, and
RealizePalette routines. We were also lucky Andrew Schulman had an old Beta
copy of GDI that contained all of the Codeview symbols. This made life much
easier, and also spilled the beans on how some of Undocumented Windows was
researched.)
Note that one structure, NewPALETTEOBJ, replaces the incorrect PALETTEOBJ
structure given in Undocumented Windows. Objects of this type sit inside the
GDI's local heap. The other structure, PALGLOBAL, contains the actual RGB
values, coded here as an array of type COLORS. Oddly enough, this one doesn't
sit inside GDI's local heap; instead it's in the global heap, perhaps to save
valuable space inside the GDI heap. Both structures are created during a call
to CreatePalette.


Trying to Find Palette Owners


So we had a program that listed the handles to all the palettes in the system
and that, after further modifications, listed the RGB values for each color in
the palettes. Unfortunately, we still had to find a place where the palettes
were connected to their creators.
As mentioned earlier, we traced through the assembler for the CreatePalette,
SelectPalette, and RealizePalette routines. This allowed us to discover
something rather surprising: Nowhere in the retail version does GDI track who
created the palettes. It either doesn't know or doesn't care.
That means absolutely any program could safely modify the palette created by
another program! Unless, of course, the other program figures out what we did
and does something rebellious such as abort. In any case, GDI itself had no
problem with modifying other program's palettes.
As amazing as this was to us, it had a bad implication: How did we know which
palette went with our MMI program? (We had almost forgotten about that!) After
all, that was the program whose palette we were trying to change.
In this particular case, we were lucky. We'd been told by the developer of the
MMI that the program maintained a static, "unalterable" palette. Our IOPal
program could easily list the palettes and dump all of them, and from there we
could see which one matched the colors used by the MMI. So we modified IOPal
further: We gave it the ability to save the palettes to disk, and after
manually recognizing which palette belonged to the MMI, we used it to dump the
MMI's palette to disk.
Next, we added to IOPal the ability to read in a palette from disk and, rather
than create a logical palette, to keep the read-in palette in a structure and
compare it to each palette in the system. If it finds a match, it opens
another file containing the preferred colors and uses the SetPaletteEntry
function (which behaves much like GetPaletteEntry, only it sets rather than
gets colors) to actually store our new colors in the logical palette we
located.
But what if for some reason the MMI gets aborted abnormally, and we have more
than one palette sitting around (which happens a couple of times a month)? In
that case, the comparison process in IOPal shows two palettes containing the
RGB values corresponding to MMI's static, no-longer-unalterable palette. The
solution was simple: modify both.
This particular version of IOPal did the job, but it was difficult because the
MMI, which could be linked with Excel, had to run an Excel macro that spawns
IOPal. This was acceptable to us, but to our boss it was a kludge. We also
thought of the incredibly remote possibility of the customer running some
other program that would, just by chance, have a color palette identical to
the MMI'S. As unlikely as this is, it could indeed be a problem.
We were going to go with it anyway, but then I started writing this article,
and the editor told me the palette-comparison method wouldn't cut it for the
readers' own problems. He was right. It barely cut it for ours, and ours was a
unique case. Besides, we wanted a better approach.



Intercepting CreatePalette


This is where Plan B came in. Recall the original goal: to find the program's
palette and modify it. Plan B was something we had quietly mentioned among
ourselves when the boss wasn't around, for fear he'd want us to go through
with it. But now looked like the time. Plan B was this: Trap calls to
CreatePalette. Essentially, the steps are as follows:
1. Remap INT 3 (Breakpoint) so it calls our own trap-handling routine. (It
might be better to use something other than INT 3 since this can badly confuse
debuggers.)
2. Get the starting address of the CreatePalette routine. There are at least
two ways to do this. One is to call GetProcAddress. The other is the way we'll
do it here: Set a variable equal to the value of CreatePalette. The second way
works well but forces us to hardcode the CreatePalette name.
3. Stick an INT 3 at the beginning of CreatePalette. (This requires creating
additional selectors capable of writing to the same segment that holds the
read-only executable code for CreatePalette.)
The trap handler would copy back the original beginning of CreatePalette, call
CreatePalette, and process the results. In the program accompanying this
article, the "process the results" is nothing more than a call to MessageBox
displaying the results, which happens to be the palette handle. The code for
this, in TRAPDLL.C and TRAPDLL.H, is available electronically.
So what good is this? Instead of displaying the palette handle in a message
box, the trap handler could be modified to add it to a list. It could also use
GetCurrentTask to find who currently has the processor, and if it's the
necessary module, to call SetPaletteEntry to change the palette. Or, it could
instead be modified to change the calling program's LOGPALETTE structure
before it calls the real CreatePalette.


Conclusion


One really nice thing about IOPal is that it can perform all the needed work
successfully without doing anything undocumented. The palette routines and the
ToolHelp routines are all fully documented. (For completeness, IOPal provides
commands for dumping the undocumented structures.) Using only documented
features is good, because there's a better chance of upward compatibility than
with undocumented structures. Perhaps we can hope, however, that this
documented interface will at least work with Windows 4.0 (Chicago) when it's
available.
Throughout our digging, we realized how little was actually written on the
subject of color palettes. Even Petzold hardly touched on it in his big book.
But an excellent source of information is the article, "The Palette Manager:
How and Why" by Ron Gery, found both in the WINSDK forum under the GDI library
#9 of CompuServe and on the Microsoft Developer Network (MSDN) CDROM. To be
quite frank, Gery's article is written like a college textbook--that is, some
paragraphs I had to read over a few times before I figured out what he was
trying to say. But the information is by far the most complete I've seen in
the way of so-called "documented" palette functions.
During the project, we couldn't help but wonder if Microsoft had made the
right decisions in the way the palettes are managed. It would be nice if a
future version of Windows offered more sophisticated methods of working around
the 256-color limit on most cards without forcing the user to buy additional
hardware beyond a standard 1-Mbyte graphics card. Some machines allow the
programmer to set interrupts to occur at the end of each horizontal scan,
which would reset the color registers, thereby placing the color-limit on a
per-line, not per-screen basis. In fact, the Amiga does this, although through
custom hardware. This might even be a possibility for future graphics cards.


[TABLE 5]

_UNDOCUMENTED CORNER_
edited by Andrew Schulman
written by Jeffrey M. Cogswell



Figure 1:

(a)

Palette:
 ilPaletteHead Offs: 00
 ilphPal Offs: 0C
 ilpUseCount Offs: 0E
 ilphLDevice Offs: 10
 ilpFlags Offs: 12
 ilphMetaList Offs: 14
Colors:
 peForeIndex Offs: 00
 peCurIndex Offs: 02
 pePrevIndex Offs: 04
 peColor Offs: 06
PalGlobal:
 phNumEntries Offs: 00
 phCurRealTime Offs: 02
 phColors Offs: 04

(b)

Region:
 rgnHead Offs: 00
 rgnSize Offs: 0C
 rgnSCnt Offs: 0E
 rgnMaxScan Offs: 10
 rgnBBox Offs: 12
 rgnScnList Offs: 1A

Scans:
 scnPntCnt Offs: 00
 scnPntTop Offs: 02
 scnPntBottom Offs: 04
 scnPntsX Offs: 06
 scnPtCntToo Offs: 0A


Figure 2

typedef struct tagPALETTEENTRY {
 BYTE peRed;
 BYTE peGreen;
 BYTE peBlue;
 BYTE peFlags;
 } PALETTEENTRY;
typedef struct tagLOGPALETTE
 WORD palVersion;
 WORD palNumEntries;
 PALETTEENTRY palPalEntry[1];
 } LOGPALETTE, FAR * LPLOGPALETTE;


Figure 3

(a)

typedef struct tagGDIOBJHDR {
 HANDLE hNext;
 WORD wMagic;
 DWORD dwCount;
 WORD wMetaList;
#ifdef DEBUG_31
 WORD wSelCount;
 HANDLE hOwner;
#endif
 } GDIOBJHDR, FAR *LPGDIOBJHDR;

(b)

#include "toolhelp.h"
// ...
SYSTEMHEAPINFO shi;
LOCALENTRY le;
shi.dwSize = sizeof(shi);
SystemHeapInfo(&shi);
hGDI = shi.hGDISegment;
for (ok = LocalFirst(&le, hGDI); ok; ok = LocalNext(&le))
 if (le.wType == LT_GDI_PALETTE)
 printf ("%04x\n",le.hHandle);




Figure 4

//structures within global heap
typedef struct tagCOLORS {
 WORD peForeIndex; //Apparantly used for Foreground palettes.

 WORD peCurIndex; //Index into system palette. This
 //number tells us where this particular
 //RGB value is mapped into the system palette.
 WORD pePrevIndex; //Previous index. Exact use unclear.
 PALETTEENTRY peColor; //RGB and flags -- see Figure 2
} COLORS;
typedef struct tagPALGLOBAL{
 WORD phNumEntries; //number of entries
 WORD phCurRealTime; //number of times realized
 COLORS phColors[1]; //actual RGB values
} PALGLOBAL;
//structure within GDI's local heap
typedef struct tagNewPALETTEOBJ {
 GDIOBJHDR ilPaletteHead; //see Figure 3(a)
 HANDLE ilphPal; //handle pointing to global structure
 WORD ilphUseCount; //Number of times currently selected
 HANDLE ilphLDevice; //copied in from DC when selected
 WORD ilpFlags; //Used internally to RealizePalette ???
 WORD ilphMetaList; //Apparantly used in metafiles
} NewPALETTEOBJ;




Example 1:


hLocal = LocalAlloc(LPTR, sizeof(LOGPALETTE) + NUMCOLORS *
sizeof(PALETTEENTRY));
LogPal = (NPLOGPALETTE) LocalLock(hLocal);



Example 2:

(a)

case WM_PAINT:
 hDC = BeginPaint(hWnd, &PtStr);
 oldpal = SelectPalette(hdc,hpal,FALSE);
 mapped = RealizePalette(hDC);


(b)

SelectPalette(hDC, hOldPal, FALSE);
EndPaint(hWnd, &PtStr);



Example 3:

shi.dwSize = sizeof(shi);
SystemHeapInfo(&shi);
GDIHeap = shi.hGDISegment;
GDIHeap &= 0xfffc;
GDIHeap = 1;


































































May, 1993
PROGRAMMER'S BOOKSHELF


Hard Driving in the Fast Lane




Al Stevens


Billionaire.
Gets your attention, doesn't it? We are irresistibly drawn to the story of a
billionaire. It doesn't matter who it is. It could be Ross Perot, Aristotle
Onassis, Howard Hughes, Donald Trump (a little while ago), or the Queen of
England, we just want to get a little closer to whatever it took for them to
get where they are, maybe to imagine that we could do it, too. Well, except
for the Queen, perhaps. Anyway, how many other billionaires do you know about?
Of them, how many would you care to spend much time with? Other than for the
availability of what Bill Gates calls "infinite money," how much of their
lives and personalities would you like to assimilate? Are they nice people?
Would you be like them to have what they have? If Bill Gates came to your
house to take your daughter to the prom, would you let him in the door? How
about if you didn't know who he was?
An analysis of the self-made billionaire usually reveals character and
personality disorders that most people do not have, do not respect, and do not
suffer. And you'll find them all in Hard Drive, a book written by two Seattle
Post-Intelligence reporters who started out to do a series of articles about
the city's most famous citizen. To become a billionaire, you must understand
negotiation and have a consuming desire to win. You should have little or no
care or concern for anyone who would impede your progress. Oddly, you should
care little about money itself because you have to be willing to risk it
without concern about the consequences. Winning the deal is important. Dashing
your competitors. Being the best. Being the only one. The money is only a
badge of victory. There won't be time to enjoy it. You'll be making the next
deal and slaying the next competitor. Take everyone to the cleaners, if not
your clothes. Most of us don't know why that's so much fun, and so we will
never be billionaires. We don't have the right stuff.
There is a magnetic draw to the billionaire mystique, though. Get inside the
aura. What must it be like to own a $300,000 car that U.S. Customs won't allow
into the country because the manufacturer hasn't crash-tested three copies of
the model? And your partner has one just like it in the same impound lot.
What's it like to not really care all that much? Do the two of you get
together and go visit the cars and then go have a beer and laugh about it?
That world is as far removed from most of us as anything can get.
We do have something in common with Bill Gates, though, because we write code
and so does he, or at least he used to, and he was good at it. The book tells
us about it. Conjure up the typical image of a nerdy kid hacking out a Basic
interpreter, hand-coding it on yellow tablets, toggling it into the front
panel of a spit-and-baling-wire home-computer kit, selling a few copies,
making some deals, turning the venture into the world's biggest software
company, and becoming one of the world's richest men. Could have been you;
could have been me. I can write code that good; so can you. We aren't
reminded, however, that he already had a million bucks in a family trust fund
when he started. Forget that he has the incessant drive to put success ahead
of everything, including personal relationships, hygiene, and the professional
esteem of his colleagues. Never mind that he has the unique intelligence to
supplement that drive and turn it into success. It's just that he wrote a
program that many of us could have written, and now he is a billionaire. The
interesting part, though, is what happened in between.
Hard Drive is an unauthorized biography. Unauthorized works are free from the
personal bias that usually accompanies an authorized one. On the other hand,
they might lack some inside information that only the subject or his appointed
representatives could provide. Hard Drive seems to cover most of the Gates
story without missing much.
At first I worried about the book. Chapter 1 starts with Gates at age 11,
riding the elevator in the Space Needle, on his way to lunch with a teacher
and classmates, a reward for memorizing the Sermon on the Mount. "Blessed are
the poor...." The book next grandly purports to tell us about Bill Gates's
thoughts, which it tells us are some 3000 miles away at Cape Canaveral,
lifting off in a spaceship, thinking about Edgar Rice Burroughs. What? Is this
going to be that kind of book? How, thought I, could the authors of a
biography, proudly promoted as having been "undertaken without the help or
cooperation of Microsoft," know anything at all about what young Bill was
thinking, particularly some 26 years ago? They don't say how they know, and I
suppose they might have read a quote somewhere, but, fortunately for the
reader, that's the only obvious place where they veer off track and take
license with journalistic integrity. The balance of the book draws mainly from
interviews and published accounts of Gates and his company.
And it's an intriguing and well-told story. Amidst all the anecdotes about
fast cars, traffic tickets, hamburgers, and ruthless deal-making weaves the
story of how Microsoft, with some lucky breaks and a lot of sheer energy,
advanced from a couple of programmers and one program to become first the
principal microcomputer language company and then the purveyors of DOS,
applications, and Windows. The book follows the IBM lashup, and much later,
the breakup; the announcement of Windows and its interminable time to delivery
and the introduction of the term "vaporware" into our language; the
look-and-feel lawsuits; going public; hiring and firing presidents; the FTC
probe.
The story of our youngest billionaire is salted with accounts of coattail
riders. If you had gotten on board in the early days, maybe now you would be
one of the millionaire coders who rode along and cashed in on the stock
options when the company went public and did well. Maybe you would have. Not
me. I wouldn't have lasted long and many didn't. The book is filled with
stories about Gates's tantrums and tirades directed at subordinates who were
not delivering to his standards. He called them stupid, idiots, and worse.
Some of them took it, stayed the course, and got rich. I take comfort in the
knowledge that I did not miss out on anything. I could never have been among
the reams of paper plutocrats who weathered the lean years to reap the gravy,
because I would have punctuated the first such diatribe aimed at me with the
old Stevens one-two-three: Verb, followed by pronoun, followed by departure.
Just not cut out for glory, I guess.
One story I like is about Gates's treatment of a lady friend who was president
of a competing company. In a social one-on-one situation she mentioned that
she had sold a significant quantity of her product to Apple, a transaction
that was in direct competition with Microsoft. Gates began machine-gun firing
questions at her and furiously taking notes about the details of the
transaction--quantities, people, dates. Later, at dinner she asked why he
wanted to know all that. So he could kill the deal, came the answer. He was
going to call Apple and put some pressure on. Microsoft comes first, and she
should never tell him anything he could use against her. I don't think Bill
got lucky that night.
Even though Hard Drive chronicles the meteoric rise of the largest software
company in the world and one of the biggest of any kind of company, the book
will probably not become a textbook in any prestigious business schools,
because you can't use it to teach successful management. It is about Gates,
which is about Microsoft, which is about all that energy, drive, and success.
To duplicate Microsoft, you'd need to create another Gates, and that's not
something you can teach. At least I hope not. But the book is pure fun to
read, particularly if you were in this business through the '70s and '80s when
it all happened. Just think. If I would have grabbed that January 1975 issue
of Popular Electronics magazine, got an 8080 manual, hand-coded a Basic
interpreter, and run down to Albuquerque, and toggled it in....






































May, 1993
A VISUAL APPROACH TO DATA ACQUISITION


Seeing data in real time




James F. Farley and Peter D. Varhol


James is a project manager at Armtec Industries in Manchester, New Hampshire.
Peter is an assistant professor of computer science and mathematics at Rivier
College in Nashua, New Hampshire.


Data-acquisition systems are built by all kinds of companies, especially those
in the sensor industry. Some of these systems are destined for more or less
permanent use, long-term experiments, or as permanent parts of continuing
systems. Most, however, are short-lived, very specific, and designed for
whatever acquisition boards and/or instruments happen to be on-hand.
This can result in a lot of code being written, largely by people not trained
as programmers, since different data is needed each time around. One common
solution is to write the code in Basic, because it's often the first (and
sometimes only) programming language learned by many engineers. These routines
can do the job, but they're sometimes hard to maintain and, more often than
not, wordy. Recently, some data-acquisition systems have begun to include C
routines, forcing engineers to begin coming to grips with this language. This
approach is somewhat better, in that it leads to the development of general,
reusable procedures. (We develop data-acquisition systems using a Forth-based
language, which has a lot of canned routines for rapid prototyping.)
These approaches to data acquisition might be termed "quick-and-dirty," in
that they're often developed using brute-force techniques in very little time.
As most engineers know, however, this approach can result in the most
time-consuming and error-prone systems imaginable. Further, such systems don't
provide easy access to all of your data. Many engineers will write data out to
a text file, and then load the text file into a spreadsheet for analysis.
This, too, is time consuming, and can result in analyses not being
performed--and in opportunities lost by not seeing the data as it is produced.
If you're used to any one of these methods, using LabView for Windows for the
first time is a revelation. With LabView, there's no code to write;
applications are written by manipulating icons in a window. LabView is based
on a graphical programming language National Instruments calls "G." For
engineers used to controlling instruments with code, this approach takes some
getting used to. After some acclimation, however, both permanent and
quick-and-dirty applications are easy to build.


The Application at Hand


Our goal was to build a prototype data-acquisition system to collect data on
infrared sensors. These IR sensors are to be used in an optical fire-detection
system that uses infrared and ultraviolet radiation to determine if a fire
exists, then sound an alarm if it does--all in less than a second. In our
project, four IR sensors were attached to amplifiers and shown fires (produced
primarily with gasoline and jet fuel) of various sizes and distances from the
sensors.
The data we collected helps determine the suitability of the IR sensor as well
as helping us make hardware decisions on the amplifier circuit attached to the
sensor. The amplifier is a necessary part of the design because the sensor
puts out microvolt signals, which must be increased to be useful to the rest
of the system. The data will also be used to help generate the embedded
algorithm since the fire detector will be microprocessor controlled.
Our data-acquisition hardware was a National Instruments AT-MIO-16F-5 board.
This board has many features, including digital I/O, analog-to-digital
converters, counter times, and digital-to-analog converters. In our project,
we used only the analog-to-digital converters, since we were inputting analog
signals and converting them to digital for display. Our test machine was a
486/33 PC with 8 Mbytes of RAM running Windows 3.1.
The data is displayed on a multiplot graph--one plot for each channel. We have
to give the user the ability to change both the number of samples per second
as well as the total number of samples recorded. The graphs will have a range
from 0 to 5 volts DC, corresponding to the output of the IR sensors, as
amplified. In addition, our application includes simple statistical functions
(mean, minimum, and maximum signals, for example) as well as the time to the
highest signal peak.


Working with LabView


LabView for Windows is a graphical programming system for data acquisition and
analysis. Working with a graphical programming language is very different from
traditional programming languages, in that developing an application is closer
to how an engineer thinks about the problem, rather than how the process
itself works. Further, design and development turn out to be the same process.
Once the icons representing the desired processes have been selected and
properly connected, you've completed building the application.
The downside is that programming in this language requires strict adherence to
the inputs of the icons. Each icon is required to have exactly the right
inputs in exactly the right spot on the icon. Fortunately, these inputs are
detailed not only in the manuals, but are available on screen and can be
displayed on-the-fly. Even well into the application, we often obtained the
"bad connector" indication simply because the output from one process didn't
match the input of another. LabView performs limited type coercions, but has
several complex structures (we're still not sure what a "cluster" is) that do
not lend themselves well to automatic coercion.
LabView applications are split into two screens. The first is the panel
screen, which shows the user interface, controls, and displays. The other is
the diagram screen, in which each item on the panel is displayed as an icon on
the diagram window. Function icons are also added to the diagram window and
wired to the control and indicator icons to build the application. These wires
are color coded on the screen to indicate what kind of data is being sent from
one icon to the other. (By all means, use a color monitor with LabView.)
Function icons can be simple routines or full blown applications.


Where to Begin


When writing on-the-fly acquisition applications, you usually start with the
data-acquisition engine. However, after using LabView for a while we came to
the opposite conclusion--it seems best to program in the opposite direction.
In other words, LabView invites you to create the user interface first,
starting with a blank window that prompts you to determine what data should be
displayed and how to display it.
This is an important distinction between LabView and traditional
data-acquisition approaches. Engineers normally don't expect to see data as it
is being produced, and therefore don't give a lot of thought as to how best to
view and analyze the data until the very end. In LabView, the data display and
analysis is the most important part of the application, and the details of
hardware interface are largely hidden.
Our first step, then, was to concentrate on displaying the data on the screen.
In traditional data-acquisition applications, you typically display some
little piece of the data in text form on the screen to ensure that the machine
has not locked up. All the rest of the data is dumped to a text file on disk
to be reviewed later using some commercial software package. The reason for
this approach is simple: Why spend the time writing complicated graphing and
analysis routines when a spreadsheet or statistical package will work just as
well? Granted, with this approach you miss the opportunity to analyze the data
in real time, and perhaps adjust the hardware while the data is still flowing,
but writing graphics drivers is too much trouble.
Not with a LabView application, however. When you use LabView, you'll probably
start with the data display and acquisition controls which consists of a rich
assortment of knobs, switches, slides, buttons, text strings, and graphs that
can be used as controls or as indicators. Once you've picked how you want to
control your system and display your data, you can start working on the
engine.
Figure 1 shows our application's user interface. We used knobs and switches to
control acquisition through the hardware, and graphs and digital indicators to
display data. We confess that, while we started using LabView by playing with
graphical user interface objects, we then experimented with the hardware
interface before returning to complete the user interface. This was due
largely to our inexperience with the product, however. In future development
efforts, we'll probably complete the user interface before working on the
application internals.
From this point, the rest of the application is anticlimactic. Using the
graphic description language, we diagrammed primitive routines to control the
instruments, as illustrated in Figure 2. Although LabView supplies rudimentary
controls for the data-acquisition board, we found that a higher block included
with the software and identified as "AI Waveforms" works just fine for our
application.
Figure 3 shows a diagram of the internals of this icon block. Its canned
routines have the error handlers built right in as well as initialization and
set-up features. Using this block to shield us from the hardware, we merely
built our own custom diagrams on top of them. Then it was a simple matter to
have the LabView engine call those routines to collect the data, keeping in
mind how you would like to handle direct memory access, initialization,
errors, and the like.


Strengths and Limitations of the LabView Approach


One of the most valuable benefits of LabView is that it lends itself to
on-the-fly reconfiguration. For example, let's say that after observing a
process for a while, you want to calculate the moving average of the data and
plot it next to the actual read-out. This change can be made in the space of
minutes with LabView, and the data-collection process can resume with the new
statistic within a short period of time. Using the traditional approach,
modifying the code could result in delays of hours or even days, and may
require the services of a programmer.
LabView also lets you observe and analyze data over long periods of time,
since the data need not be saved after it is viewed. Collecting it in a text
file and analyzing it in a spreadsheet, on the other hand, means that you
either take widely spaced samples, or view a short time period. There is no
such limitation with LabView.
We won't whitewash the process of building our first application. It took over
a month of part-time tinkering, experimenting, and redesigning before we
completed a working data-acquisition application. However, much of this time
can be attributed to the learning curve inherent in a complex application,
false starts, and the testing of different approaches. After almost completing
the application, we decided to redesign the user interface from scratch. At
that point, we knew LabView well enough to accomplish this task in a few
hours. With a little experience under our belts, we are confident that we can
produce effective and highly visual data-acquisition applications using
LabView in a few days.



Finishing off the Prototype


LabView lets you compile the completed application into a stand-alone
executable, enabling you to use it to perform permanent data-acquisition
tasks. It is also possible to incorporate your own external C routines into an
application. These features suggest many exciting possibilities to us.
For example, we can add some intelligent decision -making capacity as external
code, perhaps using the neural-net approach we presented in "Neural Nets for
Predicting Behavior" (DDJ, February 1993). This neural-network development
tool generates C code from a trained network, making it possible to feed our
data into the network and create external code that can recognize patterns and
make decisions based on real-time data.
Combining LabView's flexible and highly graphical approach to data acquisition
and monitoring with a decision-making neural network could produce a PC-based
system to acquire and display data in real time, then make process-control
decisions. This, in fact, is our next step in the process. With the proper
feedback mechanisms, it is possible to use a LabView application as part of a
completely automated process.
Another enhancement is to incorporate the LabView application with a
touchscreen, so that it can be controlled from a factory floor, without the
use of a keyboard. To this end, we included as a part of our application the
Touchmate from Visage, which uses straingauge technology to turn any monitor
into a touchscreen. The monitor simply sits on the flat unit and translates
the force of a touch into a Windows mouse movement or click. LabView's
controls are large and simple enough so that a properly designed application
can be controlled entirely by touch.
Data is the lifeblood of scientific computing. When that data must be
displayed and analyzed in real time, Lab-View is a superb tool for rapidly
prototyping a graphical data-acquisition application, perhaps cutting weeks
off of the R&D process. It can also go beyond that use, to help create a
complete process-control system that involves real-time data acquisition and
analysis. Either way, sophisticated software tools such as LabView make the
life of an engineer much easier than it used to be.






















































May, 1993
OF INTEREST





Nu-Mega has released Bounds-Checker for Windows, a debugging tool which
combines the functionality of a heap checker, parameter validation more
extensive than that of the Windows debug kernel, and an interactive postmortem
analyzer into a single software tool. DDJ saw this product demonstrated at the
Software Development '93 conference and was impressed by the way it detects
memory corruption, null-pointer memory referencing, string overruns, and many
kinds of improper Windows API usage. Bounds-Checker for Windows begins where
Bounds-Checker for DOS leaves off, but has additional features for interactive
analysis and incorporates knowledge about Windows entities such as GDI
objects, resources, and heaps.
The mode of operation is straightforward: You compile with the compiler's
debug option on--no libraries or preprocessors are required. Then, run your
program under control of Bounds-Checker for Windows, which traps faults and
can detect certain problems even before they generate a fault. As each
transgression is encountered, a dialog will pop up and provide you with the
opportunity to view variety of informational windows (source code, call stack,
memory, and so on) to get to the root of the problem. After your program has
finished executing, Bounds-Checker provides a summary of its stack and heap
usage, and lists the Windows resources that have not been freed. Reader
service no. 20.
Nu-Mega Technologies P.O. Box 7780 Nashua NH 03060 603-889 2386
Inmark Development has released its zApp application framework for OS/2 2.x.
zApp for OS/2 provides the same layer of abstraction over the OS/2
Presentation Manager API that is provided by versions of zApp for the Windows
3.1, Windows NT, DOS graphics mode, and DOS text mode environments.
The zApp application framework consists of 200 classes that allow
incorporation of objects such as dialogs, windows, controls, menus, fonts,
graphics, bitmaps, MDI windows, printing, memory management, object
persistence, and data-entry forms. Applications written with zApp are portable
in source-code form to the platforms mentioned earlier.
Inmark claims that zApp applications are about 50-80 percent smaller than
equivalent programs written to the native windowing API. zApp for OS/2 comes
with over 1200 pages of documentation and 60 sample programs, and lists for
$695.00. Reader service no. 21.
Inmark Development Corp. 2065 Landings Drive Mountain View, CA 94043
415-691-9000
Franz is now shipping Allegro CL\PC, a Common Lisp environment for the
Microsoft Windows platform. Allegro CL\PC is an object-oriented development
environment based on the Common Lisp Object System (CLOS). It includes an
incremental compiler, programmable integrated editor, structure editor,
inspector, debugger, tracer, stepper, and time profiler.
Programs written in the Common Lisp language are portable to platforms such as
UNIX and Macintosh. Applications for which Common Lisp is particularly
appropriate include knowledge-based systems, scheduling and process control,
simulation and modeling, computer-aided design and--with the help of the
Common Lisp Interface Manager (CLIM) standard--portable GUI-oriented
applications.
Allegro CL\PC integrates the object-oriented and functional programming
paradigms, and also provides an interface to procedural languages such as C
and Fortran via a Windows DLL. Since Common Lisp is a late-binding language,
an application does not have to be complete before it can be executed, in
contrast to other languages such as C++, in which a program must be complete
before it can be compiled. Developers can thus write and debug program
components incrementally without having to restart the program each time.
Allegro CL\PC's Garbage Collector manages memory allocation automatically,
reducing memory leaks and design errors.
Allegro CL\PC requires a 386 processor, 4 Mbytes of RAM and 8 Mbytes of disk
space and sells for $595.00. Reader service no. 22.
Franz Inc. 1995 University Avenue Berkeley, CA 94704 510-548-3600
Phar Lap has released Version 5.0 of its 386DOS-Extender, which allows
developers to use Microsoft's Windows NT 32-bit C/C++ compiler for developing
32-bit extended DOS applications. The 386DOS-Extender turns DOS into a 32-bit
operating environment with a flat, 4-gigabyte address space. It allows 32-bit
applications to access the compiler's 16-bit DOS run-time libraries, including
graphics. All of the standard 16-bit calls will work from 32-bit protected
mode. The 386DOS-Extender supports the XMS, VCPI, and DPMI specifications.
Included in the 386DOS-Extender SDK is 386SRCBug, a 32-bit source-code
debugger for protected-mode programs. Commercial applications that have been
built with 386DOS-Extender include Microsoft's FoxPro 2.5 for MS-DOS,
Autodesk's AutoCAD 386, and the compilers in Microsoft's Visual C++ and
Fortran PowerStation 1.0 packages.
Phar Lap also announced that its 16-bit DOS extender, 286DOS-Extender Lite, is
included in Microsoft's Visual C++ package. This allows users of Visual C++ to
write C and C++ protected-mode programs that can access up to 2 Mbytes of
memory, and run under DOS, DESQview and Windows. Users who would like CodeView
support and need to access more than 2 Mbytes of memory (up to 16 Mbytes) can
purchase Phar Lap's full 286DOS-Extender SDK, which lists for $495.00. Also
available from Phar Lap is the 286DOS-Extender Run-Time Kit (RTK), which
provides unlimited redistribution rights for compiled programs, at a price of
$995.00. Reader service no. 23.
Phar Lap Software 60 Aberdeen Avenue Cambridge, MA 02138 617-661-1510.
The Cyberspace Development Kit (CDK), a set of C++ class libraries for 3-D
visualization and simulation, has been released by Autodesk's multimedia
division. The CDK is designed to help programmers to build PC-based 3-D
applications that users can interact with in real time--virtual reality, in
other words.
The CDK addresses development in the areas of 3-D geometry and physical
phenomena (mass, density, gravity, friction, and so forth). The CDK sports an
open interface that provides an extensible architecture for transparent access
to a variety of input and output devices.
The CDK requires a 386/486 PC with 8 Mbytes of RAM and an 80-Mbyte hard disk,
VGA, and MS-DOS 3.1. It can be used with with either the Zortech 386 C++ 3.0
or MetaWare High C/C++ 3.0 compilers in addition to the Phar Lap link and
386DOS-Extender SDK.
The Cyberspace Developers Kit sells for $2495.00. Reader service no. 24.
Autodesk Inc. 2320 Marinship Way Sausalito, CA 94965 800-879-4233
Watcom has released Watcom C/C++ 32 Version 9.5, a multiplatform, optimizing
32-bit C/C++ compiler that supports DOS, Windows, OS/2, and other platforms.
The toolset includes a 32-bit debugger, plus profiler, linker, make, and other
tools such as DOS/4GW, a royalty-free 32-bit DOS extender developed by
Rational Systems.
Development can be hosted on DOS, OS/2 2.x, or Windows NT. The following
32-bit targets are supported: 32-bit DOS (with Rational, Phar Lap, or Ergo DOS
extenders), Windows 3.x (using the Watcom 32-bit Windows supervisor), OS/2
2.x, Windows NT, Win32s, and AutoCAD ADS/ADI.
The package provides support for creating true 32-bit Windows 3.x applications
and DLLs. Support components include 32-bit Windows API library, 32-bit C/C++
library, and Watcom's Supervisor module, which enables execution of 32-bit
flat model Windows apps. These programs can be debugged with Watcom's Video
debugger. Watcom has licensed the necessary components from Microsoft's SDK so
that no additional tools are needed for creating Windows programs.
Watcom's C++ compiler supports AT&T C++ 3.0, including templates and
exceptions. The optimizing code generator now supports the Intel 486 and
Pentium processors. Its new superscalar optimization strategy uses
"riscification" and instruction scheduling to speed up both integer and
floating-point computation. Watcom's linker also supports certain C++
optimizations that are only possible at link time, such as elimination of
unreferenced virtual functions.
Watcom C/C++ 32 lists for $599.00. Upgrades from Version 9.0 of Watcom C/386
are available directly from Watcom for $199.00. Reader service no. 25.
Watcom 415 Phillip Street Waterloo, Ontario Canada N2L 3X2 519-886-3700



























May, 1993
SWAINE'S FLAMES


The Enterprise Sets Forth




Michael Swaine


One of the obligations of wisdom is to give advice. My advice this month is,
don't keep switching back and forth between the Presidential press conference
and that "Star Trek" rerun while eating anchovy pizza unless you want to have
dreams like this one.
It's odd, but none of us in the crew seems to remember much about the previous
captain; only that he was shot down on his first voyage and that he once threw
up on Sulu.
This new captain is certainly making an impression, mostly through words. He
loves the sound of his own voice. Whether he's briefing the crew from the
bridge or explaining to a female crew member how to get to his quarters, he
always works in a lecture on shared sacrifice and his awesome responsibility
as captain of the Enterprise. I don't mean to suggest that he's a stuffed
shirt, though. This captain is a man of the people, and a man of action, too.
I saw him jogging in the corridor again this morning.
Today the captain and the science officer, Mr. Spock, have come down to the
engine room to explain the new Federation technology policy. This is the first
chance many of us have had to see the two of them together up close, and the
contrast is interesting: the sandy-haired captain glad-handing everyone and
occasionally breaking out into a passionate speech; and the dark-haired
Vulcan, stiff, precise, and entirely devoid of any human emotion. Some of us
down here in Dilithium Gulch--that's what we call the engine room--thought
that Mr. Spock might be the next captain, but that didn't happen. We liked the
idea of a scientifically savvy captain, of course, but there were some who had
reservations about his green blood.
Mr. Spock presented the details of the technology policy. Two main features
were technology extension centers to bring our high Federation technology to
the backwaters of the galaxy, and an information highway to let schools and
hospitals benefit from the kind of subspace communication system that we in
Star Fleet enjoy.
"Speaking of technology," the captain interrupted, "I want to say something
about the Enterprise's technology. When I first took command of the bridge,
the nerve center of the Enterprise, I was shocked at the primitive
communication system. When a call would come in from Star Fleet Command, Lt.
Uhura would have to pick up a plug and put it in a little hole. It was like
Johnson's communication system." This was apparently a reference to some
20th-century politician; the captain likes to show off his knowledge of
ancient history. "I'm replacing the entire bridge communication system," he
went on. "I am responsible for the life of every man and woman on this ship,
and I must have state-of-the-art communications."
Mr. Scott stood up. "But captain," he said, "We canna afford a new
communication system. The ship's budget canna take it. You know the Federation
has a terrible deficit."
"Damn the deficit, Mister. The Romulans had a deficit as big as ours a few
years ago, and they embarked on a ten-year mission to bring their budget into
balance, and in the process increased investment, lowered unemployment, and
increased growth, all at the same time. Are you saying we're not as smart as
the Romulans?" Mr. Scott didn't answer.
And now we're off on this four-year mission, boldly going where no man has
gone before, as the captain likes to remind us. Wish us luck.








































June, 1993
June, 1993
EDITORIAL


Sometimes You Can Trade Secrets, Sometimes You Can't




Jonathan Erickson


What started out last fall as a story one brick shy of being interesting has a
chance of turning into a landmark court decision. The plot was simple:
high-energy, high-profile boss at company A bolts for the bright lights, big
bucks, and BMWs of rival company B. While Silicon Valley personnel moves are,
on the whole, pretty boring stuff (unless, of course, you're the one on the
move), everyone still got a chuckle when Gene Wang bolted from Borland to
Symantec--everyone with the possible exception of Philippe Kahn, that is. What
goes around comes around, those with good memories said, remembering the
brouhaha over Rob Dickerson's leap from Microsoft to Borland a couple of years
back.
After a few days though, the story showed signs of life with the news that a
posse including the Santa Cruz County district attorney and FBI was poking
around, search warrants in hand. Among the tidbits they reportedly found was a
passel of MCI electronic-mail messages from the still-Borland-employee Wang to
Symantec CEO Gordon Eubanks detailing what Borland claimed was some of its
innermost secrets. Almost immediately, Borland filed a lawsuit against
Symantec. More recently, a Santa Cruz County grand jury indicted Wang on 21
counts of violating trade-secret laws, and Eubanks on 11 counts of receiving
stolen property and conspiracy to misuse Borland trade secrets.
While it's high time for the courts to tackle one aspect of the case that's in
the spotlight--that of privacy and electronic communication--the subject may
never come up. Instead, the judge and jury will first examine the murky waters
of trade secrets. If privacy raises its head at all, it'll be as a means of
appeal if Eubanks and Wang are found guilty.
The trade secrets issue will be a tough enough nut to crack and, if the
district attorney prevails against Symantec, the definition of a trade secret
will be rewritten to some degree. Although the details are sealed as of this
writing, affidavits filed by Borland indicate that data Wang passed on
included product design specs and features, sales stats, and so
forth--information the district attorney says is worth "hundreds of thousands
of dollars." Under California criminal law, however, a trade secret is limited
to information that's scientific and technical in nature. Civil statutes, on
the other hand, more broadly define trade secrets to sometimes include
marketing information. Eubanks seems to have acknowledged the distinction
between civil and criminal definitions of trade secrets when he was recently
quoted as saying, "there are no criminal trade secrets here."
If the court rules in Symantec's favor, you won't see any sweeping
redefinition of trade secrets, and electronic privacy issues will remain
muddied. While the 1986 Electronic Communication Privacy Act covers e-mail
privacy across public networks, it doesn't address in-house corporate
situations like the Borland/Wang case. In this instance, Borland presumably
provided Wang with the MCI mail account, gave him a password, and paid his
e-mail bill, just as with his telephone. Most companies accept that employees
will attend to some personal business during the workday, whether at the water
cooler or over the phone, fax, or e-mail. The question, then, centers on who
owns the conversations or messages electronically transmitted. Certainly the
employer must have access to company-related information in the employee's
files (electronic or paper) that are important to the smooth operation of the
business. But at the same time, employees should be able to assume an implied
sense of privacy in their work environment.
If the criminal charges against Eubanks and Wang are upheld, the court may
still get into the privacy thing. The issue then shifts to whether or not the
information Borland uncovered can be used against Wang, not whether or not the
company had the right to examine his files.
Someday we'll get some clear-cut legal guidelines concerning e-mail privacy
... someday, but probably not someday soon.


Information Access Update


Last month, I briefly mentioned the Library of Congress's petition to start
charging the general public for computer access to public information stored
at the library. Since then, the GPO Access bill has come to light. This
bipartisan proposal would require the Government Printing Office (GPO) to
provide some federal records online, charging for the cost of
distribution--but not for collecting--the data.
That's the good news. The bad news is that the bill would also continue to let
executive branch departments negotiate exclusive deals with private companies
to distribute for profit publicly owned information collected at taxpayer
expense.
What's interesting is that GPO Access applies to only those departments under
the control of Congress. The departments exempt from the public-access
components of GPO Access--the Justice Department, Health and Human Services,
and the like--are under the umbrella of the Executive Branch. As part of their
high-tech policy, Clinton and Gore have committed to providing as much
information to as many people as possible and open access to public
information is central to the Information Highway they're proposing. Looks
like another fish or cut bait time for the Prez.

































June, 1993
LETTERS







Stereo Glasses


Dear DDJ,
My vote for best article of March 1993 goes to "Algorithms for Stereoscopic
Imaging" by Duvanenko and Robbins. After seeing a stereo demonstration of a
Tektronics display using a full-screen polarizer and passive-polarizing
glasses a few years ago, I searched for a cheap alternative using glasses with
liquid crystal shutters that could be used with a PC VGA display. I shortly
discovered that Sega of America Inc. made such glasses for their Genesis
video-game product, so I obtained a pair for about $45.00 at a local toy
store.
After determining that 12 volts DC was sufficient to make a lens opaque, I
built a converter to accept vertical sync or field signals sent to the display
monitor and alternate the blanking voltage sent to the glasses lenses. An
adapter cable tapped the VGA monitor sync signal. I used techniques similar to
those described in the article to load two display pages and switch them in
response to vertical interrupts. The method worked fine in noninterlaced modes
using a Video 7 1012i adapter card with a Seiko CM-1440 monitor, and with a
few interrupt-handler modifications, a Diamond Speedstar VGA adapter, and NEC
Multisync 4-D monitor. I used the same converter to drive the glasses from a
Silicon Graphics 4D80GT workstation equipped with a Genlock card, the card
providing a vertical sync output.
Eventually I realized the converter box and monitor cable tap could be
eliminated in the case of the PC, by utilizing a serial port RS-232
control-signal output. On my PC, the DTR signal level switches between about
+-12 volts. This signal can be controlled by writing directly to the serial
port. A simple circuit (Figure 1) consisting of two diodes and two resistors
routes the +-12 volt level to one lens and about 0 volts to the other,
unblanked lens. I built the circuit into the shell of a 25-pin connector along
with a mini stereo jack into which the glasses plug. The result is a simple,
easily removable interface which ensures that a given output signal polarity
always blanks the same lens.
James R. Jones
Colorado Springs, Colorado


Another Curmudgeon Heard From


Dear DDJ,
After reading, "A Curmudgery on Programming Language Trends" by Scott Guthery
(DDJ, December 1992), I feel I can add a few comments. I agree with Guthery
that OOP brings nothing fundamentally new. I also agree that C++ is not a very
good language. It's a large language with many semantic pitfalls. But I feel
Guthery has not pointed out the weak points of OOP in general:
1. OOP languages often generate slower code. A common misconception is that
with today's fast microcomputers and optimizing compilers, there is no need
for low-level languages. Al Stevens remarked in a consecutive text box that
throughout computer history there has been a trend to move further from
machines; from plugging cables towards using high-level languages. Still, I
use a lot of assembly language. Not because I like it, but because I need the
speed. Buying a faster machine is not a solution because the software
requirements grow too. I used to need assembler to speed up graphics programs
for the CGA on a PC/XT. Now I use assembler for Super-VGA on a 486/33 MHz.
Regrettably, I can't get below the assembler level (and change the
microprocessors' microcode). If I could, I would probably reprogram it. But
Abrash's book The Zen of Assembly Language (ISBN 0-673-38602-3) taught me that
even if you cannot access the processor's internals, knowledge about it can
make an enormous difference.
2. Systems do not always fit into hierarchies, so in OOP you often have to
make them fit by changing the functional specifications or by adding extra
support routines. I am referring to an example of OOP from Borland's Turbo
Pascal 5.5 manual. In this example you create a location class with (x,y)
coordinates, a point class derived from location and adding a color field, and
a circle class derived from point and adding a radius field. As an exercise I
tried to add a line class, and I asked a few of my colleagues to do the same.
From what class do we derive it? A circle has a midpoint and a radius, but a
line has two endpoints. How would you solve this problem? My solution was to
take a "vector" approach. We derive line from point and add two fields, x_disp
and y_disp, that give the relative position of the second endpoint. (The first
endpoint is inherent from point.) But note that I have conveniently changed
the functional specification of a line (two endpoints) to a vector (one point
and a displacement). In this example, there is not really a problem because it
is trivial to convert from line to vector, but this is not always the case.
3. OOP software is said to be reusable. OOP does have its merits here, but
there are other methods for reusing software components without using OOP.
Modularity is the key word. Creating abstract data types is a modular
technique that can be used with most non-OOP languages.
4. When you derive one class from another, you can add new methods to the
inherited ones, and you can modify (redefine) inherited methods. But you
cannot delete inherited methods or data from the class. Normally, there should
be no need to delete elements from a class. You just ignore them, and
optimizing linkers should strip off any unused code or data. Still, I'd like
some manual control.
5. Code and data are not the same in most compiled languages, so putting them
into one data structure gives some difficulties when writing them to file.
This is in essence the problem of persistent objects in C++. The OOP concept
is much stronger in languages where code and data are exactly the same, such
as LISP.
6. A few years back everybody told you that OOP was a new way of thinking. You
were supposed to forget everything you knew about programming and restart from
the ground up thinking objects. Wrong! We forget and restart too often. That's
why history repeats itself.
I don't use OOP much, and I use Turbo Pascal rather than C++, but reusability
of code has my attention. I am trying something much more low level:
documentation. After creating a nice routine, you should document it so your
colleagues can decide whether this algorithm does something they need, too,
whether or not they can use it unmodified, and if it must be adapted to their
use, how it works. It should be the simplest and lowest level of code reuse:
Instead of reinventing the wheel, you adapt a working example that you copy
from a paper or a manual. But as long as this doesn't work (and it
doesn't--documentation always appears to be outdated, incomplete, or lacking),
I don't think OOP will help us.
Thiadmer Riemersma
Bussum, The Netherlands


Fuzzy Redux


Dear DDJ,
In the February 1993 article "Fuzzy Logic in C," Greg Viot attributes the
invention of many-valued logic to Lotfi Zadeh in 1965. However, this kind of
logical calculus was introduces in the '20s in the works of Emil L. Post, Jan
Lukasiewicz, and Alfred Tarki. This by no means does not depreciate Lotfi
Zadeh's merits. (Lukasiewicz is also a creator of the well-known Reverse
Polish Notation.)
In the same issue, Stephen Wolfram said that: "The way you make progress in
mathematics is that you think of a theorem and generate a proof for it. In
every other field of science, experiment is the thing that originally drives
what goes on. People don't make models and theories and work out their
consequences."
I think he is wrong. I agree with Fred Hoyle's science fiction novel, The
Black Cloud: "Bloody bad science...Correlations obtained after experiments
done is bloody bad.... Only predictions really count in science.... It's no
good doing a lot of experiments first and then discovering a lot of
correlations afterwards, not unless the correlations can be used for making
new predictions. Otherwise it's like betting on a race after it's been run."
Janusz Rudnicki
Ste.-Madeleine, Quebec


DCW Update


Dear DDJ,
A brief comment on John Russel's letter about the DCW (Digital Chart of the
World): A 1:1,000,000 digitized map of the world is available on four CD-ROMs
for $200.00.
The four governments that contribute to this effort are indeed to be
congratulated and thanked for making this database available for geographic
application developers at a very reasonable price. As Russel states, the
VPF-VIEW software included with the DCW leaves a lot of room for improvement;
I hope some DDJ reader puts a much more robust packaging of this spatial
dataset on the market soon.
As a footnote, not all the participating governments are unanimous in their
support of the DCW project. Specifically, the Director of Great Britain's
Ordnance Survey apparently feels that selling this dataset for "only" $200.00
constitutes "dumping" and violates the GATT (General Agreement of Trade and
Tariffs). I for one take notice that $11,000,000 of "my" (U.S. taxpayer) money
was used to produce the DCW; I feel I am entitled to a copy for as little as
$200.00--or less.
Developers of geographic applications should also note that a remarkable
spatial dataset--TIGER--is available on CD for $250.00 per disk (42 disks for
the entire country). TIGER is essentially a digital street map of the entire
USA, including street names and house numbers required to link to and map any
data file that contains street addresses. For more information contact the
Data User Services Division of the Census Bureau: 301-763-4100.
Donald F. Cooke
Lyme, New Hampshire



Ada and Modula-3


Dear DDJ,
In regard to Spencer Roberts's reaction to Sam Harbison's October 1992 article
"Safe Programming with Modula-3" (DDJ, March 1993), Spencer asks for "just the
facts" but he himself has a few facts about Ada wrong. I have programmed in
Ada, mostly on Ada compilers, for six years now, and I can testify that Ada is
not as safe as Modula-3.
Ada's initialization of all pointers to null helps only with uninitialized
pointers, not dangling pointers. Uninitialized pointers can cause as much
havoc as dangling pointers but are much easier to find by reading code.
Dangling pointers are pointers to objects which were once allocated but have
since been deallocated.
Spencer states that Ada automatically sets pointers to null when the user
exits the block they are declared in. This accomplishes nothing, since the
pointers disappear altogether at this time and can never be used anyway.
If you read the Ada Language Reference Manual (LRM) carefully, it sort of
implicitly invites implementors to write a garbage collector (LRM 4.8-7..11).
I know of no implementation which has done this. If there is no garbage
collector, then Ada requires that all dynamically allocated objects of a
particular access type (i.e., pointer type) implicitly be kept around until
the user exits the block which declares the access type. This is absolutely
safe but is so extremely conservative that it precludes real dynamic
deallocation.
Instead, in Ada, one must use the predefined generic procedure aptly named
UncheckedDeallocation. This is an explicitly programmed deallocate, and can
create dangling pointers. UncheckedDeallocation does set the pointer passed to
it to null, but that is trivial. The real programmer's problem is all the
now-dangling copies of the pointer which might be stored, who knows where, in
a multilink data structure.
In my experience, eliminating dangling pointers in a complex data structure
when using explicit programmed deallocation is more difficult than writing an
Ada front end, minus the deallocation. I have worked on two Ada compilers that
don't even try. They just abandon the garbage nodes and hope the storage loss
will be tolerable until everything can be deallocated.
They even do this for a data structure which is stored long-term in files!
When you need such a dynamic data structure, garbage collection is the only
way to go.
I don't understand what Spencer means by "garbage collection is always machine
dependent." It is no more, or less, machine dependent than any high-level
language construct. In particular, it introduces no machine dependencies to
the source code in Modula-3. The implementation of garbage collection is
machine dependent, but so is the implementation of the whole runtime system
and the code generator. All have to be different for different machines,
regardless of the language.
Spencer contends that any runtime checking is less safe than static checking.
I agree, but this doesn't compare Ada and Modula-3. Both have a lot of static
checking as well as some things that are checked at run time. I don't know of
any instance in which Ada is more aggressive than Modula-3 in enforcing static
safety. On the other hand, garbage collection is a big example where Ada is
less aggressive in enforcing dynamic safety.
Rodney M. Bates
Wichita, Kansas














































June, 1993
COMPUTER SCIENCE AND THE MICROPROCESSOR


The battle for the desktop




Nick Tredennick


Nick did the logic design and microcode for the Motorola 68000 and IBM
Micro/370 microprocessors and is the author of Microprocessor Logic Design
(Digital Press, 1987). He can be contacted at 1625 Sunset Ridge Road, Los
Gatos, CA 95030.


The invention of the integrated circuit in 1959 began a beneficial technology
spiral in which, following Moore's Law, the possible number of transistors on
an integrated circuit has been doubling every year. Three effects combine to
sustain this trend: Chips grow, features shrink, and design techniques
improve. The number of transistors on a chip goes up directly with increases
in chip area--twice the area permits twice as many transistors. Circuits are
formed on a chip by a complex process involving coating, etching, doping, and
baking. The size of transistors and wires (features) is determined by the
sophistication of the process--the better the process, the smaller the
transistors and wires. If the width of a feature (wire or transistor)
decreases by a factor of two, the number of transistors in a fixed area
increases by a factor of four. Evolving implementation techniques improve the
match between circuit requirements and the constraints of the semiconductor
technology.
Integrated circuits improved the design of electronic systems. Before the
integrated circuit, electronic systems were built of resistors, capacitors,
inductors, diodes, and transistors (or vacuum tubes). Integrated circuits
replaced collections of transistors, diodes, resistors, and capacitors. Texas
Instruments and others introduced logic families, like the 74xx series TTL
devices. Designers partitioned electronic systems into available logic
modules. Systems became smaller, cheaper, and more reliable since each logic
module replaced many discrete components. Design with logic modules was
popular, and soon module catalogs such as TI's The TTL Data Book for Design
Engineers grew to include hundreds of different logic modules.
Custom-chip design was one alternative to designing with standard logic
modules. You could partition your system into unique chips and get someone
like Intel to build them. The design would be fewer chips than a design using
standard modules, so the implementation might be cheaper and more reliable to
manufacture. But custom chip designs were expensive, so manufacturing volumes
would have to be high to amortize the development cost enough to make custom
chips the right choice. In the late '60s, it looked as if desktop calculators
might have a high enough volume to justify custom-chip design.
In September 1969 the Japanese company Busicom approached Intel with a
proposal for a calculator design using seven custom chips. In October, Intel's
Ted Hoff countered with a three-chip design based on the idea of building
computer-like chips and programming the logic to perform the desired function.
One of these chips, the 4004, became the first commercial microprocessor. The
4-bit 4004 CPU, which processed data a nibble at a time, contained an
execution unit (registers, arithmetic unit, and connecting logic) and a
control (which interprets instructions and directs actions of the execution
unit). The microprocessor is connected to memory (to hold instructions and
data) and input/output logic (to communicate with the outside world) to make a
working system.
Intel introduced the 4004 commercially in 1971. Since then, the number of
transistors in a microprocessor implementation has been doubling every two
years. The implementation of microprocessors isn't following Moore's Law.
Figure 1 graphically describes the introduction of Intel microprocessors since
1971. Each microprocessor part is plotted by year of introduction and number
of transistors per processor from the introduction of the 2,300-transistor
4004 in 1971 to the introduction of the three-million-transistor Pentium
processor in 1993. The solid line plots transistors doubling every two years,
starting in 1971.


Embedded Control


The first commercially available microprocessor wasn't invented as a natural
consequence of evolution in computer design. Instead, it replaced custom logic
in what became known as embedded-control applications, which involves any use
of a microprocessor other than as the central processing unit (CPU) in a
computer system. Microprocessor-based logic was fair competition for logic
based on custom-chip designs, but the majority of system designs employed
standard TTL modules. If the microprocessor was to be a commercial success, it
would have to compete with TTL modules in system designs. Market emphasis in
embedded-control applications led to microprocessors designed to meet the
requirements of low cost, adequate performance, simple, flexible bus
protocols, and few pins.
The microprocessor had to be cheap to compete with the high-volume TTL modules
it replaced. Performance of the programmed logic employing the microprocessor
had to match the performance of the logic it replaced--not a big challenge.
Simple, flexible bus protocols allowed the microprocessor to work with a
variety of memory and peripheral chips on a common bus. Few pins meant cheaper
packages. Package size was also driven by the desire to match the pin and row
spacing of the TTL modules. The first microprocessors were expensive, so their
first design wins were probably as alternatives to custom-chip designs rather
than replacing standard logic module-based designs. As volumes grew and
technology improved, microprocessors got better and cheaper, which opened up
more application opportunities. Microprocessor design diversified into
microprocessors, which contain just the CPU, and microcontrollers, which
include memory or I/O logic (or both) on the same chip with the
microprocessor. The embedded-control market grew from essentially zero in 1971
to an expected volume of almost two billion units (worldwide) in 1993.
Embedded-control applications fall into four market segments: zero cost, zero
power, zero delay, and zero volume.
The zero-cost segment, to a first approximation, is 100 percent of the
embedded-control market. Virtually all embedded-control applications are in
high-volume, highly competitive, cost-sensitive consumer appliances: TVs,
VCRs, toasters, blenders, washers, dryers, and microwave ovens. Component cost
is usually the first and most important consideration in an embedded-control
application. In a microwave oven, for example, minimizing component count and
component cost is vastly more important than minimizing power dissipation or
maximizing performance. What difference does it make whether the
microprocessor is 0.1 watts or 10 watts in a 1500-watt microwave oven? The
wall outlet looks like an infinite power source to the microprocessor, and the
microprocessor's power dissipation is inconsequential compared to the power
dissipated by the oven. Furthermore, performance of even a bit-serial
processor would be lightning fast compared to the glacial pace of human
command inputs to the microwave.
The zero-cost segment, which accounts for almost all unit volume in embedded
control, employs 4- and 8-bit microcontrollers. The first commercial 4-bit
microprocessor began shipping in 1971. In 1993, shipping volume for 4-bit
microcontrollers should exceed 800 million units with an average selling price
of just under $1. The first commercial 8-bit microprocessor, the 8008 (also
from Intel), began shipping in 1972, just one year after the introduction of
the 4004. In 1993, shipping volume for 8-bit microcontrollers is expected to
be over 1 billion units, with an average selling price below $4. Even though
the 8-bit microprocessor followed the 4-bit microprocessor's introduction by
less than a year, it wasn't until 1990 that shipping volumes for 8-bit
microprocessors passed the 4-bit microprocessors. This indicates the
importance of low cost and the unimportance of absolute performance in most
embedded-control applications. Microprocessor manufacturers competing for
shares of the zero-cost segment must have high-volume, low-cost production.
Zero power, the next-largest segment of the embedded-control market, is mostly
a special subset of the zero-cost segment: It includes applications for which
dissipating zero power is more important than achieving zero cost. Zero power,
to a first approximation, represents zero percent of the embedded-control
market. Zero-power applications include items such as smoke detectors, remote
controllers, and pocket calculators. We'd like to have these devices run
entirely on weak ambient light or run for a few years on a single watch
battery. Zero-power applications use the smallest, cheapest, slowest
microprocessor consistent with the requirements of the application. Since most
applications are consumer appliances, cost is still important. For most
applications, 4- and 8-bit microprocessors are sufficient, but the emerging
personal digital assistants (PDAs) probably require 16- and 32-bit
microprocessors. Microprocessor manufacturers competing for shares of the
zero-power segment must have efficient designs and good technology as well as
high-volume, low-cost production.
Zero delay is the third-largest segment. It includes applications such as
scanners, laser printers, and fax machines, for which performance is the most
important consideration. For these applications, zero processing delay is more
important than achieving zero cost. The market is competitive, so cost is
still important. Zero delay is the primary segment for the 16- and 32-bit
microprocessors. These are the high-end, embedded-control applications as
reflected by the expected 1993 average selling prices for 16- and 32-bit
microcontrollers of just under $10.00 and just under $60.00, respectively.
Unit volumes in the zero-cost segment are 20 times the unit volumes in the
zero-delay segment, but the substantially higher average selling price of the
16- and 32-bit microcontrollers brings the dollar value of the zero delay
segment to about one-third the value of the zero-cost segment.
I thought I'd covered all the market segments with these three--until I talked
to John Wharton. I explained the zero-cost, zero-power, and zero-delay
segments to him and asked, "So what do you think?" He immediately replied,
"You forgot the zero-volume segment." Indeed I had.
Zero volume is the market segment for applications with (essentially) zero
volume, but which have some attraction for the manufacturer other than sales
volume and profit. Intel built and delivered the 960MX microprocessor--at the
time, Intel's fastest and most complex microprocessor--solely for the YF-22
Advanced Tactical Fighter. Since the crash of the single flying prototype of
the YF-22, the volume looks as if it will actually be zero, but Intel could
hardly have expected to sell more than a few thousand microprocessors for the
YF-22 even in the best of circumstances. The visibility conferred by such a
high-profile application made the design win desirable. The zero-volume
segment is not sensitive to cost. All microprocessor manufacturers can compete
for applications in the zero-volume segment. High-volume, low-cost production
is not required.
The technology spiral fed the expansion of the microprocessor market. By any
standard, the growth from introduction in 1971 to an expected market of close
to two billion microprocessors in 1993 is phenomenal.


Enter the Personal Computer and Computer Architect


By 1974, the microprocessor had gotten cheap and common enough for the
invention of microprocessor-based computer systems and the sale of these
"personal computers" to individuals. Invention of the PC served to split the
microprocessor market into two segments: embedded control and CPU. The two
market segments have different requirements: Embedded control wants low cost,
while CPUs want high performance. Most microprocessors go into
embedded-control applications, but CPU applications have grown from
essentially 0 percent of unit volumes in 1974 to an expected value of almost 2
percent in 1993. About 30 million microprocessors should ship as the computer
system CPU in 1993. Since embedded-control applications have always
represented 98 to 100 percent of unit volumes, manufacturers have
traditionally ignored the CPU market segment. Microprocessor designs supported
embedded-control requirements for low cost and adequate performance: If they
also got used as CPUs, so much the better.
As computers advanced, so did the field of computer science. In the academic
world, it progressed from a side-interest within the mathematics or electrical
engineering departments to being its own separate field, bringing with it its
own professionals in industry and academics with career aspirations in
computer-related topics.
The first computers were built of vacuum tubes and were huge, expensive
electromechanical engines. Only a few large companies (like IBM) capable of
making large business machines could build these "mainframe" computers.
Designers of mainframe instruction sets and microarchitectures were rare and
probably thought of themselves as engineers and programmers. After the
invention of the transistor and the integrated circuit, computers got smaller
and cheaper. More companies could build these smaller, cheaper
"minicomputers." Designers of minicomputers were still fairly rare and also
probably saw themselves as engineers and programmers. After the invention of
the microprocessor, any company capable of building integrated circuits could
design a computer instruction set. The number of instruction-set and
microarchitecture designers reached critical mass: The designers began to
think of themselves as "computer architects," and computer architecture became
its own profession.
Invention of the computer architect brought with it an avalanche of
experiments and publications as career computer researchers competed for the
best positions in universities and industrial research organizations. But the
study of computers is a weak science. When pencil and paper produced
quantitative results, researchers spent considerable effort deciding which
quantitative results were worth computing. The computer itself is the enemy of
experiments in computer science: The computer readily produces quantitative
results. Also, the field of computer science is developing under intense
commercial pressure, which further weakens experimental procedure. Researchers
may have a financial interest in a point of view. There are few independent
investigators.


RISC


In the late '70s and early '80s, investigators at universities and industrial
research organizations noticed the mismatch between the implementation of
microprocessors and the requirements of a CPU. Consequently, they invented
RISC (reduced instruction set computers). Manufacturers were busy building
microprocessors to compete with standard logic modules for embedded-control
applications, since to a first approximation, embedded-control applications
represented 100 percent of the market for microprocessors. (CPU applications
were essentially 0 percent.) Microprocessors were designed for low-cost,
adequate performance (relative to standard logic-module solutions), few pins,
and leisurely bus protocols. Low cost was the most important feature.
But low cost isn't a major objective for the microprocessor in a computer
system. The cost of the power supply, display, hard disk, printer, keyboard,
chassis, and other components swamps the cost of the CPU. Performance is the
major objective for a microprocessor used as a CPU. Designing for best
absolute performance is so at odds with designing for lowest cost that
researchers investigating the design of microprocessors for CPU applications
found room for improvement over microprocessors designed for embedded control.
Early papers proposing RISC cited no fewer than 16 factors contributing to
enormous gains in reported performance, among them: simplified instruction
set, overlapped register windows, large register set, simplified addressing,
high-level language user interface, advanced compiler technology, delayed
branch, advanced procedure calls, single-cycle execution, simplified
implementation, quicker time to market, better design procedures, better
design tools, on-chip cache, wider external buses, and load/store
architecture. Wider, faster external buses, which increased bandwidth to
memory by a factor of six to ten, probably made the biggest contribution to
reported performance improvements.
Twelve years of subsequent investigation have not clarified or isolated the
contribution of any of these changes to increases in reported performance.
Instead, a pseudotechnical debate of epic proportions ensued, pitting RISC
(all that is good) against CISC (complex instruction set computers, or all
that is bad). The real issue had nothing to do with RISC or CISC. The real
issue has always been microprocessors with different design objectives.
Manufacturers supported designs for volume shipment--embedded-control
applications. RISC advocates supported designs for CPU applications.
Microprocessors for embedded control emphasized low cost. Microprocessors for
CPU applications emphasized performance.


The Battle for the Desktop


In a coincidence with unfortunate consequences, IBM introduced its personal
computer in 1981--just as researchers were inventing RISC. Sales of the IBM PC
took off, forever locking RISC out of the volume market in personal computers.
The invention of RISC merely split the CPU market segment into PCs and
workstations, in the same way the invention of the PC had split the
microprocessor market into CPUs and embedded controllers. In 1993, unit
volumes will be approximately 2 billion embedded controllers, 30 million
personal computer CPUs, and half a million workstation CPUs.

Even though CPU applications represent less than 2 percent of microprocessor
shipments, CPU designs are the glamour topic in microprocessor design.
High-end microprocessor designs are the focus of conferences, trade press,
technical publications, popular interest, and research. Ever since IBM
selected the lowly 8088 as the CPU in its PC, it has been an intolerable
affront to computer architects that it can't be displaced on the desktop by
any of the plethora of clearly superior RISC architectures. Since the advent
of RISC, computer architects have produced many microprocessors more suited to
CPU applications than the 80x86 architecture. Every microprocessor
architecture announced since the invention of the acronym has been labeled
RISC. Applications for RISC CPUs have grown from zero in 1981 to domination of
the half-million-unit workstation market in 1993. In the meantime, the PC
market has grown to about 30 million units a year. Although there are other
personal computers, IBM-compatible PCs have about 90 percent of the market and
Apple about 10 percent with the 680x0-based Macintosh. The Motorola 680x0
family is another old CISC architecture, so the PC market belongs exclusively
to the old, ugly CISC architectures.
Won't the superior performance of RISC-based workstations help them displace
CISC-based PCs on the desktop? RISC advocates and the trade press have been
predicting for years that sales of RISC-based computers would take off very
soon and begin eating into IBM-compatible PC volumes. Folklore has emerged to
explain why workstations will soon begin to displace PCs. The biggest
advantages for workstations are in performance, price/performance, hardware,
and new developments. The biggest advantages for PCs are availability,
applications, and the installed base. Folklore suggests that cost and price
are about the same for workstations and PCs.


Workstation Advantages


The killer advantage workstations are thought to have is in absolute
performance or in price/performance. "Once users get their hands on
$&your.favorite.workstation and see its blazing speed, sales of
$&your.favorite.workstation will surge as users switch from the PC." That's
the theory, anyway. I think it's wrong.
Leaving aside the question of whether there's a significant difference in
performance, price/performance and absolute price have more influence on the
choice of a PC or workstation than absolute performance. Comparing the best
price/performer workstation to a fully configured, top-of-the-line, list-price
IBM or Compaq system is a mistake. It may be that workstations have a giant
advantage in price/performance at that workstation price (and, perhaps, almost
every other workstation price). I don't know, and I don't think it matters.
The only price point that matters is the lowest workstation price, because
it's the only price at which workstations and PCs can compete for the same
customer. The relevant comparison is price/performance of the cheapest
workstation compared to a similarly priced IBM-compatible PC clone. At the
lowest workstation price, PCs have better price/performance.
Workstations are supposed to have an enormous advantage in hardware:
architecture, implementation, technology, and time to market. The story goes
something like this:
All workstations use RISC microprocessors. RISC has inherent architecture
advantages over CISC. RISC implementations are cheaper and faster, and they
get to market quicker. Since product cycles are shorter, new ideas can be
implemented sooner and more product generations can be introduced in a fixed
time. Also, shorter design times mean RISC uses better technology (or gets
equivalent technology to the field sooner).
Advantages in architecture are unproven and probably swamped by effects of
operating systems, compilers, assemblers, languages, and system design. The
latest high-end microprocessors implemented up to 3 million transistors. They
were all--RISC and CISC--complicated and difficult to design. Budgets for next
generation microprocessors will be three to ten million transistors and will
use similar technology and have similar implementations. They'll all be
complicated and difficult to design. In a ten-million transistor design,
instruction-set architecture offers no significant shortcut to implementation.


PC Advantages


The killer advantages for the PC are software applications and the
100-million-unit installed base. Applications for the PC are plentiful and
cheap. The price/performance advantage of a workstation would have to be
gargantuan to overcome the inertia of the installed base. PC owners can count
on finding cheap applications to suit their needs, and they can count on
cheap, regular hardware, software, and operating-system upgrades. PCs are also
readily available. You can get whatever PC configuration you want today at
your local computer store for a competitive price. If you're willing to wait a
day or so, you can get the same PC for even less through mail-order.
Availability, applications, cost, and the installed base--that's a lot to
overcome.


Cost and Price


Folklore says cost and price for workstations can be about the same as for
PCs. Cost is how much the manufacturer pays to make a workstation or PC. Price
is how much you and I have to pay to get one. In an ideal manufacturer's
market, price might be five or six times cost. In an ideal consumer market,
price might be only slightly above cost. The PC market is a consumer market.
Workstations will have to have consumer pricing to compete for PC customers.
Let's assume PCs and the low-end workstations being designed to compete with
them have similar features and use common components (power supplies, glue
logic, hard disks, floppy drives, displays, keyboards, and the like). Assume
differences in volume discounts for PC and workstation manufacturers are small
(so workstation and PC manufacturers are paying about the same for their
components). But there's a difference in CPUs between PCs and workstations.
PCs are based (mostly) on 80x86 microprocessors, and workstations are based on
RISC microprocessors. Is there a difference in cost or price for the CPU?
It costs about $65.00 to build a current high-end microprocessor. It costs
about $600.00 to process a six-inch wafer which will yield 12 to 14 good
chips; that's about $45.00 per working microprocessor. Add $10.00 to package
the chip and $10.00 more to test it, and you get $65.00 per CPU, All the
high-end microprocessors are about the same size, so there shouldn't be a
significant difference in cost to build. But cost to build isn't the whole
story--figuring out what to build and drawing up the plans (designing the
microprocessor) can be significant. Development cost for a high-end
microprocessor runs between $30 and $100 million. If you spend $50 million
designing a microprocessor and sell only 50, you'll have to charge more than
$1 million for each just to recover your costs. If you can sell 50 million
parts, you only have to charge $66.00 to recover your costs. Figure 2 plots
cost per part against parts shipped for design costs of $30 to $100 million
(assuming $65.00 per CPU in fixed manufacturing cost).
The workstation market is fragmented. SPARC from Sun, MIPS from Silicon
Graphics, POWER from IBM, PA (Precision Architecture) from Hewlett Packard,
881x0 from Motorola, Alpha from DEC, and Clipper from Intergraph all compete
for shares of the half-million-unit workstation market. There are about 20
manufacturers making high-end RISC microprocessors for workstations.
Currently, only Intel makes high-end microprocessors for IBM-compatible PCs.
Costs are clearly not equal. If you have to do a new design every two years to
keep up with everyone else, you'll be getting a share of either a million
workstations (the number shipped in two years) or 60 million PCs. If you're
one of the 20 manufacturers making a microprocessor for workstations, your
share of the market is likely to put you well over on the left side of the
curve in Figure 2. Amortizing the development cost over your share of the
workstation market will drive the price of the chip--manufacturing cost won't
be a big factor. If you're Intel, you'll be operating well off the right side
of the chart--amortized development cost won't be a big factor in setting the
price of the chip. Intel is currently shipping four to five million high-end
microprocessors per quarter. RISC microprocessors cost a lot because high
development cost must be amortized over low workstation shipping volumes.
Costs for workstations and PCs are not equal. Workstations cost more because
development cost for the CPU must be amortized over significantly lower
volumes. Lower manufacturing volumes also lead to smaller discounts on other
component purchases. The more sophisticated workstation systems (cache, fast
memory, special I/O) inherently cost more than the relatively unsophisticated
PC. Workstations have traditionally been sold through more expensive
distribution channels than the PC, which also contributes to higher cost.
Without even considering software, which is probably the most important
determinant, I think the battle for the desktop is over. The PC is the
winner--it is grabbing applications from the workstation market. The
workstation market has been maintaining volume by pressing for ever higher
performance and capturing new applications. We've reached the steady state.
Workstations will continue to push for higher performance and specialized
markets where they can command the higher prices they require. The PC will
continue to chase the workstations out of their old market segments.
More Details.


Software


PC software is cheaper than workstation software. PC software is the stuff
everyone needs: word processors, editors, communications, spreadsheets.
Workstation software is specialized software designed for a particular market:
chip design, visualization, timing analysis. Software is the same situation as
the CPU. For PC software, high volumes mean low amortized development cost, so
manuals and distribution probably dominate the cost. For workstation software,
low volumes and complex applications mean high amortized development cost.
The major applications for the PC have already been written. If you're a PC
software developer, do you have a chance of developing a new word processor
and capturing the market? Not likely. So what's Microsoft doing with all their
programmers? Over the past ten years, they've been working frantically on
major high-volume applications for the PC. Now they're done. About all they
need is two programmers and 30 documentation people per application to crank
out the annual updates to Word, Excel, Power Point, and so on. So what are the
other 6000 employees doing? When the applications that sell ten million copies
are done, they'll work on applications that sell one million copies. When the
applications that sell one million copies are done, they'll work on
applications that sell a 100,000 copies. When the applications that sell a
100,000 copies are done, Microsoft will start laying off programmers. It's my
guess that programmers at Microsoft are now working on applications which will
sell between a 100,000 and one million copies. Engineering-design
applications, the traditional market for workstations, will be converted to
the PC by Microsoft programmers just before the big layoffs begin.


Conclusion


The battle for the desktop has reached a steady state: The PC is eating into
traditional workstation applications at about the same rate that workstations,
with ever-higher performance CPUs and more complex systems, are finding new
applications. This, however, leads to problems for the RISC CPU manufacturers
because workstation volumes are too low for the chip manufacturers to recover
their development costs. The technology spiral is driving them to more complex
CPUs with correspondingly higher development cost while their market is
staying about the same size. This has caused the RISC CPU manufacturers to
rediscover the embedded-control market. RISC CPU manufacturers have begun a
major marketing campaign to capture embedded-control applications. After
competing for a few years for shares of a half-million-unit market, it must
look as if there's room for everyone in a market of two billion. There isn't.
Most of the market volume for embedded control belongs to the 4- and 8-bit
microprocessors. That's the zero-cost segment. The zero-power and zero-delay
segments are also cost sensitive, and belong to companies with high-volume,
low-cost manufacturing. The zero-delay, embedded-control segment uses some
high-end microprocessors, but the average selling price of a 32-bit CPU for
embedded control is $65.00--about the same as the manufacturing cost for a
RISC CPU. There's no way to recover development cost if the base price is the
same as the manufacturing cost. That leaves the zero-volume segment. RISC CPUs
are capturing high-profile applications in the zero-volume segment. The
problem with the zero-volume segment is, as its name implies, that there's not
enough volume to recover development cost.
PC sales are stalled at 30 million a year, workstation sales are stalled at
half million a year, and the ancient CISC CPUs own the embedded-control
market. The news is all bad for the makers of RISC CPUs. That's too bad,
because it's fate and has nothing to do with the intrinsic value of the
product (not that the intrinsic value is well known, given the state of
computer science--but that's another story). It's all tied up in the
technology spiral, the invention of the microprocessor, and the timing of the
invention of the personal computer, the computer architect, and RISC. If
you're a RISC advocate and this news has depressed you, here's something to
make you feel better: Perhaps the technology spiral will come back to bite
even the CISC microprocessors. The microprocessor was invented for embedded
control: It displaced modules with a programmed logic solution. Perhaps
reconfigurable or even self-configuring logic will displace the microprocessor
in embedded-control applications. After all, the microprocessor is only an
interim solution. Shouldn't those applications have self-configuring logic
modules?


Microprocessor Implementations


First-generation microprocessors, typified by the Motorola 6800, didn't use
pipelining. They didn't have to be fast for simple embedded-control
applications and, since integrated-circuit technology was new, chip area for
transistors was expensive. Early microprocessors used simple control and a
simple interface to external memory. The microprocessor fetched the
instruction, decoded it, and then executed it. When the microprocessor
finished the first instruction, it started on the second, and so on--no
pipelining, a simple controller. This execution model is shown in Figure 3.
The bottleneck in this simple, nonpipelined design is the controller. The
external bus is only used every third cycle for the fetch, unless the execute
cycle reads or writes an operand. The instruction decoder is only used every
third cycle. And the execution unit is only used every third cycle. The
pipeline can't stall since there isn't one.
In the late '70s, the next-generation microprocessors, typified by Motorola's
68000, used a simple, three-stage pipeline called instruction overlap. As the
first instruction is executed, the second instruction is decoded, and the
third is fetched. This execution mode is shown in Figure 4.
Instruction overlap makes better use of microprocessor resources than a
nonpipelined version. The external bus, the instruction decoder, and the
execution unit are all used on every cycle, unless there's a conflict for
resources. It's possible for the processor to complete an instruction on every
cycle. Fetch takes one cycle, decode takes one, and execute may take one to
many cycles. If execute takes more than one cycle, the following instructions
are held in the fetch and decode stages until the current instruction finishes
execution. Only one instruction at a time is allowed to begin execution, so
there are no operand conflicts. The execute stage and the fetch stage may
contend for the external bus. In an add-memory-to-register instruction, for
example, the execute stage will compute the operand address, read the memory
operand, add the register and memory operands, and store the result in the
register. If the memory-to-register add is instruction 1 in Figure 4, its
execute phase would extend from cycle 3 through cycle 6, instruction 2 would
be held in Decode, and instruction 3 would be held in Fetch. Instructions 2,
3, and, 4 would begin Execute, Decode, and Fetch, respectively, in cycle 7.
The first commercial RISC microprocessors introduced an extended pipeline. The
extended pipeline split the execute phase into address calculation, operand
access, execute, and write phases. Additional pipeline stages removed pipeline
delays caused by resource conflicts such as contention for access to external
memory. The extended-pipeline execution model is shown in Figure 5.
The extended pipeline potentially completes one instruction every cycle, but
with additional stages, there are fewer delays due to resource contention.
But, there are costs. The four instructions past the decode stage have
potential operand conflicts to resolve. Additional pipeline stages require
additional resources to avoid conflicts. You can estimate resources by looking
at Cycle 6 in the figure. Since Cycle 6 represents the theoretical
steady-state instruction flow through the microprocessor, it should be able to
accommodate any combination of six instructions without resource conflicts.
The memory system, for example, must have at least two read ports and one
write port (for Instruction 1 write, Instruction 3 read, and Instruction 6
fetch) to avoid access conflicts. There must also be more ports to the
register file (for address, read, and write) and at least two arithmetic units
(one for address calculation, and one for execute).
While the Motorola 68040 uses a six-stage pipeline, there's nothing magical
about it. Intel's 80486 and MIPS' R3000 are five-stage pipelines, and the
newer MIPS R4000 is an eight-stage pipeline. (MIPS uses the pompous term
"superpipeline" to describe their eight-stage pipeline.) The original Fujitsu
SPARC gate array and the first custom Cypress SPARC use a four-stage pipeline.
Increasing the number of stages in the pipeline reduces resource conflicts and
may allow a faster clock. Throughput increases, but these pipelines still only
complete one instruction per cycle, since they only issue one instruction per
cycle.
A superscalar pipeline attempts to issue more than one instruction per clock.
Intel's 80960CA, announced in 1989, was the first microprocessor with a
superscalar pipeline. Figure 6 shows a six-stage pipeline capable of issuing
two instructions per cycle.
Instructions 1 and 2 start at the same cycle, instructions 3 and 4 start at
the same cycle, and so on. If we started three instructions per cycle, we
could potentially complete three instructions per cycle. But look at the
loaded pipeline represented by cycle 6 (as it was in the extended pipeline).
The microprocessor is processing 12 instructions at each cycle. There's
enormous potential for operand and address conflict. The register file and
memory system need at least four read ports and two write ports each. And
there must be at least four arithmetic units (two for address calculation, and
two for execute). Hardware resources for a superscalar pipeline are
substantial and grow as more instructions can be issued simultaneously. One
way to limit required resources is to restrict combinations of instructions
permitted simultaneous issue. DEC's new 21064 Alpha microprocessor, for
example, uses a seven-stage pipeline and can issue two instructions per cycle
with some restrictions on pairs that can issue simultaneously. HP's PA 7100
can issue a floating-point instruction and an integer instruction
simultaneously, but cannot issue two integer instructions during the same
cycle. TI's SuperSPARC and Motorola's 88110 allow simultaneous issue of two
integer instructions. Intel's Pentium and Motorola's 68060 will also sport
superscalar pipelines.
--N.T.































































June, 1993
PROGRAMMING THE PENTIUM PROCESSOR


New strategies for high-performance architectures




Ramesh Subramaniam and Kiran Kundargi


Ramesh Subramaniam is a Strategic Marketing Manager in the Architecture and
Software Technology Group. Kiran Kundargi is Senior Software Engineer at Intel
Corporation. They can be reached at Intel, MS RN6-35, 2200 Mission College
Blvd., Santa Clara, CA 95052.


The advent of graphical user interfaces, digital audio-visual, and
three-dimensional graphics has truly changed end-user computing on PC
platforms. Likewise, multitasking integrated environments like OS/2, in
combination with the widespread adoption of local area networks, opens the PC
to previously uncharted territories such as cooperative computing. However,
each of these capabilities demands superior performance from the underlying
hardware--specifically the microprocessor, which is at the heart of the
platform. But it's not enough to introduce these features in the
microprocessor alone. Equally important is the availability of software tools
that enable applications to easily take advantage of these features.
Intel's Pentium microprocessor extends the performance curve of the 80386/486.
The Pentium is compatible with all existing x86 architectures from the 8086
through the 80486DX/DX2/SX. The Pentium, however, provides a superscalar
architecture along with multiple on-chip caches, a branch-prediction
mechanism, and other capabilities that result in a significant overall
performance improvement over earlier 80x86 processors. Consequently,
application developers can employ one of three strategies to enhance
performance: Use a faster processor with existing software, compile with a
processor-aware compiler, or migrate 16-bit applications to a 32-bit
environment. This article focuses on the first two possibilities:
architectural enhancements that improve performance of existing software and
recompilation with a processor-aware compiler.


Pentium Architecture Overview


Enhancements to the Intel 486 (when compared to the 80386) included an 8-Kbyte
unified code and data cache, an improved floating-point unit (FPU) that
supports a one-instruction/clock throughput for most frequently used
instructions, and greater support for multiprocessing.
Enhancements to the Pentium (when compared to the 80486) include higher clock
speeds (the Pentium is the first Intel processor to support 66-MHz operation
at both system and internal levels), a superscalar, pipelined architecture, a
pipelined FPU, Branch Target buffer, and multiple on-chip caches.
The superscalar architecture (see Figure 1) consists of two pipelined integer
units (the U pipe and V pipe, respectively) for integer instructions. The two
units can operate in parallel, thus executing two integer instructions
concurrently. This makes it possible in most cases to obtain two results per
clock cycle, breaking the one-result-per-cycle barrier of existing
architectures. For some complex instructions, the internal microcode also
exploits its own parallelism and executes using both pipes. As a result, these
complex instructions execute in fewer cycles than in earlier 80x86
architecture-based processors. Figure 2 illustrates the parallel pipelined
execution.


Pipelined Floating-point Unit


Although an individual instruction may require multiple cycles, the FPU
pipeline architecture effectively results in one add result per clock or one
multiply result per two clocks. In many cases, if a floating-point instruction
requires multiple clocks (keeping the FPU pipe busy), an appropriate number of
integer instructions can still be executed in parallel in the integer pipes.
The combination of these features means that a floating-point intensive
application can execute on the Pentium with improved performance.
Another subtle, but important, enhancement to the FPU contributes to
substantial performance improvement in floating-point instruction execution.
Intel's existing numeric coprocessors (such as the 80387) contain a set of
eight floating-point registers accessible to the software. This set is used as
a classic stack and is therefore sometimes referred to as a "floating-point
stack." Most numeric instructions on these coprocessors implicitly use the top
of the stack (called TOS) as one of the operands and replace it with the
result. For example, the FCOS instruction replaces the value at the TOS by
COS(TOS). If an application needs to save the result during subsequent
floating-point instructions, then the XFXCH instruction is used to change the
designation of the TOS. The instruction takes four cycles on the 80486. On the
Pentium however, the FXCH instruction is executed in parallel while some other
floating-point instruction is being executed in the FPU. Such parallel
execution and the fact that the instruction takes only one cycle essentially
makes it a "free" (clock-less) instruction. Therefore, appropriately scheduled
instruction sequences using this mechanism can significantly improve
floating-point instruction performance. The total number of clock cycles
needed by individual FPU instructions have also been reduced in the Pentium
microprocessor. Table 1 illustrates this by comparing the FDIV instruction on
the 80486 and Pentium.
Table 1: Clock-cycle comparison for FDIV.

 Clock Cycles
 486 Pentium
 ----------------------

 Single 36 18
 Double 63 32
 Extended 73 38



Branch Target Buffer


The Pentium has a feature for dynamic branch prediction called the "branch
target buffer" (BTB). In any software application, it's common to see
conditional loops function calls, unconditional branches, and so on. Depending
upon the result of the condition's evaluation, the application may take a
branch to a nonsequential location. This requires that the current prefetched
instruction stream be drained and a new instruction stream fetched from the
new location. This typically results in idling the processor. The existence of
an on-chip instruction cache can help to alleviate this problem, in that the
target location may be available in the cache; otherwise the processor remains
idle until the new instruction is available for execution.
The Pentium's BTB records an association of previously executed branch
instructions (from and to addresses). With this information, the processor
attempts dynamic prediction of the possible target and prefetches the
instruction stream at that location. If the prediction is correct, the branch
executes in a single clock, without additional delay. If the prediction fails,
the processor follows the normal routine of fetching the new instruction
stream either from the instruction cache or from memory. In the latter case,
two to three clock cycles are needed.


Caches


Memory accesses by the microprocessor generally tend to reduce overall system
performance because the processor idles during the time a memory operand is
being accessed. One way to alleviate memory accesses by the microprocessor is
to provide on-chip caches. The 80486 provides one 8-Kbyte cache for both
instruction and data; the Pentium provides two separate 8-Kbyte caches, one
each for instruction and data.
Both the instruction and data caches are two-way set associative and the cache
line is 32 bytes wide. Cache misses for either instructions of data do not
interfere with the other cache. Additionally, each cache has its own
translation lookaside buffer (TLB); therefore, there's no conflict between
corresponding virtual-address mappings. As a result of the separate caches and
their corresponding TLBs, the memory accesses by the processor are
significantly reduced.



Page Size


The hardware page size on the 80486 is 4 Kbytes. For applications that need
large memory buffers (such as frame buffers for graphics and file I/O buffers
for file servers), a page size of 4K may require accessing multiple 4K
hardware pages. For example, the need for a 480K memory buffer would require
120 pages. Each hardware page corresponds to one page-table entry and
therefore one slot in the TLB. As the application's data access moves from one
page to another, new TLB entries corresponding to the new page need to be
loaded, possibly replacing some existing TLB entry. This may result in some
thrashing within the TLB resulting in performance overhead.
More Details.
The Pentium addresses this by allowing a 4-Mbyte page size. Once the
application accesses such a large page, the corresponding virtual mapping for
that page is set up once in the TLB, and no further TLB updates need to take
place for that page as the application continues its data accesses within the
4-Mbyte memory space. This reduces memory accesses due to reduced TLB updates,
resulting in improved performance. Frame buffers, file servers, database
servers, and other applications that can take advantage of large memory
buffers (bigger than 4 Kbytes) without paying the penalty of unnecessarily
reloading TLB entries will find the large-page-size feature helpful.


Optimizing Compiler Technology


Earlier compilers generated machine code by first generating a lower-level
description in a generic "intermediate form" from a high-level language. This
description was then translated into machine code by a separate process. This
method allowed the development of compilers for different high-level languages
because these languages could share the "intermediate form." Also, translating
one intermediate form into a new machine language would port several
high-level languages to a new processor simultaneously. Early optimizing
compilers did more by reordering instructions for optimal memory use, and
performed more generic optimizations that improved compiled-code performance.
However, even these compilers still used a generic intermediate format that
doesn't ideally map onto the instruction set available for any specific
processor.
However, because hardware architectures continue to increase in complexity and
capability, less optimization is possible without processor-specific knowledge
built into the compiler. Therefore, there's a trend toward optimizing
processor-aware compilers that implement knowledge of the pipeline
architecture, branch handling, cache size, organization, and so on to provide
the optimal code for the target hardware implementation.
As Figure 2 shows, the Pentium provides two integer pipelines (U and V pipes)
for parallel execution of two integer instructions. Note, however, that two
instructions can execute in parallel only if there are no interdependencies
between them. Consider the assembler code in Example 1(a). Instruction I2
cannot be executed until I1 is complete because they are interdependent: The
eax register is written by I1 and read in I2. Therefore, I1 and I2 must
execute sequentially. This instruction sequence will only benefit from the
dual execution units of the Pentium if the code is re-compiled using a
processor-aware optimizing compiler.
Example 1: (a) Instruction I2 is dependent on I1, so it can't be executed
until I1 is complete; (b) the same instruction stream can be rescheduled by a
Pentium-aware compiler, resulting in parallel execution.

 (a)

 mov eax, source /I1: move source contents to eax
 add eax, ebx /I2: add contents of ebx to eax
 instr1... /I3: some other instruction-
 instr2... /I4: -not dependent on eax, ebx or source
 instr3... /I5: "

 (b)

 mov eax, source instr1...
 instr2... instr3...
 add eax, ebx

The processor-aware compiler recognizes such interdependencies and rearranges
(or reschedules) the instruction stream so that instructions can indeed
execute in parallel with minimum idle cycles. The above instruction sequence
can be reshuffled, as shown in Example 1(b).
The new sequence produces exactly the same results without unnecessary idle
cycles. On the Pentium, the parallel execution is slightly more complex than
presented in the example just described because the two pipelines (U and V)
are not exactly equivalent. Of the two pipes, one pipe (U) can execute all the
instructions, whereas the second (V) only executes hard-wired instructions.
The need to produce correct instruction pairing makes the instruction
scheduler all the more important.
For a specific code sequence in a program, it's generally possible to produce
different sequences of instruction streams that produce exactly the same
result. Some compilers provide user-selectable options for time (execution
performance) vs. space (code size) optimizations. A processor-aware compiler
analyzes the various possibilities and produces an instruction stream which
causes the least number of idle cycles on the Pentium. Figure 3shows an
example of a simple loop which can result in different instruction streams,
each with a different cycle count.
Figure 3: Code with different instruction streams.

 Original Source

 static int a[10], b[10]
 int i;
 for (i=0; i<10; i++)
 {
 a[i] = a[i] + 1;
 b[i] = b[i] + 1;
 }

 Code Sequence 1 Code Sequence 2 Code Sequence 3
 ----------------------------------------------------------------------

 xor eax, eax xor eax, eax mov eax, -40
 Loop: Loop: Loop:
 mov edx, eax inc dword ptr mov edx,
 [a + eax * 4] [a + eax + 40]
 sh1 edx, 2 inc dword ptr mov ecx,
 [b + eax * 4] [b + eax + 40]
 inc dword ptr [a + edx] inc eax inc edx
 mov edx, eax cmp eax, 10 inc ecx
 sh1 edx, 2 j1 Loop mov [a + eax + 40],
 edx
 inc dword ptr [b + edx] mov [a + eax + 40],

 ecx
 inc eax add eax, 4
 cmp eax, 10 jnz Loop
 j1 Loop



Floating-point Code


The majority of commonly used floating-point instructions are pipelined in the
Pentium, thus providing overlapped instruction execution. The floating-point
unit can accept a new instruction in each clock cycle. Such overlapped
execution hides the latencies of each of the individual instructions and in
many cases, effectively yields one FPU result per clock cycle. The compiler
takes advantage of the floating-point stack to save intermediate results. The
FXCH instruction is scheduled in the instruction stream so that the required
operand "happens to be" at the TOP of the floating-point stack when needed.
Since the FXCH instruction is essentially "no cost," this mechanism improves
the floating-point execution performance by a large margin. Note that the
integer instructions (in the U and V pipes) can execute simultaneously with
the floating-point instructions. Thus, in the case of floating-point
instructions that take multiple cycles (for example, fdiv), a number of
(multiple) integer instructions can be scheduled in the two-integer units.
Figure 4 shows how code generated using the FXCH instruction on the Pentium
can execute faster than on the 80486. Typically, floating-point code executing
on the Pentium can be as much as five times faster than code executing on the
80486.
Figure 4: Floating-point code using FXCH.

 Original Source

 subroutine da (x, y, z, n)
 dimension x(n), y(n)

 do 10 i = 1, n
 10 x(i) = x(i) + y(i)*z
 return
 end

 Code sequence without FXCH Code sequence with FXCH
 -------------------------------------------------------------------------

 fld dword ptr [esp+8] fld dword ptr [esp + 8]
 fmul dword ptr [ebx + eax *4] fmul dword ptr [ebx + eax * 4]
 fadd dword ptr [ecx + eax * 4] fld dword ptr [esp + 8]
 fstp dword ptr [ecx + eax * 4] fmul dword ptr [ebx + eax * 4 + 4]
 fld dword ptr [esp + 8] fxch st(1)
 fmul dword ptr [ebx + eax * 4 + 4] fadd dword ptr [ecx + eax * 4]
 fadd dword ptr [ecx + eax * 4 + 4] fld dword ptr [esp + 8]
 fstp dword ptr [ecx + eax * 4 + 4] fmul dword ptr [ebx + eax * 4 + 8]
 fld dword ptr [esp + 8] fxch st(2)
 fmul dword ptr [ebx + eax * 4 + 8] fadd dword ptr [ecx + eax * 4 + 4]
 fadd dword ptr [ecx + eax * 4 + 8] fxch st(1)
 fstp dword ptr [ecx + eax * 4 + 8] fstp dword ptr [ecx + eax * 4]
 add eax, 3 fxch st(1)
 cmp eax, ebp fadds dword ptr [ecx + eax * 4 + 8]
 jle Loop fxch st(1)
 fstp dword ptr [ecx + eax * 4 + 4]
 fstp dword ptr [ecx + eax * 4 + 8]
 add eax, 3
 cmp eax, ebp
 jle Loop



Blended Code


Because the 486 and Pentium processors share common architectural features,
applying Pentium optimizations to code run on either processor results in
performance gain on the 486 CPU. Slightly better performance can be obtained
by further tailoring optimizations to one processor or the other, but another
approach to the same performance gain is to choose a set of optimizations that
are balanced between the two CPUs. This is called blended optimization and
uses only those optimizations that do not deteriorate performance on either
processor.
Preliminary studies indicate that blend optimization provides well-balanced
performance on both the 486 and Pentium processors. The data also indicate
that the performance variation when optimizing for different processors is 5
percent or less in all cases. Blend optimization does increase code size, but
only slightly--by 4 percent on the 486 and 8 percent with the Pentium.


The Memory-bandwidth Bottleneck



Instruction operands, if not immediately available from the on-chip cache,
have to be obtained from an external source, such as the system memory.
Because of the comparatively slower memory speed (the low-bus bandwidth over
which the data transfers between the memory and the processor, and the
additional overhead of the inherent handshaking between the memory subsystem
and the processor), any memory access results in a few idle cycles in the
processor. Neither the superscalar architecture nor any other
performance-enhancing features designed into the microprocessor can alleviate
the memory-bandwidth problem. If an operand has to be accessed from memory, a
certain number of cycles are lost in obtaining it. Pentium compiler technology
addresses this issue. If an operand is available in the cache when needed,
it's accessed without any penalty--it's obtained at the highest possible rate.
The 80486- and Pentium-aware optimizing compiler therefore provides
sophisticated loop-reorganization techniques and vectorizing algorithms that
maximize the reuse of cached operands. This automatically reduces the number
of memory accesses and feeds the Pentium pipelines at the maximum rate
possible.


Other Optimizations


In the past, the limited ability of traditional compilers to optimize
instruction ordering and memory utilization meant that programmers had to
optimize at the source level. Indeed, programmers often resorted to manual
assembly coding to maximize application speed. However, continuing advances in
compiler technology are reducing the impact of source-level optimization on
application performance. With the advent of processor-aware compiler
technology, programs benefit from source-level optimization on far fewer
occasions. However, some understanding of the possible compiler optimizations
is still necessary to configure the makefile in the best manner for any
particular source file. There are also some rare situations in which
source-level optimization may increase the degree of optimization that the
compiler can attain. For discussion purposes, we've grouped optimizations into
two categories: base-level optimizations, which reduce the need for
programmers to modify source code to obtain better performance; and
individually selectable optimizations, which programmers can configure for
improved application performance and which are product specific. Only
base-level optimizations are discussed in this article.
Base-level optimizations fall into two categories: generic and
processor-specific optimizations.
Generic optimizations are available in most optimizing compilers. Some of the
simpler generic optimizations, such as constant and expression evaluation, are
also available in simple compilers. See Table 2(a).
Table 2: (a) Generic optimizations; (b) processor-specific optimizations.

 Function Description Coding Practice
 Implications
 -------------------------------------------------------------------------

 (a) Constant Replaces compiler-determinate Program can
 Evaluation expressions with simpler contain the
 expressions or numeric most appropriate
 constants if possible. variables and
 expressions.

 Expression Moves code that does not Statements that
 Evaluation produce different results on use no
 each iteration of a loop outside loop-dependent
 the loop. variables can
 occur
 inside loops.

 Loop Invariant Removes unnecessary code A single file
 Code Motion created by conditional can be compiled to
 compilation and other produce various
 optimization processes. binaries that
 are related but
 not identical.

 Dead Code Replaces compiler-determinate Program can
 Elimination variables wih numeric contain the most
 constants. appropriate
 variables
 and expressions.

 (b) Instruction Replaces simple instructions Performance-
 Selection with faster more complex critical
 instructions. functions
 are less likely
 to require manual
 assembly coding.

 Instruction Reorders instructions to Performance-
 Scheduling improve pipeline throughput. critical
 functions
 are less likely
 to require manual
 assembly coding.

 Loop Reduces branch stalls and Loops do not need
 Unrolling allows scheduling of one to be unrolled
 iteration with instructions manually.

 from a subsequent iteration.

 Preloading Uploads memory addresses to Programs do not
 cache before multiple writes to need to preload
 the same addresses (writes aren't the cache.
 cached without prior reads).

Processor-specific optimizations are possible to some degree even if the
compiler is not processor-aware. However, detailed knowledge of the processor
pipeline architecture, caching, and other hardware features enables far
greater performance boosts via these optimizations. See Table 2(b).


Taking Advantage of the New Technology


Programmers can choose how much effort to expend on targeting software
applications for the 80486- and Pentium. Their decision determines the degree
to which the code takes advantage of the processor's architecture to obtain
maximum performance improvement.
The simplest approach is to do nothing. Existing 80x86-based applications will
run on 80486- and Pentium-based systems without change and have faster
performance due to increased clock speeds and other architectural
enhancements. However, all the processor features in the 80486 and the Pentium
won't be used. For example, the floating-point code won't make the best use of
the "free" FXCHs in Pentium.
The next alternative is to take existing source code and recompile it using a
processor-aware optimizing compiler. As described earlier, the compiler
technology will not only provide the "classic" optimizations but also do the
Pentium-specific optimizations and help alleviate the memory bottleneck where
possible. Thus, recompilation will undoubtedly apply the 80486- and
Pentium-specific optimizations to any software application. Numerous
processor-aware compilers are available from Intel's compiler partners.
To further take advantage of the architecture, you can modify existing 16-bit
code to 32-bit, then recompile it using a processor-aware optimizing compiler.
This exposes the program to all the fundamental benefits of 32 bits, in
addition to 80486 and Pentium advantages.
Software that process huge amounts of data (database servers and file servers,
for instance) may also find it advantageous to update the software to use the
Pentium's large page size. For example, a database server could use a large
page to load a number of records and/or indexes, thereby avoiding TLB
thrashing. Thus, a slight amount of recoding and recompilation will provide
significant performance benefits at relatively low costs.
Finally, for applications written partially or entirely in assembly language,
a careful analysis and restructuring of the instruction stream may yield
substantial performance improvement. You'll need to recognize the various
instruction-pairing rules as well as the coupling between the FPU pipe and the
integer units. Taking advantage of the unused clock cycles in between existing
instruction streams could improve the execution performance significantly. The
exact range of improvement depends, of course, on the actual instructions used
and whether the program contains integer-only code or both integer and
floating-point code.


Conclusion


The Pentium features a superscalar, pipelined architecture with numerous
performance improving capabilities providing existing software a significant
performance boost. Still, not all performance improvements can be obtained by
simply running existing code on Pentium because not all the features of
Pentium or the 80486 are used by the software. By taking advantage of all
processor-specific features, additional performance benefits can be achieved
by recompiling using processor-aware compilers.


Making Compilers Pentium Aware


John Dahms
John is a compiler writer for Watcom. He can be contacted there at 415 Phillip
Street, Waterloo, ON, Canada N2L 3X2 or at john@watcom.on.ca.
With a Pentium compiler writer's guide in one hand and a simulator listing in
the other, we set out to make Watcom C/C++32 into a Pentium-aware compiler. At
first glance, it looked to be easy, but as we got into it some interesting
challenges emerged. Here is some of what we learned.


Instruction Scheduling


Because of the Pentium's multiple execution pipes, instruction scheduling, or
reordering, is by far the most important optimization performed by the
compiler. A three-phase approach to scheduling emerged as the strategy of
choice given the framework of the code generator.
RISC-ification, or reduction of integer instructions to use the RISC subset of
the instruction set. For example, we turn an add from memory into a load
followed by an add from register. The two instructions are a bit larger, but
no slower, and they give you more freedom to choose an optimal instruction
ordering.
Scheduling, or moving data-dependent instructions so that they're not
adjacent. Ideally, a result shouldn't be used until the instruction has had
enough time to complete all pipeline stages. This allows your code to take
full advantage of the superscalar architecture, and leads to considerable
80486 execution-speed improvement.
Re-CISC-ification, or building instructions back up to take advantage of
complex instructions. Often, data dependencies do not allow scheduling in
certain portions of the code. You might turn the load/add sequence created in
first phase back into an add from memory, if there was not scheduling benefit.


Quirks


There are some interesting architectural issues to address which can
materially affect Pentium performance. For instance, instructions with prefix
bytes will take an extra clock cycle. In order to reference 16-bit items, an
operand size prefix has to be used. We do our best to eliminate these prefix
bytes, but it's still a good idea to stay away from short integers.
Floating-point instructions must also be scheduled to take full advantage of
the float pipe. The FXCH instruction was designed to make this easier. It can
execute simultaneously with other floating-point instructions. Since one
operand of a floating-point instruction must be ST(0), a "free" FXCH
instruction allows floating-point sequences to be scheduled for better
throughput. Two data-independent sequences are interleaved, and FXCH is
inserted when an intermediate result needs to be brought to the top of the
stack.
This sounds great, but before you rewrite your floating-point library to take
full advantage of the "free" FXCH instruction, you need to read the fine
print. Certain conditions that must be met before instruction pairing occurs.
First, the FXCH must be followed by another floating-point instruction.
Secondly, it must be preceded by a floating-point instruction which does not
pop the stack. If these conditions are not met, your code may actually run
slower, since the FXCH instructions each take one clock cycle.
Finally, the processor documentation states that "performing a floating-point
operation on a memory operand instead of on a stack register adds no cycles."
This is true, assuming a cache hit. However, there is a clock penalty for
cache misses. Don't be fooled by the clock cycles. Keep integer and
floating-point values in registers wherever possible.


80386/486 Considerations


The 80486 only has one integer pipe and one float pipe, but it's still a
pipelined architecture. An optimal Pentium ordering is usually optimal, or
nearly so, on the 80486 as well. The scheduling avoids pipeline stalls,
greatly increasing 80486 performance.
There's one major compromise which must be made if code is to be optimized for
both the Pentium and the 80386/486. In this case, you don't use the FXCH
instruction to achieve parallelism. It may speed up Pentium code considerably,
but will not be free on a 80386/486.



16-bit Code


Remember to optimize your 16-bit code for the Pentium as well. All the same
rules apply, except that 32-bit operands now require a prefix byte. Since
Watcom C uses a common code generator to produce both 16- and 32- bit code,
Watcom C/C++ 16 became Pentium aware for "free". We recompiled a suite of
small benchmark programs, and were amazed when we saw a 40 percent average
speed increase, with improvements ranging up to a factor of two in execution
speed.



_PROGRAMMING THE PENTIUM PROCESSOR_
by Ramesh Subramaniam and Kiran Kundargi

Example 1: (a) Instruction I2 is dependent on I1, so it can't be
executed until I1 is complete; (b) the same instruction stream can
be resheduled by a Pentium-aware compiler, resulting in parallel
execution.

(a)

mov eax, source /I1: move source contents to eax
add eax, ebx /I2: add contents of eax to ebx
instr1... /I3: some other instruction-
instr2... /I4: -not dependent on eax, ebx or source
instr3... /I5: "

(b)

mov eax, source instr1...
instr2... instr3...
add eax, ebx
































June, 1993
PROCESSOR DETECTION SCHEMES


You don't have to write for the least-common denominator




Richard C. Leinecker


Rick was formerly the director of technology at IntraCorp, a company
specializing in entertainment software. Some of his recent titles include
GrandMaster chess, BridgeMaster, and Trump Castle 3. Rick was on the staff at
COMPUTE magazine and still writes a column and various features. He recently
released Graphics Guru, a graphics library that most certainly uses
processor-specific code. You can reach him through the DDJ offices; or on
CompuServe at 74676,457.


While many applications really require the power of 80386 PCs, we can't assume
(or demand) that users of our software have 386-class machines. Consequently,
our programs have to support everything from 8088s to Pentiums. This means
writing for the least-common denominator--and hoping Pentium users won't
notice the difference.
However, it is possible to write processor-specific code, taking advantage of
each processor's strengths using the techniques I present here. The hard part
is finding out which processor, or even which version of a processor, you're
running on; the easy part is knowing what to do. For example, my assembly code
usually has vector tables for each processor, and once I've identified the
CPU, I copy the appropriate vectors into a global table. From then on, it's
completely transparent--you simply make calls based on the vector table.
There are few steps to finding out what CPU is under the hood. To get started,
I check bits 12-15 of the flags register. If they're always set, regardless of
how you alter them, you've got an 8088 or an 8086. Then, I use a
self-modifying code technique to find out if the prefetch queue is 4 or 6
bytes. If it's 4, the chip is an 8088; 6 means it's an 8086.
To distinguish between a 286 and 386/486s, you need to check for 386/486 flags
that are undefined for the 286. If it's a 386 or better, the next step is to
check the AC bit in the flags. If it can be set, you've got a 486; if it
remains unset, it's a 386.


Using Bits 12-15 of the Flags


Determining whether a processor is an 8088 or 8086 simply requires checking
bits 12-15 of the flags register. Although you can't move the flags directly
to a word register, you can push the flags and pop them into a register. To
move the value back into the flags, reverse the operation.
To test bits 12-15, mask them off by ANDing the working register with Offfh.
The value must be moved into the flags, then back into the working register.
If bits 12-15 are all set, the processor is an 8088 or an 8086. The; Is It an
8088? segment of Listing One (page 126) illustrates this.
Differentiating between 8088s and 8086s is trickier. The easiest way I've
found to do it is to modify code that's five bytes ahead of IP. Since the
prefetch queue of an 8088 is four bytes and the prefetch queue of an 8086 is
six bytes, an instruction five bytes ahead of IP won't have any effect on an
8086 the first time around. See the; Is It an 8086? segment in Listing One.


Differentiating between the 286, 386, and 486


The easiest way to determine whether or not the CPU is a 80286 is to set the
NT (nested task flag) and IOPL (I/O privilege level) bits in the flags
register, then check to see if they're still set. These bits are undefined for
286s and will hold 0s no matter what you do. Since you can't set the flag bits
directly, you have to use ax and the stack to indirectly set the bits. The ;
Is It an 80286? part of Listing One describes this.
More Details.
Actually, it turns out there's one case where IsItA286 returns a bogus value,
indicating a 286 when in fact you're running on a 386. This is due to an
obscure bug in Windows 3.1, as Bob Moote of Phar Lap Software has graciously
pointed out. In a DOS box under Windows 3.1 enhanced mode, the pushf and popf
instructions are virtualized if the DOS box is running at IOPL 0. According to
Bob, when you execute the virtualized pushf, Windows doesn't remember that you
tried to set those bits, and it returns the real hardware bits, which are all
0s. Consequently, you think you're on a 286 when it's really a 386 (or later).
Keep in mind that this only happens under Windows 3.1 once out of every
several thousand runs because Windows doesn't use IOPL 0 that often. Phar Lap
gets around this problem in its 386DOS-Extender (once they know they're on at
least a 286) by doing an additional test before trying the flag-setting test.
They use the SMSW instruction to check if the PE (protected mode) bit is set;
if so, it's in V86 mode and must be a 386 or later. If the PE bit indicates
real mode, then they do the flag-setting test to see if it's a 286.
If you know you've got a 386 or higher, you have to look for a 486. The
easiest way I've found to do this is to set the 486 AC (alignment check) bit
in the flags and see if it remains set. If it does, it's a 486. The code
segment in Listing One labeled; Is It an 80386 or 80486? shows how to do this.
For a Pentium-detection scheme, see the accompanying text box entitled
"Pentium Detection." For a cache-detection technique, see the text box "486
Cache Detection."


Conclusion


Now you have no more excuses. Write the best code for the processor you've
got, and give users their money's worth. Supporting 386s and higher is a
solution, but not one nearly as professional as multiprocessor support.
More Details.


486 Cache Detection


Steve Heller
Steve is the author of Large Problems, Small Machines (Academic Press, 1992)
and president of Chrysalis Software, P.O. Box 0335, Baldwin, NY 11510. He can
be contacted on CompuServe at 71101, 1702.
While Intel's 486 has an internal cache of 8 Kbytes, 486 clones from other
manufacturers have internal caches of varying sizes. Cyrix's 486SLC, for
instance, has a 1K cache, while IBM's 486SLC has one that's 16 Kbytes in size.
Furthermore, most 486-based PCs also have additional external caches. The
program presented here (see Listing Three, Page, 127) detects whether internal
or external (or both) caches are active so that you can turn them on or off,
if necessary.
This approach takes advantage of the fact that jumping to an instruction that
isn't aligned on a doubleword boundary is time consuming when there's no
caching, as the processor has to access additional memory words to extract the
next instruction to be executed. The program has two loops: one that executes
an indirect jump to a properly aligned target, and a second that executes the
same jump to a misaligned target. If the 486/66 DX2 internal cache is active,
the two loops take almost identical amounts of time. If only the external
cache is active, the misaligned loops takes about 50 percent longer to execute
than the aligned one. If neither is active, the misaligned loop takes about
twice as long to execute as the aligned one. Similar results occur on a 486/33
and on a Cyrix 486SLC/50. Even a 386/40 DX with an external cache displays a
noticeable effect, although of much smaller magnitude. Keep in mind that if
the internal cache is active, this program cannot determine the status of the
external cache, as it is completely masked by the internal caching.
The program was compiled using Borland C++ 3.1 and should be straightforward
to port to other compilers as long as they allow inline assembly language.
However, any changes must not disturb the alignment of the first loop and the
misalignment of the second.



Pentium Detection



Robert Moote
Robert is a cofounder of Phar Lap Software and can be contacted at 60 Aberdeen
Ave., Cambridge, MA 02138.
To determine whether or not the target system is Pentium based, first
ascertain that you are at least on an i486 chip, as Rick describes. Then call
the detect routine in Listing Two (page 127), which returns 4 if the processor
is an i486, 5 or greater if the processor is a Pentium or later chip.
In the Pentium chip, Intel finally designed in a software-detection method by
adding the CPUID instruction for obtaining information about the chip family,
stepping, and unique features, and by adding an ID bit to the EFLAGS register
to ascertain whether the processor supports the CPUID instruction. The code
first attempts to toggle the ID bit in the EFLAGS register; the Pentium
processor is the first in the 80x86 family to allow this bit to be set to both
1 and 0.
The CPUID instruction takes an input value in the EAX register. A 0 in EAX
returns the highest input value recognized by CPUID in EAX (this is 1 for the
Pentium chip), and a vendor identification string in EBX-EDX (GenuineIntel for
Intel's chip). Passing a 1 in EAX returns the chip information in EAX-EDX;
this is the form of CPUID used in the detect code. On the Pentium, the chip
family, model, and stepping are returned in EAX, and some bit flags
identifying processor features are returned in EDX. The detect routine returns
the chip family code, taken from bits 8-11 of EAX. The value 5 is used to
identify the Pentium chip family; if Intel takes the logical step of
incrementing this ID value for each subsequent chip in the 80x86 processor
line, this is the last 80x86 detect routine you'll have to write.
Of course, the code presented here requires an assembler that supports Pentium
opcodes. Note that the .586 directive is how you turn on Pentium support in
the Phar Lap assembler; other assemblers may use different directives, since
the chip isn't called a 586. You can also code the detect routine with any
assembler just by using the DB directive to hardcode the opcode for CPUID.


_PROCESSOR DETECTION SCHEMES_
by Richard C. Leinecker


[LISTING ONE]

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Detect the Processor Type -- by Richard C. Leinecker ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

_PTEXT SEGMENT PARA PUBLIC 'CODE'
 ASSUME CS:_PTEXT, DS:_PTEXT

 public _Processor
; This routine returns the processor type as an integer value.
; C Prototype
; int Processor( void );
; Returns: 0 == 8088, 1 == 8086, 2 == 80286, 3 == 80386, 4 == 80486
; Code assumes es, ax, bx, cx, and dx can be altered. If their contents
; must be preserved, then you'll have to push and pop them.
_Processor proc far

 push bp ; Preserve bp
 mov bp,sp ; bp = sp

 push ds ; Preserve ds
 push di ; and di

 mov ax,cs ; Point ds
 mov ds,ax ; to _PTEXT

 call IsItAn8088 ; Returns 0 or 2

 cmp al,2 ; 2 = 286 or better
 jge AtLeast286 ; Go to 286 and above code

 call IsItAn8086 ; Returns 0 or 1

 jmp short ExitProcessor ; Jump to exit code
AtLeast286:
 call IsItA286 ; See if it's a 286

 cmp al,3 ; 4 and above for 386 and up
 jl short ExitProcessor ; Jump to exit code
AtLeast386:
 call IsItA386 ; See if it's a 386
ExitProcessor:
 pop di ; Restore di,
 pop ds ; ds,

 pop bp ; and bp

 ret ; Return to caller
_Processor endp

; Is It an 8088?
; Returns ax==0 for 8088/8086, ax==2 for 80286 and above
IsItAn8088 proc
 pushf ; Preserve the flags

 pushf ; Push the flags so
 pop bx ; we can pop them into bx

 and bx,00fffh ; Mask off bits 12-15
 push bx ; and put it back on the stack

 popf ; Pop value back into the flags
 pushf ; Push it back. 8088/8086 will
 ; have bits 12-15 set
 pop bx ; Get value into bx
 and bx,0f000h ; Mask all but bits 12-15

 sub ax,ax ; Set ax to 8088 value
 cmp bx,0f000h ; See if the bits are still set
 je Not286 ; Jump if not

 mov al,2 ; Set for 286 and above
Not286:
 popf ; Restore the flags

 ret ; Return to caller
IsItAn8088 endp

; Is It an 8086?
; Returns ax==0 for 8088, ax==1 for 8086
; Code takes advantage of the 8088's 4-byte prefetch queues and 8086's
; 6-byte prefetch queues. By self-modifying the code at a location exactly 5
; bytes away from IP, and determining if the modification took effect,
; you can differentiate between 8088s and 8086s.
IsItAn8086 proc

 mov ax,cs ; es == code segment
 mov es,ax

 std ; Cause stosb to count backwards

 mov dx,1 ; dx is flag and we'll start at 1
 mov di,OFFSET EndLabel ; di==offset of code tail to modify
 mov al,090h ; al==nop instruction opcode
 mov cx,3 ; Set for 3 repetitions
 REP stosb ; Store the bytes

 cld ; Clear the direction flag
 nop ; Three nops in a row
 nop ; provide dummy instructions
 nop
 dec dx ; Decrement flag (only with 8088)
 nop ; dummy instruction--6 bytes
EndLabel:

 nop

 mov ax,dx ; Store the flag in ax

 ret ; Back to caller
IsItAn8086 endp

; Is It an 80286?
; Determines whether processor is a 286 or higher. Going into subroutine ax =
2
; If the processor is a 386 or higher, ax will be 3 before returning. The
; method is to set ax to 7000h which represent the 386/486 NT and IOPL bits
; This value is pushed onto the stack and popped into the flags (with popf).
; The flags are then pushed back onto the stack (with pushf). Only a 386 or
486
; will keep the 7000h bits set. If it's a 286, those bits aren't defined and
; when the flags are pushed onto stack these bits will be 0. Now, when ax is
; popped these bits can be checked. If they're set, we have a 386 or 486.
IsItA286 proc
 pushf ; Preserve the flags
 mov ax,7000h ; Set the NT and IOPL flag
 ; bits only available for
 ; 386 processors and above
 push ax ; push ax so we can pop 7000h
 ; into the flag register
 popf ; pop 7000h off of the stack
 pushf ; push the flags back on
 pop ax ; get the pushed flags
 ; into ax
 and ah,70h ; see if the NT and IOPL
 ; flags are still set
 mov ax,2 ; set ax to the 286 value
 jz YesItIsA286 ; If NT and IOPL not set
 ; it's a 286
 inc ax ; ax now is 4 to indicate
 ; 386 or higher
YesItIsA286:
 popf ; Restore the flags

 ret ; Return to caller
IsItA286 endp

; Is It an 80386 or 80486?
; Determines whether processor is a 386 or higher. Going into subroutine ax=3
; If the processor is a 486, ax will be 4 before leaving. The method is to set
; the AC bit of the flags via EAX and the stack. If it stays set, it's a 486.
IsItA386 proc
 mov di,ax ; Store the processor value
 mov bx,sp ; Save sp
 and sp,0fffch ; Prevent alignment fault
 .386
 pushfd ; Preserve the flags

 pushfd ; Push so we can get flags
 pop eax ; Get flags into eax
 or eax,40000h ; Set the AC bit
 push eax ; Push back on the stack
 popfd ; Get the value into flags
 pushfd ; Put the value back on stack
 pop eax ; Get value into eax
 test eax,40000h ; See if the bit is set

 jz YesItIsA386 ; If not we have a 386

 add di,2 ; Increment to indicate 486
YesItIsA386:
 popfd ; Restore the flags
 .8086
 mov sp,bx ; Restore sp
 mov ax,di ; Put processor value into ax

 ret ; Back to caller
IsItA386 endp

_PTEXT ENDS
 END





[LISTING TWO]

.586
; Pentium detect routine. Call only after verifying processor is an i486 or
; later. Returns 4 if on i486, 5 if Pentium, 6 or greater for future
; Intel processors.
EF_ID equ 200000h ; ID bit in EFLAGS register
Pentium proc near

; Check for Pentium or later by attempting to toggle the ID bit in EFLAGS reg;
; if we can't, it's an i486.
 pushfd ; get current flags
 pop eax ;
 mov ebx,eax ;
 xor eax,EF_ID ; attempt to toggle ID bit
 push eax ;
 popfd ;
 pushfd ; get new EFLAGS
 pop eax ;
 push ebx ; restore original flags
 popfd ;
 and eax,EF_ID ; if we couldn't toggle ID, it's an i486
 and ebx,EF_ID ;
 cmp eax,ebx ;
 je short Is486 ;

; It's a Pentium or later. Use CPUID to get processor family.
 mov eax,1 ; get processor info form of CPUID
 cpuid ;
 shr ax,8 ; get Family field; 5 means Pentium.
 and ax,0Fh ;
 ret
Is486:
 mov ax,4 ; return 4 for i486
 ret
pentium endp






[LISTING THREE]

#pragma inline
main()
{
 long start1;
 long end1;
 long start2;
 long end2;
 start1 = clock();
asm P386
asm mov eax,10000000
asm lea ebx,loop1
asm loop1:
asm dec eax
asm jz loop1e
asm jmp ebx
loop1e:
 start2 = end1 = clock();
asm mov eax,10000000
asm lea ebx,loop2
asm nop
asm loop2:
asm dec eax
asm jz loop2e
asm jmp ebx
loop2e:
 end2 = clock();

 printf("misaligned loop time = %d, aligned loop time =%d\n",
 (int)(end1-start1), (int)(end2-start2));
 return;
}




























June, 1993
DETECTING OUT-OF-RANGE REFERENCES


Protecting yourself from memory-access violations




Chan Y. Lee


Chan is a member of the technical staff in the Am29000 Systems Engineering
Embedded Processor Division for Advanced Micro Devices. He can be contacted at
chanl@cayman.amd.com.


An out-of-range reference is an access attempt to a location outside the
permitted memory range. Several conditions can cause this runtime error: an
out-of-bound array index, uninitialized pointer variables, or dynamic storage
overflows, for example. Out-of-range errors occur more frequently, as more
program variables are bound at run time to facilitate efficient memory
utilization. Growing usage of indirect procedure calls in object-oriented
programming is another reason for this trend. Since there's no way to detect
an invalid binding of a pointer variable at compile time, out-of-range
references are becoming a major source of runtime errors in embedded-systems
programming.
Out-of-range errors are not a problem in conventional computing systems.
Typically, a protection mechanism verifies every memory reference.
Out-of-range references are detected and reported, thereby preventing system
crashes. With UNIX systems, for example, the familiar messages, "Bus
error--core dumped" and "Segment violation--core dumped" are generated upon an
invalid access attempt. A memory-management unit or boundary-checking logic is
responsible for this access protection and is crucial for maintaining the
conventional system's integrity.


Out-of-range References in Embedded Systems


Unlike conventional computing systems, most embedded systems are unprotected
from access violations. Protection mechanisms have not been a critical
requirement for embedded systems because embedded programs are thoroughly
debugged and frozen before product release. These "error-free" programs
eliminate the expensive production cost of extra protection hardware.
The basis of the embedded-systems design philosophy requires a thorough
debugging strategy that eliminates run-time errors before system release. In
the absence of built-in protection, embedded-systems developers use emulators
and logic analyzers to handle out-of-range reference errors. This debugging
process involves repeated attempts at setting the debug equipment triggers and
rerunning the target application. These attempts are time consuming and often
result in poor productivity. Out-of-range references are more frequent with
dynamic storage utilization and certain kinds of object-oriented programming
where indirect references are used extensively. Because of this time increase,
the debug-equipment based technique is no longer efficient for supporting
error-free programming strategy, especially on tightly scheduled projects.
Equipment cost is also a problem. Although stand-alone debug equipment is
essential in a development workbench, this equipment is often too expensive to
justify acquisition for a moderately funded project. Additionally, the
personality module for a newly introduced microprocessor may not be available.
When an access violation occurs in unprotected embedded systems, the outcome
might appear in a variety of ways, often confusing the developer.
Access-violation results range from no change in behavior to catastrophic
system failure, depending on whether the violating address is within the
physical-address boundary. The outcome also depends on the difference in the
memory-interface mechanism. The access error may not impact the system at all,
if the violating address points to a free memory location within the physical
boundary. For systems with a memory interface based on a fixed number of wait
cycles, the CPU accepts the instruction/data bus contents as long as the
address is not rejected by the system. In this case, the system gets garbage
on the bus, but the violating access may not cause an immediate crash.
However, the garbage may crash the system at a later stage, and this delayed
crash will take longer to solve. In both crash situations, the lack of a
built-in protection/detection mechanism will necessitate using debug equipment
to determine where the system crashed and what kind of error condition caused
the violating address. The address may be the direct result of either an
uninitialized pointer or a sequence of previous out-of-range references.
Out-of-range errors result in one of three typical outcomes: system crash,
illegal opcode or unexpected control flow, or no observable event.
System crash may occur if an access request cannot be granted or the violating
address cannot be decoded by the CPU. In the first situation, a memory-access
request is on hold if the memory interface cannot decode the address. After
issuing a request, the CPU waits for a Ready signal. If the memory interface
does not respond to the invalid access request, then the CPU can no longer
proceed, resulting in virtual deadlock. This type of error is the easiest to
track because the violating address can be observed immediately using a logic
analyzer. Ideally, the memory interface should signal the CPU for the address
exception, thereby preventing the CPU from waiting indefinitely for the Ready
signal.
In the second situation, the system will actually crash because either the CPU
cannot decode an out-of-range address, or the invalid address corrupted a
memory-mapped control register. For example, the Am29200 CPU partitions a
whole address space into a set of sub-spaces for different memory classes such
as ROM, RAM, and memory-mapped peripherals. A violating address will crash
this type of CPU if the address does not belong to any of the memory classes.
Illegal opcode traps and unexpected control-flow problems occur when an
indirect jump or a procedure call is taken with an uninitialized or corrupted
instruction address. Illegal opcode traps result when the invalid address
targets a data-memory location and the contents of the incoming instruction
bus cannot be decoded as a valid instruction. Since most embedded CPUs are
capable of trapping illegal opcodes, this error is easily detected. However,
discovering the problem's origin may be time consuming if the invalid
instruction address is not directly generated from a program variable.
Unexpected control-flow problems result from an invalid address pointing to an
incorrect instruction memory location. After a jump or a procedure call, the
system loses control because the program execution continues from the
incorrect location. This problem may be more difficult to resolve, as the
system behavior can only be analyzed by examining a long sequence of
instruction traces.
The result of an out-of-range data reference is unnoticeable if the reference
accidentally accesses a free memory location without corrupting other
allocated memory objects. As long as the accessed memory location remains
free, this particular out-of-range reference is completely hidden-even from a
production test suite. Although this reference sounds harmless, the error
makes the system unusable when the accidentally accessed location is no longer
free. For example, a data-pointer variable happens to go beyond the boundary
of heap space and grab a free location. In test cases not requiring the free
memory location, this error is never detected. However, when the operation of
the system claims the free memory location at run time, the system suddenly
stops working. Because this error may necessitate a costly software revision
after product release, non-observable errors may eventually be huge problems.


A Software-driven Detection Scheme


One way of coping with these problems is to develop a software-alternative
scheme comparable to the conventional system-hardware protection mechanism
that verifies address access. One approach to the software scheme is utilizing
an instruction-trace facility, if available, and letting the trace-trap
handler verify both the instruction and data address for every instruction
before execution. The trace-trap handler would first compare the contents of
program counter with the current code segment's boundary address. If the
program counter is within the boundary, then the handler fetches and decodes
the instruction to see if it involves any data access. If so, the handler then
verifies the data address against the allocated data boundary. This
straightforward scheme guarantees the detection of any out-of-range reference
at a cost of extra instruction cycles for executing the instruction trace-trap
handler. The cycles consumed by the handler significantly increase execution
time, since the trace trap is performed for every instruction to be executed.
This inherent overhead of the scheme is acceptable for testing most embedded
applications except certain real-time applications with strict timing
requirements. To keep the detection overhead as small as possible, it's
preferable for the scheme to use CPUs consuming few extra cycles before
entering the trap routine.
Listing One (page 128) shows an Am29200 RISC microcontroller-based,
out-of-range, detection-trap routine. This specific implementation executes
about 50 instructions to verify both the instruction, stack, and RAM data
address. If the verified instruction is not a LOAD or STORE, the detection
routine consumes only 25 instruction cycles. Since the Am29200's trap latency
is only a few cycles, verifying every instruction and data access for a
particular program takes less than 50 times of the total application execution
time. This increased execution time in testing is generally acceptable for
most embedded applications. For real-time applications requiring system
response within a few cycles, this detection scheme may not be directly
applicable. The scheme, however, may still be utilized with simulated
interfaces to real-time devices to maintain the system's normal course of
operation during the testing stage.
The instruction-trace trap-based scheme provides embedded systems with a
detection mechanism comparable to the hardware counterpart, yet it doesn't add
any production cost. The additional time allowed for extra instruction cycles
in the verification stage is the only cost for developers. Software-driven
detection is also inherently flexible, both in its setup and operation mode.
It may choose to verify the instruction address only, or it can check every
external reference, including stack, heap storage, and memory-mapped
peripheral devices. The detection scheme may also be enabled for necessary
access verification and then later disabled. The scheme can be enabled only
for a portion of program under development, rather than attempting
verification of the whole program. Additionally, the software-driven scheme
allows on-the-fly update of the boundary address. This address may be
constantly changing, depending on dynamic memory allocation and deallocation.
To effectively implement the software-driven detection scheme, the target CPU
must support these functions: instruction-trace facility, a very short trap
latency, and on-chip storage for the boundary address. These features are
mostly required to shorten the detection overhead, thus reducing the
access-verification time. Note that they not only benefit the detection scheme
but also help in building debug monitors and improving CPU performance. The
Am29200, for example, supports these features to allow fast context switching
as well as implementing the software-driven detection scheme.
An on-chip instruction-trace facility is necessary because the tracing event
activates the verification routine after executing every instruction. For CPUs
without the tracing capability, a timer can be used to activate the
verification routine at regular intervals. This timer-based scheme is only
able to verify the particular instruction executed at the time of the
interrupt. However, with reduced timer-interrupt intervals, this alternative
approach may still provide a good detection coverage on out-of-range
references.
The trap latency is important for the detection scheme since the delaying
period may greatly impact the overhead. Depending on the target CPU
architecture, the latency between an instruction-trace event and the start of
the trace-trap handler may range from a few cycles to hundreds of cycles. The
cycles are required to read the trap vector, jump to the handler, and save the
current context in memory for most CPUs. Because saving context uses many
memory cycles and increases the overhead of the scheme, it should be avoided
if the CPU architecture allows. For example, the Am29200 doesn't need context
saving because the critical registers composing the context are frozen upon a
trap. The frozen context will not be disturbed by the detection routine shown
in Listing One. By not saving context, the overhead of the scheme is greatly
reduced. In Listing One, an average of less than 50 instructions are executed
to verify both the instruction and data addresses.
If the boundary address is stored in memory, the detection routine must spend
many memory cycles to access the boundary information. Since the routine
accesses at least four boundary addresses, the speed of the boundary address
storage greatly impacts the time spent by the detection routine. The boundary
address is better kept in registers to avoid memory access. The Am29200
provides register storage for keeping boundary address, thereby reducing the
time spent by the detection routine. In Listing One, gr125/gr127
(general-purpose registers 125 and 127) are allocated for stack boundary
checking, gr86 for heap pointer, and gr114/ gr115 for instruction boundary
checking. The detection routine makes only one memory access to fetch
instructions from memory, to see if it is a LOAD or STORE instruction. The
boundary address registers also help the boundary updating process, because
they don't use extra memory cycles to update the boundary.


Implementation Issues


In addition to the CPU requirements, the following implementation-related
issues need to be carefully addressed in order to effectively utilize the
software-driven detection scheme.
Before starting the verification process, the boundary address must be
correctly initialized and updated to reflect any boundary changes. To forward
the boundary information, a set of procedure calls is the most convenient way
to support the functionality. Some boundary information may be automatically
initialized and updated. In an Am29200-based environment, the stack and heap
boundaries are managed by a C runtime kernel that adjusts tops of regions as
programs execute. The boundaries of the instruction and static data regions of
the Am29200, however, need to be initialized.
Fragmented memory in a given region may require multiple verifications against
multiple sets of boundaries to ensure that an access address is not out of
range. Because of this requirement, the target object code should be linked in
single, unfragmented regions.
To support multitasking environments, the verification process needs to manage
a table that stores the boundary information for all the active tasks. When a
task switching occurs, the operating system must inform the verification
process of the new task's identification. The verification process then
updates the boundary register. Depending on the system architecture, task
switching may automatically update boundary registers accessed by the
verification process. No management of boundary information is necessary for
this case.
Since an out-of-range reference error is normally unrecoverable, the only
choice is to abort the execution and report the exception to the developer for
troubleshooting. The error information may include the out-of-range address,
the address of the violating instruction, and possibly a dump of the register
contents. The simplest way to report an out-of-range error is to provide the
application program with an error-handling routine.
In Listing One, the violating address and instruction, together with the
memory region type, are transferred to the error-handling routine
Report_Out_Range. The routine then reports the error information. This routine
can either be written in assembly or C, but note that an adequate C runtime
stack environment must be provided when the error is reported by a C routine.
If supported, a more convenient way of error reporting is to use the SIGNAL
facility to take advantage of the standard signals such as SIGBUS or SIGSEGV.

_DETECTING OUT-OF-RANGE REFERENCES_
by Chan Y. Lee



[LISTING ONE]

; TraceTrap - detects out-of-range references
; This instruction trace trap handler verifies both the instruction and data
; address to detect an out-of-range reference. It first checks the contents of
; 'pc0', which keeps the address of the instruction to be executed when this
; trap is returned. If 'pc0' is good, it then fetches instruction from memory
; and decodes it to see if the instruction makes any memory reference. If so,
; the handler then checks the data address. When it finds an access is
; out-of-range, it invokes a user procedure provided to properly report error.

 .global TraceTrap
TraceTrap:
 ; Check for out-of-range instruction address. The 'pc0' contains address
 ; of the instruction to be fetched. Instruction address is verified
 ; against both ROM and RAM instruction boundaries. ROM boundary checking
 ; can be omitted if code under verification doesn't reside in ROM.

 mfsr treg1, pc0 ; obtain instruction address from program counter 0
 srl treg2, treg1, 28 ; check address qualifier first
 cpeq treg3, treg2, 0 ; see if it is an instruction ROM access

 jmpt treg3, CheckInstROM
 cpeq treg3, treg2, 4 ; see if it is an instruction RAM access
 jmpf treg3, InstOutRange ; error if neither ROM nor RAM qualified
CheckInstRAM:
 cpge treg2, treg1, RAM_pc_low
 jmpf treg2, InstOutRange
 cple treg2, treg1, RAM_pc_high
 jmpf treg2, InstOutRange
 nop
 jmp CheckRegStack
CheckInstROM:
 cpge treg2, treg1, ROM_pc_low
 jmpf treg2, InstOutRange
 cple treg2, treg1, ROM_pc_high
 jmpf treg2, InstOutRange
CheckRegStack:
 ; check for out-of-range register stack pointer(gr1).
 cpge treg2, gr1, gr126 ; gr126 is the Register Allocate Bound
 jmpf treg2, StackOutRange
 cple treg2, gr1, gr127 ; gr127 is the Register Free Bound
 jmpf treg2, StackOutRange

 ; Verification of instruction counter and reg stack pointer is done.
 ; Now fetch the instruction and decode it to see if it is a LOAD or STORE.
 ; If so, check data address contained in register operand of instruction.
 load 0, 0, treg2, treg1 ; fetch the instruction from memory
 srl treg3, treg2, 24
 cpeq treg3, treg3, 0x17 ; look for LOAD
 jmpt treg3, CheckDataAddress
 srl treg3, treg2, 24
 cpeq treg3, treg3, 0x1E ; look for STORE
 jmpf treg3, TrapExit
CheckDataAddress:
 sll treg2, treg2, 24 ; to reset all of the bits except register number
 jmpt treg2, LocalAddressReg ; if MSB is set, a local register has address
 srl treg2, treg2, 24 ; now treg2 has the reg number

 const treg3, GlobalRegTable
 jmp GetDataAddress
 consth treg3, GlobalRegTable
LocalAddressReg:
 andn treg2, treg2, 0x80 ; reset the MSB of local register number
 const treg3, LocalRegTable
 consth treg3, LocalRegTable
GetDataAddress:
 srl treg2, treg2, 3 ; table offset from reg number
 add treg2, treg2, treg3 ; jump location to get address
 const lr0, DataAddressObtained
 jmpi treg2
 consth lr0, DataAddressObtained
DataAddressObtained:
 ; 'treg2' now has the data address for either STORE or LOAD instruction.
 ; Verify the data address against the following regions:
 ; 1. Data RAM (0x4XXXXXXX)
 ; This includes the stack, heap and static data.
 ; 2. Peripheral Control Registers (0x800000XX)
 ; Access to the on-chip Peripheral Control Register.
 ; 3. Peripheral Interface Adaptor5 (0x950000XX)
 ; Access to the PIA5 region ranging from 0x95000000 to 0x950000FF.
 srl treg3, treg2, 28 ; to get the address qualifier
 cpeq treg3, treg3, 0x4 ; data RAM address?
 jmpt treg3, CheckDataRAM
 srl treg3, treg2, 28
 cpeq treg3, treg3, 0x8 ; peripheral control register?
 jmpt treg3, CheckPCR
 srl treg3, treg2, 24
 cpeq treg3, treg3, 0x95 ; peripheral adaptor region 5?
 jmpt treg3, CheckPIA5
 nop
 jmp DataOutRange ; report data address error if the qualifier doesn't
 ; match with any Am29200 region currently defined
CheckDataRAM:
 cpgt treg3, treg2, RAM_pc_high ; check base of RAM data
 jmpf treg3, DataOutRange
 cple treg3, treg2, gr86 ; check top of RAM data(heap ptr:gr86)
 jmpf treg3, DataOutRange
 nop
 jmp TrapExit ; Data RAM address is good
CheckPCR:
 const treg3, 0x800000FC
 consth treg3, 0x800000FC
 cpgt treg3, treg2, treg3 ; valid range: 0x80000000 to 0x800000FC
 jmpt treg3, DataOutRange
 nop
 jmp TrapExit ; Peripheral Control Register is good
CheckPIA5:
 const treg3, 0x950000FF
 consth treg3, 0x950000FF
 cpgt treg3, treg2, treg3 ; valid range: 0x95000000 to 0x950000FF
 jmpt treg3, DataOutRange
; TrapExit - Address verification is completed. Return to the application.
TrapExit:
 ; Clear TP(Trace Pending) bit from OPS(Old Processor Status) register
 mfsr treg3, ops
 const treg2, TP
 andn treg3, treg3, treg2

 mtsr ops, treg3
 iret




























































June, 1993
HIPPI AND HIGH-PERFORMANCE LANS


Supercomputer technologies for high performance




Andy Nicholson


Andy is a software engineer working with mobile computing for Microsoft. When
he originally wrote this article, Andy developed network software for Cray
Research. He can be contacted at One Microsoft Way, Redmond, WA, 98052.


The high-performance parallel interface (HIPPI) is the first standard computer
network architecture to use crosspoint switches as the basis for a local area
network. Currently, HIPPI is focused on point-to-point, high-throughput data
connections useful for applications such as real-time graphics. A HIPPI LAN
not only offers very high throughput, but demonstrates production-quality
technology for building bandwidth-scalable networks using components available
today. Making the best use of this capability, however, requires that we
rethink some of our fundamental assumptions about the architecture of computer
networks and network applications. Ultimately, we must modify or replace the
client/server model with peer-to-peer oriented applications to take advantage
of the increased bandwidth offered by HIPPI and other crosspoint-switch based
networks.
HIPPI, which has its roots in super-computer I/O interfaces, defines a
standard for 100-Mbyte/sec and 200-Mbyte/sec point-to-point simplex links
using 32-bit wide or 64-bit wide copper cabling. (An implementor's agreement
also exists for building fiber-optic extenders that use two or four fibers.) A
related standard defines the behavior of a cross-point switch used to support
multiple interconnects between HIPPI interfaces on different hosts. These
crosspoint switches are central to building a LAN with HIPPI technology.
There's also a proposed Internet standard for implementing an IP network on
HIPPI switches. These standards and commercially available components make it
possible to build a production HIPPI LAN. At Cray Research, we used these
standard components to build a sophisticated HIPPI LAN for production and
development work.


Crosspoint Switches and Network Scalability


A crosspoint switch (also known as an NxN or crossbar switch) connects hosts
into a LAN such that each host may have a dedicated connection to at least one
other host in the network. Each connection may transfer data at the full
throughput capability of the connection media. A 32x32 crosspoint switch
(Network Systems Corporation's PS32, the largest currently available) allows
each of 32 hosts to maintain a full 200-Mbyte/sec throughput data connection.
That's 6.4 Gbytes/sec (or 51.2 Gbits/sec) of total network bandwidth!
Two properties of crosspoint-switch based networks are very exciting compared
to bus- or ring-based networks such as Ethernet or FDDI. First, a
crosspoint-switch network has constant throughput per host. Second, a
crosspoint-switch network has increasing bandwidth that scales to the number
of hosts. Regardless of the number of hosts attached to the network, a
crosspoint-switch network delivers constant throughput capability to each
host. Because of this, the network bandwidth (the throughput times the number
of hosts) scales up as the network grows in size.
This scalability is the most valuable feature of a crosspoint-switch based
network. In a HIPPI network, it makes little sense to build a 200-Mbyte/sec
data interface if contention over a shared media (like a bus or ring) lowers
the average throughput to a fraction of what is possible. The interface is
simply too expensive to waste. For example, with the 32x32 HIPPI crosspoint
switch, all hosts can simultaneously transfer data at 200 Mbytes/sec.
Thirty-two hosts attached to a 100-Mbit/sec FDDI ring network will achieve (at
most) an average of 3 Mbits/sec throughput per host. Crosspoint-switch
networks, therefore, are the most efficient users of the network interface.
The fundamental difference between scalable crosspoint-switch networks and
traditional bus or ring networks is that, relative to the number of hosts, a
cross point-switch network is throughput limited and bandwidth scalable, and a
bus or ring is throughput decreasing and bandwidth limited. In short, the
network itself is a bottleneck in bus and ring networks.
Before HIPPI technology, there was no standard networking hardware available
that allowed us to build a bandwidth-scalable LAN. The availability of a
proposed Internet standard for using the IP protocol over a HIPPI
crosspoint-switch based LAN puts this technology into use in a production
environment. Still, the current network computing model is not sufficiently
sophisticated to take advantage of the greater capabilities of a
band-width-scalable LAN.


HIPPI and Client/Server Architectures


Software designers used a model of a network as a shared resource when
creating network applications. Client/server distributed computing is well
suited to this network model because contention for the shared network
resource hides contention for the shared server resource. Only one server is
needed because only one host may access the network at any time. Building a
LAN with HIPPI technology, however, exposes the limitations of this approach.
Bandwidth-limited network. The client/server architecture encourages balance
on the bandwidth-limited network. The server must be capable of servicing
requests only according to the network's capacity to carry requests and
responses. Clients are limited to their fair share of the bandwidth available
in the network. Additional capacity in the server is wasted, and additional
capacity in the clients only increases contention over the network. Adding
more clients also increases network contention.
Network contention encourages some interesting solutions to increasing
aggregate system performance. One solution is to optimize use of the network.
The server should respond to requests with minimum latencies, and clients
should make a minimal number of requests. This has been accomplished in some
networks by adding expensive local disks to previously diskless workstations.
Of course, much of the same data is replicated (such as commonly used
programs) across all of the local disks.
Another solution is to partition networks. Instead of adding capacity to each
server, networks are broken up into multiple networks with a smaller number of
clients and individual, dedicated servers. A variation on this strategy is to
add multiple network interfaces to higher-capacity servers to support
multiple-client networks on each server. These strategies increase the load on
the servers because they become routers between client networks. Partitioning
is more expensive in terms of server hardware, network hardware, routers, and
network-management effort.
Crosspoint-switch network. A crosspoint-switch network has very different
properties. Foremost, the network media is not bandwidth limited and is not a
shared resource. The point of contention becomes the network interface at each
host. This is a liability in a client/server architecture. The server can
respond to requests only according to the capability of the network interface.
However, the network is capable of carrying requests according to the number
of clients on the network.
Not surprisingly, another solution to overloaded networks has appeared.
Localtalk, Ethernet, and FDDI switches are available from Tribe Computer Works
(Emeryville, California), Kalpana (Santa Clara, California), and Digital
Equipment Corp., respectively. These products are designed and marketed
specifically to offer scalability over typical partitioning approaches for
increasing aggregate network bandwidth. However, their potential is limited by
clients creating significant contention over single servers. The Localtalk,
Ethernet, or FDDI switch must distribute network load to several servers to
increase overall network bandwidth utilization.
Building a balanced client/server network using a crosspoint switch requires
that a server service requests according to the capacity of clients to
generate those requests. This requires a high-performance server, and it may
require multiple network interfaces in order to bypass the bottleneck of
contention for its network interface.
This is an advantageous approach to building a balanced system because it
scales more easily than a bandwidth limited network. Adding new clients to the
network requires only additional capacity at the server and not a wholesale
replacement or repartitioning of the network. However, we can go further in
balancing system resources in a bandwidth-scalable network.


Client/Server vs. Peer-to-Peer


Because a bandwidth-scalable network offers the same throughput to each
network interface no matter how many interfaces are active in the network, the
client/server model of distributed computing appears less useful. A
peer-oriented model makes much more sense. As a first step toward a
peer-to-peer model, the server functions should be split out to separate
servers with a capacity well balanced to the function.
A typical server may provide remote disk, e-mail gateway, printer, dial-up,
and wide area network (WAN) connectivity services in some combination. All of
these services will typically be using the same network interface, yet their
functions are not compatible with optimal use of the interface. For example,
dial-up services will generate many small packets, printer and e-mail will be
larger and bursty, disk will require a lot of large blocks, and WAN
connectivity will generate all sorts of traffic.
A crosspoint-switch network allows each of these services to live in a
separate server. Another possibility allows each service to reside behind a
different interface in one high-capacity server. The system designer has great
flexibility in designing a system that is balanced to make optimal use of
system resources. (See the text box entitled, "Peer Programming: It's a Matter
of Symmetry" for a discussion of peer-to-peer vs. client/server programming.)
Computer-supported collaboration (CSC) is a perfect example of an
intrinsically peer-to-peer application that takes advantage of the enhanced
capabilities of a bandwidth-scalable crosspoint-switch based LAN. The
high-throughput requirements of real-time audio and video are multiplied by
the participation of many people in a collaborative effort. The usefulness of
a scalable bandwidth is obvious when you consider the possibility of people
entering or leaving a collaborative session at random intervals. The network
can adapt to the changing needs of the application.


HIPPI LANs


HIPPI technology is not the only way to build bandwidth-scalable networks
using available products. Nevertheless, HIPPI is the first network media in
which crosspoint-switch technology is the only method of building a LAN, as
well as providing a standard method for use of the crosspoint switch for the
LAN.
This is an important distinction. A Localtalk, Ethernet, or FDDI switch exists
to increase bandwidth in a bandwidth-limited network. The same networking
model is still in use and there is little incentive to directly connect hosts
to ports on the switch. The Ethernet or FDDI switch breathes new life into a
client/server model on bandwidth-limited networks.
More Details.
On a HIPPI LAN, there is no such need to increase bandwidth. It is possible to
design new applications for bandwidth-scalable networks on HIPPI without the
existing media limitations. We must learn how to think about
bandwidth-scalable networks. Consider the history of the telephone network.
When telephone service first became available, it was offered in the form of a
party line. A party line was set up so many people would share a circuit to
the central office, and if someone was using the line when you wanted to talk,
you either had to wait or politely ask them to finish their conversation (or
be impolite or claim your call was an emergency!). Conference calling was the
norm in the early days of telephones, and only recently have we regained this
desirable form of communication. The party line is similar to the modern bus
or ring network.

Today's phone system, however, is different from a bus or ring network.
Instead of sharing a party line to the central office, nearly every telephone
customer has a dedicated line to the central-switching office. When you wish
to make a call to another telephone, you encode the destination of your call
inband on the phone circuits and the central switch connects you to that
destination after a short switching delay. You direct the switch to connect
you directly to your peer and you communicate as long as you and your peer
wish to. No one else can connect to you or your peer until you disconnect your
call. This is exactly the same as a crosspoint-switch network using HIPPI
technology--except that the HIPPI switch is very fast, switching on the order
of a microsecond.
The switching of dedicated lines revolutionized telephone service, and today
it is almost impossible to imagine going back to the old party-line system.
(The truly parsimonious, however, may long for the lower party-line rates!)
Crosspoint switches offer the same opportunity to revolutionize local area
computer networks. We must challenge and re-evaluate our fundamental
assumptions about how to use the network.
Consider what can be done with the technology developed for the largest
commercially available HIPPI switch, which supports 32 hosts at 200
Mbytes/sec. The bit rate on each of the 64 data lines is 25 Mbits/sec--a
pretty fast LAN by today's standards (Ethernet). Because it is a 32x32 switch,
there are 1024 switch points inside the switch. Because a 200-Mbyte/sec HIPPI
uses 85 wires, each switch point actually supports 85 switches in a ganged
fashion. What can you do with 85K switch points? You can build a serial N log
N switch supporting four thousand hosts! The hardware complexity is the same
order of magnitude as a fully configured 32-port switch.
My initial reaction is that this is all the switch you could ever use because
it would be ridiculous to put that many hosts on one LAN. (We can't get
acceptable performance out of LANs with only a few hundred hosts!) However,
this is an implicit fundamental assumption based on the fact that throughput
per host decreases as the number of hosts increases, eventually driving
performance to an unacceptably low level, and we partition the network to get
better performance. The subnetting then brings along a whole new class of
problems. This isn't true on a crosspoint-switch network. Whether you have
four hosts or four thousand hosts or even four million hosts, the throughput
per host remains constant. What could you do with a LAN of four thousand
hosts? No one has really bothered to think about this before. Now it is a
practical exercise to consider how we can better solve our computing problems
with this kind of structure.


Building a HIPPI LAN


At Cray we built a HIPPI LAN comprised of almost a dozen CRI computers, a
frame buffer, and other commercial components; see Figure 1. We used a Network
Systems Corporation (NSC) PS8 switch as the basis for our production network
and an NSC PS32 as a development network. Not only were these networks
connected through CRI computers with multiple HIPPI interfaces, but we also
used direct switch-to-switch connections to send traffic between HIPPI
networks. Furthermore, NSC has built HIPPI interfaces for their routers, which
allowed us to route traffic from HIPPI networks to our campus-wide FDDI
backbone, and from there to the Internet. We also used a PsiTech Frame Buffer
for high-throughput video output.
This equipment is located in the computer center, but the reach of the HIPPI
LAN extends to the networking lab in the next building. Using a Broadband
Communications Products' BCP1200 HIPPI fiber-optic extender, we connected our
development network to another mini network using a 4x4 HIPPI switch. Using
VME Bus HIPPI interfaces, we even extended our HIPPI network into
workstation-class computers.
With all of these standard components and the protocol specified in the
proposed standard, "IP and ARP over HIPPI," we demonstrated sustained
single-stream TCP data rates between two hosts of over 75 Mbytes/sec, and
identified opportunities for improvement. The production network used these
high data rates to take advantage of the high disk-throughput that a Cray
Research file server can provide, as well as other high-throughput data
transfers between Cray computers.


Conclusion


HIPPI technology offers the opportunity to re-examine the way we build LANs.
We can build networks that have high throughput per host with scalable
bandwidth and no network performance degradation due to the number of hosts.
The client/server model as currently used must be modified or replaced with a
peer-to-peer model to best use this capability in a network. We must challenge
our fundamental assumptions about the organization of function and network
topology. HIPPI technology creates our first opportunity to build fast
networks with capabilities unavailable in the past. We can take full advantage
of this opportunity only by recasting our thinking process and imagining new
creative solutions to network computing problems unencumbered by the
limitations of bus- and ring-oriented networks.


Acknowledgments


I want to thank John Renwick for getting me involved with HIPPI.


References


Adams, George B. III, Dharma P. Agrawal, and Howard Jay Siegel. "A Survey and
Comparison of Fault-Tolerant Multistage Interconnection Networks." IEEE
Computer (June 1987).
Bhuyan, Laxmi N., Qing Yang, and Dharma P. Agrawal. "Performance of
Multiprocessor Interconnection Networks." IEEE Computer (February 1989).
High-Performance Parallel Interface--Mechanical, Electrical, and Signaling
Protocol Specification (HIPPI-PH), ANSI X3.183-1991.
High-Performance Parallel Interface--Framing Protocol (HIPPI-FP), ANSI
X3.210-199X.
High-Performance Parallel Interface--Encapsulation of IEEE 802.2 (IEEE Std
802.2) Logical Link Control Protocol Data Units (802.2 Link Encapsulation)
(HIPPI-LE), ANSI X3.218-199X.
High-Performance Parallel Interface--Physical Switch Control (HIPPI-SC), ANSI
X3.222-199X.
Kumar, V.P. and S.M. Reddy. "Augmented Shuffle-Exchange Multistage
Interconnection Networks." IEEE Computer (June 1987).
Renwick, John, and Andy Nicholson. "IP and ARP on HIPPI." Proposed Internet
Standard, Internet Working Group Request for Comments 1374, October 1992.


Peer Programming: It's a Matter of Symmetry


The primary difference between client/server and peer-to-peer network
programming is the symmetry between the communicating parties. Client/server
programming is asymmetrical and peer-to-peer is symmetrical. Peer programming
means writing only one version of a program that does everything the network
application requires.
When writing client/server programs, you first think in terms of the service
the server will provide, and how that service will be made available. Then you
write the server to implement the design. Then you think through how the
client will access the service. Finally, you implement the client program.
Peer programming is similar, except that you put both kinds of functionality
into the same program. It might seem more complex at first, because, when
thinking it through, you realize that meaningful communication is taking place
in two directions when peers are cooperating on the network. This is the same
as running a client/server program in both directions on the same network. In
this aspect, peer programming is simpler because you can ignore the
differences between client and server programs.
To illustrate this, I've written a simple network echo service. Listing One
(page 130) lists the header file for echo programs. The client/server version
has the typical client and server programs, which are clearly different. The
server program (Listing Two, page 130) creates a network-service endpoint and
binds to a well-known location. Then it waits for a client (Listing Three,
page 130) to come along and ask for service. The client also creates a network
endpoint, but binds to a wildcard address because its location is unimportant.
The client has to know where to find the server, but because the server is a
passive entity waiting for connections, clients don't need well-known
addresses. Once it has a valid network endpoint, the client connects to the
server and a service transaction takes place. In the case of this echo server,
the client simply sends a buffer that the server echoes back. Both the client
and server then break the connection. The client is finished, while the server
goes on looking for further client transactions.
The peer-oriented echo service (Listing Four, page 130) is longer, and appears
to be more complex partly because it is a contrived example. In this
peer-oriented program, the bulk of the code is used for connection setup, with
a very small module that performs the actual data communications. In a real
application, the data communication part of the program would be much larger.
All of this code is shared in the peer program, thus reducing the size and
complexity of the total package when compared with a client/server
implementation. Peer programming is not inherently more difficult than
client/server programming, but it does require you think about the program
differently.
The order of events in the peer program is not very different from the
client/server programs. It starts out creating a passive TCP endpoint, just
like the server program. Once a passive endpoint is established, the program
enters a loop looking for incoming requests (like the server) and creating
active requests (like the client).
This program uses a simple mechanism to determine whether to make an active
connection to another peer or to respond to an incoming request. The program
accepts and processes incoming connections as they are discovered using the
select system call. An active connection is attempted if there are no incoming
connections. An active connection is not attempted until all incoming
connections are processed.
In this simple example, there is no synchronization and it is possible for a
deadlock to occur. If both peers attempt an active connection at the same
time, then each connection attempt will succeed; however, each will send data
and wait for the response before continuing and responding to the incoming
request. Because of this possible deadlock, it is necessary to fork a child to
handle either the passive or active side of the peer. In this example a child
is forked to handle the active side, which then frees the parent to process
the incoming request.
Synchronization plays a big part in creating a peer-to-peer program. In a
client/server environment, synchronization is straightforward. If the server
is not running, the client cannot connect to it. With a peer program,
synchronization becomes a design issue, and you must decide how to handle it.
Another difference in the peer program is that both sides of the connection
use the same code for data transfer and, therefore, both sides start out by
sending an echo-request message to the other side. They then look for an
incoming echo request, which must be echoed, and an echo reply for the request
sent out. Data is being transferred in both directions at the same time, and a
simple protocol has been implemented so that the peers can tell the difference
between requests and responses.
I've written multiple versions of the peer program while creating these
examples and have learned quite a bit. One of the fun things I learned is that
a peer program can talk to itself. While testing it, I fired up a copy in
loopback, expecting to start a second copy for the first copy to talk to.
Instead, it connected to itself, and away it went! Writing peer-oriented
programs will be a learning experience for all of us who have become
accustomed to writing client/server programs.
-- A.N.


_HIPPI AND HIGH-PERFORMANCE LANS_
by Andy Nicholson



[LISTING ONE]

/* echo.h - header file for the echo example programs */

#define BUFSIZE 512
#define ECHOPORT 7500

#define ECHO_REQUEST 1
#define ECHO_REPLY 2






[LISTING TWO]

/* echos.c -- The echo service server program */

#include <errno.h>

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

#include "echo.h"

main()
{
 char buf[BUFSIZE];
 int s, c;
 struct sockaddr_in sin, peer;
 int peerlen;
 int mlen;

 /* get a socket */
 if ((s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == -1) {
 perror("can't open socket");
 exit(1);
 }
 /* bind to local address and echo port */
 bzero(&sin, sizeof(sin));
 sin.sin_family = AF_INET;
 sin.sin_port = ECHOPORT;
 if (bind(s, &sin, sizeof(sin)) == -1) {
 perror("cannot bind");
 close(s);
 exit(1);
 }
 /* start listening for connections */
 if (listen(s, 5) == -1) {
 perror("cannot listen");
 close(s);
 exit(1);
 }
 /* accept and process incoming messages */
 while (1) {
 if ((c = accept(s, &peer, &peerlen)) == -1) {

 perror("cannot accept");
 close(s);
 exit(1);
 }
 /* simple processing, read message, echo it back */
 if ((mlen = read(c, buf, BUFSIZE)) == -1) {
 perror("read error");
 close(c);
 continue;
 }
 if (write(c, buf, mlen) != mlen) {
 perror("write error");
 close(c);

 continue;
 }
 printf("%d byte message received and echoed\n", mlen);
 close(c);
 }
} /* end of main */





[LISTING THREE]

/* echoc.c -- The echo service client program */

#include <errno.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

#include "echo.h"

main(argc, argv)
int argc;
char **argv;
{
 char buf[BUFSIZE];
 int s;
 struct sockaddr_in sin;

 if (argc < 2) {
 printf("must specify a server in internet dot notation\n");
 exit(1);
 }
 /* get a socket */
 if ((s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == -1) {
 perror("can't open socket");
 exit(1);
 }

 /* bind to local address and available port */
 bzero(&sin, sizeof(sin));
 sin.sin_family = AF_INET;
 if (bind(s, &sin, sizeof(sin)) == -1) {
 perror("cannot bind");

 close(s);
 exit(1);
 }
 /* connect to echo server at echo port */
 bzero(&sin, sizeof(sin));
 sin.sin_family = AF_INET;
 sin.sin_port = ECHOPORT;
 sin.sin_addr.s_addr = inet_addr(argv[1]);
 if (connect(s, &sin, sizeof(sin)) == -1) {
 perror("cannot connect");
 close(s);
 exit(1);
 }
 /* send a message to be echoed */
 if (write(s, buf, BUFSIZE) != BUFSIZE) {
 perror("write error");
 exit(1);
 }
 /* read the echoed message back */
 if (read(s, buf, BUFSIZE) != BUFSIZE) {
 perror("read error");
 exit(1);
 }
 printf("%d byte message echoed by echo server\n", BUFSIZE);
 close(s);
 exit(0);
} /* end of main */





[LISTING FOUR]

/* echop.c -- The echo service peer program */

#include <errno.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <sys/time.h>

#include "echo.h"

main(argc, argv)
int argc;
char **argv;
{
 int s;
 struct sockaddr_in sin;
 int status;

 int width;
 fd_set readfds, writefds, exceptfds;
 struct timeval timeout;
 if (argc < 2) {
 printf("must specify a peer in internet dot notation\n");
 exit(1);
 }

 /* setup passive socket */
 if ((s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == -1) {
 perror("can't open passive socket");
 exit(1);
 }
 /* bind to local address and echo port */
 bzero(&sin, sizeof(sin));
 sin.sin_family = AF_INET;
 sin.sin_port = ECHOPORT;
 if (bind(s, &sin, sizeof(sin)) == -1) {
 perror("cannot bind");
 close(s);
 exit(1);
 }
 /* start listening for connections */
 if (listen(s, 5) == -1) {
 perror("cannot listen");
 close(s);
 exit(1);
 }
 /* initial select parameters */

 FD_ZERO(&writefds);
 FD_ZERO(&exceptfds);
 width = s+1; /* only file descriptor we want to check */
 timeout.tv_sec = 0; /* set timeval for polling */
 timeout.tv_usec = 0;

 while (1) {
 /* check for and process incoming connection requests */
 FD_SET(s, &readfds);
 while ((status = select(width, &readfds, &writefds,
 &exceptfds, &timeout)) > 0) {
 do_respond(s);
 FD_SET(s, &readfds);
 }
 if (status == -1) {
 perror("select failed");
 }
 do_request(argv[1]);
 }
} /* end of main */
int
do_respond(s)
int s;
{
 int c;
 struct sockaddr_in peer;
 int peerlen = sizeof(peer);

 if ((c = accept(s, &peer, &peerlen)) == -1) {
 perror("accept failed");
 return(-1);
 }
 do_echo(c);
 close(c);
 return(0);

} /* end of do_respond */


int
do_request(dst)
char *dst;
{
 int s;
 struct sockaddr_in sin;
static pid_t pid = 0;
 int status;
 /* Wait for child if we have already forked. */
 if (pid) {
 wait(&status);
 pid = 0;
 }
 /* get a socket */
 if ((s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == -1) {
 perror("can't open socket");
 return(-1);
 }
 /* bind to local address and available port */
 bzero(&sin, sizeof(sin));
 sin.sin_family = AF_INET;
 if (bind(s, &sin, sizeof(sin)) == -1) {
 perror("cannot bind");
 return(-1);
 }
 /* connect to echo server at echo port */
 bzero(&sin, sizeof(sin));
 sin.sin_family = AF_INET;
 sin.sin_port = ECHOPORT;
 sin.sin_addr.s_addr = inet_addr(dst);
 if (connect(s, &sin, sizeof(sin)) == -1) {
 perror("connection failed");
 close(s);
 return(-1);
 }
 /* fork process to be form active request */
 if ((pid = fork()) == -1) {
 perror("Cannot fork");
 pid = 0;
 return(-1);
 }
 /* return if parent so we can go back and look for incoming
 * service requests. */
 if (pid > 0) {
 close(s);
 return(0);
 }
 /* child uses echo service */
 do_echo(s);

 close(s);
 exit(0);
} /* end of do_request */

int
do_echo(s)
int s;
{

 char buf[BUFSIZE];
 int recvlen;
 int sendlen = BUFSIZE;
 /* send a message to be echoed */
 buf[0] = ECHO_REQUEST;
 if (write(s, buf, sendlen) != sendlen) {
 perror("sending: write error");
 exit(1);
 }
 printf("sent echo request\n");
 /* read an incoming message */
 if ((recvlen = read(s, buf, BUFSIZE)) == -1) {
 perror("reading message: read error");
 exit(1);
 }
 /* echo message if it is an echo request */
 if (buf[0] == ECHO_REQUEST) {
 buf[0] = ECHO_REPLY;
 if (write(s, buf, recvlen) != recvlen) {
 perror("echoing: write error");
 exit(1);
 }
 printf("received echo request - sent echo reply\n");
 } else { /* must be reply */
 printf("received echo reply\n");
 }
 /* read another incoming message */
 if ((recvlen = read(s, buf, BUFSIZE)) == -1) {
 perror("reading message: read error");
 exit(1);
 }
 /* echo message if it is an echo request */
 if (buf[0] == ECHO_REQUEST) {
 buf[0] = ECHO_REPLY;
 if (write(s, buf, recvlen) != recvlen) {
 perror("echoing: write error");
 exit(1);
 }
 printf("received echo request - sent echo reply\n");
 } else { /* must be reply */
 printf("received echo reply\n");
 }

 return(0);

} /* end of do_echo */
















June, 1993
32-BIT FLOATING-POINT MATH


Taking full advantage of the 386 instruction set




Al Williams


Al is the author of DOS 6: A Developer's Guide (M&T Books, 1993) and DOS and
Windows Protected Mode (Addison-Wesley, 1993). You can reach Al at 310 Ivy
Glen Court, League City, TX 77573 or on CompuServe at 72010,3574.


It's no secret that floating-point calculations can slow program execution.
For those PCs that have them, coprocessors can make a difference but, when
compared to floating point, integer arithmetic is still faster. Libraries that
emulate coprocessors are also available; however, these emulators are
notoriously slow.
In this article, I present an alternative approach to floating-point math that
takes advantage of 32-bit instructions. Even though I use the 80386 as an
example platform, these techniques can be applied to other 32-bit processors
as well. (You could even adapt the routines to run on 16-bit processors, but
you would see some loss of performance.)
To use FPM (floating-point math), you'll need MASM or TASM to assemble the
main source file. You can then use FPM from assembly, C, or C++ programs.
Since FPM takes advantage of several 386 instructions, you'll also need a 386-
or 486-based PC. One caveat: FPM is not as accurate as a coprocessor (or an
emulation library) that uses an 80-bit word. Still, it is accurate enough for
many programs.


Underlying Principles


When you do arithmetic by hand, you don't work directly with floating-point
numbers. Instead, you calculate an integer answer and decide where to put the
decimal point. To add, for example, you align the two numbers' decimal points,
then add. The answer's decimal point remains in the same position. When
multiplying by hand, you treat the numbers as integers, and place the answer's
decimal point by counting the number of digits to the right of the decimal
point in the two multiplicands. FPM uses similar procedures on binary numbers.
Using 32-bit integer arithmetic, the 386 can easily perform real-number math
with reasonable accuracy. Since FPM uses binary numbers, there is no "decimal"
point. To avoid confusion, I'll refer to the point in a FPM number as the
"radix point."
Figure 1 shows how FPM stores floating-point numbers. The val field stores a
32-bit binary number. The scale field determines the location of the assumed
radix point. The sign field is 0 for positive numbers, or 0xFF for negative
numbers.
FPM uses two routines to adjust numbers. RADJ shifts numbers right until bit 0
is a 1, and LADJ shifts them left until the leftmost bit is a 1. These
routines update the scale field by the amount of the shift so that the
number's value doesn't change--just its representation.
Before adding two numbers, FPM must line up their radix points using the
algorithm in Figure 2(a). FPM uses RADJ on both numbers to allow for as many
significant digits as possible; see Figure 3. If the scales are then equal,
the addition proceeds. If they are not, FPM adjusts the number with the
smallest scale to the left until the scales are equal, or until a left shift
would cause a 1 bit to fall off the end of the word. If the scales are still
not equal, FPM will right shift the number with the largest scale. While some
bits on the right may be lost, these bits have less significance than the bits
on the left.
Figure 2: (a) Pseudocode for FPM addition; (b) pseudocode for FPM subtraction;
(c) pseudocode for FPM multiplication; (d) pseudocode for FPM division.

 (a)

 if (x.value==0) return y

 if (y.value==0) return x

 right_adjust(y)
 right_adjust(x)
 if x.scale!=y.scale then
 A=number with larger scale
 B=number with smaller scale
 sdiff=A.scale-B.scale
 idx=left_bit_index(B) /* find index of leftmost '1' bit */
 if 31-idx>=sdiff then
 B.value=B.value<<sdiff
 B.scale+=sdiff
 else
 B.value=B.value<<31-idx
 B.scale+=31-idx
 A.value=A.value>>sdiff-(31-idx)
 A.scale-=sdiff-(31-idx)
 endif
 endif
 if bit 31 of A.value or B.value is set then
 A.value=A.value>>1
 B.value=B.value>>1
 A.scale--
 B.scale--

 endif
 if A.sign==0xff A.value=-A.value
 if B.sign==0xff B.value=-B.value
 result.value=A.value+B.value
 if result.value<0 then
 result.value=-result.value
 result.sign=0xff
 endif
 result.scale=A.scale

 (b)

 if y.value!=0 y=-y
 jump to add routine

 (c)

 right_adjust(x)
 right_adjust(y)
 if x.value==0y.value==0 return 0
 product=x.value*y.value /* product is 64 bit */
 shift product left until top 32 bits are all zero
 x.scale=x.scale-amount_shifted
 if bit 31 of product set then
 product=product>>1
 x.scale--
 endif
 result.value=product
 result.scale=x.scale+y.scale
 result.sign=x.sign^y.sign /* exclusive-or */

 (d)

 left_adjust(x)
 right_adjust(y)
 if x.value==0 return 0
 if y.value!=1 then
 idx=left_bit_index(y) /* find index of leftmost '1' bit */
 divx=x.value /* divx is 64 bit */
 divx=divx<<idx
 x.scale+=idx
 result.value=divx/y.value
 else
 result.value=x
 endif
 result.scale=x.scale-y.scale
 result.sign=x.sign^y.sign /* exclusive-or */

Figure 3: Addition with and without right adjustment.

 Problem: Add 10110.100 and 10010.000
 (both binary) using 8 bits.

 With no adjustment:

 10110.100
 + 10010.000
 ___________
 101000.100


 Overflow!

 With right adjustment:

 0010110.1
 + 0010010.0
 ___________
 0101000.1

Subtraction is simple; see Figure 2(b). FPM inverts the sign of the second
argument, calls the addition routine, and restores the second argument's
original sign.
The 386's multiply instruction can directly act on the two numbers in the val
fields. The scale of the result is the sum of the two original scales. Figure
2(c) outlines the multiplication algorithm. The MUL instruction places a
64-bit result in the EDX:EAX register pair. FPM shifts any significant bits
out of EDX to the EAX register (and adjusts the answer's scale, of course).
The DIV instruction divides a 64-bit number (in EDX:EAX) by a 32-bit number.
To maximize precision, FPM shifts the divisor to the right until its bit 0 is
a 1 (adjusting the scale, of course). If the divisor is equal to 1, the
division is complete. Otherwise, FPM shifts the 64-bit dividend left by the
number of significant bits in the divisor. This assures a 32-bit result with
the maximum number of significant digits.
Next, the DIV instruction divides the two numbers. FPM then subtracts the
divisor's scale from the dividend's scale to locate the result's radix point.
Figure 2(d) illustrates this process.


Implementation


Listing One (page 76) contains FPM.ASM, the core functions for FPM. FPM.H
(Listing Two, page 78) and FPMC.C (Listing Three, page 78) simplify using FPM
from C programs.
The faddxy(), fsubxy(), fmulxy(), and fdivxy() functions provide the basic
mathematical operations. To round out the library, two routines (fixtofloat()
and floattofix()) perform conversions between FPM numbers and standard C
floats.
Listing Four (FPMGRF.C, page 80) shows an example of FPM at work and pits FPM
against the standard floating-point library. FPMGRF plots a line twice using a
real-number coordinate system.
First, the standard library plots the line, then fpmline() plots the same
line. Variations in color show where FPM and the math library compute
different results. Table 1 contains the results of running FPMGRF.C on
different machines.
Table 1: FPMGRF.C timing results. Values are in clock ticks (about 1/18 of a
second) and represent averages over 12 executions.

 Math Library FPM
 ---------------------------------

 25-MHx 386
 without 80387 26 7

 25-MHz 386
 with 80387 4 7



Optimizations


FPM takes full advantage of the 386 instruction set. Of course, the 32-bit
register width is a significant advantage, but other 386 instructions help,
too.
FPM often shifts words until their rightmost (or leftmost) bit is a 1 (see
LADJ and RADJ in Listing One). A conventional DOS program might use the code
in Example 1(a) to right-adjust EAX. The 386-bit scan instructions, BSF and
BSR, can considerably speed up this operation. BSF starts at bit 0 and returns
the index of the first 1 bit that it finds. BSR is similar to BSF, but it
starts with bit 31 (for a 32-bit operand). If there are no 1 bits in the word,
the bit-scan instructions set the zero flag (ZF). Example 1(b) shows how to
rewrite the right-adjust routine using BSR and BSF. This avoids the expensive
looping and jumping that the first method requires.
Example 1: (a) Conventional DOS program that right-adjust EAX; (b) using the
386's BSF and BSR instruction to right-adjust EAX.

 (a)

 SLOOP: TEST EAX,1
 JZ DONE
 SHR EAX,1
 JMP SLOOP
 DONE:

 (b)

 BSF ECX,EAX
 SHR EAX,ECX

FPM provides a C-language interface to the bit-scan instructions (bitscan() in
Listing One). The prototype is: int bitscan(int dir, unsigned long lword);. If
the dir flag is 0, bitscan() uses the BSF instruction. Otherwise, it uses BSR.
The lword parameter is a 32-bit integer to scan. Unlike BSF and BSR, bitscan()
numbers the bits from 1 (rightmost) to 32 (leftmost). If no bits are set in
lword, bit-scan()returns 0.
FPM uses other 386 instructions in several places. For instance, Example 2
shows how the SHLD instruction shifts an operand left, filling the rightmost
bits from its other operand and leaving 10FAH in EAX.
Example 2: Using the 386 SHLD instruction.


 MOV EAX, 10H
 MOV EBX, 0FA000000H
 SHLD EAX, EBX,8



Conclusions


If you are a C++ programmer, you have probably already thought about writing a
class wrapper for FPM. By using C++'s operator overloading, you could make FPM
calls with conventional C operators. In fact, an FPM class is simple to
construct. Unfortunately, the constructor overhead that C++ imposes degrades
FPM's performance. I have experimented with several class wrappers for
FPM--all ran more slowly than the normal C float type. If FPM isn't faster
than the math library, there is little point in using it.
FPM can provide your programs with a fast alternative for real-number
arithmetic. It also shows that you don't have to run in protected mode to
benefit from the power of a 386 or 486 CPU.

_32-BIT FLOATING-POINT MATH_
by Al Williams



[LISTING ONE]

;##########################################################
;# File: FPM.ASM #
;# 386 Floating point package by Al Williams #
;##########################################################
.MODEL SMALL,C

; enable 386 instructions
.386
.DATA

; Number format
FPM_NUM STRUC
SIGN DB ? ; 0 for + FF for -
SCALE DB ? ; position of decimal point
NUM DD ? ; unsigned magnitude
FPM_NUM ENDS

.CODE
; C routines
PUBLIC faddxy,fsubxy,fmulxy,fdivxy,bitscan

;************************************************
; Helper for C routines bitscan(dir,lword)
; if dir==0 do BSF on num
; if dir!=0 do BSR on num
; Returns index 1-32 or 0 if word was zero
bitscan PROC
 ARG DIR:WORD, LWORD:DWORD
 MOV AX,DIR
 OR AX,AX
 MOV EAX,LWORD
 JZ SHORT SCANFWD
 BSR EAX,EAX
 JMP SHORT BITSC0
SCANFWD: BSF EAX,EAX
BITSC0: JNZ SHORT BITSC1
 XOR AX,AX
 RET
BITSC1: INC AX
 RET

bitscan ENDP

;************************************************
; Adjust temp register @BX so that rightmost bit is 1
RADJ PROC
; Find last set bit
 BSF ECX,[BX].NUM
; If all zeros, jump to ZOUT
 JZ SHORT ZOUT
; Shift right and adjust scale
 SHR [BX].NUM,CL
 SUB [BX].SCALE,CL
 RET
ZOUT:
 MOV [BX].SCALE,0
 RET
RADJ ENDP

;************************************************
; Adjust temp register @BX so leftmost bit is 1
LADJ PROC
; Find first bit set
 BSR ECX,[BX].NUM
; If all zero's goto zout (above in proc RADJ)
 JZ ZOUT
; 31-Bit_index -> # places to shift
 MOV CH,31
 SUB CH,CL
 MOV CL,CH
; Shift number and adjust scale
 SHL [BX].NUM,CL
 ADD [BX].SCALE,CL
 RET
LADJ ENDP

;************************************************
; Add x+y->x
faddxy PROC USES SI
 ARG ANSOFF:WORD,ANSSEG:WORD,XARG:FPM_NUM,\
 YARG:FPM_NUM
; Check for zero addition (x=0)
 MOV EAX,XARG.NUM
 OR EAX,EAX
 JNZ SHORT YZA
 MOV EAX,YARG.NUM
 MOV CH,YARG.SIGN
 MOV CL,YARG.SCALE
 JMP DOADD3
; Check for y=0
YZA: MOV EAX,YARG.NUM
 OR EAX,EAX
 JNZ SHORT NZA
 MOV EAX,XARG.NUM
 MOV CH,XARG.SIGN
 MOV CL,XARG.SCALE
 JMP DOADD3
NZA:
 LEA BX,YARG
 CALL RADJ

 LEA BX,XARG
 CALL RADJ
; if eq then ok
 MOV CL,XARG.SCALE
 MOV CH,YARG.SCALE
 CMP CL,CH
 JE SHORT DOADD
; if xscale>yscale ...
 LEA SI,YARG
 JG SHORT PLUSADJY
 XCHG CH,CL
 XCHG BX,SI
PLUSADJY:
; amt=xscale-yscale
 SUB CL,CH
; find top bit in y
 BSR EAX,[SI].NUM
 MOV CH,31
 SUB CH,AL
 CMP CH,CL
; if 31-topbit>=amt shift y left by amt; yscale+=amt
 JL SHORT PLUSELSE
 SHL [SI].NUM,CL
 ADD [SI].SCALE,CL
 JMP SHORT DOADD
PLUSELSE:
; else ...
; shift y left by 31-topbit; yscale+=31-topbit
 XCHG CH,CL
 SHL [SI].NUM,CL
 ADD [SI].SCALE,CL
 XCHG CH,CL
 SUB CL,CH
; shift x right by amt-(31-topbit); xscale-amt-(31-topbit)
 SHR [BX].NUM,CL
 SUB [BX].SCALE,CL
DOADD:
; Make certain bit 31 of each number is off and keep scales equal
 MOV EAX,XARG.NUM
 MOV EBX,YARG.NUM
 MOV CL,XARG.SCALE

 MOV EDX,EAX
 OR EDX,EBX
 JNS SHORT SETSIGN
 SHR EAX,1
 SHR EBX,1
 DEC CL
SETSIGN:
 XOR DH,DH ; Set positive flag
 MOV DL,XARG.SIGN
 OR DL,DL
 JZ SHORT DOADD1
 NEG EAX
 INC DH
DOADD1: MOV DL,YARG.SIGN
 OR DL,DL
 JZ SHORT DOADD2
 NEG EBX

 INC DH
DOADD2:
 ; Assume positive sign in CH
 XOR CH,CH
 ADD EAX,EBX
 JNS SHORT DOADD3
 OR DH,DH
 JZ SHORT DOADD3 ; Large positive number
 NEG EAX
 NOT CH
DOADD3:
 PUSH DS
 LDS BX,DWORD PTR ANSOFF
; Store sign, scale, and value
 MOV [BX].SIGN,CH
 MOV [BX].SCALE,CL
 MOV [BX].NUM,EAX
 POP DX
 RET
faddxy ENDP

;************************************************
fsubxy PROC
 PUSH BP
 MOV BP,SP
; If X-0 then compute X+0
 MOV EAX,[BP+14].NUM
 OR EAX,EAX
 JZ SHORT SUBBY0
; else compute X+-1*Y
 NOT [BP+14].SIGN
SUBBY0:
 POP BP
; Let + do the work
 JMP faddxy
fsubxy ENDP


;************************************************
fmulxy PROC
 ARG ANSOFF:WORD,ANSSEG:WORD,XARG:FPM_NUM,\
 YARG:FPM_NUM
; Right adjust X
 LEA BX,XARG
 CALL RADJ
; Right adjust Y
 LEA BX,YARG
 CALL RADJ
; Check for X*0 or 0*Y
 MOV EAX,XARG.NUM
 OR EAX,EAX
 JZ SHORT RETZM
 MOV EBX,YARG.NUM
 OR EBX,EBX
 JZ SHORT RETZM
; Do it
 MUL EBX
TSTZLP:
; Find bits in EDX (overflow of 32 bits)

 BSR ECX,EDX
; If none, goto DXZ
 JZ SHORT DXZ
; Shift bits to EAX and adjust scale
 INC CL
 SHRD EAX,EDX,CL
 SUB XARG.SCALE,CL
DXZ:

 OR EAX,EAX
 JNS SHORT NOOF
; keep answer positive
 SHR EAX,1
 DEC XARG.SCALE
NOOF:
; store answer
 PUSH ES
 LES BX,DWORD PTR ANSOFF
 MOV ES:[BX].NUM,EAX
; compute new scale
 MOV AL,YARG.SCALE
 ADD AL,XARG.SCALE
 MOV ES:[BX].SCALE,AL
; compute new sign
 MOV AL,YARG.SIGN
 XOR AL,XARG.SIGN
 MOV ES:[BX].SIGN,AL
 POP ES
 RET
; Jump here to return a zero
; shared by MUL & DIV
RETZM: PUSH ES
 LES BX,DWORD PTR ANSOFF
 XOR EAX,EAX
 MOV ES:[BX].NUM,EAX
 MOV ES:[BX].SCALE,AL
 MOV ES:[BX].SIGN,AL
 POP ES
 RET
fmulxy ENDP

;************************************************
fdivxy PROC
 ARG ANSOFF:WORD,ANSSEG:WORD,XARG:FPM_NUM,\
 YARG:FPM_NUM
; LEFT ADJUST X
 LEA BX,XARG
 CALL LADJ
; RIGHT ADJUST Y
 LEA BX,YARG
 CALL RADJ
; CHECK FOR DIVIDE BY ZERO
 MOV EAX,XARG.NUM
 OR EAX,EAX
 JZ RETZM
; CHECK FOR DIVIDE BY 1
 MOV EBX,YARG.NUM
 CMP EBX,1
; DON'T DIVIDE BY 1 -- BESIDES IF WE ROTATE EDX:EAX LEFT,

; / BY 1 WOULD RESULT IN OVERFLOW
 JZ SHORT DODIV
; SHIFT X BITS INTO EDX BASED ON SIZE OF Y
 XOR EDX,EDX
 BSR ECX,EBX
 SHLD EDX,EAX,CL
 SHL EAX,CL
; ADJUST SCALE
 ADD XARG.SCALE,CL
 DIV EBX
DODIV:
 PUSH ES
 LES BX,DWORD PTR ANSOFF
 MOV ES:[BX].NUM,EAX
; COMPUTE ANSWER'S SCALE
 MOV AL,XARG.SCALE
 SUB AL,YARG.SCALE
 MOV ES:[BX].SCALE,AL
; FIND ANSWER'S SIGN
 MOV AL,YARG.SIGN
 XOR AL,XARG.SIGN
 MOV ES:[BX].SIGN,AL
 POP ES
 RET
fdivxy ENDP

 END





[LISTING TWO]

/***************************************************
 * FPM.H - C language header for FPM *
 ***************************************************/
#ifndef FPMHEADER
/* fixed point type */
typedef struct
 {
 char sign;
 char scale;
 unsigned long num;
 } FIXED;

/* Scale factor
 use - 1000.0 for 3 places, 100.0 for 2 places etc. */
#define SFACTOR (10000.0)

#define CVTTYPE float

#ifdef __cplusplus
extern "C"
{
#endif

/* Assembly core functions (from FPM.ASM) */
extern FIXED faddxy(FIXED,FIXED),fsubxy(FIXED,FIXED),

 fmulxy(FIXED,FIXED),fdivxy(FIXED,FIXED);
extern int bitscan(int dir,unsigned long lword);

/* C helpers (from FPMC.C) */
FIXED floattofix(CVTTYPE f);
CVTTYPE fixtofloat(FIXED f);
int fixtoint(FIXED f);

#ifdef __BORLANDC__
/* this is the same */
extern FIXED faddxy (FIXED,FIXED),
 FIXED fsubxy (FIXED,FIXED),
 FIXED fmulxy (FIXED,FIXED),
 FIXED fdivxy (FIXED,FIXED);
#else
/* Microsoft definitions */
FIXED fpmmsc_rv;
extern void faddxy (FIXED _far *, FIXED,FIXED),
 fsubxy (FIXED _far *, FIXED,FIXED),
 fmulxy (FIXED _far *, FIXED,FIXED),
 fdivxy (FIXED _far *, FIXED,FIXED);
#define faddxy(a,b) (faddxy(&fpmmsc_rv,a,b),fpmmsc_rv)
#define fsubxy(a,b) (fsubxy(&fpmmsc_rv,a,b),fpmmsc_rv)
#define fmulxy(a,b) (fmulxy(&fpmmsc_rv,a,b),fpmmsc_rv)
#define fdivxy(a,b) (fdivxy(&fpmmsc_rv,a,b),fpmmsc_rv)
#endif

#ifdef __cplusplus
}
#endif

#endif





[LISTING THREE]

/***************************************************
 * FPMC.C - C language routines for FPM *
 ***************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include "fpm.h"

/* convert fpm to float */
FIXED floattofix(CVTTYPE f)
 {
 FIXED ans;
 unsigned long i,whole,frac=0L;
 CVTTYPE fp=.5;
 ans.sign=0;
/* special case */
 if (f==0.0)
 {
 ans.scale=0;

 ans.num=0L;
 return ans;
 }
 if (f<0.0)
 {
 ans.sign=0xff;
 f=-f;
 }
 whole=(unsigned long)f;
 f-=whole;
/* scale is number bits in whole */
 ans.scale=32-bitscan(1,(unsigned long)whole);
 ans.num=(unsigned long)whole<<ans.scale;
 if (ans.scale)
/* compute fractional part */
 for (i=1L<<(ans.scale-1);i!=0&&f!=0.0;fp/=2)
 {
 if (f>=fp)
 {
 f-=fp;
 frac=i;
 }
 if (!frac&&!whole)
 ans.scale++;
 else
 i>>=1;
 }
 ans.num=frac;
 return ans;
 }
/* convert FPM number to float */
/* Uses SFACTOR for rounding (see FPM.H) */
CVTTYPE fixtofloat(FIXED f)
 {
 unsigned long i;
 CVTTYPE fp=0.5,res,sg=1.0;
 if (f.sign)
 {
 sg=-1.0;
 }
 if (f.scale<0)
 res=(CVTTYPE)(f.num<<-f.scale);
 else
 {
 res=(CVTTYPE)(f.num>>f.scale);
 while (f.scale>>5)
 {
 f.scale--;
 fp/=2.0;
 }
 for (i=1L<<f.scale-1;i!=0;i>>=1,fp/=2)
 if (f.num&i) res+=fp;

 }
 return floor(sg*res*SFACTOR+.5)/SFACTOR;
 }
/* convert FPM number to int (truncate fraction) */
int fixtoint(FIXED f)
 {

 int r;
 if (f.scale>=0) r=f.num>>f.scale; else r=0;
 if (f.sign) r=-r;
 return r;
 }






[LISTING FOUR]

/***************************************************
 * FPMGRF.C - Graphical demo of FPM with benchmark *
 ***************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <conio.h>
#include <time.h>
#include "fpm.h"

/* Macro to call BIOS */
#define INT10(r) int86(0x10,&r,&r)
/* Parameters for line (y=Mx+B) */
#define M .291 /* Slope */
#define B .08 /* y-intercept */

main(int argc,char *argv[])
 {
 clock_t s,e;
 long fpmtime,floattime;
/* Set CGA mode */
 setmode(4);
/* mark time */
 s=clock();
/* draw line with math library */
 floatline();
 e=clock();
/* compute time */
 floattime=e-s;
/* wait for a key */
 printat(1,10,"Press any key to continue");
 if (!getch()) getch();
 printat(1,10," ");
/* mark time again */
 s=clock();
/* draw line with FPM */
 fpmline();
 e=clock();
/* compute time */
 fpmtime=e-s;
/* wait for a key */
 printat(1,10,"Press any key to continue\n");
 if (!getch()) getch();
 setmode(3);
/* print results */
 printf("Float time=%ld ticks\nFPM time=%ld ticks\n",floattime,fpmtime);

 }
/* Set video mode */
setmode(int n)
 {
 union REGS r;
 r.x.ax=n;
 INT10(r);
 return 0;
 }
/* plot a floating point pixel */
floatplot(float x,float y)
 {
 int x1,y1;
 x1=x*320;
 y1=y*200;
 plot(x1,y1,2);
 return 0;
 }
/* ploat a point with FPM numbers */
fpmplot(FIXED x,FIXED y)
 {
 static FIXED xscale;
 static FIXED yscale;
 static int first=0;
 if (!first)
 {
 xscale=floattofix(320.0);
 yscale=floattofix(200.0);
 first=1;
 }
 plot(fixtoint(fmulxy(x,xscale)),fixtoint(fmulxy(y,yscale)),3+128);
 }
/* plot a hardware pixel */
plot(int x,int y,int color)
 {
 union REGS r;
 r.h.ah=0x0c;
 r.h.al=color;
 r.h.bh=0;
 r.x.cx=x;
 r.x.dx=y;
 INT10(r);
 return 0;
 }
/* Floating point library line drawing routine */
floatline()
 {
 float x,y;
 for (x=0.0;x<1.0;x+=.001)
 {
 y=M*x+B;
 floatplot(x,y);
 }
 }
/* FPM line drawing routine */
fpmline()
 {
 FIXED x,y,m,b,incr;
 m=floattofix(M);

 b=floattofix(B);
 x=floattofix(0.0);
 incr=floattofix(.001);
 while (fixtoint(x)==0)
 {
 y=faddxy(fmulxy(m,x),b);
 fpmplot(x,y);
 x=faddxy(x,incr);
 }
 }
/* print string via BIOS at x,y */
printat(int x,int y,char *s)
 {
 union REGS r,r1;
 r.h.ah=2;
 r.h.bh=0;
 r.h.dh=y;
 r.h.dl=x;
 INT10(r);
 r.h.ah=0xe;
 r.h.bl=7;
 while (r.h.al=*s++)
 int86(0x10,&r,&r1);
 }


Example 1: (a) conventional DOS program that right adjusts EAX;

(a)

SLOOP: TEST EAX,1
 JZ DONE
 SHR EAX,1
 JMP SLOOP
DONE:

(b)

SHR EAX,ECX



Example 2: Using the 386 SHLD instruction

MOV EAX,10H
MOV EBX,0FA000000H
SHLD EAX,EBX,8



Figure 1: FPM number representation

struct fpmnum
/------------------------------------------------\
 
 val: contains the number's magnitude 
 

------------------------------------------------/

 scale: 
 number's 
 exponent 
-----------
 sign: 
 0 if + 
 FF if - 
\-----------/

Examples:

If:
val=10101 (binary)
scale=1
sign=0
Then:
number=1010.1 (binary) = 10.5 (decimal)

If:
val=11 (binary)
scale=-1
sign=0
Then:
number=110 (binary) = 6 (decimal)




Figure 2: (a) pseudeocode for FPM addition; (b) pseuedocode for FPM
subtraction; (c) pseudeocode for FPM multiply; (d) pseudocode for FPM divide

(a)


if (x.value==0) return y

if (y.value==0) return x
right_adjust(y)
right_adjust(x)
if x.scale!=y.scale then
 A=number with larger scale
 B=number with smaller scale
 sdiff=A.scale-B.scale
 idx=left_bit_index(B) /* find index of leftmost '1' bit */
 if 31-idx>=sdiff then
 B.value=B.value<<sdiff
 B.scale+=sdiff
 else
 B.value=B.value<<31-idx
 B.scale+=31-idx
 A.value=A.value>>sdiff-(31-idx)
 A.scale-=sdiff-(31-idx)
 endif
endif
if bit 31 of A.value or B.value is set then
 A.value=A.value>>1
 B.value=B.value>>1
 A.scale--
 B.scale--
endif

if A.sign==0xff A.value=-A.value
if B.sign==0xff B.value=-B.value
result.value=A.value+B.value
if result.value<0 then
 result.value=-result.value
 result.sign=0xff
endif
result.scale=A.scale


(b)

if y.value!=0 y=-y
jump to add routine


(c)

right_adjust(x)
right_adjust(y)
if x.value==0y.value==0 return 0
product=x.value*y.value /* product is 64 bit */
shift product left until top 32 bits are all zero
x.scale=x.scale-amount_shifted
if bit 31 of product set then
 product=product>>1

 x.scale--
endif
result.value=product
result.scale=x.scale+y.scale
result.sign=x.sign^y.sign /* exclusive-or */

(d)

left_adjust(x)
right_adjust(y)
if x.value==0 return 0
if y.value!=1 then
 idx=left_bit_index(y) /* find index of leftmost '1' bit */
 divx=x.value /* divx is 64 bit */
 divx=divx<<idx
 x.scale+=idx
 result.value=divx/y.value
else
 result.value=x
endif
result.scale=x.scale-y.scale
result.sign=x.sign^y.sign /* exclusive-or */













June, 1993
 DISTRIBUTED COMPUTING NOW: APPLICATION ENVIRONMENTS


Solving the problems of concurrency, distribution of procedures, and
integration




Lowell S. Schneider and Stephen S. Murray


Lowell, a cofounder of Ellery Systems, has 25 years of experience in software
development, most of them spent working with distributed databases and
computing system. He can be contacted at lss@esi.com. Stephen is associate
director for the High Energy Astrophysics Division of the Smithsonian
Astrophysical Observatory. He can be contacted at ssm@cfa.harvard.edu.


What problems can distributed computing solve? This question first arose in
the context of distributed databases in the 1970s. While technologists were
focusing on the problem of how to take existing, centralized databases and
transparently break them into distributed and possibly heterogeneous pieces,
the real problem was just the opposite. We already had distributed,
heterogeneous databases--what we needed was a way to integrate them.
Today the issues appear to be much the same when it comes to doing distributed
computing or downsizing and dealing with legacy code. The real problem is not
how to break up a monolithic program into procedures, some of which are local
and others remote, but how to integrate disparate procedures that are already
distributed. One distributed computing approach, Ellery Open Systems (EOS),
which runs as middleware on top of the OSF Distributed Computing Environment
(DCE) solves this problem by allowing existing programs to be run as servers
with little or no change, so you can build new, integrated applications that
call these as remote procedures.
Our DDJ articles will address some of the problems that distributed computing
is intended to solve: concurrency, distribution of procedures, and
integration. In this, our first installment, we'll focus on an application
that requires--and illustrates the power of--distributed computing. In the
future, we'll look under the hood and examine the tools and techniques used to
create distributed computing applications like that described here.


The ADS


The Astrophysics Data System (ADS) is a NASA-sponsored program that puts the
disparate data collected by NASA space missions and stored on a heterogeneous
collection of platforms at research centers and university facilities across
the nation in the hands of the scientific community in an efficient and
effective manner. The ADS offers a novel approach to dealing with large
quantities of geographically distributed and heterogeneous data from a broad
range of astrophysics missions: It provides convenient, uniform access to
relevant and current data regardless of where or how it is physically
maintained. The ADS also provides a variety of tools and services to inform
users of available data and to facilitate its manipulation and analysis. The
NASA ADS is built on EOS and runs over the Internet, including the NSFnet and
the NASA Science Internet. At this time, UNIX workstations from Sun, Digital
Equipment, and Hewlett-Packard are used throughout the system.
The ADS consists of operations sites, nodes, and users. The operations sites
provide project management, authentication, routing, systems management and
operations, electronic software distribution, user support, and other project
services.
Nodes are suppliers of data and/or services. Current nodes primarily provide
RDBMS-located catalogs of astronomical data (such as star catalogs) and
observation catalogs (lists of targets, for example). These catalogs are
typically maintained in commercial DBMSs such as Ingres or Sybase. Nodes also
provide data-processing facilities such as coordinate conversion and
visualization services. Services coming online in 1993 include connections to
foreign astronomical-data services, additional imagery capabilities, and
bibliographic abstract access--all of which are currently in beta testing.
These nodes are located at various research centers and universities which
receive NASA funding to archive space-astronomy data.


Distributed Users


Most importantly, the ADS has a distributed set of users. Client software is
distributed over the network to users, who install it on their local
workstations. The installation is purposely made simple: Any local user can
install the software, without needing the UNIX root-user password or any
special skills, save minimal UNIX knowledge and basic skills with a text
editor to change a single configuration file that drives the system. These
users are located at universities, small research groups, large astronomical
research facilities, and in government and commercial laboratories.
The intent of the project has always been to provide user-friendly data access
and processing abilities. With the use of Motif widgets and pull-down menus,
an astronomer with little computer experience can sit down and be doing
meaningful research on multispectral data located all across the country in
just a few minutes.
As an example, basic searches can be done with a just a few simple widgets. On
the first widget, the astronomer chooses which catalogue or catalogues to
search by clicking on their names. A positional search is selected from a
pull-down menu on the widget. A second widget materializes that contains
sliders for setting both the center point of the search in the sky and the
search radius in the north-south and east-west directions. With the sliders
set, the astronomer types a local filename in which to store the results and
presses the OK button. The client software then looks up the selected
catalogues in a cross-reference database, converts the user-entered
coordinates to the appropriate units for each catalogue, and builds SQL
statements. Remote procedure calls (RPCs) are used to invoke processes at the
remote sites where the data archives are located. Once the connection to these
remote servers is established, the SQL queries are sent to the remote database
servers for processing by the remote RDBMS. When the remote RDBMS returns
results, data-format translation is done on the node side to a canonical
format, which is transported back to the user's computer system. To keep from
overloading the network, only the first 25 rows are returned to the astronomer
and presented in a third widget for examination. If the first results look
useful, the user can request the rest of the results by pushing the Page Down
button on the widget. If the results are not interesting, another query can be
sent. Cleanup at the nodes is done automatically when the client software
quits.
The entire application was written by student and professional programmers,
most of whom knew little about Motif programming, RPC application programming,
or SQL. The interpreted Motif capability of EOS and the ability to build
server code that could be debugged from the local shell has resulted in rapid
prototyping and deployment of new services, without needing a cadre of
programming "gurus." Much of the application programming was done within the
project at the Infrared Processing and Analysis Center at Caltech, the Center
for Astrophysics and Space Astronomy at the University of Colorado, Boulder,
and the Smithsonian Astrophysical Observatory at Harvard.
The system has been well received within the community, especially with the
introduction of the Motif interface. Each day, hundreds of users are doing
cross-country research and instructing students with live data and hands-on
tools. In reviews by other astronomers on behalf of NASA, the ADS scores
highly as an important and useful tool. As the first production-level, open,
distributed-processing application, it is regarded as a technological
achievement. The future looks even more exciting.
"Virtual observatories" are on the horizon. An astronomer at a participating
college or university will be able to issue queries that will first look for
available data in archives of existing observations. Then, if the desired data
does not exist, the system will automatically generate observing requests of
ground-based telescopes or space-home observatories to do the observation and
return the data to the user and then archive it automatically for future use.
The technology to do this is available today. Whether it's done first in the
United States or not, is up to the American electorate.


The National Information Infrastructure Testbed


One mechanism for developing the technology needed for the "virtual
observatory" is the National Information Infrastructure Testbed (NIIT). The
NIIT is an industry-funded collaboration between a number of large- and
small-industry companies, Sandia National Laboratories, and several university
science organizations. It was established to implement a nationwide,
high-performance distributed-computing and network communications testbed in
1993. It is intended to enable industry, government, and university
participants to gain hands-on experience with the practical issues associated
with very large-scale information-infrastructure implementation. It is a
practically focused initiative to facilitate the acquisition and sharing of
practical experience necessary to implement, operate, and maintain very
large-scale, DCE-based, distributed-computing applications over
high-performance fiber ATM communications networks. The initial reference
application being deployed for roll-out in November 1993 is a multi-user
collaborative and interactive multi-media data-browsing and analysis
application of the Hewlett-Packard and Ellery Systems Earth Data System.


The EDS


The Earth Data System (EDS) is a suite of environmental-science,
distributed-computing applications developed by a number of leading scientists
around the United States in collaboration with Ellery Systems and
Hewlett-Packard.
Like the ADS, the EDS consists of distributed data holdings and applications
programs. A multimedia, interactive, collaborative data-browsing and analysis
application is being built by the EDS collaborators for deployment as a
reference application on the NIIT. This application will be comprised of a
number of existing application programs and datasets that will be integrated
as DCE services using EOS. When the application is completed in October 1993,
EDS users in different time zones will be able to collaboratively browse,
correlate, and analyze images, data, and meta-data as though they were sitting
side by side in front of a common monitor. It will be built, in large part, by
the scientists themselves, just as ADS was. EOS will be both the application's
development and the runtime environment.


Technology Sharing


The biggest technological difference between EDS and ADS is that EDS needs
much higher communications bandwidth and computational speed because its basic
units of data, the images, are typically quite large. That's why NIIT and the
future National Information Infrastructure are so important. Accelerating the
acquisition of practical experience with existing distributed-computing and
high-performance network technology will enable accelerated deployment of the
technology, while driving the cost down to generally affordable levels. It is
the lack of experience with the technology that inhibits its deployment.
Funding the development and implementation of the EDS over the last two years
has enabled Ellery, HP, and the participating scientists to gain practical
experience with distributed computing application solutions to complex,
heterogeneous data access and analysis problems. As the technology-transfer
liaison, the NASA Astrophysics Data System project office has provided
technical and scientific application technology and experience. It's extremely
exciting to see how distributed computing facilitates multidisciplinary data
system development and implementation. If properly designed,
distributed-computing solutions save enormous amounts of time and money while
providing significantly more powerful and flexible research and analysis
capabilities. We've also learned that the "cultural" issues associated with
implementing distributed computing applications -- that is, those involving
coordination and cooperation amongst individual special-interest groups
working toward a common good--are generally more challenging than the
technical issues. Distributed-computing applications development is a
different kettle of fish than older-generation centralized-computing
solutions, and the practical experience we've gained in this regard is
something we look forward to continuing to share with industry and other
communities, now seriously looking to implement distributed-computing
applications.
The similarities between the ADS and EDS will allow much of the experience
gained in building the ADS to be put to use for the EDS. At the same time,
some of the new capabilities developed for EDS will become important to the
ADS as it begins to bring more of the large data sets from current satellite
missions online.

The ADS and the EDS are being built mainly by people whose regular job is not
programming. In the case of the National Information Highway, the required
level of programming expertise is a critical issue. If you have to be a
computer scientist to gain access to it, it may not worth building. It won't
create any new jobs, and the research and education community, who really need
it, won't be able to afford it. But if average programmers can access it, then
small businesses, who create most new jobs but can only afford inexperienced
programmers, can become competitive service and information providers; the
research community will have access to these services as well as contributing
new information and services that result from research; and the education
community will be able to drive access down to the K-12 level.
When that happens, the National Information Infrastructure will truly deserve
its grand title.




























































June, 1993
OBJECT-ORIENTED FINITE-ELEMENT SOFTWARE


Al Vermeulen


Al, who holds a degree in systems design, is a developer for Rogue Wave
Software. He can be contacted P.O. Box 2328, Corvallis, OR 97339 or at
alv@roguewave.com.


Object-oriented programming is emerging as the dominant software methodology
of the '90s. Even the traditional scientific-computing community is emerging
from its Fortran shell to investigate the promise of C++ and other OO
languages. Why? Because object-oriented languages allow you to code at a
higher level of abstraction than traditional, procedural languages.
Consequently, object-oriented code ends up being easier to read, write,
maintain, and debug than traditional C or Fortran code.
Coding at a higher level of abstraction is, of course, relative. Many of us
can remember when almost all scientific programs were coded in assembler (or
even directly, via switches). In those days, numerical programmers had to
translate mathematics down to the bit, byte, and register level. Then along
came Fortran, and you could code formulas involving real and complex numbers
directly, without having to translate to assembler. Today, object-oriented
languages like C++ let you create even higher-level abstractions, and treat
them as if they were built-in data types. You can, for example, model vectors
and matrices, functions, and nonlinear systems of equations. Then you can use
these abstractions (or classes) as building blocks for still-higher levels of
abstraction.
In this article, I'll discuss code developed at the University of Waterloo to
research spline-based, finite-element models. To keep the the discussion
focused, I'll concentrate on the specific problem of determining the
load-deflection behavior of a simple bicycle-wheel model like that shown in
Figure 1.


The Finite-Element Method


The goal of the application is to simulate how a bicycle wheel behaves when
force is applied to it--the stress in the spokes, the shape of the wheel, and
the force it can withstand before buckling. All of these can be computed once
you know how the applied force deforms the wheel. Mathematically, the
deformation of a structure due to a load is modeled by partial differential
equations; the equations used to model the spokes are shown in Figure 2.
Unfortunately, for all but the simplest of cases, the exact solutions to the
differential equations aren't known, and so we must be satisfied with an
approximate solution.
One way of constructing an approximate solution is the finite-element method.
I'll explain the key points of the method using a simple example: exerting
downward force on a yardstick on end; see Figure 3. As you push on the
yardstick, it compresses and the tip moves down slightly. As you continue to
push harder and harder, the yardstick eventually bows outward, and the tip
moves down much more easily. This phenomena is called buckling. Since most
structures cannot withstand buckling, an engineer would often like to know the
precise relationship between weight and maximum deflection; this relationship
can be plotted on a force-deflection diagram like the one in Figure 3. (As you
can see from this diagram, the yardstick buckles at about 16 pounds of force;
feel free to test this out at home). Ideally, you'd create this plot by
directly solving the differential equations for each different weight, except
that this is usually too difficult. It turns out that for this simple example
you can solve the equations exactly, but let's ignore that for the time being
and focus on the finite-element method.
The key idea behind the finite-element method is to restrict the possible
deformations to a simple form that approximates the true deflection. For
example, you might restrict the deformation of the yardstick to be a cubic
polynomial. In the variant of the finite-element method we're interested in,
the deformation is restricted to be a piecewise polynomial spline function. A
spline, as shown in Figure 5, is a smooth curve whose shape is determined by
control points (the Xs in Figure 2). You can use a mathematical technique
(such as the Rayleigh-Ritz or Bubnov-Galerkin method) to convert the governing
equations to a new set of equations with the spline control points as
unknowns. Looking at the yardstick example, you can represent the deflection
of the yardstick as a spline with, say, five control points. The Rayleigh-Ritz
method is used to convert the governing differential equation into a nonlinear
system of equations with the control points as unknowns.
You've now approximated the governing system of partial differential equations
with a system of nonlinear algebraic equations whose unknowns are control
points in a spline representing the deflection. Unfortunately, these nonlinear
equations are still too complicated to solve directly, so you need to use some
iterative solution technique like Newton's method. To apply such a technique,
you need to be able to compute two things:
Given a force and a deflection, compute the out-of-balance forces. When the
out-of-balance forces are 0, you've found a point on the force-deflection
curve--an equilibrium point.
Given a deflection, compute the changes in out-of-balance forces due to
infinitesimal changes in the deflection. This is represented by a matrix,
called the tangent-stiffness matrix, because it describes the resistance of
the structure to forces. In mathematical terminology, this matrix is called
the Jacobian matrix.
Once you can compute the out-of-balance forces and tangent-stiffness matrix
for any deflection, you can use a nonlinear equation solver to obtain
equilibrium points, or a continuation method to obtain the force-deflection
plot.
If you're familiar with the finite-element method, you may be wondering why
we're interested in using splines to approximate the deformation, rather than
the conventional approach of breaking the structure down into small pieces
(elements), assuming polynomial deflection on each element, and then stitching
the elements together. One reason is efficiency: The use of splines can lead
to considerably more efficient solutions, giving better accuracy for fewer
degrees of freedom.; this can be shown both in practice and in theory. A
second reason for preferring splines is simplicity: In computer-aided design
systems, the geometry of structures is often already defined using splines. If
so, why not use that geometry to guide the deformation analysis, rather than
having to discretize the structure into elements?
Still, a drawback of the spline-based method as opposed to the conventional
element-based method is that the spline formulation leads to more complex
equations for the out-of-balance forces and tangent-stiffness matrix--and one
of the reasons for using object-oriented programming is to help manage this
complexity.


Wheel Model


If you recall, the goal in this article is to construct the load-deflection
curve for the two-dimensional bicycle-wheel model shown in Figure 1. The wheel
consists of a hub, a rim, and 36 spokes. The hub is considered to be rigid--no
amount of force will deform it--and fixed in space. The rim and spokes are
modeled as nonlinear, thin, shallow beams. (The adjectives refer to
assumptions made in deriving their governing equations.) One end of each spoke
is fixed in space by the hub; the other is pin-jointed to the rim. The wheel
is loaded by a point force pushing in on the rim. This force could represent
the force of a wall smashing into the wheel, or the floor pushing up on the
wheel. (If you're concerned about a world where moving walls collide with
stationary wheels, note that by changing reference frames you get the
physically identical, but more intuitive, situations of a moving wheel hitting
a stationary wall, or a rider's weight pushing the wheel against the ground.)
Deflections of the components which deform--the rim and spokes--are determined
using the spline finite-element method. To apply this method, you need to be
able to construct an out-of-bounds force vector and tangent-stiffness matrix
for each component. The complete formulas for the components of these vectors
and matrices are long and complex. Still, the formula for one component in the
spoke's tangent-stiffness matrix is shown in Figure 4. This formula is
typical: An entry in the matrix is obtained as the integral of an expression
involving the current deformation function and the spline-basis functions.
I'll describe shortly how to write C++ code to compute this expression.
For a structure (like the bicycle wheel) with more than one part, the
finite-element models for each individual part must somehow be tied together
into a model for the whole. This is done by imposing constraints on the
control points of the individual pieces so that the deformation is consistent
between pieces according to how the pieces are joined. In the case of the
wheel, the end-control point of each spoke indicates the deformation at that
end--this control point is constrained to be equal to the deformation of the
rim where the spoke is attached. The constraints between all the spokes and
the rim are represented using a sparse matrix; this matrix is used to
amalgamate the out-of-balance force vector and tangent-stiffness matrix of the
component parts into a global out-of-bounds force vector and tangent-stiffness
matrix.


Choosing Abstractions


Now comes the tough part: choosing which abstractions (classes) to use to
model the problem. You do this in three stages: First try to identify
components of other software packages which you can reuse in this design; then
create components custom built for this type of application; and finally build
any application-specific classes you need, and put the pieces together into an
application. The key components of the design are shown in Figure 5. You may
find it helpful to occasionally refer to this figure to see where the modules
described fit in.
*


Vectors and Matrices


Perhaps the most fundamental module needed by any scientific code are classes
for storing and manipulating arrays of numbers. Neither C nor Fortran provides
good abstractions for arrays: In C, you can't treat an array as a first-class
object, and Fortran lacks the capabilities for dynamic resizing of arrays.
Neither language lets you express operations on arrays in a high-level
fashion: You must use explicit loops to do even basic computations. These
drawbacks are easily remedied using C++ classes. Operator overloading and a
well-thought-out class interface provide excellent expressive capabilities. In
fact, even if you use no classes beyond a set of vector, matrix, and array
types, a good array class library can make the transition from C or Fortran
worthwhile.
When work on the spline finite-element project began in 1988, there were only
a handful of people doing numerics in C++; consequently almost no class
libraries existed. Once we began writing and using our own basic vector
classes, one of the truths of C++ became clear: While using C++ classes is
easy, writing them is not. It demands an excellent knowledge of the language
and more skill than we had at the time. One of the few other C++ numerical
projects was the Data Analysis and Interactive Modeling System (DAIMS) at the
University of Washington's Oceanography department, led by Thomas Keffer.
They, too, had been struggling with building vector and matrix classes. After
a lengthy exchange of ideas and experiences over the Internet, ideas from our
library were incorporated into the DAIMS classes. The result was a
high-performance, flexible library of vector and matrix classes. Tom went on
to found Rogue Wave software, and the vector and matrix classes evolved into
the core of Math.h++, the first commercially available C++ class library.


Functions


The concept of a function is fundamental to mathematics, and yet the only
support available for this concept, in either C or Fortran, is the rather
primitive pointer-to-function data type. Just to be sure we're on the same
wavelength, I'm talking here about mathematical functions like f(t), not
subroutines in a computer program. In our finite-element code, functions are
used to describe just about all aspects of the problem: geometry, material
properties, loading, and deformation. Since no function class library existed
back in 1988, we developed classes to support the notion of univariate
function, various special types of functions like polynomials and splines, and
things related to functions, such as integrators.
The most important class in this module is Func, which embodies the idea of a
univariate function. The key part of the header for Func is shown in Listing
One (page 91). (In the interest of clarity, member functions unnecessary for
this application have been omitted.) By overloading the function-call
operator, (), to do evaluation, using instances of Func becomes very natural.
Consider, for example, the simple plot subroutine Listing Two (page 91) that
uses the Func class. Func is an example of an abstract (pure virtual, in C++
talk) class. You can't instantiate a Func itself, since Func represents an
abstract idea, not a concrete one; instead you must instantiate a class
derived from Func with an actual evaluation method. Two classes derived from
Func are needed in the bicycle-wheel application: Poly to represent
polynomials, and NUBFunc to represent a nonuniform B-spline function.
Besides evaluating, you need to be able to combine functions in expressions,
like f*g+h. The problem is that in order to evaluate this expression later,
you need to keep references to its constituent functions: f, g, and h. The
solution is to use a smart pointer class, FuncHandle, for referring to
functions on the heap. The FuncHandle takes care of reference-counting the
functions and deleting them when they are no longer needed. Once you have
handles referring to functions, the expression classes can keep handles to
their constituent functions, and we can use expressions. Example 1 shows how
you might use this to plot the function sin(x)cos(x): (Note that the CFunc
class is used to encapsulate subprogram style functions.)
Example 1: Plotting the function sin(x)cos(x).


 FuncHandle s(new CFunc (sin));
 FuncHandle c(new CFunc (cos));
 plot (s*c);

To form our finite-element, out-of-balance force vectors and tangent-stiffness
matrices, and for many other applications throughout science and engineering,
you need to be able to integrate functions. One way to think of integration is
as a functional: a function of a function. The Integrator class embodies this
idea: It is an abstract base class, like Func, but to evaluate it requires not
a real number, but a function. The GQuad class is a subclass of integrator for
doing integration using Gauss quadrature. Listing Three (page 91) shows how to
use the integrator and the ability to form expressions of functions to compute
an entry in a finite-element tangent-stiffness matrix using the formula in
Figure 5. The code is beautiful: It mirrors almost exactly the mathematics:
Think about how this would look in Fortran!


Nonlinear-equation Support


Often the structures you're interested in (including our bicycle-wheel model)
are governed by a system of nonlinear equations. Unlike linear equations,
nonlinear equations usually cannot be explicitly inverted, so you rely on some
sort of approximation technique to solve them. The, nonlinear equations module
provides classes for representing and solving nonlinear equations. Although
the classes in this module were carefully designed to be useful in other
contexts, I'll use structural analysis words, like "forces" and
"displacements," to fit in with the rest of this article.
A general system of nonlinear equations can be written as f[i],(x[j]), where
(in our context) x[j] are control points representing displacement, and f[i]
are out-of-balance forces at that displacement. We are usually interested in
the equilibrium position of a structural system, which is the displacement
where there are no out-of-balance forces, f[i](x[j]) = 0. The nonlinear
equation is represented by the abstract base class NLSys ( Listing Four, page
91). (In the interest of clarity, member functions unnecessary for the
application have been omitted.) There are two things which we require of an
NLSys object: Given a vector of variables x[j], we must be able to evaluate
the out-of-balance forces, f[i], and solve the system of equations J[x]=f
where J is the Jacobian matrix (tangent-stiffness matrix) of the system.
The point of representing a nonlinear system using the NLSys class is to be
able to solve systems of equations and plot load-deflection curves. Since
these sorts of algorithms have many parameters, you encapsulate them into
objects. The NRSolve class uses Newton-Raphson iteration to find a solution to
the equations given an initial guess. The YSContinue class uses Yang and
Shieh's continuation method to compute a complete load-deflection curve.


Structural Components


At this point, the list of non-application-specific classes you could buy,
beg, borrow, steal, or (as a last resort) write is just about exhausted. The
next step is to use them to write classes for the domain of interest:
finite-element structural analysis. These domain-specific classes are
typically a little easier to write than the domain-independent classes, since
you don't have to worry quite so much about other people using them for
purposes you hadn't thought of.
Most of the domain-specific classes in the spline finite-element project
represent types of structural models. Although we've written a large number of
these classes, I'll consider only the two needed for the wheel: NLBeam and
NLArch. NLArch represents a finite-element model of a shallow, thin, nonlinear
curved beam. We'll use it to model the wheel's rim. The NLBeam class is a
special case of NLArch where the curvature is allowed to be infinite, thus
making it a straight beam. This class is used to represent the spokes. The job
of these classes is to represent the nonlinear system of equations governing
the control points in the spline approximation.
A domain-specific class needed (other than a structural component) is the
StructureConstraint class. This is used to represent the constraints between
substructures in a model. In this case, you need this class to act as the glue
between the spokes and the rim.


The Bicycle Wheel


With all the component parts in hand, all you need to do is snap them together
to build the application, An instance of the bicycle-wheel class itself,
Wheel, contains an array of 36 spoke objects (of class NLBeam), one NLArch
object representing the rim, and a StructureConstraint object to tie the
spokes to the rim.
The final step is to use the wheel model to generate a load-deflection curve.
First you need to interface the Wheel object to the nonlinear equation solver
module's NLSys object. This is done using multiple inheritance: A new object,
WheelSys, is defined which inherits from both Wheel and NLSys. This is a
common, simple way to tie pieces of an application together in C++. The
WheelSys class uses a Double PDFact object internally to maintain a
factorization of the current tangent-stiffness matrix; this is used to
implement the NLSys solveTangentSystem() member function. The response of the
wheel to a head-on collision force is shown in Figure 7, where you can see how
strong a well-built bicycle wheel can be--buckling does not even begin to
occur until past 25,000 pounds of force have been applied.

_OBJECT-ORIENTED FINITE-ELEMENT SOFTWARE_
by Al Vermeulen


[LISTING ONE]

class Func {
public:
 virtual ~Func() {};
 virtual double evaluate(double x) const =0;
 virtual double evaluate(double x, int d) const;
 double operator()(double x) const {return evaluate(x);}
 double operator()(double x,int d) const {return evaluate(x,d);}
 virtual DoubleVec operator()(const DoubleVec& x) const;
 virtual DoubleVec operator()(const DoubleVec& x, const IntVec& d) const;
 virtual DoubleVec operator()(double x, const IntVec& d) const;
};





[LISTING TWO]

void plot(const Func& f, double a, double b, double step)
{
 moveto(a,f(a));
 for(double x=a+inc; x<=b; x+=step) {
 drawto(x,f(x));

 }
 drawto(b,f(b));
}





[LISTING THREE]

FuncHandle dudx = diff(u,x);
FuncHandle dwdx = diff(w,x);
FuncHandle dNidx = diff(Ni,x);
FuncHandle integrand = EA*(dudx+dwdx*dwdx/2)*dNidx;
double fi = I(integrand); // I is an instance of the GQuad class





[LISTING FOUR]

class NLSys
{
 public:
 virtual DoubleVec value(const DoubleVec& x) =0;
 // Calculate K(x).
 virtual int setKt(const DoubleVec& x) =0;
 // Evaluate the tangent stiffness matrix at the point "x".
 // Subsequent calls to solveTangentSystem should use the
 // tangent stiffness matrix as set here. If the tangent
 // stiffness matrix is singular at this point then return a
 // 0 and do nothing, otherwise return a 1.
 virtual DoubleVec solveTangentSystem(const DoubleVec& b) =0;
 // Solve the system Ku=b for u where K is the last calculated
 // tangent stiffness matrix. u and b are increments.
};

























June, 1993
EXTENDING A VISUAL LANGUAGE FOR SIMULATION


Discrete-event simulation made possible via DLLs




Peter D. Varhol


Peter is an assistant professor of computer science and mathematics at Rivier
College in Nashua, New Hampshire.


As a college teacher, I'm constantly on the lookout for ways to supplement a
purely mathematical treatment of system simulation and modeling. My mostly
adult professional students are usually more interested in answering the
questions "how and where is this useful" rather than "how or why is this
done," necessitating a practitioner's approach to the subject.
The standard languages for system simulation, such as GPSS and Simscript, are
old and don't take full advantage of many of the PC's capabilities. The
alternative is to learn a new programming language to do simulation, and the
simulation system itself must be expressed in terms of that language. Further,
even with educational discounts, these packages are priced beyond the reach of
the students I serve. Thus, I was understandably optimistic when introduced to
VisSim, a simulation package based on a visual programming language.
With VisSim, the user constructs simulations by manipulating graphical blocks
representing different high-level functions useful in simulation. These blocks
are then connected to create the flow of data through the simulation. These
characteristics make the software easy to learn and highly visual, giving
students better insight into the simulation process. VisSim currently runs in
Microsoft Windows and on several UNIX/X platforms. According to the vendor,
Visual Solutions, the UNIX port was done by emulating the Windows API on top
of X, making simulations portable between the platforms.
For most of my needs, however, VisSim has one major drawback--it lacks the
ability to perform discrete-event simulation, which is in far more common use
than continuous simulation. Discrete-event simulation models systems in which
"customers" enter the system, receive "service," and leave. While this may
seem more appropriate for modeling business-oriented systems, I've used this
approach to successfully model systems such as job scheduling on multiuser
computers, failure prediction on system components, and reliability analysis
of automated processes. "Customers" can readily refer to people, computer
processes, automobiles on a highway, failed electrical components, or hundreds
of other discrete tasks.
In addition to being enormously useful, discrete-event simulation is both
conceptually and computationally easier than continuous simulation, and
requires little math beyond algebra. These characteristics make it easier for
students to understand, and more useful in many of their professions.


Customizing VisSim Using DLLs


Fortunately, the VisSim menus include what is called a "user-defined block," a
gateway to user-written dynamic link libraries (DLLs). These DLLs can be
written for practically any purpose, and Visual Solutions supports C, Pascal,
and Fortran. (The use of Fortran requires C calling conventions.) The ability
to use Fortran is a nice feature for those who have existing Fortran code that
can be included as part of a VisSim simulation.
Calling a user-defined function (in C) or procedure (in Pascal) in a DLL is
simple. Selecting the userFunction from the menu and clicking in the window
creates a rectangular block similar to predefined function blocks. The user
can press the right mouse button to open a window to name both the DLL and the
function. The DLL is loaded at that time, so it has to be in place before you
create the block on the screen. The block can now be treated like any
predefined block in VisSim, enabling it to be used as an operation in a larger
simulation.
There are certain limitations in the DLL gateway that make the development of
discrete-event procedures somewhat challenging. VisSim is designed so that the
gateway between the application and the DLL consists of a three-element array
of double-precision reals. While this data type may be useful for continuous
simulations, discrete-event procedures rely more heavily on integer
operations. This means that many type conversions have to be performed in the
DLL. It also means that no more than three values can be passed to a DLL at
one time.


Choosing a Development Environment


I decided to work in Turbo Pascal for Windows, for several reasons. First, I
already had experience writing DLLs in Turbo C++, and I wanted to be able to
compare the techniques used there with another language. Specifically, I
thought that C DLLs were unnecessarily complex (what purpose does the WEP
function really serve?) and was looking for an easier way. Second, I wanted my
students to be able to examine and modify the DLLs, and Pascal is still a more
universally accepted academic language.
Writing DLLs is easier in Pascal than in C, although Pascal's stricter type
checking meant that I had to be creative in my type conversions. Turbo Pascal
requires only that the program be declared as a library, that the appropriate
Windows "user" files be included, and that the procedures be exported at the
end of the file. As in writing DLLs using C, there's no main program as such.


Designing the DLLs


The discrete-event DLLs I needed can be divided into two categories. First, I
had to design a set of procedures that would calculate the appropriate
statistics resulting from the discrete-event processes. These statistics
included expected time spent in the system, expected number of customers in
the system, and the probability of a given number of customers in the system.
These are relatively straightforward, with simple equations involving math no
more difficult than algebra. They are computations that summarize the results
of a simulation, and will likely be used just prior to a VisSim output block.
The second and more complex part of the problem was that I had to generate
random events according to distributions that are most often associated with
discrete processes. VisSim gives me random numbers that follow the uniform and
normal distributions, which were useful, but I really needed the Poisson and
exponential distributions. These values would represent the arrival rates and
service times of the system.
There are two ways of doing this. First, I could use VisSim's uniform
distribution generator, pass it to a DLL, filter the values so that the
resulting random numbers followed the Poisson distribution, and finally return
the filtered values back to VisSim. On the other hand, I could use the
random-number generator in Turbo Pascal.
As Listing One (page 95) shows, I opted for the latter approach. Though I was
not sure if there was a performance penalty in passing large amounts of random
numbers to my DLL for filtering, it seemed more efficient to both create and
parameterize my customer arrival rates and service rates from Pascal. The
input to this event-generation DLL will be the mean of the desired probability
distribution, and the output will be the random values generated by that
distribution.
I ended up with three DLLs. The first created random numbers and filtered them
so that they followed the Poisson distribution, and returned these values back
to VisSim. The second did the same thing for the exponential distribution. The
third calculated system statistics based on the interaction between the two
series of random numbers, and returned these statistics to VisSim.


A Template for Queueing Models


Figure 1 shows an application of my discrete-event DLLs. I've kept it general,
so that it can be used both as a template for simulating more specific systems
and as a part of a more-complex simulation model.
The queueing system itself has three inputs: the mean customer-arrival rate,
the mean service time, and the number of servers. The mean customer-arrival
rate is used as input to my Poisson random-number filter DLL, designated by
the step-distribution icon. This returns customer interarrival times that
follow the Poisson distribution, which are the customer inputs to the queueing
system.
The mean service time, which follows the exponential distribution, is set by
the slider block and filtered through my exponential random-number generator
DLL to produce individual customer-service times. This DLL is designated by
the negative exponential curve icon. These service times for individual
customers are passed into the queueing system.
The last input is the number of servers, which comes from a variable block.
Note that the number of servers can easily be changed between simulation runs
by changing the input variable to the "Number of servers" block. Likewise, the
arrival rate and service time can be adjusted by moving the sliders that
contain the respective values. Currently, the sliders for both the arrival
rate and service time can be adjusted at 0.2 intervals between 0 and 20, but
that range is also easily changed.
The block labeled "Queueing system" is a compound block that contains several
different operations. It simulates customers being served and calculates
several different statistics on the queueing process. The aforementioned
statistics are calculated through values passed into another DLL. The actual
values of three of the statistics are displayed in the final blocks at the far
right of the model.
I've used this basic template to model problems from real life, such as jobs
being run on a multitasking computer or cars crossing a bridge. Any system
that follows a Markovian process can use these types of discrete operations.
For greater applicability, this template can be enhanced still further. The
current approach assumes that each of the servers in a multiserver system has
the same average service time. This is adequate for most purposes, but would
make it difficult to model a distributed computing system where the CPUs had
different clock speeds, for example. Another enhancement would be to display
some of the statistics, such as the average number of customers in the system,
in a meter or plot so that users can view the activity of the system as it
occurs.



A Little About VisSim


The time I put into customizing VisSim is in no way a reflection on any
failings in the product itself. VisSim comes with several dozen predefined,
high-level operations, divided into 11 categories. These categories include
arithmetic, Boolean operations, integration, transcendental functions, and
nonlinear operations. This toolbox makes it possible to create a wide variety
of simulations useful to both engineers and scientists. Its extensive
collection of example simulations cover a number of different practical
applications.
The visual modeling approach is well thought out and easily learned by anyone
interested in simulating processes. A particularly nice feature is the ability
to construct hierarchical block diagrams, representing hierarchical levels of
a simulation. Like my DLLs, hierarchical function blocks are highly reusable
between simulations.
Practitioners will appreciate VisSim (as opposed to traditional simulation
languages) because they can visualize the simulation as it progresses. This
can be done with one or more output blocks, which include simple value
displays along with more graphical meters and plots. This allows the user to
change parameter values on-the-fly and observe the results of the changes
without having to save the data and display it in a spreadsheet.
VisSim also supports the use of bit-mapped graphic icons in place of the
simple rectangular blocks for user-defined functions. The tool includes
examples with graphical icons of DC motors and digital controls, which
represent fully functional models of those devices. The icons themselves serve
no functional purpose, but serve to better communicate the layout of the
simulation. With my limited artistic ability, I used Windows Paintbrush to
create my own set of icons, three of which are used in my queueing system in
Figure 1.


Other Approaches to Discrete-event Simulation


It's possible to perform discrete-event simulation using VisSim without adding
DLLs. In its extensive collection of examples, Visual Solutions includes a
simulation of a "widget distribution system," shown in Figure 2. In this
example, widgets are produced, distributed from a central location to regional
warehouses, and sold from there. VisSim incorporates a time delay into a
reservoir model to emulate the flow of widgets between distribution locations.
However, this is not the way most professionals are used to thinking about
discrete-event simulation, and it adds an unnecessary layer of complexity onto
a relatively simple process. My approach, using DLLs for generating the
necessary random events, provides simpler and more understandable tools for
this category of simulation problem.


Extending and Enhancing Commercial Software: The Time has Come


In general, the user-defined DLL is probably one of the most important
features in a Windows application. An increasing number of commercial
applications provide this capability, which both provides a thriving market
for add-ons and lets technical users customize the products for their specific
needs. While the interface between VisSim and the DLL is restrictive, many
DLLs can still be written to make the product useful to more professionals.
VisSim also supports a DDE block, so that a simulation block can act as a data
server or client for another application. This means that a simulation can be
used to draw data from a spreadsheet, for example, and can also update the
spreadsheet with the results of a simulation run. Developing simulations using
this hot link to other applications and data sources is next on my list of
simulation projects.
What advantage does the visual approach to simulation have over a traditional
simulation language? VisSim has a learning curve, but since the simulation is
a closer conceptual representation of the system being modeled, the learning
curve is not as great as with a simulation language. Second, a properly
constructed simulation, complete with bitmapped icons representing components,
can be more instructive, since the relationships between components are more
explicit. These features can benefit not only the student, but also the
practitioner who seeks a greater understanding of any complex system.

_EXTENDING A VISUAL LANGUAGE FOR SIMULATION_
by Peter D. Varhol


[LISTING ONE]

{*****************************************************}
{ Borland Turbo Pascal }
{ VisSim DLL for Windows }
{ Poisson distribution random number generator }
{ This procedure of the DLL generates a random number }
{ using the Turbo Pascal random number generator, }
{ then filters the values to pass only those that }
{ follow the Poisson distribution. }
{*****************************************************}

library Queue; {rather than program}
uses WinTypes, WinProcs;
type
VisSimArg = array[0..2] of double;

Function fact(X:integer) : integer;

var
i, total : integer;

begin
if X <= 0 then fact := 0
else
 if X = 0 then fact := 1
 else
 begin
 total := 1;
 for i := 1 to X do
 total := total * i;

 fact := total
 end
end;

Procedure poisson(var param, X,result:VisSimArg);
export;
const
e = 2.718;

var
n, testX_int, testY_int : integer;
testX, testY, poissonX : double;

begin
 {n := X[0]; contains Poisson mean }
 Randomize; {set random seed}
 testX := Random; {uniform random number}
 testY := Random; {second random number}
 testX_int := trunc(testX * X[0] * 2);
 {scale random numbers and make an integer}
 testY_int := trunc(testY * X[0] * 2);
 poissonX := (Exp(testX) * testX*Ln(-X[0])) / fact(testX_int);
 {generate distribution}
 if poissonX > testX_int then
 result[0] := testX_int
 end;

exports
poisson index 1; {export by name and index number}

begin {procedure not called in main program}
end.






























June, 1993
NEURAL NETWORKS AND CHARACTER RECOGNITION


High-level, interactive tools are less complex but just as powerful




Ken Karnofsky


Ken is market-segment manager for signal-processing and neural-network tools
at Math Works. He's also worked in the areas of speech-recognition research,
interactive speech-processing software, and data analysis for industrial
applications. Ken can be contacted at the Math Works, Cochituate Place, 24
Prime Park Way, Natick, MA 01760.


Neural-network technology is being applied to solve a wide variety of
scientific, engineering, and business problems and perform complex functions
such as noise cancellation, adaptive filtering, pattern recognition,
non-linear controls, and econometric forecasting. Neural networks are composed
of many simple interconnecting elements, or neurons, working in parallel to
solve a problem. Like biological nervous systems, artificial neural nets can
be trained to find their own solutions, based solely on the form of
interconnections and the data presented to the network. In contrast to
classical approaches in fields such as statistics and control theory, neural
nets require no explicit model or limiting assumptions of normality or
linearity. This property is useful where system complexity makes formal
analysis extremely difficult or impossible. The ability to adapt to incoming
data conditions make neural networks particularly adept at handling noisy or
rapidly changing systems.
Neural-net research and development has traditionally been carried out using
languages such a C, C++, Fortran, or Lisp. In recent years, however,
commercial software tools have used pre-existing algorithms to simplify
network development. In most cases, modifying algorithms or integration with
non-neural systems continues to require advanced programming techniques.
In this article, I present an overview of neural-network properties, then
examine an application that illustrates them using the Matlab Neural Network
Toolbox, an add-on that facilitates the design and simulation of many kinds of
neural networks. Matlab (short for "matrix laboratory") is a technical
computing environment that provides a high-performance, interactive system for
numerical computation and visualization. Matlab, which runs on platforms
ranging from Microsoft Windows to the X Window System, integrates matrix
computation, numerical analysis, data analysis, and graphics with a language
that expresses problems and solutions as they are written
mathematically--without traditional programming.
Matlab's basic element is a matrix that doesn't require dimensioning. Along
with optimized numerical routines and a fast interpreter, this allows many
numerical problems to be solved in a fraction of the time that it would take
to write a program in Fortran or C. Applications written in Matlab can draw on
high-level programming constructs and a library of over 500 numerical,
analytical, and graphics functions. Application-specific add-on libraries
("toolboxes") of Matlab functions can be used to solve particular classes of
problems in the areas of signal processing, control-system design, and
optimization as well as neural nets.
By drawing on Matlab's computation features and matrix-oriented, high-level
language, the Neural Network Toolbox handles networks efficiently using a
matrix representation of network models. Those models may be easily compared
to, and interact with, more traditional problem-solving approaches. In
addition, a concise graphical notation simplifies the description and
understanding of network architectures.


Neuron Model and Neural-network Architectures


The neuron model and the architecture of a neural net determine how a network
transforms its input into an output. This transformation can be viewed as a
computation. Both the neuron model and the network architecture each place
limitations on what a particular neural network can compute.
In the simplest case, a single input neuron, the input p is transmitted
through a connection that multiplies its strength by the weight w, to form the
product w*p. This product is the argument to the transfer function F, which
produces an output a. The neuron may also have a bias, b, which is simply
added to the product w*p; therefore, the net input is the sum of the weighted
input w*p and the bias b. The complete neuron model is a=F(w*p+b). Note that w
and b are both adjustable parameters of the neuron.
In general, each neuron may have multiple inputs, and any number of neurons
may operate in parallel to form a layer. Figure 1 illustrates the components
of a neuron layer: the input, the weights, the bias, and the transfer
function, and the output. In the Matlab Neural Network Toolbox, the inputs,
outputs, and bias are represented as vectors, and the weights are represented
as a matrix (n inputs by m neurons). The resulting matrix equation, shown in,
Figure 1, provides a concise notation for neural networks. Because Matlab is
optimized for matrix manipulation, it provides a natural foundation for
efficient implementation of network computations.
Transfer functions may be linear or nonlinear, according to the purpose of the
net; typically, they limit the output to a desired numerical range. For
example, by limiting output to 0 or 1, the step function is useful for
classification decisions. Sigmoid functions are smooth, differentiable
nonlinear functions that compress the input into the range of 0 to 1; see
Figure 2. They can be used for function approximation and a variety of other
tasks.
A network can have any number of layers, but networks with more than three are
uncommon. In multilayer networks, transfer functions condition the output of
one layer before passing it on to the next layer. This prevents the
propagation of very large and very small numbers and the associated numerical
problems.
In summary, the number of inputs, the number of neurons per layer, the number
of layers, and the type of transfer function all affect what the network can
compute. The application establishes the number of inputs and outputs but the
number of neurons in the intermediate layers are up to the designer. In
general, more neurons is more powerful, but at the expense of additional
computational complexity.


Training


The weight and bias values to be used in a neural network are obtained by a
process of training. Some neural nets are trained by presenting the network
with a sequence of example input/output pairs. The net adjusts its parameters
until it can produce output vectors that closely match the desired outputs,
called target vectors, as closely as possible. Such nets are called supervised
networks.
Networks trained in a supervised manner can be used to model and control both
linear and nonlinear dynamic systems, to classify input vectors in the
presence of noise, and for other useful tasks. The natural ability of neural
networks to deal with nonlinearities allows them to solve complex problems.
For example, a neural controller has been implemented by Derrick Nguyen and
Bernard Widrow for backing a truck and trailer to a dock. It can perform this
task successfully even when starting with the truck and trailer jackknifed and
facing the wrong way.
Other neural nets are trained by presenting input vectors and letting the net
adjust itself. Such networks are said to be unsupervised since no specific
behavior is predefined. Instead, the network "discovers" the relationships
within the data. Unsupervised networks can be used to find a classification
scheme for data as it is presented. This classification scheme can then be
used on future data.


Character Recognition


Optical character recognition has many practical applications, from check
processing to converting faxed text into editable format on a computer. Such
systems can be very cost effective, because they can save time, avoid errors,
and relieve humans from performing repetitive tasks.
Statistical methods are typically used to model patterns and to compensate for
noise corruption. In these systems, fine-tuning system performance can often
mean exhaustive research to develop algorithms that handle real-world
conditions. Frequently these algorithms are computationally intensive. With
neural networks, on the other hand, once the basic requirements are
understood, the experimentation and algorithm development is relatively
straightforward. While the training may require complex computations, the
execution is typically quite efficient.
In this example, the task is to recognize the 26 letters of the English
alphabet. I'll simulate an imaging system that digitizes each letter centered
in the system's field of vision, ideally resulting in a 5x7 grid of Boolean
values. Figure 3(a) shows the letter G scanned under ideal conditions.
However, a typical scanning system is not perfect; it often produces a noisy
image of the letter, such as Figure 3(b).
The goal of the neural net is to correctly identify the image of each letter.
The problem requires perfect classification of ideal inputs and reasonably
accurate classification of noisy inputs. Specifically, the network should make
as few mistakes as possible when classifying inputs with noise of mean 0 and
standard deviation 0.2 or less.
The inputs are represented as a set of 26 vectors of 35 elements each (imagine
each 5x7 grid "unrolled" into a column vector)--one for each letter. These
vectors will in turn form the columns of an input matrix called "alphabet."
The output of the net is to be trained to match a set of target vectors that
represent the positions of the letters of the alphabet. Each target vector is
a 26-element vector with a 1 in the position of the letter it represents, and
Os everywhere else. For example, the letter A is represented by a 1 in the
first element (as A is the first letter of the alphabet), and 0s in elements 2
through 26.
In this example, the training and target vectors are generated artificially,
which would be appropriate in the early prototyping stages of a system. As
development progresses, you could easily substitute actual scanned images
using Matlab's I/0 and image-processing capabilities.


Network Architecture and Implementation


Since it must map a 35-pixel image to one of the 26 letters in the alphabet,
the neural net will need 35 inputs to represent the image grid, and 26 neurons
in its output layer. Because two-layer nets are known to have better
function-approximating properties, the network has an additional, hidden
layer, with ten neurons. Picking this number uses experience and some
guesswork. Too few neurons in the hidden layer can lead to underfitting, while
too many can contribute to unstable oscillations between the fitted points. If
the network has trouble learning, you can add neurons to this layer to improve
the fit. Figure 4 is a diagram of this network.
For both layers, use the log-sigmoid transfer function. The output range (0,1)
of the log-sigmoid transfer function is perfect for learning to output Boolean
values. It is also well suited to handle the nonlinear relationship between
the patterns in the input and output vectors.

The net will be trained to output a 1 in the correct position of the output
vector and to fill the rest of the output vector with 0s. However, noisy input
vectors may cause the network to produce outputs that are close to, but not
exactly 1s and 0s. To handle this problem, the network output will be passed
through a post-processing transfer function. You'll use the competitive
transfer function, which outputs a 1 in the position of the largest value and
0s elsewhere. This will assure that the element representing the letter most
like the noisy input vector takes on a value of 1 and all others have a value
of 0. This post-processing produces the final, clean output that will actually
be used in classification.
Implementation of the network with the Matlab Neural Network Toolbox is
performed in three phases:
Initialization of network inputs and parameters. This is performed by
assigning constants or the return value of built-in functions. External data
can be easily imported, scaled, and otherwise preprocessed using Matlab's
data-manipulation functions.
Network architecture specification and training. This is typically done with a
call to a toolbox function that accepts the input vectors and the weights,
biases, and transfer functions for each layer as arguments. Training data may
be presented one set at a time, or multiple training sets may be batched
together. Optional graphs help monitor the training progress.
Network simulation and testing. For each test case, a new input is presented
as an argument to the network function, along with the trained weights and
biases. Again, data can be acquired from external sources and preprocessed as
part of the training or simulation procedure. Results may be visualized using
built-in graphs or with custom plots devised by the user.
Each of these steps may be done interactively or automatically to explore
alternative strategies, parameter values, and architectures. Once an approach
has been selected, the same commands may be saved in a script file to automate
procedures that will be used repeatedly.


Training the Network with Backpropagation


You train the network using the backpropagation learning rule. Backpropagation
is perhaps the most commonly used supervised training technique because it
produces networks that are capable of approximating any reasonable function.
It has a nice generalization property that makes it possible to train a
network on a representative set of input/target pairs and get good results for
new inputs without training the network on all possible input/output pairs.
First, you initialize the weights and biases with constrained initial
conditions, using the Nguyen-Widrow method, to improve the learning time; see
Listing One, page 103. This method takes advantage of how a network
approximates functions to speed convergence. Creating a net that handles noisy
input vectors works best when training with both ideal and noisy vectors.
Start out by training on ideal vectors until you've reached a low sum-squared
error (the sum of the squared differences between the network targets and
actual outputs).
After initialization, you apply the backpropagation technique to ideal vectors
as follows. First, the 26x35-element input matrix P representing the alphabet
is presented to the network. The net's overall output A2 is calculated in
response to this vector. Next, the difference between the output and the
target vector is calculated. This is the error of the network.
The derivative of the sum of the squared errors with respect to the second
layer's output vector is calculated. This, in turn, is used to calculate the
derivative of sum-squared error with respect to the hidden layer's outputs.
Propagating the derivatives of error backward through the network is the
origin of the term "backpropagation."
The backpropagation learning rule adjusts the weights and biases of the
network so as to minimize the sum-squared error. This is done by iteratively
changing the values of the network weights and biases in the direction of
steepest descent with respect to error, effectively reducing the error by a
small amount each time. This is called a gradient descent procedure. Changes
in each weight and bias are proportional to that element's effect on the
sum-squared error of the network. Once the sum-squared error of the network
has reached an acceptably small value, or a maximum number of iterations are
performed, the training is complete.
Among the known pitfalls of backpropagation are false-error minima and
excessive training time. These problems can be overcome with techniques
provided in the Neural Network Toolbox. Adding a momentum factor decreases
backpropagation's sensitivity to small variations in the error surface. This
helps the net avoid mistaking shallow-local minima for the true lowest-error
solution. Training time can be reduced by applying the Nguyen-Widrow initial
conditions, as mentioned earlier. Another approach is to use an adaptive
learning rate to increase the learning rate without sacrificing stability.


Training Without Noise


The network is initially trained without noise for a maximum of 5000 epochs
(training iterations) or until the network sum-squared error falls beneath
0.1. The row vector TP contains training parameters such as the acceptable
error, learning rate, and so on. These values are useful for adjusting a
network's ability to handle noise.
Listing Two (page 103) implements the training procedure, using the Neural
Network Toolbox function trainbpx, which uses both adaptive learning rate and
momentum to improve the training procedure.


Adding Noise


Next, you train the network on ten sets of ideal and noisy vectors. Each set
has two copies of the noise-free alphabet and two noisy vectors. The target
vectors consist of four copies of the vectors in the 26x26 matrix target. The
noisy vectors have noise of mean 0 and standard deviation 0.1 and 0.2 added to
them. This forces the neurons to learn how to properly identify noisy letters,
while requiring that it can still respond well to ideal vectors. To train with
noise, the maximum number of epochs is reduced to 300 and the error goal is
increased to 0.6, to reflect the higher expected error. See Listing Three
(page 103).
Unfortunately, after the training described above, the net learns to classify
some difficult noisy vectors at the expense of properly classifying a
noise-free vector. So you train the network again on just ideal vectors. This
will help ensure that the network will respond perfectly when presented with
an ideal letter.
The values (W1, B1, W2, and B2) returned by the function trainbpx represent
the final parameters of the network you'll actually use.


Simulating Network Operation


You measure the reliability of the neural-net pattern-recognition system by
testing the network with hundreds of input vectors with varying quantities of
noise. Noise with a mean of 0 and standard deviation from 0 to 0.5 are added
to input vectors.
At each noise level 100 presentations of different noisy versions of each
letter are made and the net's output is calculated. The output is then passed
through the competitive transfer function, which returns a 1 in the position
of the largest element, and Os elsewhere. As a result, only one of the 26
outputs representing the letters of the alphabet, has a value of 1.
Listing Four (page 103) shows the code for such a test on one letter (M),
using noise with standard deviation of 0.2. Figure 2 displays the system
performance as a function of noise for each of the two networks. The solid
line on the graph shows the reliability for the network trained with noise.
The reliability of the same network trained without noise is shown with a
dotted line. As you can see, training the network on noisy input vectors
significantly reduced its error rate in classifying noisy vectors. The network
trained with noise did not make any errors for vectors with noise of standard
deviation 0.00 or 0.05. When noise of standard deviation 0.2 was added to the
vectors, the network made a mistake 3.62 percent of the time.
If you require higher accuracy, the net could be trained for a longer time or
retrained with more neurons in its hidden layer by changing the values of the
appropriate variables. Also, the resolution of the input vectors could be
increased to say, a 10x14 grid. Finally, the network could be trained on input
vectors with greater amounts of noise if greater reliability was needed for
higher levels of noise.


Summary


This article has examined how a simple pattern-recognition system can be
designed and simulated using neural networks. Training the network on
different sets of noisy vectors forced it to learn how to deal with noise, a
common problem in pattern-recognition applications.
The backpropagation technique used here has been proven to be quite versatile.
Matlab users have used it in medical-image processing, geoscience,
control-systems design, and other fields where the ability to handle noise and
nonlinearities are important. A typical application is the control of an
antenna to track a moving signal.
However, different network architectures might be more appropriate for other
applications. For example, associative networks produce associations between
inputs and neurons that are more localized than backpropagation networks;
self-organizing networks adapt to correlations or irregularities in the
inputs; and recurrent networks contain a feedback element that makes it
possible to model delays or effects that vary with time. A neural-network tool
should provide all of these network types.

_NEURAL NETWORKS AND CHARACTER RECOGNITION_
by Ken Karnofsky


[LISTING ONE]

R = 35; % number of inputs
S1 = 10; % number of neurons in layer 1
S2 = 26; % number of neurons in layer 2

[W1,B1] = nwlog(S1,R); % initial conditions for layer 1
W2 = rands(S2,S1)*0.01; % initial conditions






[LISTING TWO]

P = alphabet; % input matrix - 26x35
T = targets; % target matrix - 26x26
disp_freq = 20; % define training parameters
max_epoch = 5000; % "
err_goal = 0.1; % "
lr = 0.01; % "
lr_inc = 1.05; % "
lr_dec = 0.7; % "
momentum = 0.95; % "
err_ratio = 1.04; % "

% All training parameters are packed into the vector TP
TP = [disp_freq, max_epoch, err_goal, lr, lr_inc, lr_dec, momentum,
err_ratio];

% The function trainbpx trains the network using the specified initial
% conditions and transfer functions. It returns the trained network
% weights and biases, the number of epochs (iterations) required, and a
% record of errors and learning rate at each epoch.
[W1,B1,W2,B2,epochs,TR] = trainbpx(W1,B1,UlogsigU,W2,B2,UlogsigU,P,T,TP);







[LISTING THREE]


max_epoch = 300; % define training parameters
err_goal = 0.6; % "
TP = [disp_freq ... AS ABOVE ... err_ratio];
for pass = 1:10,
 P = [alphabet, alphabet, ... % clean training vectors
 (alphabet + rand(R,Q)*0.1), ... % and with varying
 (alphabet + rand(R,Q)*0.2)]; % amounts of noise
 T = [targets, targets, targets, targets]; % 4 target vectors batched together
 [W1,B1,W2,B2,TE,TR] = ... % training procedure
 trainbpx(W1,B1,UlogsigU,W2,B2,UlogsigU,P,T,TP);
end







[LISTING FOUR]


% Define an input example with random noise added.
noisyM = alphabet(:,13) + randn(35,1)*0.2;

% Output A2 is a function of the input vector, the transfer function for each
% layer, and the trained weights and biases.
A2 = logsig(W2*logsig(W1*noisyM,B1),B2);

% Because A2 may be noisy (not exactly 1 or 0); the competitive function
% picks the element closest to one. The FIND function returns its index.
answer = find(compet(A2) == 1);




















































June, 1993
A GUI ENVIRONMENT FOR FORTRAN DEVELOPMENT


VShell as the bridge between Visual Basic front ends and Fortran DLLs




Vinod Anantharaman


Vinod is a graduate student in the EECS department at the University of
Michigan. He can be contacted at 1803 Willowtree Lane, Ann Arbor, MI 48105.


Fortran end users have long put up with primitive command-line user interfaces
restricted by READ and WRITE, the only ANSI standard I/O statements provided
in the language. While many such users wish to take advantage of modern
graphical user interfaces, creating these graphical environments--dialog
boxes, command buttons, scroll bars, icons, graphs, pull-down and pop-up
menus--can mean considerable work for Fortran programmers. Using the C-based
Windows Software Development Kit (SDK), for instance, is often a tedious
process that involves looking at all the nitty-gritties of the Windows API
straight in the eye.
This is where Microsoft's Visual Basic comes into the picture. For a graphical
environment such as Windows, a visual development environment like Visual
Basic is not merely a means for interactive interface design--it also makes
the complexity of the Windows API transparent to the user. Fortran routines in
dynamic link libraries (DLLs) can be called from the Basic code that processes
the user's actions. This way, all the real computing can be done in Fortran,
with Visual Basic taking care of the GUI front end.
The Fortran Visual Shell (VShell) presented in this article is an icon-based
visual tool that simplifies (via drag and drop) the process of creating
Fortran DLLs accessible from Visual Basic. VShell is itself developed with
Visual Basic and features a Windows-hosted GUI in which objects such as
Fortran-related files and project folders are icons. The user executes a task
on an object by double-clicking the object or dragging and dropping it on the
task icon. A Fortran project complete with a Windows GUI can be developed from
scratch under VShell without any complicated interfacing between Fortran and
Visual Basic.


The VShell/Fortran Development Cycle


VShell has a set of windows, called "forms," each with a set of controls:
icons, menus, buttons, file-system controls, and the like. The VShell Main
form (see Figure 1) displays icons for a file browser, command icons, and
folder icons. Command icons include Edit, Folder, Compile, DLL-VB Frontend,
Run, Programmer's Workbench (PWB), and Visual Basic.
Begin by creating/editing Fortran source files. To create a new file, run an
editor by double-clicking the Edit icon on the Main form. To edit an existing
file, either select the file from the file browser and drag and drop its icon
on Edit or, if the file has a .FOR extension, double-click on it in the
browser. Both methods run an editor with the chosen file open. The editor can
be selected the first time Edit is used; VShell displays the Editor form with
an icon for each available editor; see Figure 2. The selected editor is then
used for all editing.
Figure 3 shows the Folder form used to create a Fortran project. On the Main
form, double-click on the Folder Command icon to bring up this form. This form
has a file browser that lets you explore the file system for Fortran source
files (.FOR extension). To select a file and include it in the folder,
drag-and-drop its associated icon from the file list box into the folder
contents box. The contents box displays files (with full paths) currently
selected for the folder's project. To set the compiler flags for the project,
use the Fortran Compiler Options entry in the Options menu--this displays the
Fortran Compiler Flags form, which has option buttons for various Fortran
compiler flags, as seen in Figure 4. Once a folder has been created and given
a name, an icon that represents it is added to the set of folder icons on the
Main form when the Folder form is closed.
To edit files in an existing project folder, drop the folder's icon on the
Edit Command icon, and the File_to_Edit form shown in Figure 5 will pop up
with the names of all the Fortran files in that folder. Select one of these
files, and the editor window comes up with the selected file open. To examine
or change the contents of a folder, either drag-and-drop on the Folder Command
icon or double-click on the Folder's icon to bring up the Folder form with the
desired folder's contents.
Once a project folder has been created, drop its icon on the Compile Command
icon to compile it and produce a QuickWin executable. The compiler flags set
for the project, along with some VShell default flags (discussed next) are
used. If compilation is successful, the .EXE file created is added to the
project's folder, and the folder's icon on the Main form is tagged as an .EXE
folder icon.
To run the executable in a compiled folder, drag and drop it on the Run
Command icon. The Run form in Figure 6 pops up and prompts for command-line
arguments, if any. The executable in the folder is then run inside a window.
Executable files dropped directly from the file browser onto the Run Command
icon may not be run within a window unless they were originally compiled as
QuickWin applications.
To create a Visual Basic/Fortran mixed-language application, edit Fortran DLL
source file(s) and create a folder, as described previously. The DLL code must
consist only of functions or subroutines because the Visual Basic side will
constitute the main program. Instead of compiling the folder, create a Fortran
DLL by dragging and dropping the folder icon on the DLL-VB Frontend icon.
VShell prompts you for the name of a Visual Basic declarations-module file
into which it writes Basic Declare statements for accessing the Fortran DLL's
exported functions and subroutines. The user can then drag and drop this Basic
file from the file browser on the Visual Basic Command icon to run Visual
Basic and load the declarations file. Now the stage is set for the user to
design and implement the Windows user interface using Visual Basic as the
front end, calling the Fortran DLL routines as needed by using the Basic Call
statement.
This completes a typical project-development cycle under VShell. There is one
further Command icon in VShell, the PWB Command used to run Microsoft's PWB.
Double-click the icon to run PWB, or drop a project folder icon on the PWB
icon to make PWB load the folder's files as well.


VShell Internals


Both the VShell Main form and the Folder form have file browsers that allow
the user to explore the file system. Visual Basic's specialized file-system
controls--the drive list box, the directory list box, and the file list
box--are combined and synchronized to display the system's drives,
directories, and files, all with a few mouse clicks.
For folders compiled to produce an executable, we use the /MW option that
instructs FL (Microsoft's Fortran 5.1 Compiler is assumed) to compile a
Quick-Win application. Also by default, the executable file produced by FL is
given the base name of the first file in the folder contents box. The user can
override the following default compiler settings: the warning level (set to
/W1, to display warning messages), the debug option (set to none), the
processor (set to /GO, the 8086/8088 instruction set), floating-point option
(set to /FPi, to generate inline instructions and select the emulator math
package), and the optimization desired (set to/Ox by default, full
optimization). The set of compiler flags assumed for folders used to create
DLLs is different and is discussed in the section on accessing Fortran DLLs
from Visual Basic.
Every folder has a folder-description file (FDF) with an .FOL extension name
and the same base name as the folder name. The FDF is a text file that
contains a description of the folder: the files it contains, their paths, and
the compiler flags for the project. The folder.vsh file in the VShell
directory contains the names of all the project folders in existence and their
types: plain (yet uncompiled, with only Fortran source files), executable, or
DLL. When VShell is started, the Main form is loaded with folder icons for
each of these project folders. The folder.vsh file gets updated on quitting a
VShell session.
Both the Compile and DLL-VB Frontend Command icons make use of Microsoft's
Program Maintenance Utility (NMAKE) to simplify project management. When these
Command icons are run, VShell automatically creates a suitable NMAKE
"description file" for the project. The description file has the target, the
target's dependents, and the commands to build the target. NMAKE ensures that
a project's files are recompiled only if dependents are modified
simultaneously with or later than the target. If you attempt to run an
outdated executable from a folder, VShell will prompt you to rebuild the
folder. An outdated DLL in a folder remains outdated until a new DLL is
explicitly created using the DLL-VB Frontend icon.
System commands and other executable programs (like editors, NMAKE, PWB, and
so on) are run from Visual Basic using the Shell function. However, when
VShell calls Shell to run a program such as NMAKE, the VShell code immediately
following the Shell call will normally continue executing even as NMAKE is
running. Therefore, VShell requires some way to infer the termination status
of NMAKE, and a clean way to branch on the completion of the NMAKE run,
depending on the termination status. VShell solves this problem by using a DOS
batch file and a Visual Basic timer control, as with the Visual Basic code
(available electronically; see "Availability," page 7). As you can see,
compile.bat first calls NMAKE using the description file created by VShell.
Next, it tests the exit code returned by NMAKE using the DOS batch command IF
ERRORLEVEL. The exit code is 0 if NMAKE ran successfully, and higher if it
encountered an error. The exitdump program dumps a dummy value (0) into a
temporary file (temp1.tmp) if the error level is 0. Next, a dummy value is
written into another temporary file (temp2.tmp) whether NMAKE had fatal errors
or not. The Shell call runs the batch file. A program information file (PIF),
compile.pif, created for this batch file, sets the Display Usage option to
Windowed--the batch file is set to run in a window.
The two statements before the Shell call ensure that if either of the
temporary files that compile.bat may create are already present, they are
deleted. After the Shell call is made, a Visual Basic timer control,
Compile_Timer, is enabled, and the VShell Main form, Shellform, is disabled.
Compile_Timer is programmed to check, at regular intervals, if the temporary
file temp2.tmp has been created; this happens after NMAKE has completed. When
Compile_Timer finds the file temp2.tmp, it checks if the file temp1.tmp has
been created. If it has, then NMAKE completed successfully, and the
Compile_Timer code adds the .EXE file created to the project's folder. If
temp1.tmp has not been created, Compile_Timer displays a message box
indicating that NMAKE encountered a fatal error. In either case, the timer is
disabled and the VShell Main form is enabled again.


Accessing Fortran DLLs from Visual Basic


To create Fortran DLLs accessible from Visual Basic, drag a folder icon (with
the Fortran DLL code) and drop it on the DLL-VB Frontend Command icon. VShell
takes care of the following: compiling the Fortran code with the appropriate
compiler flags; creating the definitions file with the DLL export functions;
linking with the appropriate runtime library to produce the DLL; and creating
the Visual Basic declarations module containing the declarations for the
Fortran DLL routines imported by Visual Basic.
What appears as a simple drag-and-drop event involves the following steps:
First, a new form, the DLL form (Figure 7), is loaded. In this form, you're
prompted for the name of the Visual Basic declarations file for the front end.
Next, the DLL source files are compiled using NMAKE, producing an object file.
The /Zi compiler flag, for producing debugging information with the Microsoft
Codeview debugger, is specified in the NMAKE description file for reasons that
will become apparent later. Because a DLL gets its stack from the caller, the
DLL's code cannot assume the stack segment is equal to the data segment; the
/Aw option generates code that takes this into account. VShell assumes that
all the functions and subroutines in the DLL code may be imported by Visual
Basic. Hence, the /Gw option is used for all the source files; this generates
appropriate prologue and epilogue code.
For VShell to know whether compilation was successful or not, the DOS
interface is implemented along the same lines as in Listing One. A timer
control, Object_Timer, runs code that checks for temporary files created by
the batch file that calls NMAKE and determines if NMAKE ran successfully or
not. If it was unsuccessful, VShell displays a message box indicating that
NMAKE encountered a fatal error, and terminates the DLL creation process;
otherwise, NMAKE produces a .OBJ file from the DLL source files.
For the rest of the DLL creation process to proceed without user input,
symbolic information regarding the names of all the DLL subroutines and
functions and their parameter and function-return types is required. This is
why compilation used the /Zi flag--the .OBJ file has symbolic debugging
information, and Microsoft's Object/Library Decoder is used to extract it. The
output from the Decoder is directed to a file with the same base name as the
.OBJ file, but with the extension .OMF (for Object Module Format). VShell
makes one pass through this .OMF file to extract the names of all the
information it requires about the DLL subroutines and functions. With this
information, VShell constructs two files: the definitions file (.DEF), which
has all the names of all subroutines and functions in the DLL in the EXPORT
section, along with the default WEP DLL termination routine; and the Visual
Basic declarations file, which has a Declare statement with the special
keyword Lib and the full path to the .DLL file for each of the DLL functions
and subroutines. The parameter types extracted from the .OMF file are
converted to their Basic equivalents in the Visual Basic declarations module.
Since the default calling protocols in Fortran and Visual Basic are similar
(almost everything is passed as a far pointer), interfacing Visual Basic with
Fortran is straightforward. Owing to a limitation in the version of Decode
used by VShell, passing arrays and structures as parameters between Visual
Basic and Fortran is not permitted at this point.
Once the .OBJ and the .DEF files have been created, run the linker to create
the .DLL file using the runtime library LDLLFEW.LIB. The interface with DOS
for calling the linker is implemented the usual way: The link command is part
of a batch file that Visual Basic runs using a Shell call, and a timer
control, Complete_Timer, checks if the DLL creation process completed
successfully or not. If unsuccessful, the .DLL file gets added to the folder,
and the folder's icon on the Main form is tagged as a DLL folder. This
completes the DLL creation process. If the linker encountered any fatal
errors, Complete_Timer detects this, and displays a "linker error" message.
Once the DLL and the Visual Basic declarations file have been created,
dropping the declarations file (from the browser) on the Visual Basic Command
icon runs Visual Basic and loads the declarations file. With this, the user
can proceed with designing Visual Basic forms and controls, writing code to
handle user-interface events, and calling the Fortran routines as needed.


A Monte Carlo Integrator



Among other projects, I used VShell to develop a Visual Basic/Fortran
mixed-language program for Monte Carlo integration. To integrate a real-value
function over an interval, this method takes a rectangular region that
includes the interval, plus a method of determining whether a random point in
the rectangular region is inside or outside the function. Monte Carlo
integration evaluates the function at a random sample of points. It estimates
the integral as the ratio of the points in random sample that were inside the
function to the total number of random points, multiplied by the area of the
rectangular region.
The program offers a choice of five different functions to integrate
(including linear, polynomial and trigonometric functions). The Fortran DLL
side (available electronically; see "Availability," page 7) has the following
routines:
Subroutine RANDOMXY generates random points within a specified rectangular
boundary.
Functions F1 to F5, one for each of the five mathematical functions, return
the value of that mathematical function at a specified point, given the values
of the function's coefficients.
Subroutine MCARLO, given the mathematical function selected, its coefficients,
and a random point (generated by RANDOMXY) determines whether or not that
point was within the function.
The main components of the Windows user interface (designed and implemented on
the Visual Basic side) are: option buttons to select the function to
integrate; a picture box to draw the graph of the selected function; scroll
bars to adjust the X-coordinate boundaries of the integration; option buttons
to choose the number of random points for the integration; a menu option that
enables the user to adjust the function's coefficients; another option for
turning the display of the random points on or off; and various command
buttons. The user interface is shown in Figure 8 and Figure 9 .
The VShell-created Visual Basic declarations file for accessing the DLL and
the main Visual Basic code that calls the DLL routines are both available
electronically; see "Availability," page 7.


Conclusions


Using VShell, programmers can preserve their investment in useful Fortran code
and make it part of a GUI application. Under VShell, the entire cycle of
Fortran program development--from editing files to running compiled
projects--is fully "visual," making it very easy to use and intuitive for all
Fortran development.

_A GUI ENVIRONMENT FOR FORTRAN DEVELOPERS_
by Vinod Ananthraraman


[LISTING ONE]
Branching based on the termination status of an executable program run via a
Shell call.]

 batfile% = OpenFile("\vshell\bat\compile.bat", FILEWR) 'Open batch file.
 If batfile% Then
 <Code to create MakeFile, the NMAKE description file for the Folder>

 Print #batfile%, "NMAKE /F " + MakeFile 'Batch file runs NMAKE.
 Print #batfile%, "IF NOT ERRORLEVEL 1 exitdump temp1.tmp 0"
 'If NMAKE runs successfully
 'then write to temp1.tmp.
 Print #batfile%, "exitdump temp2.tmp 0" 'Write to temp2.tmp.
 Close batfile%

 If Len(Dir$("temp1.tmp")) > 0 Then Kill "temp1.tmp" 'If temporary files
temp1.tmp
 If Len(Dir$("temp2.tmp")) > 0 Then Kill "temp2.tmp" 'or temp2.tmp exist,
delete.
 x = Shell("c:\vshell\bat\compile.bat", 1) 'Now run the batch file.

 Compile_Timer.Enabled = TRUE 'Enable Compile_Timer.
 Shellform.Enabled = FALSE 'Disable the Main form.
 End If


[LISTING TWO]
The file "mcarlo.for" with the Fortran DLL routines for Monte Carlo
integration.]


 SUBROUTINE MCARLO(INDF,C1,C2,C3,C4,XVAL,YVAL,ISIN)
c
c Takes INDF, the index of the function to integrate, its
c coefficients (C1 to C4), and a random point (XVAL, YVAL) and
c determines if the random point was inside the
c function. If the point is inside the function, MCARLO sets ISIN
c to 1 or -1 according as the point is on the negative or
c positive side of the abscissa (the X coordinate axis).
 INTEGER INDF
 REAL*4 C1,C2,C3,C4
 REAL*4 XVAL, YVAL
 INTEGER ISIN
 REAL*4 TRUEPT
 EXTERNAL FUNC,F1,F2,F3,F4,F5

 ISIN = 0
 TRUEPT = FUNC(INDF,XVAL,C1,C2,C3,C4)
 IF ((TRUEPT .GE. 0) .AND. (YVAL .LE. TRUEPT)
 + .AND. (YVAL .GE. 0)) THEN
 ISIN = 1
 ELSE IF ((TRUEPT .LT. 0) .AND. (YVAL .GE. TRUEPT)
 + .AND. (YVAL .LE. 0)) THEN
 ISIN = -1
 END IF
 RETURN
 END

 REAL*4 FUNCTION FUNC(INDF, X,C1, C2, C3, C4)
c
c Calls one of the functions F1-F5, depending on the value of the
c index INDF.

 INTEGER INDF
 REAL*4 X, C1, C2,C3, C4
 SELECT CASE (INDF)
 CASE (1)
 FUNC = F1(X,C1,C2)
 CASE (2)
 FUNC = F2(X,C1,C2)
 CASE (3)
 FUNC = F3(X,C1,C2,C3)
 CASE (4)
 FUNC = F4(X,C1,C2,C3,C4)
 CASE (5)
 FUNC = F5(X,C1,C2,C3)
 CASE DEFAULT
 FUNC = -999
 END SELECT
 RETURN
 END

 REAL*4 FUNCTION F1(X,C1,C2)
c
c Linear function

 REAL*4 X, C1, C2
 F1 = (C1*X) + C2
 RETURN
 END

 REAL*4 FUNCTION F2(X,C1,C2)
c
c Square root function

 REAL*4 X, C1, C2
 F2 = C1*SQRT(X) + C2
 RETURN
 END

 REAL*4 FUNCTION F3(X,C1,C2,C3)
c
c Quadratic function

 REAL*4 X, C1, C2, C3

 F3 = C1*X**2 + C2*X + C3
 RETURN
 END

 REAL*4 FUNCTION F4(X,C1,C2,C3,C4)
c
c Third degree polynomial function

 REAL*4 X, C1, C2, C3, C4
 F4 = C1*X**3 + C2*X**2 + C3*X + C4
 RETURN
 END

 REAL*4 FUNCTION F5(X,C1,C2,C3)
c
c Trigonometric function

 REAL*4 X, C1, C2, C3
 F5 = C1*SIN(X) + C2*COS(X) + C3
 RETURN
 END

SUBROUTINE RANDOMXY(XS,XE,YS,YE,RANX, RANY)
c
c Given the end points of a rectangular region, returns a random point
c (RANX,RANY) within that region.

 REAL*4 XS,XE,YS,YE
 REAL*4 RVAL, RANX, RANY
 CALL RANDOM(RVAL)
 RANX = (XE - XS)*RVAL + XS
 CALL RANDOM(RVAL)
 RANY = (YE - YS)*RVAL + YS
 RETURN
 END



[LISTING THREE]
The Visual Basic declarations file "mcarlo.bas" created by VShell to access
the routines in the Fortran DLL "mcarlo.dll". This DLL was itself created by
VShell from the Fortran source code in "mcarlo.for" shown in Listing 2.]



Declare Sub MCARLO Lib "c:\vb\mcarlo.dll" (Var1 As Long, Var2 As Single, Var3
As Single, Var4 As Single, Var5 As Single, Var6 As Single, Var7 As Single,
Var8 As Long, Var9 As Long)
Declare Function FUNC Lib "c:\vb\mcarlo.dll" (Var1 As Long, Var2 As Single,
Var3 As Single, Var4 As Single, Var5 As Single, Var6 As Single) As Single
Declare Function F1 Lib "c:\vb\mcarlo.dll" (Var1 As Single, Var2 As Single,
Var3 As Single) As Single
Declare Function F2 Lib "c:\vb\mcarlo.dll" (Var1 As Single, Var2 As Single,
Var3 As Single) As Single
Declare Function F3 Lib "c:\vb\mcarlo.dll" (Var1 As Single, Var2 As Single,
Var3 As Single, Var4 As Single) As Single
Declare Function F4 Lib "c:\vb\mcarlo.dll" (Var1 As Single, Var2 As Single,
Var3 As Single, Var4 As Single, Var5 As Single) As Single
Declare Function F5 Lib "c:\vb\mcarlo.dll" (Var1 As Single, Var2 As Single,
Var3 As Single, Var4 As Single) As Single
Declare Sub RANDOMXY Lib "c:\vb\mcarlo.dll" (Var1 As Single, Var2 As Single,
Var3 As Single, Var4 As Single, Var5 As Single, Var6 As Single)


[LISTING FOUR]
Visual Basic front end code that calls routines in "mcarlo.dll" and displays
the result of the Monte Carlo integration.]


 NumIn = 0 'NumIn is the effective number of random points inside the
function. 'Points in the negative side of the function have a weight of -1.
 For j% = 1 To Nr 'Nr is the number of random points used for the integration.


 Call RANDOMXY(Xstart,Xend,Ystart,Yend,Xvalue!,Yvalue!)
 'This DLL routine returns (Xvalue!,Yvalue!), a random point in the
'rectangular region between (Xstart,Ystart) and (Xend,Yend).
 Call MCARLO(FnIndex,Coeff1,Coeff2,Coeff3,Coeff4,Xvalue!,Yvalue!,IsIn&)
 'The DLL routine to determine if the random point is inside the 'function.
The index of the function being integrated and its 'coefficients are
parameters.
 NumIn = NumIn + IsIn&
 Next j%

 TotalAreaLabel.Visible = TRUE
 IntegralLabel.Visible = TRUE
 Integral# = (NumIn / Nr) * (Xend - Xstart) * (Yend - Ystart)
 'The integral value estimated by Monte Carlo integration.
 TotalAreaLabel.Caption = "Sampled area: " + Str$((Xend - Xstart) * (Yend -
Ystart))
 'Display the total area sampled for random points.
 IntegralLabel.Caption = "Value of integral (colored): " + Str$(Integral#)
 'Display the estimated integral value.














































June, 1993
EXAMINING MFC 2.0


This class library has all the elements you need to build a graphical periodic
table




Michael Yam


Michael is an independent consultant and has served New York's financial
district since 1984. He can be reached on CompuServe at 76367,3040.


The Microsoft Foundation Class library (MFC) provides a wrapper for most, but
not all, Windows API functions. The result is a condensed, object-oriented API
that's at least familiar to seasoned SDK programmers. You're in for a
surprise, however, when you start looking for the SDK's WinMain, WndProc, or a
message loop because MFC doesn't need the skeletal code required by the
Windows API. This could be disconcerting for experienced SDK programmers
because it suggests a loss of control. That's not the case, however.
MFC 1.0, originally introduced with Microsoft C/C++ 7.0, took a minimalist
approach to supporting the implementation of user-interface elements within
Windows. MFC 2.0, released with Visual C++, is more robust. While MFC 1.0
contained about 60 C++ classes, MFC 2.0 provides over 100. According to
Microsoft, any application written under earlier versions of MFC will run
virtually unmodified under 2.0--well, almost. I've found that the
functionality of some classes has been either merged or completely replaced.
Although the changes are minor, they can be "gotchas" when moving your code to
MFC 2.0. Furthermore, a redesign of your application may be required to take
full advantage of the more recent version of MFC.


A Look at PT


As a "Petzoldian" and a chemistry major of long ago (in my time, only 105
elements existed), writing a periodic table was a good way to examine MFC. PT,
a periodic table like that in most chemistry textbooks, is a Windows
application I originally wrote using MFC 1.0. See Table 1(a) for a list of the
files that make up PT. PT uses a modeless dialog box as its main window. Doing
so takes advantage of the built-in functionality offered by dialog boxes, such
as allowing the user to move among controls with the Tab key. It's also easier
to get and set text in edit controls. And in terms of program size and memory
requirements, a dialog box as a main window is thriftier than a standard frame
window.
Table 1: (a) Files that make up PT; (b) data members in the CWinApp class; (c)
MFC global functions to access WinMain() variables.

 (a)

 PT.H
 PT.CPP
 PT.RC
 PT.DEF
 MAKEFILE
 DIALOGS.DLG
 RESOURCE.H
 PT.ICO

 (b)

 Data Members Description
 ---------------------------------------------------------------------

 m_pszAppName Specifies the name of the application.
 m_hInstance Corresponds to the hInstance parameter passed
 by Windows to WinMain().
 m_hPrevInstance Corresponds to the hPrevInstance parameter
 passed by Windows to WinMain().
 m_IpCmdLine Corresponds to the IpCmdLine parameter passed
 by Windows to WinMain().
 m_nCmdShow Corresponds to the nCmdShow parameter passed
 by Windows to WinMain().
 m_pMainWnd Holds a pointer to the application's main
 window.

 (c)

 Functions Description

 AfxGetApp Obtain a pointer to the CWinApp object.

 AfxGetInstanceHandle Obtain a handle to the current application
 instance.
 AfxGetResourceHandle Obtain a handle to the application's resources.
 AfxGetAppName Obtain a pointer to a string containing the
 application's name.

I arranged the display of the periodic table to resemble the map in your
chemistry class; see Figure 1. The idea is to point to the element of choice,
click the mouse, and have the program display edit fields containing the name,
symbol, atomic number, and atomic weight of the specified element. You can
also retrieve information by typing data into the appropriate edit field and
pressing Enter. As a convenience for those who can't remember the spelling of,
say, "Molybdenum" or recall its symbol, the name and symbol edit fields are
implemented as combo boxes so that data may be chosen from the list-box
portion.


Browsing MFC


Microsoft developed MFC to simplify the Windows API, enable object-oriented
(C++) techniques, and provide a degree of portability from 16- to 32-bit
versions of Windows. The more than 100 MFC classes can be divided into two
groups: general purpose and Windows specific. The former manages file
services, persistent objects, exception handling, strings, and collections.
The latter should appeal to SDK programmers because it supports GDI, MDI, OLE,
menus, dialogs, controls, and the like. Figure 2 shows the complete hierarchy.
The class CWinApp wraps all the functionality of WinMain() in class members;
see Table 1(b). An MFC program only needs to declare an instance of CWinApp
and the constructor takes care of the usual WinMain() responsibilities (class
registration and main message-loop processing). And just as an SDK program can
have only one WinMain(), an MFC program can only have one instance of CWinApp.
With 2.0 message maps, you write individual functions to handle each event
instead of using switch/case statements to process messages in Windows
procedures. A good comparison is MS Basic's ON KEY(n) GOSUB line statement; if
a function key or cursor key is pressed, then GOSUB line is executed. With
MFC, your individual function is executed if an event (like a function key,
button press, or paint message) is detected. A side benefit of the message map
is that it provides portability to Win32 by eliminating the need to decode
wParam and lParam.
MFC 2.0 also supports OLE 1.0 (unlike MFC 1.0). Nine classes provide for
client and server applications, and integrate OLE into MFC's document
architecture and view classes. The client and server classes should not be
thought of as two different categories of OLE items, but as two different
interfaces to the same OLE item. Since these interfaces communicate through
OLE system DLLs, a client application should never directly call member
functions of a server class. Likewise, a server application should never
directly call member functions of a client class. Maintaining the integrity of
the OLE interprocess communication allows client applications to accept items
from any server; you won't need to write code to handle the specific contents
of an incoming item. This separation of client and server, now artificially
imposed by the framework, will be extended and integrated into the upcoming
OLE 2.0.
The server classes allow for the creation of full and mini-servers. A full
server can be a stand-alone application or an application that has been
launched by a client. A full server handles both embedded and linked objects,
can own documents and write them to disk, and typically has a multiple
document interface. A mini-server, on the other hand, is launched by a client
and handles only embedded objects. A mini-server does not own documents, but
rather accesses those of the client. Additionally, a mini-server typically has
a single-document interface. The MS-Draw and Graph components of Microsoft
Word for Windows are mini-servers.
Also new to MFC 2.0 is support for Visual Basic 1.0 custom controls or VBX
controls. VBX controls are stored in Windows DLLs and the class CVBControl
allows you to load the controls, and get and set their properties.
Compatibility is good, although some VBX features, such as drag-and-drop and
control arrays, are not supported. Naturally, any VBX control that relies on
the internal or undocumented features of Visual Basic isn't guaranteed to
work.


Dialog Box as the Main Window


As previously mentioned, PT uses a modeless dialog box as its main window.
From Microsoft's CDialog class, PT derives its own class, CPTDialog (as shown
in Listing One, page 132). To conserve space here, many of the functions for
the 109 elements are available electronically; see page 7. Using an object of
this class as a main window requires a further look at two components of
CWinApp: an overrideable member function, InitInstance(), and a public data
member, m_pMainWnd. InitInstance() creates a window object and m_pMainWnd
holds the pointer to that window object. Therefore, you need to assign the
main window pointer to the CPTDialog constructor to make a dialog object a
main window. From CWinApp, PT derives an application class, CTheApp, and
inside CTheApp::InitInstance(), places the statement m_pMainWnd=new
CPTDialog().


Custom Icons


The code inside the constructor should reassure SDK programmers that MFC can
be overridden; besides creating the dialog object, CPTDialog() replaces the
default icon, a white box, with its own icon. MFC provides two mechanisms for
loading a custom icon; unfortunately, neither work for a dialog box. Still,
the techniques are worth knowing, so I'll mention them briefly. The first
mechanism is through a function, AfxRegisterWndClass(), which accepts the icon
as one of its arguments. Other arguments include the class style, the mouse
cursor, and the background. AfxRegisterWndClass() generates a class name to be
passed into a Create() member function, typically of the CWnd class. The
problem is that the Create() member function of the CDialog class doesn't
accept a class name.
The second mechanism involves overriding reserved MFC icon IDs defined as
AFX_IDI_STD_FRAME and AFX_IDI_STD_MDIFRAME (see Figure 3). But as the names of
the identifiers suggest, this works only for standard frame windows and MDI
frame windows, not for dialogs. Loading a custom icon for a dialog box wasn't
obvious from the documentation, possibly because a user doesn't usually need
to minimize a dialog box. The solution was to register my own PT dialog class,
and load my icon there.
Figure 3: Overriding icon IDs reserved by MFC.

 AFX_IDI_STD_FRAME ICON custom.ico
 AFX_IDI_STD_MDIFRAME ICON custom.ico

Inside CPTDialog, you'll find a private function member: RegisterPTClass()
(shown in Listing Two, page 132). It fills in the WNDCLASS structure and
resembles some of the initialization code found in WinMain(). I set the icon,
as well as the cursor, background, and menu, and designated the class name as
PTDLGCLASS. This class name is referenced by the dialog-box template described
in the resource file DIALOGS.DLG (available electronically; see page 7).
Notice that RegisterPTClass() uses "unwrapped" Windows functions such as
::LoadIcon(), ::LoadCursor(), and ::RegisterClass(). The C++ scope resolution
operator, ::, indicates that the name refers to the Windows function and not
to a class-member function. RegisterPTClass() also needs the application's
instance handle and calls an MFC global function AfxGetInstanceHandle() to
retrieve it. MFC global functions access the WinMain() variables, of which
there are four; see Table 1(c).


Mapping Messages


For simplicity, I've assigned each element its own button control. This totals
109 element buttons. (Don't forget that 255 controls per dialog is the limit
set by the Windows API.) Each button is entered into the message map and
associated with a function to display the element's data. To establish a
message map, I first included DECLARE_MESSAGE_MAP inside the CPTDialog class.
Only one declaration is allowed per class. The code fragment in Figure 4
belongs outside the class declaration and outside the scope of any function.
Figure 4: Declaring a message map.

 BEGIN_MESSAGE_MAP (CPTDialog, CDialog)
 WM_CLOSE ()
 ON_COMMAND (IDM_ABOUT, OnAbout)
 ON_CBN_SETFOCUS (IDD_ELEMENTNAME,
 OnNameSetFocus)
 ON_COMMAND (H. OnH)
 // other messages
 END_MESSAGE_MAP()

Listing Two contains the map. Four categories of messages can be trapped:
WM_COMMAND messages generated by menu selections, WM_COMMAND messages
generated by keys, notification messages from child windows, and general
WM_messages such as WM_PAINT or WM_CLOSE. Messages to be trapped are placed
between the macros BEGIN_MESSAGE_MAP and END_MESSAGE_MAP(). To help direct the
flow of messages, BEGIN_MESSAGE_MAP requires two arguments: the derived class
and the base class. When a message is sent to a CPTDialog object, it is
compared against the CPTDialog message map. If an entry is found, the
associated function is executed. Otherwise, the search continues with the
message map in the parent class. As long as the message isn't matched, it
continues to flow up the class hierarchy until it reaches the CWnd class--the
mother of all Windows classes in MFC. One of its member functions includes
DefWindowProc(). Just as in the SDK Windows procedure, this is where unmatched
messages go for default processing.
The first message in Figure 4 traps WM_CLOSE, overriding CWnd::OnClose(). This
is generally not necessary for a standard window frame, but is required when
using a dialog box. Recall that a modeless dialog box can only be closed with
DestroyWindow(). Yet when using MFC, DestroyWindow() by itself is not enough.
The function only destroys, or closes, the dialog window you see on the
screen; it does not destroy the dialog object. Any associated data structures
and resources remain in memory. MFC 1.0 provided an elegant solution to this
problem: code CPTDialog::OnClose() with the C++ statement delete this. This
not only destroys the CPTDialog object (freeing resources), but also
transparently calls DestroyWindow() to close the dialog window. Had I not
trapped WM_CLOSE, the message would have flowed up to CWnd and triggered the
destructor, ~CWnd. Since ~CWnd just calls DestroyWindow(), it would have left
the dialog object as orphaned memory. However, using this technique with MFC
2.0 is no longer recommended by Microsoft. Instead, you should code
DestroyWindow() and move delete this to an overrided PostNcDestroy member
function (see Listing One).
The second message in the map corresponds to a menu message;
CPTDialog::OnAbout() is executed whenever IDM_ABOUT is detected. OnAbout() in
turn opens a modal dialog box to display information about the periodic table.
A modal dialog box, About, is then constructed and accepts two parameters: the
dialog template name (see Listing Three, page 134) and a pointer that
identifies the current object, CPTDialog. DoModal() is the member function
that runs the dialog box.
The third message in the map is just one of four combo-box/edit-field
notification messages trapped by PT. By keeping track of which field has the
input focus, PT can determine what kind of information the user wants to
search on: the element name, symbol, atomic number, or atomic weight. The
search is performed by CPTDialog::GetPTData() in Listing Two.
The fourth message starts a long list of element IDs and their associated
display functions. I've identified the dialog button ID as H, instead of IDD_H
(per Microsoft naming convention). In the context of PT the IDD_ prefix seemed
redundant and would not enhance code readability. Also, it would have been
natural to assign each element's button ID to their respective atomic numbers,
starting with one for Hydrogen and ending with 109 for Une, but I couldn't.
Microsoft reserves ID numbers below 100 (for example, IDOK and IDCANCEL are
one and two, respectively), so I defined the element-button IDs as "atomic
number+ 100."

PT data is stored in memory as an array of structures, as shown in Figure 5.
The "+1" in the fields indicates they are null terminated. Setting the 0th
element to Null, I filled the array by atomic number, starting with
information for Hydrogen as _atom[1] and ending with Une as _atom[109].
CPTDialog::Display() accepts the element button ID as its argument and
displays the information corresponding to _atom[elementid- 100].
Figure 5: Structure used to store PT data.

 struct
 {
 unsigned char number;
 char symbol [3+1];
 char element [12+1];
 char weight [9+1];
 }_atom [110];



Conclusion


I've covered only a small, but fundamental, part of MFC. If you've had the
patience and curiosity to pursue the SDK, you won't have a problem coping with
MFC. The most difficult aspects were accepting that WinMain() and Windows
procedures were gone, and trusting that significant control hadn't been lost.
Also, locating MFC member functions which paralleled the SDK functions was
difficult because, while the SDK Programmer's Reference laid out functions in
alphabetical order (flat), the MFC Class Libraries Reference stored member
functions in class descriptions (hierarchical); MFC is, after all, object
oriented.
Conventional wisdom states that to learn Windows programming and C++, you
should study one, then the other, but not both at the same time. The combined
concepts, from the Windows' API to the syntax of C++ to object-oriented
programming, would overwhelm most people. But because MFC and C++ go hand in
hand, it now makes good sense to learn and take advantage of both.


References


Chiverton, Bob. "C/C++ Questions & Answers." Microsoft Systems Journal
(October, 1992).
Microsoft C/C++ Class Libraries Reference. Microsoft Corp., 1991.
"Reference Tables for Chemistry." New York State Department of Education.

_EXAMINING MFC 2.0_
by Michael Yam


[LISTING ONE]

//----- PT.H - Declares class interfaces for Periodic Table -----
#ifndef __PT_H__
#define __PT_H__

#define PT_MAXELEMENTS 107

class CPTDialog : public CDialog
{
private:
 static BOOL bRegistered;
 static BOOL RegisterPTClass();
 void Display (int iAtomicNumber);
 void GetPTData (int nID);
 char CB_Focus[4];
public:
 CPTDialog();
 void OnClose()
 {
 delete this;
 }
 void OnAbout();
 void OnHelp();
 void OnOK();
 void OnCancel();

 void OnNameSetFocus();

 void OnSymbolSetFocus();
 void OnNumberSetFocus();
 void OnWeightSetFocus();

 void OnH()
 {
 Display (H);
 }
 void OnHe()
 {
 Display (He);
 }
 void OnLi()
 {
 Display (Li);
 }
 void OnBe()
 {
 Display (Be);
 }
 void OnB()
 {
 Display (B);
 }
 void OnC()
 {
 Display (C);
 }
 void OnN()
 {
 Display (N);
 }
 void OnO()
 {
 Display (O);
 }
 void OnF()
 {
 Display (F);
 }
 void OnNe()
 {
 Display (Ne);
 }
 void OnNa()
 {
 Display (Na);
 }
 void OnMg()
 {
 Display (Mg);
 }
 void OnAl()
 {
 Display (Al);
 }
 void OnSi()
 {
 Display (Si);

 }
 void OnP()
 {
 Display (P);
 }
 void OnS()
 {
 Display (S);
 }
 void OnCl()
 {
 Display (Cl);
 }
 void OnAr()
 {
 Display (Ar);
 }
 void OnK()
 {
 Display (K);
 }
 void OnCa()
 {
 Display (Ca);
 }
 void OnSc()
 {
 Display (Sc);
 }
 void OnTi()
 {
 Display (Ti);
 }
 void OnV()
 {
 Display (V);
 }
 void OnCr()
 {
 Display (Cr);
 }
 void OnMn()
 {
 Display (Mn);
 }
 void OnFe()
 {
 Display (Fe);
 }
 void OnCo()
 {
 Display (Co);
 }
 void OnNi()
 {
 Display (Ni);
 }
 void OnCu()
 {

 Display (Cu);
 }
 void OnZn()
 {
 Display (Zn);
 }
 void OnGa()
 {
 Display (Ga);
 }
 void OnGe()
 {
 Display (Ge);
 }
 void OnAs()
 {
 Display (As);
 }
 void OnSe()
 {
 Display (Se);
 }
 void OnBr()
 {
 Display (Br);
 }
 void OnKr()
 {
 Display (Kr);
 }
 void OnRb()
 {
 Display (Rb);
 }
 void OnSr()
 {
 Display (Sr);
 }
 void OnY()
 {
 Display (Y);
 }
 void OnZr()
 {
 Display (Zr);
 }
 void OnNb()
 {
 Display (Nb);
 }
 void OnMo()
 {
 Display (Mo);
 }
 void OnTc()
 {
 Display (Tc);
 }
 void OnRu()

 {
 Display (Ru);
 }
 void OnRh()
 {
 Display (Rh);
 }
 void OnPd()
 {
 Display (Pd);
 }
 void OnAg()
 {
 Display (Ag);
 }
 void OnCd()
 {
 Display (Cd);
 }
 void OnIn()
 {
 Display (In);
 }
 void OnSn()
 {
 Display (Sn);
 }
 void OnSb()
 {
 Display (Sb);
 }
 void OnTe()
 {
 Display (Te);
 }
 void OnI()
 {
 Display (I);
 }
 void OnXe()
 {
 Display (Xe);
 }
 void OnCs()
 {
 Display (Cs);
 }
 void OnBa()
 {
 Display (Ba);
 }
 void OnLa()
 {
 Display (La);
 }
 void OnCe()
 {
 Display (Ce);
 }

 void OnPr()
 {
 Display (Pr);
 }
 void OnNd()
 {
 Display (Nd);
 }
 void OnPm()
 {
 Display (Pm);
 }
 void OnSm()
 {
 Display (Sm);
 }
 void OnEu()
 {
 Display (Eu);
 }
 void OnGd()
 {
 Display (Gd);
 }
 void OnTb()
 {
 Display (Tb);
 }
 void OnDy()
 {
 Display (Dy);
 }
 void OnHo()
 {
 Display (Ho);
 }
 void OnEr()
 {
 Display (Er);
 }
 void OnTm()
 {
 Display (Tm);
 }
 void OnYb()
 {
 Display (Yb);
 }
 void OnLu()
 {
 Display (Lu);
 }
 void OnHf()
 {
 Display (Hf);
 }
 void OnTa()
 {
 Display (Ta);

 }
 void OnW()
 {
 Display (W);
 }
 void OnRe()
 {
 Display (Re);
 }
 void OnOs()
 {
 Display (Os);
 }
 void OnIr()
 {
 Display (Ir);
 }
 void OnPt()
 {
 Display (Pt);
 }
 void OnAu()
 {
 Display (Au);
 }
 void OnHg()
 {
 Display (Hg);
 }
 void OnTl()
 {
 Display (Tl);
 }
 void OnPb()
 {
 Display (Pb);
 }
 void OnBi()
 {
 Display (Bi);
 }
 void OnPo()
 {
 Display (Po);
 }
 void OnAt()
 {
 Display (At);
 }
 void OnRn()
 {
 Display (Rn);
 }
 void OnFr()
 {
 Display (Fr);
 }
 void OnRa()
 {

 Display (Ra);
 }
 void OnAc()
 {
 Display (Ac);
 }
 void OnTh()
 {
 Display (Th);
 }
 void OnPa()
 {
 Display (Pa);
 };
 void OnU()
 {
 Display (U);
 }
 void OnNp()
 {
 Display (Np);
 }
 void OnPu()
 {
 Display (Pu);
 }
 void OnAm()
 {
 Display (Am);
 }
 void OnCm()
 {
 Display (Cm);
 }
 void OnBk()
 {
 Display (Bk);
 }
 void OnCf()
 {
 Display (Cf);
 }
 void OnEs()
 {
 Display (Es);
 }
 void OnFm()
 {
 Display (Fm);
 }
 void OnMd()
 {
 Display (Md);
 }
 void OnNo()
 {
 Display (No);
 }
 void OnLr()

 {
 Display (Lr);
 }
 void OnUnq()
 {
 Display (Unq);
 }
 void OnUnp()
 {
 Display (Unp);
 }
 void OnUnh()
 {
 Display (Unh);
 }
 void OnUns()
 {
 Display (Uns);
 }
 void OnUno()
 {
 Display (Uno);
 }
 void OnUne()
 {
 Display (Une);
 }
 DECLARE_MESSAGE_MAP()
};
class CTheApp : public CWinApp
{
public:
 BOOL InitInstance();
};
// Data stored in memory as an array of structures: _atom[]. Weights in
// parens correspond to atoms of most stable isotope. Data retrieved from
// "Reference Tables for Chemistry," SUNY, State Education Dept., Albany, NY
12234.
struct
{
 unsigned char number; // atomic number
 char symbol[3+1]; // three char symbol plus null
 char element[12+1]; // full name plus null
 char weight[9+1]; // atomic weight
}_atom[] = {
 0, "", "", "",
 1, "H" , "Hydrogen", "1.0079",
 2, "He", "Helium", "4.00260",
 3, "Li", "Lithium", "6.941",
 4, "Be", "Beryllium", "9.01218",
 5, "B" , "Boron", "10.81",
 6, "C" , "Carbon", "12.011",
 7, "N" , "Nitrogen", "14.0067",
 8, "O" , "Oxygen", "15.9994",
 9, "F" , "Fluorine", "18.998403",
 10, "Ne", "Neon", "20.179",
 11, "Na", "Sodium", "22.98977",
 12, "Mg", "Magnesium", "24.305",
 13, "Al", "Aluminum", "26.98154",
 14, "Si", "Silicon", "28.0855",

 15, "P" , "Phosphorus", "30.97376",
 16, "S" , "Sulfur", "32.06",
 17, "Cl", "Chlorine", "35.453",
 18, "Ar", "Argon", "39.948",
 19, "K" , "Potassium", "39.0983",
 20, "Ca", "Calcium", "40.08",
 21, "Sc", "Scandium", "44.9559",
 22, "Ti", "Titanium", "47.90",
 23, "V" , "Vanadium", "50.9414",
 24, "Cr", "Chromium", "51.996",
 25, "Mn", "Manganese", "54.9830",
 26, "Fe", "Iron", "55.847",
 27, "Co", "Cobalt", "58.9332",
 28, "Ni", "Nickel", "58.70",
 29, "Cu", "Copper", "63.546",
 30, "Zn", "Zinc", "65.38",
 31, "Ga", "Gallium", "69.72",
 32, "Ge", "Germanium", "72.59",
 33, "As", "Arsenic", "74.9216",
 34, "Se", "Selenium", "78.96",
 35, "Br", "Bromine", "79.904",
 36, "Kr", "Krypton", "83.80",
 37, "Rb", "Rubidium", "85.4678",
 38, "Sr", "Strontium", "87.62",
 39, "Y" , "Yttrium", "88.9059",
 40, "Zr", "Zirconium", "91.22",
 41, "Nb", "Niobium", "92.9064",
 42, "Mo", "Molybdenum", "95.94",
 43, "Tc", "Technetium", "(97)",
 44, "Ru", "Ruthenium", "101.07",
 45, "Rh", "Rhodium", "102.9055",
 46, "Pd", "Palladium", "106.4",
 47, "Ag", "Silver", "107.868",
 48, "Cd", "Cadmium", "112.41",
 49, "In", "Indium", "114.82",
 50, "Sn", "Tin", "118.69",
 51, "Sb", "Antimony", "121.75",
 52, "Te", "Tellurium", "127.60",
 53, "I" , "Iodine", "126.9045",
 54, "Xe", "Xenon", "131.30",
 55, "Cs", "Cesium", "132.9054",
 56, "Ba", "Barium", "137.33",
 57, "La", "Lanthium", "138.9055",
 58, "Ce", "Cerium", "140.12",
 59, "Pr", "Praseodymium", "140.9077",
 60, "Nd", "Neodymium", "144.24",
 61, "Pm", "Promethium", "(145)",
 62, "Sm", "Samarium", "150.4",
 63, "Eu", "Europium", "151.96",
 64, "Gd", "Gadolinium", "157.25",
 65, "Tb", "Terbium", "158.9254",
 66, "Dy", "Dysprosium", "162.50",
 67, "Ho", "Holmium", "164.9304",
 68, "Er", "Erbium", "167.26",
 69, "Tm", "Thulium", "168.9342",
 70, "Yb", "Ytterbium", "173.04",
 71, "Lu", "Lutetium", "174.97",
 72, "Hf", "Hafnium", "178.49",
 73, "Ta", "Tantalum", "180.9479",

 74, "W" , "Tungsten", "183.85",
 75, "Re", "Rhenium", "186.207",
 76, "Os", "Osmium", "190.2",
 77, "Ir", "Iridium", "192.22",
 78, "Pt", "Platinum", "195.09",
 79, "Au", "Gold", "196.9665",
 80, "Hg", "Mercury", "200.59",
 81, "Tl", "Thallium", "204.37",
 82, "Pb", "Lead", "207.2",
 83, "Bi", "Bismuth", "208.9804",
 84, "Po", "Polonium", "(209)",
 85, "At", "Astatine", "(210)",
 86, "Rn", "Radon", "(222)",
 87, "Fr", "Francium", "(223)",
 88, "Ra", "Radium", "226.0254",
 89, "Ac", "Actinium", "(227)",
 90, "Th", "Thorium", "232.0381",
 91, "Pa", "Protactinium", "231.0359",
 92, "U" , "Uranium", "238.029",
 93, "Np", "Neptunium", "237.0482",
 94, "Pu", "Plutonium", "(244)",
 95, "Am", "Americium", "(243)",
 96, "Cm", "Curium", "(247)",
 97, "Bk", "Berkelium", "(247)",
 98, "Cf", "Californium", "(251)",
 99, "Es", "Einsteinium", "(254)",
 100, "Fm", "Fermium", "(257)",
 101, "Md", "Mendelevium", "(258)",
 102, "No", "Nobelium", "(255)",
 103, "Lr", "Lawrencium", "(260)",
 104, "Unq","Unq", "(261)",
 105, "Unp","Unp", "(262)",
 106, "Unh","Unh", "(263)",
 107, "Uns","Uns", "(262)",
 108, "Uno","Uno", "?.?",
 109, "Une","Une", "?.?",
};
#endif // __PT_H__






[LISTING TWO]

//----- PT.CPP - Periodic Table for Windows -------
#include <afxwin.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>

#include "resource.h"
#include "pt.h"

BOOL CPTDialog::bRegistered = FALSE;

//------ CPTDialog -- Registers our class if necessary, constructs the
// dialog object, and populates the combo boxes.

CPTDialog::CPTDialog()
{
 int i;
 if (!bRegistered)
 bRegistered = RegisterPTClass();
 Create ("pt");
 for (i=1; i<=PT_MAXELEMENTS; ++i)
 {
 CWnd::SendDlgItemMessage (IDD_ELEMENTNAME, CB_ADDSTRING,
 0, (LONG)(LPSTR)_atom[i].element);
 CWnd::SendDlgItemMessage (IDD_ELEMENTSYMBOL, CB_ADDSTRING,
 0, (LONG)(LPSTR)_atom[i].symbol);
 }
 // No combo-box has focus.
 memset (CB_Focus, 0, sizeof (CB_Focus));
}
//----- RegisterPTClass -- Register PT dialog class and replace the default
// white box icon with our own.
// Returns: nonzero if class is registered. 0 if class is not registered.
BOOL CPTDialog::RegisterPTClass()
{
 WNDCLASS wndclass;
 wndclass.style = 0;
 wndclass.lpfnWndProc = DefDlgProc; /* use default dialog proc */
 wndclass.cbClsExtra = 0;
 // This field MUST be set to DLGWINDOWEXTRA, or this class we're
 // registering won't work properly with our dialog boxes.
 // MUST be set to DLGWINDOWEXTRA or class won't work properly
 wndclass.cbWndExtra = DLGWINDOWEXTRA ;
 // Use MFC global function to retrieve app's instance.
 wndclass.hInstance = AfxGetInstanceHandle();
 // Load custom icon, cursor, set background, load menu
 wndclass.hIcon = ::LoadIcon (AfxGetInstanceHandle(), "PTICON");
 wndclass.hCursor = ::LoadCursor (NULL, IDC_ARROW) ;
 wndclass.hbrBackground = COLOR_WINDOW + 1 ;
 wndclass.lpszMenuName = "PTMenu";
 // Need unique name here. Name must be used in dialog box
 // template in DIALOGS.DLG to force dialog box to use this class.
 wndclass.lpszClassName = "PTDLGCLASS";
 return ::RegisterClass(&wndclass);
}
//--- GetPTData -- Retrieve and display data based on location of input focus:
// name, symbol, atomic number, or atomic weight. Accepts control ID as input.
void CPTDialog::GetPTData (int nID)
{
 char buffer[20];
 char *pWeight;
 double fBuffer;
 int i, bytes;
 bytes = CWnd::GetDlgItemText (nID, (LPSTR)buffer, sizeof(buffer));
 if (bytes > 0)
 {
 switch (nID)
 {
 case IDD_ELEMENTNAME:
 for (i=1; i<=PT_MAXELEMENTS; ++i)
 {
 if (_stricmp (buffer, _atom[i].element) == 0)
 break;

 }
 break;
 case IDD_ELEMENTSYMBOL:
 for (i=1; i<=PT_MAXELEMENTS; ++i)
 {
 if (_stricmp (buffer, _atom[i].symbol) == 0)
 break;
 }
 break;
 case IDD_ATOMICNUMBER:
 i = atoi (buffer);
 break;
 case IDD_ATOMICWEIGHT:
 // Atomic weights stored as strings. Convert
 // to floating point to do comparisons.
 fBuffer = atof (buffer);
 for (i=1; i<=PT_MAXELEMENTS; ++i)
 {
 // Some weights are in parens.
 // Skip over them if detected.
 if (_atom[i].weight[0] != '(')
 pWeight = _atom[i].weight;
 else
 pWeight = &_atom[i].weight[1];

 if (atof (pWeight) >= fBuffer)
 break;
 }
 break;
 default:
 i = 0;
 break;
 }
 // i contains the index into the array of structures: _atom[i].xxxx
 if (i <= PT_MAXELEMENTS && i > 0)
 Display (i+100);
 else
 MessageBox ("Selected element not found in periodic table.",
 "Error",
 MB_OK MB_ICONEXCLAMATION);
 }
}
// OnNameSetFocus -- OnSymbolSetFocus -- OnNumberSetFocus -- OnWeightSetFocus
// These functions keep track of which field has input focus. Fields are
// mapped to array CB_Focus[]. If user clicks on any edit fields or type
// in any data, blank out any existing data with Display (100).
void CPTDialog::OnNameSetFocus()
{
 Display (100);
 memset (CB_Focus, 0, sizeof(CB_Focus));
 CB_Focus[0] = 1;
}
void CPTDialog::OnSymbolSetFocus()
{
 Display (100);
 memset (CB_Focus, 0, sizeof(CB_Focus));
 CB_Focus[1] = 1;
}
void CPTDialog::OnNumberSetFocus()

{
 Display (100);
 memset (CB_Focus, 0, sizeof(CB_Focus));
 CB_Focus[2] = 1;
}
void CPTDialog::OnWeightSetFocus()
{
 Display (100);
 memset (CB_Focus, 0, sizeof(CB_Focus));
 CB_Focus[3] = 1;
}
// OnCancel -- User pressed the "Cancel" button. Terminate program.
void CPTDialog::OnCancel()
{
 delete this;
}
// OnOK -- User pressed the "OK" button. Retrieve element info
// based on the edit box which has the input focus.
void CPTDialog::OnOK()
{
 // CB_Focus tracks which edit box user entered data. GetPTData
 // searches for that data.
 if (CB_Focus[0])
 GetPTData (IDD_ELEMENTNAME);
 else if (CB_Focus[1])
 GetPTData (IDD_ELEMENTSYMBOL);
 else if (CB_Focus[2])
 GetPTData (IDD_ATOMICNUMBER);
 else
 GetPTData (IDD_ATOMICWEIGHT);
}
// OnAbout -- User selected "About" from menu. Open a modal dialog
// box and tell user about this program.
void CPTDialog::OnAbout()
{
 CModalDialog about( "AboutBox", this );
 about.DoModal();
}
// OnHelp -- User selected help from the menu. Open a modal dialog
// box and display help info.
void CPTDialog::OnHelp()
{
 CModalDialog version ("HelpBox", this);
 version.DoModal();
}
// Display -- Display element info in the edit boxes using "SetDlgItemText"
void CPTDialog::Display(int iAtomicNumber)
{
 char szNumber[4];
 // Atomic numbers range from 1-108; to avoid conflict with IDOK and IDCANCEL
 // (1 & 2), 100 is added to atomic numbers. Adjust before indexing array.
 iAtomicNumber -= 100;
 // Convert atomic number from numeric to string.
 if (iAtomicNumber <= 0) /* too small */
 memset (szNumber, 0, sizeof (szNumber));
 else if (iAtomicNumber > PT_MAXELEMENTS) /* too big */
 sprintf (szNumber, "%3d", PT_MAXELEMENTS);
 else /* juuust right */
 sprintf (szNumber, "%3d", _atom [iAtomicNumber].number);

 CWnd::SetDlgItemText (IDD_ELEMENTNAME, _atom [iAtomicNumber].element);
 CWnd::SetDlgItemText (IDD_ELEMENTSYMBOL, _atom [iAtomicNumber].symbol);
 CWnd::SetDlgItemText (IDD_ATOMICNUMBER, szNumber);
 CWnd::SetDlgItemText (IDD_ATOMICWEIGHT, _atom [iAtomicNumber].weight);
}
// MESSAGE MAP
BEGIN_MESSAGE_MAP (CPTDialog, CDialog)
 ON_WM_CLOSE ()
 ON_COMMAND (IDM_ABOUT, OnAbout)
 ON_COMMAND (IDM_HELP, OnHelp)
 ON_COMMAND (IDCANCEL, OnCancel)
 ON_COMMAND (IDOK, OnOK)
 ON_CBN_SETFOCUS (IDD_ELEMENTNAME, OnNameSetFocus)
 ON_CBN_SETFOCUS (IDD_ELEMENTSYMBOL, OnSymbolSetFocus)
 ON_EN_SETFOCUS (IDD_ATOMICNUMBER, OnNumberSetFocus)
 ON_EN_SETFOCUS (IDD_ATOMICWEIGHT, OnWeightSetFocus)
 ON_COMMAND (H, OnH)
 ON_COMMAND (He, OnHe)
 ON_COMMAND (Li, OnLi)
 ON_COMMAND (Be, OnBe)
 ON_COMMAND (B, OnB)
 ON_COMMAND (C, OnC)
 ON_COMMAND (N, OnN)
 ON_COMMAND (O, OnO)
 ON_COMMAND (F, OnF)
 ON_COMMAND (Ne,OnNe)
 ON_COMMAND (Na, OnNa)
 ON_COMMAND (Mg, OnMg)
 ON_COMMAND (Al, OnAl)
 ON_COMMAND (Si, OnSi)
 ON_COMMAND (P, OnP)
 ON_COMMAND (S, OnS)
 ON_COMMAND (Cl, OnCl)
 ON_COMMAND (Ar, OnAr)
 ON_COMMAND (K, OnK)
 ON_COMMAND (Ca, OnCa)
 ON_COMMAND (Sc, OnSc)
 ON_COMMAND (Ti, OnTi)
 ON_COMMAND (V, OnV)
 ON_COMMAND (Cr, OnCr)
 ON_COMMAND (Mn, OnMn)
 ON_COMMAND (Fe, OnFe)
 ON_COMMAND (Co, OnCo)
 ON_COMMAND (Ni, OnNi)
 ON_COMMAND (Cu, OnCu)
 ON_COMMAND (Zn, OnZn)
 ON_COMMAND (Ga, OnGa)
 ON_COMMAND (Ge, OnGe)
 ON_COMMAND (As, OnAs)
 ON_COMMAND (Se, OnSe)
 ON_COMMAND (Br, OnBr)
 ON_COMMAND (Kr, OnKr)
 ON_COMMAND (Rb, OnRb)
 ON_COMMAND (Sr, OnSr)
 ON_COMMAND (Y, OnY)
 ON_COMMAND (Zr, OnZr)
 ON_COMMAND (Nb, OnNb)
 ON_COMMAND (Mo, OnMo)
 ON_COMMAND (Tc, OnTc)

 ON_COMMAND (Ru, OnRu)
 ON_COMMAND (Rh, OnRh)
 ON_COMMAND (Pd, OnPd)
 ON_COMMAND (Ag, OnAg)
 ON_COMMAND (Cd, OnCd)
 ON_COMMAND (In, OnIn)
 ON_COMMAND (Sn, OnSn)
 ON_COMMAND (Sb, OnSb)
 ON_COMMAND (Te, OnTe)
 ON_COMMAND (I, OnI)
 ON_COMMAND (Xe, OnXe)
 ON_COMMAND (Cs, OnCs)
 ON_COMMAND (Ba, OnBa)
 ON_COMMAND (La, OnLa)
 ON_COMMAND (Ce, OnCe)
 ON_COMMAND (Pr, OnPr)
 ON_COMMAND (Nd, OnNd)
 ON_COMMAND (Pm, OnPm)
 ON_COMMAND (Sm, OnSm)
 ON_COMMAND (Eu, OnEu)
 ON_COMMAND (Gd, OnGd)
 ON_COMMAND (Tb, OnTb)
 ON_COMMAND (Dy, OnDy)
 ON_COMMAND (Ho, OnHo)
 ON_COMMAND (Er, OnEr)
 ON_COMMAND (Tm, OnTm)
 ON_COMMAND (Yb, OnYb)
 ON_COMMAND (Lu, OnLu)
 ON_COMMAND (Hf, OnHf)
 ON_COMMAND (Ta, OnTa)
 ON_COMMAND (W, OnW)
 ON_COMMAND (Re, OnRe)
 ON_COMMAND (Os, OnOs)
 ON_COMMAND (Ir, OnIr)
 ON_COMMAND (Pt, OnPt)
 ON_COMMAND (Au, OnAu)
 ON_COMMAND (Hg, OnHg)
 ON_COMMAND (Tl, OnTl)
 ON_COMMAND (Pb, OnPb)
 ON_COMMAND (Bi, OnBi)
 ON_COMMAND (Po, OnPo)
 ON_COMMAND (At, OnAt)
 ON_COMMAND (Rn, OnRn)
 ON_COMMAND (Fr, OnFr)
 ON_COMMAND (Ra, OnRa)
 ON_COMMAND (Ac, OnAc)
 ON_COMMAND (Th, OnTh)
 ON_COMMAND (Pa, OnPa)
 ON_COMMAND (U, OnU)
 ON_COMMAND (Np, OnNp)
 ON_COMMAND (Pu, OnPu)
 ON_COMMAND (Am, OnAm)
 ON_COMMAND (Cm, OnCm)
 ON_COMMAND (Bk, OnBk)
 ON_COMMAND (Cf, OnCf)
 ON_COMMAND (Es, OnEs)
 ON_COMMAND (Fm, OnFm)
 ON_COMMAND (Md, OnMd)
 ON_COMMAND (No, OnNo)

 ON_COMMAND (Lr, OnLr)
 ON_COMMAND (Unq, OnUnq)
 ON_COMMAND (Unp, OnUnp)
 ON_COMMAND (Unh, OnUnh)
 ON_COMMAND (Uns, OnUns)
 ON_COMMAND (Uno, OnUno)
 ON_COMMAND (Une, OnUne)
END_MESSAGE_MAP()

// Create the application object
CTheApp theApp;
// InitInstance -- Make CPTDialog object the main window by assigning
// m_pMainWnd to constructor. Returns: TRUE
BOOL CTheApp::InitInstance()
{
 m_pMainWnd = new CPTDialog();
 m_pMainWnd->ShowWindow( m_nCmdShow );
 m_pMainWnd->UpdateWindow();

 return TRUE;
}





[LISTING THREE]

//----- PT.RC - Resources for Periodic Table -----
#include <windows.h>
#include <afxres.h>
#include "resource.h"

PTICON ICON pt.ico
PTMenu MENU
{
 POPUP "&Help"
 {
 MENUITEM "&General Information", IDM_HELP
 MENUITEM "&About Periodic Table", IDM_ABOUT
 }
}
rcinclude dialogs.dlg // file generated by MS Dialog Editor
//------- End of PT.RC -------






Figure 3: Overriding icon IDs that have been reserved by MFC

AFX_IDI_STD_FRAME ICON custom.ico
AFX_IDI_STD_MDIFRAME ICON custon.ico



Figure 4: Declaring a message map


BEGIN_MESSAGE_MAP (CPTDialog, CDialog)
 WM_CLOSE ()
 ON_COMMAND (IDM_ABOUT, OnAbout)
 ON_CBN_SETFOCUS (IDD_ELEMENTNAME, OnNameSetFocus)
 ON_COMMAND (H, OnH)
 // other messages
END_MESSAGE_MAP()



Figure 5: Structure used to store PT data

struct
{
 unsigned char number;
 char Symbol[3+1];
 char element[12+1];
 char weight[9+1];
}_atom[110];











































June, 1993
A CROSS-PLATFORM PLUG-IN TOOLKIT


Creating dynamically extendable applications


 This article contains the following executables: XPIN.ARC XPIN.SIT


Ramin Firoozye


Ramin heads rp&s Inc. in San Francisco, California and can be reached on the
Internet at rpa@netcom.com or at 70751,252 on CompuServe.


Plug-ins, which have been popularized in recent years on the Macintosh by
programs like Adobe Photoshop, may well represent the ultimate embodiment of
modular software--they allow you to update or add functionality without the
need to patch your application. This is possible because a plug-in, much like
a resource, is loaded by the host application at run time. Photoshop uses this
approach to provide additional support for special effects and import/export
filters.
The plug-in concept is not specific to the Macintosh. But as useful as
plug-ins are, they introduce platform-specific elements into your application.
Because my work often takes me from the Macintosh to the PC, I developed the
Cross-Platform Plug-in Toolkit (XPIN) presented in this article. The toolkit
consists of two parts: a caller API and a plug-in skeleton. The caller API is
comprised of a set of routines that make an application plug-in aware. The
plug-in skeleton is a set of declarations that can be used to build from
simple to fully functional plug-ins. The XPIN toolkit takes care of all the
underlying bookkeeping necessary for looking up, managing, and invoking
plug-ins. The complete source code for XPIN is available electronically; see
"Availability," page 7.
The toolkit currently has been tested against a number of C compilers on both
the Windows (Borland and Microsoft) and Macintosh platforms (Think and MPW C),
and ports to Windows NT and SunOS are under way.


Plugging In


A plug-in is a small, self-contained program that cannot be invoked by itself.
A plug-in aware program (or caller) acts as a host for a plug-in. At run time,
the caller looks for all plug-in files in a given directory and registers
their presence. The caller can then invoke a plug-in at any time during
execution. However, the caller has no idea what it may encounter at run time.
The only predefined constants are the directory to search and the file type to
locate. When found, the plug-ins become integral parts of the main program.
Since the act of locating the plug-ins occurs at run time, no special registry
or patching mechanism is required: Just copy the plug-in file into the
directory and restart the program.
I had two general goals in creating the XPIN toolkit. The first was
source-level compatibility across multiple platforms. The caller API would be
identical across Macintosh and Windows, as would the plug-in skeleton. You
take the same plug-in skeleton from one system to another and recompile it
without any changes. The contents of the plug-in, however, are as portable as
you make them. The toolkit provides some compile-time portability mechanisms
to help isolate system dependencies. The second goal was simplicity in the
interface. The caller API consists of six C-language functions that look and
behave identically across all platforms.
As an example, Figure 1 shows what the menu bar for a sample caller program
looks like when executed. Everything up to this point looks incredibly
ordinary! Figure 2 shows the same program. This time, however, two plug-in
files have been copied into the application's directory. Note that the
application has detected the presence of the plug-ins, obtained their labels
(an internal value that has nothing to do with the filename), and added them
to the menu bar. The user can now invoke each of these plug-ins as if they
were integral parts of the original code.
Adding this functionality to the sample caller program requires about 30 extra
lines of code (see Example 1), primarily having to do with handling the menu.
This source code was directly copied from a Macintosh to a PC and recompiled.
The platform-specific functions of the plug-in were isolated using the
appropriate #ifdefs, but were built without further code modification.
Example 1: Adding functionality to the sample caller program.

 #include "XPIN.h"
 #ifdef OS_MAC
 #include <Dialogs.h>
 #endif
 XPIN(xblk)
 {
 int localVarsGoHere;
 DescribeXPIN(xblk, "Dialog", "Description of Dialog", 1, 0);
 #ifdef OS_WIN
 MessageBox(NULL, "This is a sample dialog box...",
 "Dialog Plug-in Message",
 MB_ICONSTOP MB_YESNO);
 #elif OS_MAC
 ParamText("\pThis is a sample dialog box...",
 NIL, NIL, NIL);
 Alert(128, NIL);
 #endif
 XSETSTATw(xblk, 0); // Return 0 as status
 EndXPIN();
 }



Plug-in Identification


The toolkit defines six functions for use by the caller; see Table 1 . To
distinguish plug-in files from other files, a "type" is assigned to each
plug-in. I decided to let the operating system help identify files of this
type. This was faster and simpler. On a Macintosh every file has a creator and
a type associated with it. These values allow the Macintosh Finder to
automatically find and launch a program when one of its data files is
double-clicked. On the Macintosh, all plug-ins of one category must have a
given finder type, for example, all Photoshop import filters of type 8BAM.
There are routines in the Macintosh Toolbox that allow a directory to be
scanned for all files of a given type, effectively performing a wildcard
search. A happy side effect of this mechanism is that the Macintosh Finder
assigns individual icons to each file type. Thus, plug-ins can have icons that
visually distinguish them from other files.
Table 1: Caller functions.


 Function Description
 -------------------------------------------------------------------------

 XPINInit Initializes internal data and searches the toolkit for
 plug-ins. Those found are asked to provide a label and a
 description. These are cached in the caller data space for
 quick access.

 XPINCount Returns the number of plug-ins found.

 XPINLabel Returns label associated with a given plug-in.

 XPINDesc Returns the description string associated with a given
 plug-in. This could be used for an "About" message or a
 simple help screen.

 XPINCall Calls a single plug-in. A general-purpose parameter block is
 used to pass arguments to the plug-in.

 XPINDone Called before the program exits. The internal data structures
 are deallocated. Each plug-in is also sent a "done" message
 to allow it to deallocate any space it may have allocated.

No such intrinsic type can be associated with a file on the PC. You do,
however, have the three-letter file extension. Many DOS programs (including
the command processor) use this file extension to locate special-use files
(for example, COMMAND.COM allows files of type .EXE, .COM, and .BAT to be
executed by simply typing the filename). Moreover, there are corresponding DOS
functions that return files based on filename wildcards.
Since it is possible for a non-plug-in file to be accidentally assigned a
plug-in type value, the toolkit validates all files it locates to make sure
they are legitimate plug-ins. The toolkit effectively ignores all "bad"
plug-ins.
The toolkit does not apply any architectural limits to the types and numbers
of plug-ins a single program can handle. To stop runaway searches, the toolkit
stops the directory search after 100 files, a constant defined in a header
file. This limit could be easily updated or removed altogether. An application
program can also support multiple plug-in types. Each type can be located in a
different directory (or they can all be located together). The overhead for
the plug-in structure is small. The main internal tracking structure takes
about 120 bytes of space. Each plug-in takes about 300 additional bytes. The
number of internal structures is limited by available memory. The plug-in
structure is allocated off the heap and released when XPINDone is called.


The Pathname Dilemma


The way directory pathnames are identified under each operating system is a
problem when writing cross-platform software. For example, the directory-name
separator under DOS is the backslash character. The Macintosh uses the colon,
and UNIX uses the forward slash. There are other syntactic and semantic
differences in pathnames that go beyond the scope of this article. Although
more polished solutions are possible, XPIN requires you to make sure the right
pathname was passed to the toolkit on each platform. This seems like a
cop-out, but it turns out that there are two "blessed" locations under both
operating systems: the application's home directory and the environment's
system directory. On the Mac-OS, this is the System Folder; under Windows,
it's the directory in which Windows is installed. So the toolkit supports two
"aliases" as a replacement for the plug-in search path. The $HOME and $SYSTEM
names are aliases that are replaced with absolute paths at run time. They can
be used on their own (for maximum portability) or incorporated into an
absolute path (for example, $HOME\PLUGIN\ or $HOME:PlugIn:). The other
advantage of using aliases is that $HOME allows the user to freely move the
entire application with plug-ins from one directory to the other, and the
$SYSTEM alias simplifies booting off different volumes. The application and
its plug-ins can be located easily, since the aliases are resolved at run
time.


Passing Parameters


The toolkit takes care of much the background work, like locating and invoking
the plug-ins. However, you must consider other issues. An application needs to
be able to exchange information between itself and the plug-in. In the case of
plug-ins, the compiler does not have the luxury of knowing the parameter types
being passed. In a static program, the compiler and linker know the type and
number of parameters being exchanged between a program and a function. The
compiler often takes care of all typecasting and parameter conversion.
Parameters are usually passed to a function using the stack. The compiler
generates the instructions necessary to pass the right number of parameters
onto the stack and to restore the stack after the function returns. The
function being called expects the stack to be in a well-defined state. The
only exception to this rule is the varargs mechanism (first popularized under
UNIX). Despite its flexibility, I chose not to use this mechanism because of
implementation discrepancies across platforms.
At run time, the compiler has no idea what it may encounter in a dynamic
situation like the one faced with plug-ins. The solution to this problem is
the parameter block (called the XBlock in the toolkit). The number and type of
parameters between the caller and the plug-in function are predefined.
However, one of the parameters is a pointer to an XBlock structure. The caller
is free to load anything it wants into the XBlock as long as the plug-in is
designed to expect the proper values in the right order.
The alternative to using a parameter block was to devise a dynamic runtime
parsing mechanism that supported a stream of variable-length arguments, each
identified by their data type and size. Despite its flexibility, I decided
against this "parameter-stream" approach due to its overhead. I will probably
revisit the issue at some future time; for now, however, the XBlock mechanism
appears to be sufficient. The XBlock is a fixed-size block of arguments. (The
number of slots can be changed in a header file.) Each slot can be either a
byte, word, long, or pointer (a far pointer under Windows). The values in the
slots can be easily accessed through a series of XSET# and XGET# macros (where
# is b, w, l, or p). Example 2 shows the definition of the XBlock structure.
Example 2: Definition of the XBlock structure.

 #define XBLOCK_MAXARGS 5
 union Arg {
 UPtr p; // Pointer
 Ubyte b; // Byte
 Uword w; // Word
 Ulong l; // Longword
 };
 typedef union Arg Arg;
 struct XBlock {
 Uword action; // action code (XOP)
 Arg args [XBLOCK_MAXARGS]; // array of args
 Arg status; // result sent back from XPIN
 };

The XBlock also has space for an action code and a status value. You can
define the action code. It is most useful for defining behavior common to all
plug-ins. Internally, the toolkit reserves two action codes to get the
plug-ins to return their label and description fields (XOP_INIT) and to
perform cleanup action (XOP_DONE). Other popular attempts at plug-ins either
require fixed parameters (Adobe Photoshop) or the use of parameter blocks
(HyperCard).
Coming up with a flexible specification for the XBlock is one of the crucial
tasks of application programmers who want to create plug-in aware programs. If
not correctly specified, ill-defined plug-ins will instantly throw the main
program into spasms. It behooves you to define how the application intends to
pass parameters to the plug-in (and to properly document it). The published
specification can be used by enterprising third-parties to write undoubtedly
amazing plug-ins.

What I've been calling "plug-ins" are, in reality, stand-alone code resources
under the Macintosh environment and dynamic link libraries (DLLs) under
Windows. I've just simplified their interfaces. Under the hood, however, they
have to obey the rules and restrictions of their host operating environment.
These are not debilitating restrictions. The main thing to consider is that
using plug-ins requires an understanding of how programs use memory,
particularly in the form of the stack and the heap. (It was illuminating to
discover that Macintosh CODE resources and Windows DLLs have similar but
opposite memory requirements.)
Windows plug-ins (DLLs) have no stack of their own. When loaded, they share
the stack of the calling application. This means that the plug-in should not
assume the presence of a large amount of stack space. What's left is the space
the application started with minus the space used before invoking the plug-in.
If the plug-in accidentally exceeds the stack space, the heap will almost
certainly become corrupted. Allocating space on the heap, however, carries
much less risk. The well-behaved plug-in should check the status returns from
all memory-request calls and gracefully handle errors. You should also be
aware that Windows keeps only one copy of a DLL in memory at any given time.
This can cause problems with static variables declared inside a DLL.
Macintosh CODE resources, on the other hand, can have a stack. However, they
take care in how they access global variables (A5-globals) and the heap (which
is accessed as an offset from A5). Both the Apple MPW and the Symantec Think
development environments provide ways to access globals. If you think this
might not apply to you, I should mention that QuickDraw uses A5-globals. If
you are not careful, the first time you use a QuickDraw function inside the
plug-in will lead to the gallows. The plug-in skeleton code takes care of
access to A5-globals but you should be aware of the limitation the Mac OS
applies to stand-alone CODE resources.
One final rule about memory management and plug-ins: Leave the world the way
you found it. If a plug-in allocates any memory, it should deallocate it
before returning. Unless previously arranged, allocated memory cannot be
retained by plug-ins. A better solution is for the calling program to allocate
enough memory and pass its address to plug-ins through the XBlock. The
plug-ins can be assured of a safe, clean area to operate in. To have
persistent memory, the plug-in can return a value to the caller through the
XBlock that indicates that the block of memory should be preserved by the
caller.


Presentation


How should plug-ins be shown to the user? There seems to be some sort of a
secret agreement that plug-ins should be added and invoked through menus. In
fact, there's no need to limit plug-ins to the menu bar. The menu just happens
to be a user-interface element that can be dynamically changed at run time. It
is, of course, up to you to associate plug-ins with the appropriate
user-interface elements in their applications.
The toolkit associates a string label and a description string with each
plug-in. These are maintained internally by each plug-in and have nothing to
do with filenames. The plug-in returns these values to the caller when the
toolkit first finds and initializes all plug-ins. The developer is free to use
the label in any way. Internally, each plug-in is identified by an index
(starting at 0). It is up to you to devise a way to associate the menu (or
whatever) items with the index for each plug-in. For example, the first
element of the plug-in menu can be associated with the 0th plug-in index. When
the user selects the first item in the menu bar, the application program is
required to translate that command to a call to the plug-in toolkit with 0 as
the plug-in index. The second menu may be associated with index 1, and so
forth.
This isn't as hard as it may sound. A simple lookup table or a clever
menu-numbering scheme will do nicely. In the sample caller application
provided with the toolkit, all plug-in menu items start at a base value (1000)
and go up sequentially. When the application gets a menu command with an ID
above the base value, it simply subtracts 1000 to get the 0-based plug-in
index and makes the call to the toolkit with that index.


Missing Features


I deliberately left a number of features out of this version of the XPIN
toolkit, the most important being callbacks. A callback is a function inside
the main application that can be called by a plug-in. Callbacks allow plug-ins
and the application to maintain a two-way dialog and exchange information
freely. I kept callbacks out because of serious cross-platform
incompatibilities they would introduce into the toolkit, especially in the use
of stacks and globals. A future version may add this feature if an appropriate
scheme can be found. Designing generic callbacks required making assumptions
that varied between different applications and unnecessarily complicated the
design. An application that requires the plug-in to call it back can pass the
address of the callback entry point inside the XBlock. I leave it to the
developer to devise a callback mechanism that works with the individual
application.
As mentioned before, each plug-in has a label and a description associated
with it. This brings up the other feature that was left out: icons. I would
have liked to have associated icons with each plug-in. These could be used,
for example, in button strips or toolbars. There is a certain attraction to a
plug-in that carries all its baggage in a single package. However, icons are
among those system-dependent entities that require special treatment. Again,
simplicity was used as the driving goal, and icons were dropped. It is
possible, however, for an application to associate icons with each plug-in
index, just as it was possible to associate a label with a menu item.


Cross-platform Support


To be able to write portable code, you need to know the environment and
platform on which you're running. And on each platform, there are a number of
popular development tools, each with their own idiosyncrasies. Determining
which system and compiler are being used can either be done when compiling or
at run time. For performance reasons, I chose the compile-time option. C
provides #ifdefs to allow the compiler to strip out code that does not meet
certain conditions. To help break the code into common and
platform-independent code, the toolkit provides the XCONFIG.H #include file.
Including this file on top of every source file will enable the proper
#defines that can be used by subsequent #ifdefs.
The toolkit takes advantage of certain predefined settings and combinations of
values to help define a common set of constants. This eliminates the need for
a large amount of hand editing of #include files or special "configuration"
programs often needed when supporting cross-platform programming. The values
set by XCONFIG.H (a file that's available electronically, see "Availability"
on page 7) are shown in Table 2. I found these settings to be sufficient for
most cases. In fact, I used these settings for the toolkit itself. The entire
finished source code for the XPIN-toolkit library can be copied between the PC
and Macintosh systems and recompiled without modification. (It was surprising
how much of the toolkit worked on both systems and did not require the use of
conditional compilation.)
Table 2: Environment-settings flags.

 OS_SUPPORTED Platform
 supported
 ---------------------------------

 OS_MAC Macintosh
 OS_WIN Windows
 OS_NT Windows/NT
 OS_UNIX UNIX

 COMPILER_MSC Microsoft C/C++
 COMPILER_BORLAND Borland C++
 COMPILER_THINK Think-C
 COMPILER_MPW MPW C/C++
 COMPILER_GNU GNU C/C++

 LANGUAGE_C C language
 LANGUAGE_CC C++ language



Conclusion


Writing portable programs requires planning and re-examination of your
assumptions about a given platform or environment. The XPIN plug-in toolkit
should introduce a larger audience in the developer community to the benefits
of plug-ins. At the same time, it should remove a technical barrier to
developing robust cross-platform programs. Publishing plug-in interfaces is a
good way to encourage other programmers to extend your application and make it
more attractive to end users.


References


Inside Macintosh, volumes I-VI. Apple Computer.
Klein, Mike. Windows Programmer's Guide To DLLs and Memory Management. Carmel,
IN: Sams, 1992.

Macintosh Technical Note #256: Stand-Alone Code. Apple Computer, August 1990.
Rollin, Keith. "Another Take on Globals in Standalone Code." develop
(December, 1992).
Think C Reference Manual, Version 5.0, Symantec.



























































June, 1993
PROGRAMMING PARADIGMS


Three Days of the Ponderer




Michael Swaine


This month's column is the day-to-day log of a software-development project
that I assigned myself in March.
Why am I inflicting this on you? This is not an example of fine-and-fastidious
programming. It's not even an example of quick-and-dirty programming. It's not
really programming at all. What it is is software development by wiring up
off-the-shelf objects, which, if memory serves, was the promise of
object-oriented programming.
But despite the claims of OOP boosters, the world has not seen a lot of
software development by wiring up off-the-shelf objects. Perhaps when you read
this column you'll know why, but I hope you'll get something else out of it.
My goal is to show, in a detailed case study, exactly what a weekend
programmer can do with off-the-shelf objects and a tight deadline.


The Project


In March I decided I needed to write a BBS for subscribers to my HyperTalk
newsletter. When I considered HyperPub's publishing schedule, my own work
schedule, and various other constraints, I concluded that I needed to do it in
three days.
That is, I wanted something that I could start shaking down in three days.
I was willing to settle, in this shakedown version, for the simplest possible
BBS interface: one-character mnemonic items in a hierarchical, command-line
menu system. At the top level, the user could type F to go to a Files area, M
to a Mail area, and N to a News area; at lower levels, typing a numeral should
select among files, news items, or mail messages. Typing H at any level should
display context-sensitive help text; typing X at any level should exit that
level, logging off at the top level. The system should respond with
appropriate welcome messages, prompts that indicate the available options at
any point, and a logoff message telling the time online and the logoff time. I
also wanted the logon to be tied to our customer database, so that subscribers
to HyperPub would be let in and anyone else would be locked out, but that
feature could wait.
A major purpose of the BBS would be file downloading and uploading, and I
didn't want to reinvent that wheel; I wanted a module that I could drop in to
handle file transfer. I was willing to settle initially for a very simple mail
system that would let users read mail addressed to them, type in messages
online, and send them to the sysop, other users, or all users. I was
particularly interested in creating a news system, a sort of customized and
curtailed NewsBytes service, in which I would supply the news items.
That's where I stood at the beginning of day one. Here's my log of those three
days.


Day 1: Assessing Feasibility


The first decision I have to make is what tool I'm going to use.
Although I could buy a BBS package for under $100.00, I want to get some
knowledge of BBS design out of this project, and I also want the ability to
tweak the system later on. The time frame and my abilities, though, limit me
to high-level tools: I won't be writing this from scratch in C. And I'll be
running the BBS on a Mac, so that narrows my options further.
I'm familiar enough with HyperTalk, and all the necessary tools do exist in
that environment, but I doubt my ability to tie them together. I've done some
work using Serius Developer, and it has an impressive-looking communications
library, so I'm drawn to it. And Serius will let me write my own objects for
future tweaking, although for now I can just hook up supplied objects.
At least that's the theory. I check the documentation to see if Serius has
what I need, and it looks promising. Serius delivers several specialized
object libraries, including a communications library with three communications
objects for managing file transfer and so forth. But there's only so much I
can learn from the docs, so I prepare to install the product.
First I have to set up the hardware, which takes way too long and casts a pall
over the whole project. If it takes me an hour and a half to figure out what's
wrong with the internal modem in one of the machines, what hope do I have of
getting this whole project done in three days? I'm talking HayesSpeak before
I'm finished (that Atlanta dialect in which every word starts with AT). Almost
as much fun is the session under the desk, figuring out what I can unplug so
that I can have two machines next to each other and still plug in the external
modem that one of them requires.
After this, the software installation, including the various necessary
components of system software, is a treat.
The communications package includes a sample communications application. My
hope is to use as much of this as I can, so I launch it to see what it does.
The sample application uses all three of the communications objects that come
in the Serius communications library: a connection object, which manages
connections on various channels, a terminal object, which lets the machine
emulate a terminal, and a file-transfer object, which, I hope, will manage
file transfers for me.
The connection object has functions for opening a communication channel and
waiting for text to come in, which is what a BBS spends much of its time
doing. Its capabilities are invoked in the sample application by menu choices,
so I'll need to change that.
By the time I figure I know what this sample application is doing, I've used
up half my first day.
After lunch I examine the code--although code is not the right word. Serius is
a visual programming system, and at the level I'm using it, it's a tool for
wiring up predefined objects.
What these two facts mean in practice is that you "write" programs by dragging
icons around the screen. Serius has a scripting system, though, called
ObjectTalk, and I print out the ObjectTalk version of the sample application
to study it offline. I immediately regret doing so. I have forgotten that
ObjectTalk is just an alternative way to supply parameters to functions. The
ObjectTalk printout is useful as documentation, but I find it useless for
studying the program's design. This is a visual programming system, through
and through. I go back to the computer.
Like any GUI application, this one has a lot of functionality tied up in UI
stuff: maintaining menus, closing or opening windows, and so on. This part is
easy to follow.
I get confused, though, trying to see what the connection object does with
incoming text. The documentation, both online and in the manual, is a little
confusing. The connection object normally passes any text that it receives to
the terminal object. It can, however, intercept the text and save it in a text
object. So far, clear enough. There's a perfectly clear discussion about how
you can use this mechanism to buffer input, but it's not clear if I actually
need to do this in order to trap my one-character user inputs. What, exactly,
are the responsibilities of the connection and terminal objects regarding this
incoming text? The documentation is not clear.
This is a familiar problem for anyone who has constructed any hardware system
from components. At the interface, nobody wants to take responsibility. But
Serius is the supplier of both components. I need to understand how this
works, since recognizing and responding to user inputs without missing a byte
is essential to the proper functioning of the BBS. The sample application
doesn't do anything with incoming text, so it's no help either. The
documentation and the sample application having failed me, I resort to
thinking. This desperate tactic works, and I resolve that problem and go on to
the next.
I end the day in thinking, without having created any code. I go over each
component of my BBS, trying to anticipate problems. I anticipate none. Piece
of cake.


Day 2: Getting Something Running


I spend the morning modifying the sample application to see if it will do what
I need. The connection object works the way I reasoned it should yesterday. It
provides me with a configuration dialog, which is how one controls what it
does. Here I can specify text strings I want it to detect in the data stream
from the modem and signals I want it to send when it detects them. That's all
I need to do with the connection object itself. I'll then associate function
chains with each of these signals to implement the behavior each user input
string is supposed to evoke.
Here's what I actually do to accomplish this:
1. Choose a function. I click on an object's name in a scrolling list of all
available objects, and this gets me a window full of icons for the functions
associated with this object. I grab one function icon and drag it to my work
window.
2. Fill in its parameters. I observe the parameter slots above the function's
icon and supply parameters, either by typing in string constants or by
dragging from the parameter slot to an object in another window. Serius
type-checks the parameter and refuses to let me supply a parameter of an
inadmissible type.
3. Wire the function into the program. I link the function into a function
chain by dragging to its icon either from an object icon or from another
function icon. When I do this, a line appears connecting the icons, and a
label appears on the line. The line represents a signal, and if there is more
than one signal that the object or function could be sending, I can click on
the label and all the signal names pop up so that I can choose one.
This is visual programming. It's easy.
It is also messy, but I assume that's just because I'm not properly
modularizing this design. The screen is getting awfully cluttered awfully
fast, though. I wonder how small I'll have to make my modules in order to make
them understandable.

Serius compiles a program of this size pretty fast, so I am not tempted to one
of my other programming sins--a tendency to write too much between tests.
By noon I have a skeleton of a BBS running. I can dial up, and it greets me,
explains the commands and menu hierarchy, responds in some way to each command
I type, and lets me upload or download a file. One specific file. When I type
X it sends me a logoff message, breaks the connection, and hangs. Oops.
Apparently I get one logon per restart. Well, it's something.
After lunch I start from scratch, sort of, building a new application and
pulling in pieces of the sample application (and most of its structure). This
time I try to modularize properly.
Writing modular applications using. Serius means using subjects. Serius calls
an application under development a "project," and modules of a project
"subjects;" that is, sub(pro)jects. Each subject has its own work window, and
objects can be shared among subjects.
I create subjects to handle logon and logoff, the Files area (uploading and
downloading), the Mail area (messages between users or between users and
sysop), the News area (displaying brief news items), Help (context-sensitive),
and a User list display (which eventually will be fairly complicated because
of the need to get data from our existing customer database).
As I drag and drop function icons, I can see that this is the right level of
modularity. The windows are not too cluttered.
I am now considering a question that I dodged yesterday. The sample
application lets the user set the parameters for the connection object by
invoking the standard dialog for that object. Since this isn't programming,
there is no API--there's an "application configurer's interface," which in
this case is a dialog box. But the connection object controls more
communications devices and channels than I am interested in, and there's no
way to turn off part of this configuration dialog. Unless there's another way
to set connection object parameters, I have the choice of providing the sysop
with either too much or too little control.
It turns out, after I dig into the documentation, that it is possible to get
the entire connection object configuration as text and to set any part of it,
so I can in fact craft my own dialog for setting exactly the parameters I want
the sysop to deal with. Later for that.
This one has turned out not to be a problem, but it is the sort of thing one
worries about in high-level programming: When the tools do a lot for you, they
may not do exactly what you want. So far, I haven't had that problem with
Serius.
By the end of the day, I have this second version hobbling along. It gives the
user a choice of files to download and news items to read, resets itself so it
can answer the phone again after a proper logoff or any broken connection, and
responds to all user commands, if not always correctly. The user list is bogus
(I haven't begun to think about how to link into my customer database), and
help is still noncontextual. Also, I haven't got the mail system running yet.
But it's forgotten how to download files. The problem seems to be that I'm not
properly supplying a parameter to one of the functions. This should be easy to
check, but the Serius documentation is a little vague regarding parameters,
supplying neither an unambiguous BNF-type specification nor examples. I think
the function is dissatisfied with the pathname I'm giving it, but it sure
looks like a pathname to me. Grumpf.


Day 3: Facing Reality


This is it: the final day. I'd better prioritize. What do I absolutely have to
get done today? Fix the pathname bug, for sure. I brute-force my way to a
resolution of that one.
I can't worry now about baud rate. The whole world is not 1200 baud, but the
part of it that wants to communicate with this BBS had better be.
Next I clean up the user interface, especially the Help system. Making it
contextual takes about half an hour: 15 minutes to change the signals and add
the text objects, and 15 minutes to type in the text of the contextual help
messages.
I've been using dummy files to download in the Files area and dummy news items
for the News area, so I replace these with the actual data I want to make
available to my users. It's clear to me that I'm not going to get the Mail
area implemented today, so I punt that.
I spend a couple of hours tidying up the icons and adding comments so I'll
understand what I did when I come back to this thing. Figuring I've got enough
time for one more tweak, I add the time online and the logoff time to the
logoff message. Doing this requires that I quit Serius in order to add the
time object to my project. Actually adding the times to the logoff message
takes all of two minutes.
At 6:00 I take the BBS online and invite my beta testers to have at it.


Lessons


Since that three-day crunch, I've done more work on the BBS. I chickened out
on trying to connect it directly with our customer database, opting to just
read in a TDF file of subscribers exported periodically from the database.
Getting the Mail area working only took one more day, although it's still
rudimentary. It doesn't handle replies, threads, or offline message composing.
It's clear, though, that none of these things will strain Serius when I get
another day or two free to add them.
What did I learn? A fair amount about building a BBS, although nothing of any
depth about file transfer or managing a communication channel. Building an
application from existing objects is pretty simple when one supplier provides
all the objects and some of the objects represent big chunks of the task.
Oh, and I'd better keep my day job.































June, 1993
C PROGRAMMING


Flattening D-Flat++ and Typing with the Joystick




Al Stevens


I'm giving D-Flat++ a minor overhaul. As I rewrote the code in C++, I carried
the original display logic along to the new version. One of the things that I
do not like about D-Flat is its performance on older, slower machines. The
display code is the culprit. Or, more to the point, the display philosophy is
at fault. The system examines every screen write to see if the character
position of the target window is in view. That's more complicated than it
sounds. An application can legitimately write to a window that is partially or
totally obscured. The target window might be overlapped by another window; it
might be partially off screen; the window size might provide a smaller
viewport than the display requires; or the window might be partially or
completely outside the borders of its parent. Every one of these conditions is
tested in most circumstances, resulting in a display rate that is, at best,
marginally acceptable on older machines and even some current laptops.
What to do? D-Flat's application model has an application window with menus, a
status bar, multiple document windows, and modal and modeless dialog boxes.
The overhead for all of that is borne by every application, whether it uses
all of the features or not. Looking back, I recall that one of my objectives
for D-Flat was to let an application use the compiler to pare out the unused
features. I did not squarely hit that target. Achieving that goal requires a
complex system of compile-time conditionals that weave in and out and all over
the code. As I added features, the conditionals became unmanageable. There
were too many of them, and the number of possible configurations increased
exponentially, which made comprehensive tests impossible.
There are two issues here: features and style of code. Not one of the
applications I developed with D-Flat comes close to using all of its features,
and so I decided that D-Flat++ does not need that much power. There are other
C++ class libraries that provide it, and we do not need another one. What we
need is a class library with fewer features and better performance.
The second issue has to do with how you go about building a program that does
not have more code than it needs. I have the same goals as before, but I am
trying to use the C++ inheritance mechanism to achieve them. If you don't need
a radio button, for example, then your program does not instantiate a
radio-button object, and the linker does not link the radio-button object
module. This is one of those cases where using object-oriented design makes
you wish you had always done so, even when the language was procedural. With
more care I might have made the D-Flat library work that way. The C language
doesn't promote or encourage it, however, and I am not sure I could have kept
it up as the program grew with features, no matter how good my intentions.
Back to the first issue, the features themselves: D-Flat++ will support an
application model similar to that of D-Flat, but without the multiple-document
interface, which I never used except in the example program. An application
window still has the menus and status bar. And although it can have its own
control windows, they should be unmovable by the user and not overlapping. The
application can spawn dialog boxes too, each with their own set of controls.
This model supports a large number of applications and is faster and smaller
than a full-featured CUA application. If it turns out that we need that extra
stuff, it can be added by using class inheritance. Let's see how successful I
am at reaching these lofty goals a second time.
I hope to have a working version by the time you read this. You can download
D-Flat++ from the CompuServe DDJ forum or from M&T Online. You can also get it
by sending a stamped, self-addressed diskette mailer and a formatted diskette
to me at Dr. Dobb's Journal, 411 Borel Avenue, San Mateo, CA 94402. I'll
include the latest version of D-Flat as well. The software is free but if you
wish, include a dollar for my Careware charity, the Brevard County Food Bank.


JOYKEY


Several years ago my friend Jerry was alone one evening, listening to his
police-band scanner. Jerry, a single parent, has custody of Valerie, his
then-teenaged daughter, who had gone out on her motorscooter to run an errand.
Jerry works part-time as a volunteer dispatcher with the local police, so he
keeps a scanner at home. The scanner landed on a police report describing an
accident under investigation. A child had been hit by a car. He listened while
the police officer spelled Jerry's last name with the phonetic alphabet that
radio operators use. He says now that words cannot describe how he felt when
he realized that the report was about Valerie. They were reading her name from
her driver's license and broadcasting it ahead to the hospital to locate her
parents.
What followed in the years since the tragedy is a story in itself. Valerie is
now in a wheelchair, a paraplegic. Jerry's life is devoted to caring for her.
He is always on the lookout for some way to help her. Valerie's mind is not
totally intact, but a good bit of who she was survived the accident. However,
her limited motor skills impair her ability to communicate. She cannot talk or
write but she can move one hand.
Jerry is a computer nut like most of us, although not a programmer, and we
were talking about computers one evening. He explained that Valerie had shown
an interest in his computers before the accident. I wondered then if she could
use one of those mouse-driven keyboard simulators that I've seen posted in the
Handicap forum on CompuServe. Jerry doubted Valerie's ability to manage the
mouse. Her hand movements tend to be spastic when what she is holding is not
fastened down. She can maneuver her wheelchair somewhat with a joystick,
though, so we kicked around an idea: Maybe she could handle the joystick on a
computer. It's the kind of problem you want to solve.
I went home and downloaded one of the mouse-driven keyboard simulator programs
to see how it worked. The program is NO-KEYS, written by David Leithauser. It
is a TSR that displays a simulated keyboard on the screen on top of your
application. You select keys with the mouse, and the NO-KEYS program tricks
the application into thinking that you are using the keyboard. It's useful to
anyone who can handle a mouse but has difficulty using a keyboard. NO-KEYS
showed me how such a program should work. It did not come with source code,
but its operation was simple enough, so I set out to build a similar program
for the standard PC joystick port. The result is Listing One, joykey.c, on
page 165.
JOYKEY is yet another TSR, albeit a benign one. It could just as well be a
device driver, but I decided to load it from the command line so that it would
be easier to unload. Here's what it does.
When you load JOYKEY, it attaches the keyboard BIOS interrupt, displays a
simulated keyboard on the screen, and watches for programs to read the keys.
Most well-behaved programs read the keyboard by calling BIOS, checking the
keyboard status and reading any keystrokes that have been entered. Figure 1
shows how the simulated keyboard looks with the joystick cursor on the a key.
The program works only in DOS text mode, although a Windows version would be
an interesting project.
I wanted JOYKEY to be as unobtrusive as possible. For example, the real
keyboard should work along with the simulated one so that Jerry and Valerie
could use the same screen to converse. Furthermore, since the simulated
keyboard writes itself as a 5x46 text screen rectangle on top of the
application screen, they need to move the keyboard around to see what's under
it.
Operating JOYKEY is simple enough, although tedious for the nonchallenged
typist; you go at the rate of a word every now and then. The program uses a
reverse box cursor just like the standard text-mode cursor. Bang the joystick
to the left, and the cursor begins to move leftward. Same thing for up, down,
right, and combinations. Release the joystick to its center position, and the
cursor stops in its tracks. This movement is similar to the joystick behavior
on some CAD/CAM systems that I have seen, although the PC joystick lacks the
precision that those high-end devices have. You have to run the cursor pretty
slowly to be able to stop it where you want. You need a delay before it starts
tracking, too. Otherwise, it gets away from you.
If you click the joystick button while the cursor is outside of the keyboard,
the keyboard repositions itself at the cursor location, being sure to stay
completely on the screen, however.
To "type," you click the joystick button while the cursor is inside the
simulated keyboard on the key you want to press. All the keys of a standard
94-key keyboard are there, so JOYKEY should work with most applications--if
the program is well-behaved with respect to keyboard input, that is. For
example, Microsoft's EDIT program, the text editor that comes with DOS, is not
so well-behaved. Somehow, it captures the keyboard when you pop down a menu
and freezes JOYKEY out. You'd expect Microsoft programmers to obey the rules,
wouldn't you? D-Flat and D-Flat++ applications work fine. I've tested JOYKEY
with several DOS text-mode word processors, including XyWrite, which is known
to be finicky, and JOYKEY gets along with them just fine. It even works with
the early, cantankerous versions of Sidekick.
The Shf, Alt, and Ctl keys on JOYKEY's simulated keyboard are toggles. When
you press one, it highlights. When a toggle is highlighted, the program
returns an appropriate keystroke value for subsequent keystrokes as if the
highlighted key was pressed, too. Press the toggled key a second time, and the
highlight turns off. When the Shf key is highlighted the keyboard's key
display changes to show the uppercase letters and shifted special characters.
The Unl key to the right of the spacebar is not a standard keystroke. It's
there to let you unload JOYKEY from memory. To avoid accidental hits--a likely
mishap for a poorly coordinated user--the key is a toggle. The first time you
hit the key, it highlights. The second time you hit it, the program removes
itself from memory. If you hit anywhere else, the highlight turns off.
Unloading becomes a simple double-click to those of us fortunate enough to be
nonchallenged physically.
I used JOYKEY to write some of this column. It makes you empathize with the
problems of the handicapped. The first paragraph took forever to write. My
deadline loomed near, so I switched back to the real keyboard. Let someone use
this program for a while and then tell me that they can still grab a
handicapped parking space with a clear conscience. Here in Florida, we have
senior citizens who patrol parking lots. They call the police when they catch
a healthy person parking in one of those spaces. Even if you have the little
wheelchair-icon bumper sticker, you'd better be physically impaired or picking
up an obviously impaired passenger. Otherwise one of these gray-panther
vigilantes is likely to crease your hood with an umbrella.
JOYKEY has command-line parameters that let you control which joystick button
to use, the color of the simulated keyboard and the reaction of the joystick
device to a user's hand movements. I have a FlightStick joystick, and the
default values of these parameters work well with it. Joysticks are different,
and user's abilities vary, so these parameters are to compensate. For example,
the FlightStick has two buttons, one on top in the thumb position and the
other in front in the trigger position. Which one you use depends on the
characteristics of your impairment. A joystick movement moves the cursor one
position, delays, and then starts a steady movement across the screen. The
delay makes single-character position moves easier. The command-line options
can modify the delay rate and the speed of the steady cursor movement, as well
as the sensitivity of the joystick, which is a measure of how much movement is
needed before the program decides that you have moved the joystick.
Table 1 lists the command-line options. You can put them in any sequence and
use upper or lower case.
Table 1: JOYKEY command-line options.

 Command Meaning
 -------------------------------------------------------------------------

 /1 Use the trigger joystick button. The thumb position is the default.

 /Dn Set the number of clock ticks that JOYKEY delays moving the
 cursor after the first move. The n can be 0 to 20. The default value is 5.

 /Sn Set the joystick sensitivity, which is the amount of lateral
 movement before the joystick senses a move; n can be 0 to
 100. The default value is 10.

 /Tn Set the number of clock ticks between cursor movements. This
 parameter makes the cursor move faster or slower in increments
 of about 1/18 second; n can be 0 to 10. The default value is 1.

 /RED
 /GREEN

 /BLUE
 /WHITE Set the color of the JOYKEY window. You can use the first
 letter of the parameter. The default is WHITE.

 /? Display a small help screen to remind you about the
 command-line parameters.



JOYKEY's Code


JOYKEY is a small study in DOS systems programming. The code uses the Borland
C++ low-level extensions to the C language to become a TSR and to access the
hardware. I didn't jump through hoops to make the executable as small as
possible. You can do that, but you have to rewrite the startup assembly
language code and do tricks with the linker. (I wrote about such cybernastics
in the August 1992 "Examining Room.") Valerie is not a power user; she can
spare the 15K it takes and, if not, she can load it high.
The first part of the code to look at are the arrays that define the simulated
keyboard. The first two arrays define how the keyboard displays on the screen,
one for the unshifted keys and one for the shifted keys. The other array
defines to the program the key positions within the display and the BIOS scan
code and key value to return for each one. I did not type that table in when I
built it. Instead, I wrote a program that noted each key press and built the
corresponding values into a text file that I later modified with my editor.
The main function processes the command-line arguments and sets up the TSR. It
reads the joystick port and adjusts its settings. Most joysticks have trimming
pots, and the program assumes that whatever values it reads on startup are
relative to the 0/0 center position of the stick. The program uses one of
Andrew Schulman's Undocumented DOS features to get the address of what is
called the inDOS flag. This flag usually indicates that the user is sitting at
the command line when the keyboard BIOS is read. It has other purposes, but
that's a side effect. It lets JOYKEY refuse to unload itself if another
program is running above it. The inDOS flag would not be set in that
situation. Of course, if the other program is using DOS to read the keyboard,
which few reasonable programs do, then this test does not work reliably.
Before becoming resident, the program hooks the timer and BIOS keyboard
interrupt vectors. The timer interrupt supports the program's timed operations
for the joystick cursor speed controls. The keyboard interrupt lets the
program substitute joystick-selected keys for keyboard keys.
When a program calls BIOS to read the keyboard or its status, JOYKEY's hooked
interrupt service routine gets control. If the simulated keyboard is not
currently displayed, the program displays it. If the caller is waiting for a
key rather than testing keyboard status, the program loops while calling the
same interrupt but asking for status. The loop continues until the call
returns an indicator that a key was typed, either at the keyboard or from the
joystick. This call for status gets hooked, too, because it uses the same
interrupt vector. This way, JOYKEY always operates from the hooked interrupt
call that asks for status. Most interactive programs spend most of their time
in a loop waiting for the next keystroke. That loop is where JOYKEY works.
The program displays the simulated keyboard and then remembers that it was
displayed. But the user's actions can change the screen, so, once every loop
cycle the program looks at video memory to see if the keyboard is still there.
If not, it displays it again. Next, it tests to see if the user has pressed
the joystick button. If so, the button-down function tests to see if the
cursor is inside the simulated keyboard display. If not, the function moves
the simulated keyboard to where the cursor is positioned. If the cursor is
inside the keyboard, the function uses that big table to determine which key
was hit. It gets the appropriate scan code and key value from the table and
returns it.
If the user selected the Unl button, the program tests to see if it can unload
itself. If the inDOS flag is set and no other program has hooked the interrupt
vectors away from JOYKEY, it unloads by freeing the memory that DOS has
allocated to it.
Other simulated keystrokes are returned to the original caller of the BIOS
keyboard interrupt, and that's how JOYKEY substitutes the joystick for the
keyboard.


Epilogue


In this column I discussed issues related to the handicapped, or the
physically challenged, or whatever the current socially correct terminology
is. If I slipped up and used a term that is out of favor in your part of the
country, it is not due to my insensitivity to those issues but rather my
inability to keep pace with vogue. If I seem to take a light view of all of
this, it is because I am personally involved and have been for three decades
with a so-called "challenged" son.
I wish that I had a happy ending for this story. I wish that I could tell you
that Valerie took JOYKEY and wrote her doctoral dissertation or a best-selling
novel or even that she was now happily communicating with her dad. But the
small dramas in real life are not like a TV movie. We can't encapsulate the
whole story in one installment. It takes years and whatever tools a dedicated
person can muster and improvise to fit each unique situation. Success and
progress are measured in small increments stretched out over a lifetime.
They're working on it.


[LISTING ONE]

/* ------------------ joykey.c -------------- */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dos.h>
#include <ctype.h>
#include <conio.h>

/* ------- the interrupt function registers -------- */
typedef struct {
 int bp,di,si,ds,es,dx,cx,bx,ax,ip,cs,fl;
} IREGS;

#define KEYBOARD 0x16
#define DOS 0x21
#define TIMER 0x1c
#define ZEROFLAG 0x40
#define CARRYBIT 1
#define READKEY 0
#define KEYSTATUS 1

#define TRUE 1
#define FALSE 0
#define HT 5
#define WD 46
#define HL WHITE
#define FG (BG == WHITE ? BLACK : LIGHTGRAY)
#define TRVL 8
#define BUTTONMASK (Use1 ? 0x10 : 0x20)


int LagTime = 5;
int Sensitivity = 10;
int SpeedCtr = 1;
int SpeedTimer = 0;

static void movecursor(void);
static void (interrupt *oldtimer)(void);
static void (interrupt *old16)(void);
static void interrupt newtimer(void);
static void interrupt int16(IREGS);
static int unload(void);
static unsigned highmemory;
static unsigned sizeprogram;
static int up;

static int Use1;
static int vsave[HT * WD];
static int BG = WHITE;

extern unsigned _heaplen = 1;
extern unsigned _stklen = 512;

static char *nk[2][5] = {
 {
 {"F1 F2` 1 2 3 4 5 6 7 8 9 0 - = \\ ĳEsc "},
 {"F3 F4 q w e r t y u i o p [ ] Hom  PUp"},
 {"F5 F6Ctl a s d f g h j k l ; '    \x1a "},
 {"F7 F8Shf z x c v b n m , . / End  PDn"},
 {"F9 F0Alt space Unl Ins Del"}
 },
 {
 {"F1 F2~ ! @ # $ % ^ & * ( ) _ + Esc "},
 {"F3 F4 Q W E R T Y U I O P { } Hom PUp"},
 {"F5 F6Ctl A S D F G H J K L : \"  \x1a "},
 {"F7 F8Shf Z X C V B N M < > ? End PDn"},
 {"F9 F0Alt space Unl Ins Del"},
 }
};

struct kdef {
 int x, y; /* position within window */
 int wd; /* width of screen token */
 struct {
 char sc; /* scan codes */
 char ky; /* key values */
 } sk[4]; /* normal, shift, ctrl, alt */
};
static struct kdef kds[] = {
//x,y,wd Normal Shift Ctrl Alt
//- - -- --------- --------- --------- ---------
{ 0,0,2,{{0x3b,0x00},{0x54,0x00},{0x5e,0x00},{0x68,0x00}}}, // F1
{ 3,0,2,{{0x3c,0x00},{0x55,0x00},{0x5f,0x00},{0x69,0x00}}}, // F2
{ 6,0,1,{{0x29,0x60},{0x29,0x7e},{0x29,0x60},{0x29,0x60}}}, // ~
{ 8,0,1,{{0x02,0x31},{0x02,0x21},{0x02,0x31},{0x78,0x00}}}, // 1
{10,0,1,{{0x03,0x32},{0x03,0x40},{0x03,0x00},{0x79,0x00}}}, // 2
{12,0,1,{{0x04,0x33},{0x04,0x23},{0x04,0x33},{0x7a,0x00}}}, // 3
{14,0,1,{{0x05,0x34},{0x05,0x24},{0x05,0x34},{0x7b,0x00}}}, // 4
{16,0,1,{{0x06,0x35},{0x06,0x25},{0x06,0x35},{0x7c,0x00}}}, // 5

{18,0,1,{{0x07,0x36},{0x07,0x5e},{0x07,0x1e},{0x7d,0x00}}}, // 6
{20,0,1,{{0x08,0x37},{0x08,0x26},{0x08,0x37},{0x7e,0x00}}}, // 7
{22,0,1,{{0x09,0x38},{0x09,0x2a},{0x09,0x38},{0x7f,0x00}}}, // 8
{24,0,1,{{0x0a,0x39},{0x0a,0x28},{0x0a,0x39},{0x80,0x00}}}, // 9
{26,0,1,{{0x0b,0x30},{0x0b,0x29},{0x0b,0x30},{0x81,0x00}}}, // 0
{28,0,1,{{0x0c,0x2d},{0x0c,0x5f},{0x0c,0x1f},{0x82,0x00}}}, // -
{30,0,1,{{0x0d,0x3d},{0x0d,0x2b},{0x0d,0x3d},{0x83,0x00}}}, // =
{32,0,1,{{0x2b,0x5c},{0x2b,0x7c},{0x2b,0x1c},{0x2b,0x5c}}}, // \
{34,0,2,{{0x0e,0x08},{0x0e,0x08},{0x0e,0x7f},{0x0e,0x08}}}, // BS
{37,0,3,{{0x01,0x1b},{0x01,0x1b},{0x01,0x1b},{0x01,0x1b}}}, // Esc
{ 0,1,2,{{0x3d,0x00},{0x56,0x00},{0x60,0x00},{0x6a,0x00}}}, // F3
{ 3,1,2,{{0x3e,0x00},{0x57,0x00},{0x61,0x00},{0x6b,0x00}}}, // F4
{ 6,1,1,{{0x0f,0x09},{0x0f,0x00},{0x0f,0x09},{0x0f,0x09}}}, // Tab
{ 9,1,1,{{0x10,0x71},{0x10,0x51},{0x10,0x11},{0x10,0x00}}}, // q
{11,1,1,{{0x11,0x77},{0x11,0x57},{0x11,0x17},{0x11,0x00}}}, // w
{13,1,1,{{0x12,0x65},{0x12,0x45},{0x12,0x05},{0x12,0x00}}}, // e
{15,1,1,{{0x13,0x72},{0x13,0x52},{0x13,0x12},{0x13,0x00}}}, // r
{17,1,1,{{0x14,0x74},{0x14,0x54},{0x14,0x14},{0x14,0x00}}}, // t
{19,1,1,{{0x15,0x79},{0x15,0x59},{0x15,0x19},{0x15,0x00}}}, // y
{21,1,1,{{0x16,0x75},{0x16,0x55},{0x16,0x15},{0x16,0x00}}}, // u
{23,1,1,{{0x17,0x69},{0x17,0x49},{0x17,0x09},{0x17,0x00}}}, // i
{25,1,1,{{0x18,0x6f},{0x18,0x4f},{0x18,0x0f},{0x18,0x00}}}, // o
{27,1,1,{{0x19,0x70},{0x19,0x50},{0x19,0x10},{0x19,0x00}}}, // p
{29,1,1,{{0x1a,0x5b},{0x1a,0x7b},{0x1a,0x1b},{0x1a,0x5b}}}, // [
{31,1,1,{{0x1b,0x5d},{0x1b,0x7d},{0x1b,0x1d},{0x1b,0x5d}}}, // ]
{37,1,3,{{0x47,0x00},{0x47,0x37},{0x77,0x00},{0x00,0x07}}}, // Home
{41,1,1,{{0x48,0x00},{0x48,0x38},{0x48,0x00},{0x00,0x08}}}, // UP
{43,1,3,{{0x49,0x00},{0x49,0x39},{0x84,0x00},{0x00,0x09}}}, // PgUp
{ 0,2,2,{{0x3f,0x00},{0x58,0x00},{0x62,0x00},{0x6c,0x00}}}, // F5
{ 3,2,2,{{0x40,0x00},{0x59,0x00},{0x63,0x00},{0x6d,0x00}}}, // F6
{ 6,2,3,{{0xff,0x00},{0xff,0x00},{0xff,0x00},{0xff,0x00}}}, // Ctrl
{10,2,1,{{0x1e,0x61},{0x1e,0x41},{0x1e,0x01},{0x1e,0x00}}}, // a
{12,2,1,{{0x1f,0x73},{0x1f,0x53},{0x1f,0x13},{0x1f,0x00}}}, // s
{14,2,1,{{0x20,0x64},{0x20,0x44},{0x20,0x04},{0x20,0x00}}}, // d
{16,2,1,{{0x21,0x66},{0x21,0x46},{0x21,0x06},{0x21,0x00}}}, // f
{18,2,1,{{0x22,0x67},{0x22,0x47},{0x22,0x07},{0x22,0x00}}}, // g
{20,2,1,{{0x23,0x68},{0x23,0x48},{0x23,0x08},{0x23,0x00}}}, // h
{22,2,1,{{0x24,0x6a},{0x24,0x4a},{0x24,0x0a},{0x24,0x00}}}, // j
{24,2,1,{{0x25,0x6b},{0x25,0x4b},{0x25,0x0b},{0x25,0x00}}}, // k
{26,2,1,{{0x26,0x6c},{0x26,0x4c},{0x26,0x0c},{0x26,0x00}}}, // l
{28,2,1,{{0x27,0x3b},{0x27,0x3a},{0x27,0x3b},{0x27,0x3b}}}, // ;
{30,2,1,{{0x28,0x27},{0x28,0x22},{0x28,0x27},{0x28,0x27}}}, // '
{32,2,3,{{0x1c,0x0d},{0x1c,0x0d},{0x1c,0x0a},{0x1c,0x0d}}}, // Enter
{38,2,1,{{0x4b,0x00},{0x4b,0x34},{0x73,0x00},{0x00,0x04}}}, // LF
{44,2,1,{{0x4d,0x00},{0x4d,0x36},{0x74,0x00},{0x00,0x06}}}, // RT
{ 0,3,2,{{0x41,0x00},{0x5a,0x00},{0x64,0x00},{0x6e,0x00}}}, // F7
{ 3,3,2,{{0x42,0x00},{0x5b,0x00},{0x65,0x00},{0x6f,0x00}}}, // F8
{ 6,3,3,{{0xfe,0x00},{0xfe,0x00},{0xfe,0x00},{0xfe,0x00}}}, // Shf
{11,3,1,{{0x2c,0x7a},{0x2c,0x5a},{0x2c,0x1a},{0x2c,0x00}}}, // z
{13,3,1,{{0x2d,0x78},{0x2d,0x58},{0x2d,0x18},{0x2d,0x00}}}, // x
{15,3,1,{{0x2e,0x63},{0x2e,0x43},{0x2e,0x03},{0x2e,0x00}}}, // c
{17,3,1,{{0x2f,0x76},{0x2f,0x56},{0x2f,0x16},{0x2f,0x00}}}, // v
{19,3,1,{{0x30,0x62},{0x30,0x42},{0x30,0x02},{0x30,0x00}}}, // b
{21,3,1,{{0x31,0x6e},{0x31,0x4e},{0x31,0x0e},{0x31,0x00}}}, // n
{23,3,1,{{0x32,0x6d},{0x32,0x4d},{0x32,0x0d},{0x32,0x00}}}, // m
{25,3,1,{{0x33,0x2c},{0x33,0x3c},{0x33,0x2c},{0x33,0x2c}}}, // ,
{27,3,1,{{0x34,0x2e},{0x34,0x3e},{0x34,0x2e},{0x34,0x2e}}}, // .
{29,3,1,{{0x35,0x2f},{0x35,0x3f},{0x35,0x2f},{0x35,0x2f}}}, // /
{37,3,3,{{0x4f,0x00},{0x4f,0x31},{0x75,0x00},{0x00,0x01}}}, // End

{41,3,1,{{0x50,0x00},{0x50,0x32},{0x50,0x00},{0x00,0x02}}}, // DN
{43,3,3,{{0x51,0x00},{0x51,0x33},{0x76,0x00},{0x00,0x03}}}, // PgDn
{ 0,4,2,{{0x43,0x00},{0x5c,0x00},{0x66,0x00},{0x70,0x00}}}, // F9
{ 3,4,2,{{0x44,0x00},{0x5d,0x00},{0x67,0x00},{0x71,0x00}}}, // F10
{ 6,4,3,{{0xfd,0x00},{0xfd,0x00},{0xfd,0x00},{0xfd,0x00}}}, // Alt
{17,4,5,{{0x39,0x20},{0x39,0x20},{0x39,0x20},{0x39,0x20}}}, // space
{32,4,3,{{0xfc,0x00},{0xfc,0x00},{0xfc,0x00},{0xfc,0x00}}}, // Unl
{37,4,3,{{0x52,0x00},{0x52,0x30},{0x52,0x00},{0x52,0x00}}}, // Ins
{43,4,3,{{0x53,0x00},{0x53,0x2e},{0x53,0x00},{0x53,0x00}}}, // Del
{0,0,0,0,0,0,0,0,0,0,0}
};
static int Shift;
static int Ctrl;
static int Alt;
static int Unl;
static int sx = 1;
static int sy = 1;
static int mx = 40;
static int my = 12;
static int nomx, nomy;
static int ff;
static struct kdef *kd;
static far char *inDOS;
/* ------- write a string to video ---------- */
static void wputs(char *str, int x, int y)
{
 int wx = wherex();
 int wy = wherey();
 gotoxy(x+1, y+1);
 cputs(str);
 gotoxy(wx, wy);
}
/* ------ position the joystick cursor ----- */
static void Cursor(int x, int y)
{
 static int mchr;
 gettext(x+1, y+1, x+1, y+1, &mchr);
 mchr ^= 0x7700;
 puttext(x+1, y+1, x+1, y+1, &mchr);
}
/* ---- display the simulated keyboard ----- */
static void DisplayKB(void)
{
 int y;
 textbackground(BG);
 textcolor(FG);
 gettext(sx+1,sy+1,sx+WD,sy+HT,vsave);
 for (y = 0; y < HT; y++)
 wputs(nk[Shift][y], sx, sy+y);
 textcolor(HL);
 if (Ctrl)
 wputs("Ctl", sx+6, sy+2);
 if (Shift)
 wputs("Shf", sx+6, sy+3);
 if (Alt)
 wputs("Alt", sx+6, sy+4);
 if (Unl)
 wputs("Unl", sx+32, sy+4);
}

#define DoParm(p, mx) min(atoi(p),mx)
void main(int argc, char *argv[])
{
 int ch;
 /* --- process the command line parameters --- */
 while (argc > 1) {
 --argc;
 argv++;
 if (**argv == '/') {
 (*argv)++;
 ch = toupper(**argv);
 (*argv)++;
 switch (ch) {
 case 'D':
 LagTime = DoParm(*argv, 20);
 break;
 case 'S':
 Sensitivity = DoParm(*argv, 100);
 break;
 case 'T':
 SpeedCtr = DoParm(*argv, 10);
 break;
 case '1':
 Use1 = TRUE;
 break;
 case 'R':
 BG = RED;
 break;
 case 'G':
 BG = GREEN;
 break;
 case 'B':
 BG = BLUE;
 break;
 case 'W':
 BG = WHITE;
 break;
 default:
 puts("Usage: JoyKey /<switch>");
 puts(" /1 = Use other button");
 puts(" /RED, /GREEN, /BLUE, /WHITE");
 puts(" /Dn = delay (n = 0 to 20)");
 puts(" /Sn = sensitivity (n = 0 to 100)");
 puts(" /Tn = ticks (n = 0 to 10)");
 return;
 }
 }
 }
 /* -------- test and center the joystick ------ */
 _AH = 0x84;
 _DX = 1;
 geninterrupt(0x15);
 nomx = _AX;
 nomy = _BX;
 if (nomx && nomy) {
 /* --- get address of InDOS flag --- */
 unsigned seg, off;
 _AH = 0x34;
 geninterrupt(DOS);

 seg = _ES;
 off = _BX;
 inDOS = MK_FP(seg, off);
 /* ----- attach interrupt vectors ------ */
 old16 = getvect(KEYBOARD);
 oldtimer = getvect(TIMER);
 setvect(KEYBOARD, int16);
 setvect(TIMER, newtimer);
 /* ------ compute program size ------- */
 highmemory = _SS + ((_SP+8) / 16);
 sizeprogram = highmemory - _psp + 1;
 /* ----- terminate and stay resident ------- */
 keep(0, sizeprogram);
 }
}
/* ----- timer interrupt service routine ------- */
static void interrupt newtimer(void)
{
 if (SpeedTimer > 0)
 --SpeedTimer;
 (*oldtimer)();
}
/* -- adjust js vectors by nominal values and sensitivity -- */
static int adjust(int v, int nomv, int mv, int maxtr)
{
 if (mv < maxtr)
 if (v > nomv + Sensitivity)
 return mv+1;
 if (mv > 0)
 if (v < nomv - Sensitivity)
 return mv-1;
 return mv;
}
/* --- move the joystick cursor based on joystick motion --- */
static void movecursor(void)
{
 int x, y;
 if (SpeedTimer == 0 && up) {
 _AH = 0x84;
 _DX = 1;
 geninterrupt(0x15);
 x = _AX;
 y = _BX;
 x = adjust(x, nomx, mx, 79);
 y = adjust(y, nomy, my, 24);
 if (x != mx y != my) {
 if (ff == LagTime ff == 0) {
 Cursor(mx, my);
 Cursor(x, y);
 mx = x;
 my = y;
 }
 if (ff > 0)
 --ff;
 }
 else
 ff = LagTime;
 SpeedTimer = SpeedCtr;
 }

}
/* ---- the joystick button was pressed ---- */
static int buttondown(void)
{
 int newkey = 0;
 Cursor(mx, my);
 puttext(sx+1,sy+1,sx+WD,sy+HT,vsave);
 if (mx < sx mx > sx+WD-1 my < sy my > sy+HT-1) {
 /* ---- hit outside of the window ----- */
 sx = mx;
 sy = my;
 if (sx > 79-WD) /* keep within the screen area */
 sx = 79-WD;
 if (sy > 25-HT)
 sy = 25-HT;
 Unl = 0;
 }
 else {
 /* ---------- hit inside the window ----------- */
 kd = kds;
 /* --- see if a key was hit ---- */
 while (kd->wd != 0) {
 if (my == kd->y+sy)
 if (mx >= kd->x+sx && mx <= kd->x+sx+kd->wd-1)
 break;
 kd++;
 }
 if (kd->wd != 0) {
 /* ---- hit a key in buttondown ---- */
 int i = 0, scan;
 if (Shift)
 i = 1;
 else if (Ctrl)
 i = 2;
 else if (Alt)
 i = 3;
 scan = kd->sk[i].sc & 255; /* get scan code */
 switch (scan) {
 case 0xfc: /* Unl */
 Unl++;
 break;
 case 0xfd: /* Alt */
 Alt ^= TRUE;
 break;
 case 0xfe: /* Shift */
 Shift ^= TRUE;
 break;
 case 0xff:
 Ctrl^=TRUE; /* Ctrl */
 break;
 default:
 /* --- get key mask --- */
 newkey = (scan << 8) (kd->sk[i].ky & 255);
 Unl = 0;
 break;
 }
 }
 else
 Unl = 0;

 }
 DisplayKB();
 Cursor(mx, my);
 return newkey;
}
/* ----- Keyboard BIOS ISR ------- */
static void interrupt int16(IREGS ir)
{
 int sw, i, j;
 int func = (ir.ax >> 8) & 0xff;
 static unsigned int newkey = 0;
 static int buttonup = FALSE;
 static int sample[WD];
 char *cp;
 int *ip;

 /* --- display the keyboard if it is not up --- */
 if (!up) {
 DisplayKB();
 Cursor(mx, my);
 up = TRUE;
 }
 /* --- for read key bios call, loop until key pressed --- */
 if (func == READKEY) {
 int flg = ZEROFLAG;
 while (flg & ZEROFLAG) {
 _AH = KEYSTATUS;
 geninterrupt(KEYBOARD); /* this will call myself */
 flg = _FLAGS;
 }
 }
 /* ---- test to see if keyboard is really displayed ----- */
 gettext(sx+1,sy+1,sx+WD,sy+1,sample);
 cp = nk[Shift][0];
 ip = sample;
 for (i = 0; i < WD; i++)
 if (((*cp++) & 255) != ((*ip++) & 255))
 break;
 if (i < WD) {
 /* ---- keyboard is not on the screen ---- */
 for (j = 0; j < sy; j++) {
 /* ----- test to see if screen has scrolled ---- */
 gettext(sx+1,j+1,sx+WD,j+1,sample);
 cp = nk[Shift][0];
 ip = sample;
 for (i = 0; i < WD; i++)
 if (((*cp++) & 255) != ((*ip++) & 255))
 break;
 if (i == WD) {
 /* --- restore the video buffer --- */
 Cursor(mx, my-(sy-j));
 puttext(sx+1,j+1,sx+WD,j+HT,vsave);
 break;
 }
 }
 if (i < WD)
 Cursor(mx, my);
 DisplayKB();
 Cursor(mx, my);

 }
 /* ----- Test joystick Button ---- */
 _AH = 0x84;
 _DX = 0;
 geninterrupt(0x15);
 sw = _AL;
 if ((sw & BUTTONMASK) == 0) {
 /* ---- button is down ---- */
 if (buttonup) {
 newkey = buttondown();
 buttonup = FALSE;
 }
 }
 else
 buttonup = TRUE;
 if (Unl > 1) {
 /* --- the Unl simulated function key --- */
 if (up) {
 puttext(sx+1,sy+1,sx+WD,sy+HT,vsave);
 up = FALSE;
 }
 if (unload()) {
 _AX = ir.ax; /* pass into keyboard BIOS */
 geninterrupt(KEYBOARD);
 ir.ax = _AX;
 ir.fl = _FLAGS;
 }
 Unl = 0;
 return;
 }
 if (func != READKEY)
 movecursor();
 if (newkey) {
 /* --- user selected a keystroke with joystick --- */
 ir.ax = newkey;
 if (func == READKEY)
 newkey = 0;
 else
 ir.fl &= ~ZEROFLAG;
 }
 else {
 _AX = ir.ax;
 (*old16)();
 ir.ax = _AX;
 ir.fl = _FLAGS;
 }
 if (func == READKEY && up) {
 int ch = ir.ax & 0xff;
 int sc = ir.ax & 0xff00;
 if (sc == 0 ch < ' ' ch > 127) {
 Cursor(mx, my);
 puttext(sx+1,sy+1,sx+WD,sy+HT,vsave);
 up = FALSE;
 }
 }
}
/* --------- unload the resident program --------- */
static int unload(void)
{

 if (getvect(KEYBOARD) == int16) {
 if (getvect(TIMER) == newtimer) {
 if (*inDOS) {
 unsigned far *env = MK_FP(_psp, 0x2c);
 /* --- restore interrupt vectors --- */
 setvect(KEYBOARD, old16);
 setvect(TIMER, oldtimer);
 /* ---- free memory ---- */
 freemem(*env);
 freemem(_psp);
 return TRUE;
 }
 }
 }
 return FALSE;
}














































June, 1993
ALGORITHM ALLEY


Telephonic Mnemonics and the Chocolate Coefficient


 This article contains the following executables: ALLEY.ARC


Tom Swan


Flying with a computer is no fun anymore. I remember when just whipping out my
laptop at 30,000 feet would cause all sorts of excitement. Now nobody
notices--except, that is, when I playfully bring up an air-traffic control
simulation, which almost guarantees a double take from the flight crew.
Not long ago, I was on an especially tedious flight--one of those that takes
off and lands so many times you get the idea the pilot can't decide whether to
fly the plane or drive it all the way to LA. The joker next to me wanted to
talk; I didn't, so I buried my nose in the "complimentary magazine that I'm
welcome to take with me."
That's when I saw an advertisement that led to this column. It listed a
company's mnemonic telephone number, something like 1-800-555-BOAT. The
mnemonic, of course, is supposed to help you remember the number by
association.
Maybe that's true, but I always find these telephonic mnemonics hard to dial.
So, having little else to do while the pilot perfected his landing technique,
I decided to set up my laptop and write a program to generate the number for a
telephone "name." But why stop there? I figured I could also write a program
to output all possible names for any given number. The problem to solve was:
What are all the possible words formed by a given telephone number? The
solution took me back to my early programming days when I spent hours poking
around with short but interesting algorithms known as permutations.


Permuting Computing


Permutation algorithms almost always use recursion, not because it's required
(it never is), but because without it the algorithms are more difficult to
understand. Example 1, Algorithm #2, shows one of the simplest permutation
algorithms.
Example 1: Pseudocode for Algorithm #2 (Permutation One).

 procedure Permute(n: Integer);
 begin
 if n = 1 then Show array else
 begin
 Permute(n-1);
 for i <-- 1 to n-1 do
 begin
 Swap a[i] and a[n];
 Permute(n-1);
 Swap a[i] and a[n];
 end;
 end;
 end;

I found Algorithm #2 buried in an illustration in Niklaus Wirth's book,
Algorithms+Data Structures=Programs. (Wirth doesn't explain the method.) The
technique uses an array a of n integer values, initialized as a[1]=1,
a[2]=2,..., a[n]=n. Passing n to Permute calls another procedure Show array
for every possible combination of n values.
The algorithm works by making two recursive calls--unusual in recursive
algorithms, which typically call themselves only once per recursion. On any
level, if n equals 1, the procedure has reached the first array position and
it displays the current permutation. Otherwise, a For loop works in the
opposite direction, swapping pairs of values from 1 to n-1, and again calling
Permute recursively for each iteration. The process resembles the way most
people would permute an array of values manually.
Listing One (page 168) implements Algorithm #2 in Pascal. For a better
understanding of how the algorithm works, set MAX to 3 and single-step the
program in a debugger.
Example 2, Algorithm #3, shows a permutation method from Robert Sedgewick's
book, Algorithms in C++ (Addison Wesley, 1992). The original algorithm uses an
input array defined as array[O .. N]. My slightly modified version works with
an array[1 .. N]. Listing Two (page 168) implements Algorithm #3. The method
requires the input array to be initialized to all 0s. Essentially, the
algorithm uses 0 as a flag that indicates more work to be done on each level
of the recursion. Passing 1 to Permute causes the procedure to "visit" values
from 1 to n as though they were nodes in a graph. In this sense, Algorithm #3
generates permutations by searching all paths to all nodes.
Example 2: Pseudocode for Algorithm #3 (Permutation Two).

 procedure Permute(n: Integer);
 begin
 pos <-- pos + 1;
 a[n] <-- pos;
 if pos = MAX then Show array;
 if pos <> MAX then
 for i <-- 1 to MAX do
 if a[i] = 0 then Permute(i);
 pos <-- pos-1;
 a[n] <-- 0;
 end;

The disadvantages of Algorithm #3 are the use of a unique flag value and the
requirement that array values be integers. Algorithm #2 can permute any data,
not only integers. The main advantage of Algorithm #3 is the single recursive
call, which probably reduces stack use over Algorithm #2, although I haven't
tested that assumption. So, what's all this got to do with telephones and
chocolate? Patience, patience.



The Chocolate Coefficient


The permutations produced by Algorithms #2 and #3 solve the relatively simple
problem of how to arrange a sequence in all possible ways. The count of these
arrangements equals the factorial of the number of items. There are 24 ways to
arrange the four-element sequence ABCD, and of course, the factorial of 4 is
24.
More complex permutations, such as telephone "names," require different
techniques. In his delightful book, Innumeracy (Vintage, 1988), John Allen
Paulos illustrates a similar proble with an ice-cream shop. Given 28 flavors,
how many triple-decker ice-cream cones can the shop claim to offer?
The simple answer equals the number of possible ways to arrange 28 items in
groups of threes, or 28*27*26. If you don't understand this, consider how many
pairs of letters there are in the four-letter sequence ABCD. Disallowing
duplicate pairs such as AA and BB, there are 12 (4*3) groups of two letters:
AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB, and DC. Likewise, there are 24
(4*3*2) combinations of three letters. If we could have 28 letters from which
to choose, we could make 19,656 groups of three, or 28*27*26.
But wait a sec. If A is chocolate, B is vanilla, and C is strawberry, is it
proper to consider cone ABC to be different from CBA? Probably not, unless
you're the fussy type who insists on always having chocolate on top. What we
really want is the number of unique ways to arrange 28 flavors in groups of
three.
Deriving this value takes additional work. Given any set of three flavors A,
B, and C, there are six possible permutations: ABC, ACB, BAC, BCA, CAB, and
CBA. As just mentioned, this count equals the factorial of the number of items
(6 is the factorial of 3). As ice-cream cones, all six combinations are
equivalent, so the number of unique triple-deckers you can make from 28
flavors equals the number of nonunique combinations (19,656) divided by the
factorial of 3, or 3276. This final value is called the "combinatorial
coefficient." (If I ever open an ice-cream shop, I'll name it The Chocolate
Coefficient. The first customer to guess the meaning of the name eats free.)
Example 3, Algorithm #4, and Listing Three (page 168) put these ideas to the
test. Run the program and enter 3, then 28 to compute the number of nonunique
and unique combinations of three items selected from a set of 28.
Example 3: Pseudocode for Algorithm #4 (Combinatorial Coefficient).

 answer <-- elements;
 for i <-- 1 to selections-1 do
 begin
 elements <-- elements-1;
 answer <-- answer * elements;
 end;
 Write("Nonunique combinations = ",
 answer);
 answer <-- answer / Factorial(selections);
 WriteIn("Unique combinations = ", answer);



Telephonic Mnemonics


Unfortunately, none of the methods I've explained so far exactly solves the
problem of generating all possible names for a given telephone number. In this
case, we need a permutation technique that restricts letter groups to
positions dictated by a telephone number. (Back at the ice-cream shop, you'd
have a similar restriction if, say, you permitted the middle scoop to be only
chocolate, vanilla, or strawberry.)
And there's another complication. Most, but not all, telephone digits are
associated with three letters. The digit 2 is assigned ABC; 3 is given DEF,
and so on. Digits 0 and 1 have no letters. Letters Q and Z are missing
altogether. AT&T adopted this configuration in 1918 as a means of identifying
the burgeoning list of directories. Digit 0 was reserved for quick access to
the operator; digit 1, to signal another directory. Letters Q and Z, rarely
used in directory names, were deleted to make the remaining 24 letters evenly
divisible by the remaining eight numbers.
To keep the program simple, I pretend that digits 0 and 1 are represented by
three spaces each. To associate letters and digits, I use an array of
three-character strings indexed by telephone digits 0 to 9. Given that
structure, it's easy to display the number for any telephone "name." Listing
Four (page 168) shows the result. There's no corresponding algorithm because
the program merely substitutes numbers for letters.
It's not as simple, however, to display all possible names for a given number.
Example 4, Algorithm #5, shows the algorithm I used to solve this problem. As
in the other permutation methods, this one relies on recursion to arrange a
given set of letters in an output string.
Example 4: Pseudocode for Algorithm #5: Telephonic Mnemonic.

 procedure Permute(n: Integer);
 begin
 digit <-- ValueOfChar(inString[n]);
 for i <-- 1 to 3 do
 begin
 outString[n] <-- TelDial[digit][i];
 if (n = Length(inString)) then
 Write(outString)
 else
 Permute(n + 1);
 end;
 end;

Algorithm #5 uses an array TelDial of three-character strings. The expression
TelDial[3] equals the string DEF. Treating this array as having two
dimensions, a second subscript identifies a single letter. Thus TelDial[3][2]
is the second letter in the string associated with the telephone digit 3--in
this case, the letter E.
The method uses a For loop to set digit n in an output string to each of the
possible letters for a given telephone digit. The procedure calls itself
recursively for the next digit n+1, until reaching the end of the input
string, at which point the code outputs the current permutation.
Listing Five (page 168) implements Algorithm #5 in a program that displays all
possible names for any number. Enter your number or another to see what it can
spell. (Telephone numbers with no 0s or 1s give the best results.)
The number of possible names for a single telephone number is not very large.
Because there are only three letters per digit, a seven-digit number has only
3{7} or 2187 unique names (ignoring the problem of spaces for digits 0 and 1).
A four-digit number generates only 3{4} or 81 names.
The total number of names for all possible numbers, however, is much more
interesting. Using the formula for a combinatorial coefficient (hold the
chocolate), all possible seven-digit telephone numbers have a whopping 1.7
billion permutations--that is, nonunique, seven-letter combinations of 24
letters. Adding a three-digit area code to the mix takes the total skyward
toward the half-trillion mark. If there is a word that's not in there
somewhere, I probably can't pronounce it anyway.


Your Turn


After entering your name or another phrase into Listing Four, try feeding the
resulting number back into Listing Five to find all the other names your
number-name can spell. For example, one of my telephonic-mnemonic aliases is
VON-RYAN, which has a nice ring to it, don't you think? Perhaps there's some
cryptographic value in these permutations, but for heaven's sake, don't use
telephone number-names for online passwords.
Oh, the lengths I'll go to fight boredom at high altitudes.


_ALGORITHM ALLEY_
by Tom Swan


[LISTING ONE]

{ perm1.pas -- Algorithm #2: Permutation One by Tom Swan }
program Perm1;

const
 MAX = 4;

var
 i: Integer;
 a: array[1 .. MAX] of Integer;

{ Display contents of global array a }
procedure ShowArray;
var
 i: Integer;
begin
 for i := 1 to MAX do
 Write(a[i]:3);
 Writeln
end;

{ Arrange global array a in all possible ways }
procedure Permute(n: Integer);
var
 i, temp: Integer;
begin
 if n = 1 then ShowArray else
 begin
 Permute(n - 1);
 for i := 1 to n - 1 do
 begin
 temp := a[i]; { Swap a[i] and a[n] }
 a[i] := a[n];
 a[n] := temp;
 Permute(n - 1);
 temp := a[i]; { Restore a[i] and a[n] }
 a[i] := a[n];
 a[n] := temp
 end
 end
end;

begin
 for i := 1 to MAX do
 a[i] := i;
 Permute(MAX)
end.







[LISTING TWO]

{ perm2.pas -- Algorithm #3: Permutation Two by Tom Swan }
program Perm2;

const
 MAX = 4;

var
 pos: Integer;
 i: Integer;
 a: array[1 .. MAX] of Integer;

{ Display contents of global array a }
procedure ShowArray;
var
 i: Integer;
begin
 for i := 1 to MAX do
 Write(a[i]:3);
 Writeln
end;

{ Arrange global array a in all possible ways }
procedure Permute(n: Integer);
var
 i: Integer;
begin
 pos := pos + 1;
 a[n] := pos;
 if pos = MAX then ShowArray;
 if pos <> MAX then { Optional }
 for i := 1 to MAX do
 if a[i] = 0 then Permute(i);
 pos := pos - 1;
 a[n] := 0
end;

begin
 pos := -1;
 for i := 1 to MAX do
 a[i] := 0;
 Permute(1)
end.







[LISTING THREE]

{ coco.pas -- Algorithm #5: Combinatorial Coefficient by Tom Swan }
program Coco;

var
 i: Integer;
 selections: Integer;

 elements: Integer;
 answer: Real;

function Factorial(n: Integer): Real;
var
 result: Real;
begin
 result := 1;
 while (n > 1) do
 begin
 result := result * n;
 n := n - 1
 end;
 Factorial := result
end;

begin
 Write('Number of selections? ');
 Readln(selections);
 Write('Out of how many elements? ');
 Readln(elements);
 answer := elements;
 for i := 1 to selections - 1 do
 begin
 elements := elements - 1;
 answer := answer * elements
 end;
 Writeln('Nonunique combinations = ', answer:0:0);
 answer := answer / Factorial(selections);
 Writeln('Unique combinations = ', answer:0:0)
end.







[LISTING FOUR]

{ telname.pas -- Display number for telephone "name" by Tom Swan }

program TelName;

var
 i: Integer;
 TelNumber: String;
 TelDial: array[0 .. 9] of String[3];
 LetterSet: set of Char;

{ Return telephone digit that corresponds to a letter C. }
function DigitToLetter(C: Char): Char;
var
 i, j: Integer;
begin
 C := Upcase(C);
 for i := 0 to 9 do
 for j := 1 to 3 do
 if (C = TelDial[i][j]) then

 begin
 DigitToLetter := Chr(i + ord('0'));
 Exit
 end;
 DigitToLetter := C { Default }
end;

begin
 TelDial[0] := ' '; TelDial[1] := ' ';
 TelDial[2] := 'ABC'; TelDial[3] := 'DEF';
 TelDial[4] := 'GHI'; TelDial[5] := 'JKL';
 TelDial[6] := 'MNO'; TelDial[7] := 'PRS';
 TelDial[8] := 'TUV'; TelDial[9] := 'WXY';
 LetterSet := ['A' .. 'P', 'R' .. 'Y'];
 Write('Enter telephone name: ');
 Readln(TelNumber);
 for i := 1 to length(TelNumber) do
 Write(DigitToLetter(TelNumber[i]));
 Writeln
end.





[LISTING FIVE]

{ telnum.pas -- Algorithm #6: Telephonic Mnemonic by Tom Swan }

program TelNum;

var
 TelNumber: String;
 TelDial: array[0 .. 9] of String[3];
 inString: String;
 outString: String;

function ValueOfChar(c: Char): Integer;
begin
 ValueOfChar := Ord(c) - Ord('0')
end;

procedure Permute(n: Integer);
var
 i, digit: Integer;
begin
 digit := ValueOfChar(inString[n]);
 for i := 1 to 3 do
 begin
 outString[n] := TelDial[digit][i];
 if (n = Length(inString)) then
 Writeln(outString)
 else
 Permute(n + 1)
 end
end;

begin
 TelDial[0] := ' '; TelDial[1] := ' ';

 TelDial[2] := 'ABC'; TelDial[3] := 'DEF';
 TelDial[4] := 'GHI'; TelDial[5] := 'JKL';
 TelDial[6] := 'MNO'; TelDial[7] := 'PRS';
 TelDial[8] := 'TUV'; TelDial[9] := 'WXY';
 Write('Telephone number? ');
 Readln(inString);
 if (Length(inString) > 0) then
 begin
 outString := inString;
 Permute(1)
 end
end.


Example 1: Pseudocode for Algorithm #2 (Permutation One)

procedure Permute(n: Integer);
begin
 if n = 1 then Show array else
 begin
 Permute(n - 1);
 for i <- 1 to n - 1 do

 begin
 Swap a[i] and a[n];
 Permute(n - 1);
 Swap a[i] and a[n];
 end;
 end;
end;


Example 2: Pseudocode for Algorithm #3 (Permutation Two)

procedure Permute(n: Integer);
begin
 pos <- pos + 1;
 a[n] <- pos;
 if pos = MAX then Show array;
 if pos <> MAX then
 for i <- 1 to MAX do
 if a[i] = 0 then Permute(i);
 pos <- pos - 1;
 a[n] <- 0;
end;


Example 3: Pseudocode for Algorithm #4 (Combinatorial Coefficient)

answer <- elements;
for i <- 1 to selections - 1 do
begin
 elements <- elements - 1;
 answer <- answer * elements;
end;
Write("Nonunique combinations = ", answer);
answer <- answer / Factorial(selections);
Writeln("Unique combinations = ", answer);



Example 4: Pseudocode for Algorithm #5: Telephonic Mnemonic

procedure Permute(n: Integer);
begin
 digit <- ValueOfChar(inString[n]);
 for i <- 1 to 3 do
 begin
 outString[n] <- TelDial[digit][i];

 if (n = Length(inString)) then
 Write(outString)
 else
 Permute(n + 1);
 end;
end;














































June, 1993
UNDOCUMENTED CORNER


Spying on WinHelp




Ron Burk


Ron is the editor of Windows/DOS Developer's Journal and is working on a book
entitled WinHelp for Programmers and Technical Writers. You can contact him
via CompuServe (70302,2566), BIX (rlburk), or Internet (ronb@rdpub.com).


Online help? What could possibly be undocumented about that? In Windows 3.1,
Microsoft's WinHelp is a potentially marvelous facility that lets you attach
macros, and any routine in a dynamic link library (DLL), to WinHelp menu
items, buttons, and hotspots. For example, when a user clicks on a portion of
a graphic in a Windows .HLP file, it can automatically invoke a function
you've placed in a DLL.
The documentation for WinHelp appears in the Windows 3.1 SDK Programmer's
Reference, Volume 4: Resources, Chapter 15 ("Windows Help Statements and
Macros"); the WinHelp() function is documented in Volume 2: Functions. This
documentation explains the basics of creating new macros from DLL functions
with RegisterRoutine(), then attaching macros to buttons with
ChangeButtonBinding(), attaching macros to menu items with
ChangeItemBinding(), and so on. Pretty cool stuff for what most of us think of
as just a simple online help facility. I've seen one company (FS Forth-Systeme
in Breisach am Rhein, Germany) that's put together entire applications using
just .HLP files and DLLs. No executable file!
Such power should make it possible to provide exciting help for Windows
applications. But anyone who's ever tried to put together such sophisticated
online documentation for a windows program has probably run into some annoying
gaps in Microsoft's own documentation for WinHelp. For example, how do you let
a user print or copy multiple topics at the same time? How do you declare DLL
routines with return values? How does the WinHelp() function connect up with
WINHELP.EXE, anyway?
This month's "Undocumented Corner" comes to us from Ron Burk, editor of the
superb magazine, Windows/DOS Developer's Journal. If you've been reading
W/DDJ, you know that Ron has in recent issues been putting together several
big pieces of the WinHelp puzzle. The February 1993 W/DDJ has Ron's article
"Automating Help Topic Extraction," which included a brief sidebar entitled
"Undocumented WinHelp." The saga continued in the March 1993 W/DDJ, where Ron
discussed "Automatic Help Topic Printing" (how to create help files that can
print multiple topics: sounds simple, but it isn't). The source code from this
second article is on CompuServe, in HLPPRN.ZIP in Library 7 (R&D Publications)
of CLMFORUM. According to Ron, it is a fairly popular download item, since the
question of how to print multiple topics comes up about once a week in the
CompuServe WINSDK forum.
Dr. Dobb's readers are fortunate to have Ron contribute this month's
"Undocumented Corner," where he uncovers yet more WinHelp undocumentedness.
One important piece of information here is the that Microsoft's multimedia
VIEWER.EXE provides essentially the same interface as WINHELP.EXE. Once you
know this one little fact, you can use the better (albeit difficult-to-find)
Viewer documentation to supplement the WinHelp documentation.
Ron also shows that WinHelp() communicates with WINHELP.EXE via a WM_WINHELP
message (VIEWER.EXE uses a WM_WINDOC message). You won't find these messages
listed in WINDOWS.H, or even in the chapter of Undocumented Windows devoted to
messages. Why? Because these are registered window messages. Windows provides
the capability of defining new system-wide messages: You pass
RegisterWindowMessage() a string (such as "WM_WINHELP"), and it returns a
message number. If someone has already registered that string, you get back
the same message number. Hmm, sounds like a hash table. In fact, it is a hash
table: Registered window messages are just atoms; you can use the ATOMWALK
program from Undocumented Windows to find the WM_WINHELP atom in the USER's
atom table. This means that registered window messages can be turned back into
their original strings using the GetAtomName() function. As noted later,
Borland's WinSight debugging tool takes advantage of this undocumented
"implementation detail."
At the end of his article, Ron notes the crying need for someone to figure out
the format of .HLP and .MVB files. We can only imagine the quality and
quantity of third-party help tools we would have today if Microsoft would
provide this information--or if some kind soul out there would
reverse-engineer it. If you've figured out this or any other interesting
aspect of Windows, DOS, NetWare, or whatever, please write to me via
CompuServe (76320,302) or Internet (andrew@pharlap.com).
--Andrew Schulman
Creating good online help for Windows applications is one of those things that
seems to often fall through the cracks. Studying Microsoft's help compiler
documentation is not a high priority for most programmers, and most technical
writers cannot create quality help systems without some programming support.
You can see the results on the Windows desktop--a handful of major
applications supply context-sensitive online help complete with conceptual
information, while many applications supply nothing more than a few lines of
help text for each menu item; some applications offer essentially no online
help at all. Some applications offer online help that sparkles with color and
special effects, while most online help is black and white with only a few
hypertext jumps to break up the monotony.
Software alone can't help you create quality online help, but software tools
can make the task of creating online help less tedious for both programmers
and writers. This article covers a bit of the history of the Windows help
system and then shows how the undocumented WM_WINHELP message makes it easy to
create a program to help debug an application's online help.


WinHelp History


Part of the apathy programmers exhibit toward online Windows help is no doubt
due to the help system itself. Before Windows 3.1, Microsoft's scheme for
online help was mechanical and allowed little room for programming creativity.
With Windows 3.1, however, the help system blossomed and offered a new degree
of programmability and extensibility. Windows 3.1 online help provides a
simple set of macros that let you connect built-in actions (such as printing
the current topic) to buttons, menu items, accelerator keys, and text or
graphical hot spots. More importantly, Windows 3.1 lets you declare external
DLL routines to augment the built-in macros, much as Visual Basic and
WinWord's Word Basic provide access to external DLL functions.
Once you start exploring these new features, however, you quickly run into
limitations of the help system and its documentation. For example, it would be
handy if your help file could make decisions at run time, based on the return
value of an external DLL function. Unfortunately, Microsoft's Windows 3.1 SDK
documentation does not reveal how to do this. In fact, the Windows 3.1 help
system has a great deal of functionality that is either only hinted at or not
mentioned at all in the documentation Microsoft supplied to SDK programmers.
Instead, Microsoft revealed these capabilities in, of all places, the
Multimedia Development Kit (MDK).
The MDK manual that documents the Multimedia Viewer (which is used in
Microsoft CD-ROM titles such as Cinemania) also documents, almost
inadvertently, a variety of new functions in WINHELP.EXE. I say
"inadvertently" because I found no place in the Viewer documentation that
mentions that these features work the same in WINHELP.EXE as they do in
VIEWER.EXE. However, it seems obvious that the Multimedia Viewer is
constructed as an extension to the Windows 3.1 help system. The Viewer
compiler adds some information not found in a normal help file (mainly, an
index of all the words in your help text) and Viewer itself is a compatible
replacement for WINHELP.EXE that provides an enhanced user interface. In fact,
you can take almost any .HLP file, rename it to a .MVB file (the file
extension that Viewer files use), and Viewer will display it with no problems!
The few programmers who purchased the MDK and had any interest in online help
discovered that you can have WINHELP.EXE notify your DLL of a variety of
useful events (such as when the user jumps to a different help topic or a
different help file). This documentation also revealed that you can obtain a
set of pointers to useful functions within WINHELP.EXE. The SDK documentation
hints that you can add arbitrary files to your Windows help file (by listing
them in the [BAGGAGE] section), but only the MDK's Viewer manual exposed the
gory details of how you can use WINHELP.EXE's internal functions to perform
I/0 to the help file's internal file system and actually read the baggage
files you placed there when you compiled the help file. You can also create
custom "embedded windows" in your help topics, within which you can perform
animation, display 256-color bitmaps, or do anything else you can think of
that requires a custom window.
All in all, the Multimedia Viewer manual is a goldmine of programming
documentation for creating snazzy online help with WINHELP.EXE. The
undocumented key to tapping this goldmine is the fact that WINHELP.EXE
supports many of the same features as VIEWER.EXE.
At one time you really had to buy the MDK to uncover these Windows help system
features; Microsoft Press published all the manuals in the MDK as standalone
books--except for the Multimedia Viewer manual. The street price of the MDK is
over $400.00, but you can now acquire this documentation less expensively. The
Microsoft Developer's Network (MSDN) CD-ROM has a Help Authoring Guide that
contains much of the information that first appeared in the Multimedia Viewer
Developer's Guide, with WinHelp examples substituted for Viewer examples.
(Incidentally, the MSDN CD-ROM itself is built using the Viewer.)
Alternatively, if you have access to CompuServe, you can download the Help
Authoring Guide, HAG.HLP in library 6 ("Unsupported Tools") from forum
MSDNLIB.


WinHelp()


The Windows API provides a single function, WinHelp(), for managing your
application's online help. WinHelp() starts up the help engine (WINHELP.EXE),
if it isn't already running, and uses it to display topics in one or more help
files. This function marks the dividing line between the programmer's job and
the technical writer's job. Everything on the calling side (the application
and the parameters it passes to WinHelp()) is the programmer's responsibility,
while everything on the callee side (principally, what help text gets
displayed) is the technical writer's responsibility. As the arbitrator between
programmers and writers, it's no wonder that WinHelp() can be the focus of
some exasperation during application development.
WinHelp() really performs several different functions, just as a window
call-back procedure does. In fact, WinHelp()'s parameter list looks a bit like
a window callback procedure. The first two arguments, a window handle and
help-file name, are basically for identification; they identify the calling
application and the help file to operate on. The third argument is an integer
"command" that specifies the action WinHelp() should take, and the final
argument is a long integer of "additional data" whose meaning depends on the
command selected. Many variations are possible, but most applications interact
with WinHelp() in a fairly simple way. Applications typically supply a menu
selection that lets the user see the help file's index. Selecting that menu
item usually results in a call like this:
WinHelp(hWnd, "myhelp.hlp",
HELP_CONTENTS, 0);
The user can then traverse the help file interactively without further help
from the application.
A good application will offer context-sensitive online help. That requires
that every menu item and dialog-box control have some associated help topic
that explains its function. So long as the programmer assigns each such item a
unique ID, this is fairly straightforward. In compiling the help file, the
technical writer typically includes the same file of #defines that the
programmer uses. The technical writer will also have to include a section that
tells the help compiler which help topic to display for each of the #defines
of interest. The application responds to a request for online help with a call
something like this:
WinHelp(hwnd, "myhelp.hlp",
HELP_CONTEXT, CtrlId);
You can also associate help topics with keywords, and applications can request
that the topic associated with a particular keyword be displayed. WINHELP.EXE
has other, more esoteric features, but most applications do not get beyond the
basics.
This is all well and good, but a good-sized application can have a great many
items that require context-sensitive help, and getting them all correct can be
tedious. Worse, WinHelp() provides little in the way of useful information
when things go wrong. For example, if the programmer adds a menu item that the
technical writer (and hence, the help file) doesn't know about, requesting
context-sensitive help for that menu item produces the error message "Topic
not found." Other kinds of mistakes can result in no feedback at all.
I decided that it would be nice to have a HelpSpy utility that displayed each
call to WinHelp(), including the details of the parameters passed. That would
give technical writers enough information when errors occurred that they could
solve many problems without waiting for a programmer. The utility would also
give programmers an easy way to verify that they were passing their parameters
to WinHelp() correctly--some of the more esoteric WinHelp() commands have
several fields of data to get right. A more peripheral benefit of a HelpSpy
utility would be to allow you to spy on help files you did not create.


Looking for WinHelp() Messages


Unfortunately, there's no documentation on how WinHelp() executes and
communicates with WINHELP.EXE. The most direct way to spy on WinHelp() would
be to take over the function itself. Although it's possible to intercept any
Windows function on-the-fly by patching its entry point, I decided to snoop
around first and see exactly how WinHelp() communicates with WINHELP.EXE.
WinHelp() could have been designed to communicate with WINHELP.EXE in a number
of ways, but I started by looking at window messages. I used WinSight, a tool
for spying on window messages that comes with Borland C++ 3.1. WinSight is
based on Michael Geary's original Spy, which appeared in the 1987 special
issue of BYTE on the IBM PC. WinSight displays the window-class names of the
help windows that WINHELP.EXE creates as well as any messages they receive.
Sure enough, WinSight showed that every time I issued a call to WinHelp(), the
main help window received a registered message called WM_WINHELP.

Registered messages are a method Microsoft recommends for interapplication
messaging. Each application that wants to communicate can call
RegisterWindowMessage() with the same symbolic name and receive a unique
message number that doesn't conflict with any existing window message numbers.
The authors of WinSight apparently did a little reverse-engineering of their
own to display the symbolic string associated with a registered window
message, since the SDK does not document how to discover the strings
associated with a given registered message number. [Editor's Note: As
explained in Undocumented Windows, pp. 140-1, registered window messages are
located in USER's atom table. Going backward from a registered message number
to a string is simply a matter of calling GetAtomName() with USER's DS.] This
feature of WinSight made my snooping easier--now I knew what string to pass to
RegisterWindowMessage() to obtain the number of the window message I needed to
intercept.
As WinSight showed the messages arriving, I watched the values for wParam and
lParam. I knew that reverse-engineering the data should be fairly easy, since
it had to contain all of the data I was passing to WinHelp(). Every call to
WinHelp() includes a window handle and a pointer to the pathname for the help
file, so I started by looking for these. It was easy to see that wParam was
simply the window handle passed to WinHelp().
That left lParam for transmitting all of the rest of the WinHelp() parameter's
data. At every call, the upper 16 bits of lParam seemed to be 0. About the
only way I could see that 16 bits of data could transmit an arbitrary amount
of data was if this was a handle to global memory. At that point, I put away
WinSight and started writing code.


Hooking Messages


Windows makes it fairly easy to intercept window messages, even in other
applications. I used SetWindowsHookEx(), passing it the WH_CALLWNDPROC option,
as you can see in the finished program in Listing One (HELPSPY.C), page 169.
The Microsoft Windows 3.1 SDK documentation claims that the hook function you
pass to SetWindowsHookEx() "must be in a dynamic-link library." This is not
true, but you do have to know what you are doing if you want to put the hook
function inside an application instead of in a DLL.
For small applications, I don't go to the trouble to create a separate DLL
just to contain a hook function. Also, I usually avoid using
MakeProcInstance() in my Windows programs, since the latest crop of Windows
compilers can nearly eliminate the need for it. Since the hook function will
get called in the context of another application (WINHELP.EXE, in this case),
I made sure the compiler-generated code that did not assume that the stack was
in the default data segment (in other words, SS != DS). As with any exported
function, the compiler had to generate prologue code to obtain the correct
value of the local data segment (DS) for the hook function. I was not using
MakeProcInstance(), so the correct value would not be passed in the AX
register. This was a hook function, so the correct value could not be obtained
from the SS register either. The only option remaining was to load the DS from
DGROUP. Loading from DGROUP does prevent you from running more than one
instance of HelpSpy, but I could not see any use in running more than one
instance anyway.
Windows compiler options can be arcane. I try to make my code as vendor
independent as possible, so I've written a simple program, CC.EXE (see
"Availability," page 7), that provides a standard interface to the C/C++
compilers of Microsoft, Borland, and Symantec. Listing Two (page 169) shows
the makefile that uses CC.EXE to create HELPSPY.EXE for any of the three
vendors' compilers. CC.EXE handles creating a default module definition file
and running rc to append the resource file to the executable.


Dumping WM_WINHELP


After I wrote code to hook the desired message, I wanted to test my theory
that the lower 16 bits of lParam were a handle to global memory. Whenever my
hook function received the registered message WM_WINHELP, it did a global lock
on the lower 16 bits of lParam, and then unlocked it. This stub gave me a nice
place to set a breakpoint in a debugger to dump the contents of the global
memory. Figure 1 shows the hex dump I got from a call to:
 WinHelp():WinHelp(Window, "C:\WIN3.1\APPS.HLP", HELP_CONTENTS, 0);
Figure 1: Dump of WM_WINHELP global memory.

 WinHelp(Window, "C:\WIN3.1\APPS.HLP", HELP_CONTENTS, 0);

 0000: 23 0 3 0 0 0 0 0 0 0 0 0 10 0 0 0 #..............
 0010: 43 3a 5c 57 49 4e 33 2e 31 5c 41 50 50 53 2e 48 C:\WIN3.1\APPS.H
 0020: 4c 50 0 45 4c 50 2e 45 58 45 0 0 0 0 0 0 LP.ELP.EXE......
 0030: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...............


 int Size = 0023h
 int Message = 3 (HELP_CONTENTS in WINDOWS.H)
 long Context = 0
 long Unknown = 0
 int PathOffset = 10h int StringOffset = 0
 char Path[] = "C:\WIN3.1\APPS.HLP"

By staring at Figure 1 and similar dumps, the format became obvious, and I
started creating the structure I call HelpParams to describe it. There seemed
to be some garbage at offset 23h, and the first word in the block was 23h, so
I decided that the first word specified the length of the data structure (a
common-enough practice for operating-system data structures) and I added an
integer Size field to HelpParams. The value of the constant HELP_CONTENTS is 3
(see WINDOWS.H), which was precisely the value of the second word in the data
structure, so I added an integer field called Message to HelpParams. The
string containing the name of the help file started at offset 10h, which was
also the value of the seventh word of the data structure, so I added a field
called PathOffset to HelpParams.
Now the process became one of elimination. I passed in each of the various
WinHelp() commands to see where the data was stored in the global memory
array. You can see the final version of HelpParams in Listing One. Most of the
messages are quite simple, but a few, such as HELP_SETWINPOS are more complex.
In general, whenever a command required a variable-length structure or string,
an offset to that structure or string appeared immediately after PathOffset,
so I named that field StringOffset. Whenever a command required a help topic
context number, that number (a long) appeared right after the help command, so
I called that field Context.
That left a hole in my data structure, an integer that I named Unknown. I
never saw it take on a value other than 0, so it seems likely this field is
reserved for future use. The Context field was a little anomalous for commands
that did not require a help topic context number. For some commands
(HELP_COMMAND, HELP_PARTIALKEY, and HELP_SETWINPOS) it was always set to -1.
Otherwise, it was 0. The exception was HELP_MULTIKEY, for which the Context
field seemed to get set to more or less random values. In any case, I had
found all of the data passed to WinHelp(), so I was ready to finish writing
HelpSpy.
One way to quickly write small Windows applications is to use dialog boxes,
and that is what I did with HelpSpy. Listing One (HELPSPY.C) shows the
complete source, with the header file in Listing Three (HELPSPY.H), page 169,
and the dialog-box definition in Listing Four (HELPSPY.RC), page 169. The main
window is simply a dialog box whose entire client area is occupied by a list
box. At initialization time, I store a handle to that list box in a global
variable. When the hook routine receives a WM_WINHELP message, it produces a
formatted dump of the message parameters and adds the resulting string to the
list box. This lets you see the messages as they arrive and scroll back to
look at the history of what has happened. Using a dialog box gave me a simple
but useful user interface with very little coding. Figure 2 shows HelpSpy in
action.
Earlier, I mentioned the close relationship between WinHelp and the Multimedia
Viewer. In fact, you can use Viewer as the engine for your online help files.
The MDK supplies a DLL with a function called "MVAPI" that provides almost
exactly the same interface as WinHelp(). By also monitoring the registered
message that Viewer uses for communication ("WM_WINDOC"), HelpSpy could work
for Viewer as well, so I added a check for that message.


Conclusion


Most programmers know that users only read manuals as a last resort. That
being the case, we need to invest our programming talent in online help as
much as in any other part of the user interface. As we've seen in Windows,
surprisingly, sometimes this will require discovering and depending upon
undocumented information.
Microsoft documented the file format of WinHelp's ancestor (Microsoft's DOS
help system) in the little-known manual, Microsoft Professional Advisor
Library Reference. The format of Windows help files bears only some
resemblance to DOS help files (both use a form of rich text format to describe
the help text), and Microsoft continues to keep the new .HLP and .MVB formats
proprietary despite regular requests from developers. The pressure to liberate
the .HLP file format can only grow as more programmers discover the creative
possibilities buried in WINHELP.EXE.

_UNDOCUMENTED CORNER_
edited by Andrew Schulman

"Spying on WinHelp"
by Ron Burk


[LISTING ONE]

/* helpspy.c - spy on WM_WINHELP messages, system-wide */


#include <stdarg.h>
#include <stdio.h>
#include <string.h>
#include "hookmsg.h"
#include "helpspy.h"

#define MAX_STRING (256)


LRESULT CALLBACK _export HookProc(int, WPARAM, LPARAM);
BOOL CALLBACK _export DlgProc(HWND, UINT, WPARAM, LPARAM);

UINT WM_WINHELP; /* a registered window message */
UINT WM_WINDOC; /* the one that Viewer uses */
HWND ListBox; /* listbox to display messages */
HICON MyIcon;

typedef struct HelpParams {
 int Size;
 int Message;
 long Context;
 long Unknown;
 int PathOffset;
 int StringOffset;
 char Path[1];
 } HelpParams;
static HHOOK MsgHook;
#ifdef __BORLANDC__
 #pragma argsused
#endif
int PASCAL WinMain(HINSTANCE Me, HINSTANCE Previous,
 LPSTR lpszCmdLine, int nCmdShow) {
 if(Previous) {
 MessageBox(NULL, "HelpSpy already running.", "HelpSpy", MB_OK);
 return 0;
 }
 MyIcon = LoadIcon(Me, "HelpSpyIcon");
 WM_WINHELP = RegisterWindowMessage("WM_WINHELP");
 WM_WINDOC = RegisterWindowMessage("WM_WINDOC");
 MsgHook = SetWindowsHookEx(WH_CALLWNDPROC, HookProc, Me, NULL);
 DialogBox(Me, "HelpSpyDialog", NULL, DlgProc);
 UnhookWindowsHookEx(MsgHook);
 return 0;
 }
/* HookProc - pass only WM_WINHELP messages to DumpWinHelp(). */
LRESULT CALLBACK _export HookProc(int Code, WPARAM Param1, LPARAM Param2) {
 void DumpWinHelp(UINT Message, LPARAM Param2);
 typedef struct HMSG {
 LPARAM lParam;
 WPARAM wParam;
 UINT message;
 HWND hwnd;
 } HMSG;
 HMSG *Message = (HMSG *)Param2;
 if(Message->message == WM_WINHELP Message->message == WM_WINDOC)
 DumpWinHelp(Message->message, Message->lParam);
 if(Code < 0)

 CallNextHookEx(MsgHook, Code, Param1, Param2);

 return 0;
 }
/* DumpWinHelp() - dump a WM_WINHELP message. */
void DumpWinHelp(UINT Message, LPARAM Param2) {
 char String[MAX_STRING];
 char *Command = "HELP_UNKNOWN";
 HGLOBAL Handle = (HGLOBAL)Param2;
 HelpParams *Params = (HelpParams *)GlobalLock(Handle);
 switch(Params->Message) {
 case HELP_CONTEXT: Command = "HELP_CONTEXT"; break;
 case HELP_CONTEXTPOPUP: Command = "HELP_CONTEXTPOPUP"; break;
 case HELP_CONTENTS: Command = "HELP_CONTENTS"; break;
 case HELP_SETCONTENTS: Command = "HELP_SETCONTENTS"; break;
 case HELP_KEY: Command = "HELP_KEY"; break;
 case HELP_PARTIALKEY: Command = "HELP_PARTIALKEY"; break;
 case HELP_MULTIKEY: Command = "HELP_MULTIKEY"; break;
 case HELP_COMMAND: Command = "HELP_COMMAND"; break;
 case HELP_SETWINPOS: Command = "HELP_SETWINPOS"; break;
 case HELP_FORCEFILE: Command = "HELP_FORCEFILE"; break;
 case HELP_HELPONHELP: Command = "HELP_HELPONHELP"; break;
 case HELP_QUIT: Command = "HELP_QUIT"; break;
 }
 sprintf(String, "%s(%s", (Message == WM_WINHELP)
 ? "WinHelp" : "MVAPI", Command);
 switch(Params->Message) {
 case HELP_SETCONTENTS :
 case HELP_CONTEXT :
 case HELP_CONTEXTPOPUP :
 sprintf(String+strlen(String), ",%ld)", Params->Context);
 break;
 case HELP_KEY :
 case HELP_PARTIALKEY :
 case HELP_COMMAND :
 sprintf(String+strlen(String), ",'%s')",
 (char *)Params + Params->StringOffset);
 break;
 case HELP_CONTENTS :
 case HELP_FORCEFILE :
 case HELP_HELPONHELP :
 case HELP_QUIT :
 strcat(String, ")");
 break;
 case HELP_SETWINPOS : {
 HELPWININFO *Info = (HELPWININFO *)
 ((char *)Params + Params->StringOffset);
 sprintf(String+strlen(String),
 ",x=%d,y=%d,dx=%d,dy=%d,wMax=%d,\"%s\")",
 Info->x, Info->y, Info->dx, Info->dy,
 Info->wMax, Info->rgchMember);
 break;
 }
 case HELP_MULTIKEY : {

 MULTIKEYHELP *MultiKey = (MULTIKEYHELP *)
 ((char *)Params + Params->StringOffset);
 sprintf(String+strlen(String), ",'%c', '%.*s')", MultiKey->mkKeylist,
 MultiKey->mkSize-sizeof(UINT)-sizeof(BYTE), MultiKey->szKeyphrase);
 break;
 }

 default :
 strcat(String, ")");
 }
 GlobalUnlock(Handle);
 SendMessage(ListBox, LB_INSERTSTRING, 0, (LPARAM)(LPCSTR)String);
 }
/* DlgProc() - Store listbox handle in global variable. */
#ifdef __BORLANDC__
 #pragma argsused
#endif
BOOL CALLBACK _export DlgProc(HWND Dialog, UINT Message,
 WPARAM Param1, LPARAM Param2) {
 if(Message == WM_INITDIALOG) {
 ListBox = GetDlgItem(Dialog, ID_LISTBOX);
 return TRUE;
 }
 else if(Message == WM_PAINT && IsIconic(Dialog))
 {
 PAINTSTRUCT PaintInfo;
 BeginPaint(Dialog, &PaintInfo);
 DrawIcon(PaintInfo.hdc, 0, 0, MyIcon);
 EndPaint(Dialog, &PaintInfo);
 }
 else if(Message == WM_COMMAND)
 if(Param1 == IDOK Param1 == IDCANCEL) {
 EndDialog(Dialog, 0);
 return TRUE;
 }
 return FALSE;
 }








[LISTING TWO]

APP =helpspy
OBJ =$(APP).obj
MODEL =-ml!
#DEBUG =-d
CCFLAGS =$(DEBUG) $(MODEL) -DSTRICT -wed

.rc.res :
 rc -r $*.rc

.c.obj :
 cc -c $(CCFLAGS) $*.c
$(APP).exe : $(OBJ) $(APP).res makefile
 cc $(DEBUG) $(MODEL) $(OBJ) $(APP).res
$(APP).res $(APP).obj : $(APP).h








[LISTING THREE]

#include <windows.h>

#define ID_LISTBOX 101
#define MENU_HELP 201





[LISTING FOUR]

#include "helpspy.h"

HelpSpyDialog DIALOG 15, 131, 306, 41
STYLE WS_POPUP WS_CAPTION WS_SYSMENU WS_MINIMIZEBOX
CAPTION "HelpSpy"
BEGIN
 CONTROL "", ID_LISTBOX, "LISTBOX", LBS_NOTIFY LBS_DISABLENOSCROLL 
 WS_CHILD WS_VISIBLE WS_VSCROLL, 2, 0, 304, 42

END
HelpSpyIcon ICON "helpspy.ico"




































June, 1993
PROGRAMMER'S BOOKSHELF


Fuzzy Logic? Get Real!




Jonathan Erickson


Walking around the exhibit hall of the recent Second IEEE International
Conference on Fuzzy Systems, you're struck by the parallels between it and
DDJ's 1993 February issue. The cover of that issue, if you recall, proclaimed
"Cognitive Computing: Finding Its Way into the Mainstream." With the similar
slogan "Real tools for the real world," HNC (formerly Hecht-Nielsen
Neurocomputer) was typical of many of the exhibits at the IEEE conference.
Fuzzy logic still has to prove itself.
The problem fuzzy logic has in the U.S. is that it's the Rodney Dangerfield of
computing--it just can't get no respect, at least in the commercial end of the
business. Both Colin Johnson's article "What Is Cognitive Computing?" (DDJ,
February 1992) and Business Week's special report on AI techniques (November
1992) took strides to dispel the notion that neural nets, genetic algorithms,
and fuzzy systems are interesting from a research perspective, but otherwise
impractical. As both articles clearly demonstrate, fuzzy logic and its cousins
are finally making inroads into real-world, mainstream computing.
One question that comes to mind, therefore, is: If fuzzy logic is more than
just a glimmer in a research scientist's eye, how did it get stuck with the
"impractical" rap. To a large degree, that's the question Paul Freiberger and
Dan McNeill tackle in Fuzzy Logic: The Discovery of a Revolutionary Computer
Technology--and How it is Changing Our World. As in the classic Fire in the
Valley: The Making of the Personal Computer (a collaborative effort of
Freiberger and DDJ's Michael Swaine), we see technology as history, starting
in this case with Lotfi Zadeh's famous 1965 paper which introduced fuzzy set
theory. To be fair, Freiberger and McNeill carefully give credit where
credit's due, acknowledging among others Ludwig von Bertalanffy's 1951
discussion of "general systems theory," Max Black's 1937 paper "Vagueness: An
Exercise in Logical Analysis," and even Georg Cantor's 19th-century work in
set theory. Still, fuzzy logic is Zadeh's baby.
Not that, at the time, Zadeh got the credit he was due. What he did get was a
lot of criticism like that of Rudolph Kalman, who said, "Fuzzification is a
kind of scientific permissiveness; it tends to result in socially appealing
slogans unaccompanied by the discipline of hard scientific work and patient
observation." Apparently, the simplicity of fuzzy logic, perhaps its greatest
strength, didn't fit into Kalman's view of research and work ethic--if there's
no pain, there's no gain. Because of misunderstanding, academic infighting,
and a host of other reasons, fuzzy logic was never accepted as anything more
than an academic exercise, and perhaps a flawed one at that--except in Japan,
where the technology was embraced from the outset.
For their part, Japanese companies created a fuzzy bandwagon, then jumped on
it. Everywhere in Japan you'll find fuzzy-controlled washing machines,
microwaves, automobiles (for both fuel efficiency and safety), cameras,
elevators, traffic lights, robots, subway systems, and more. If fuzzy logic is
a research project, then all of Japan is the laboratory.
McNeill and Freiberger seek to answer the question of how the U.S. happened to
miss the fuzzy boat, or in their words "blind itself to a commercial jackpot."
Near the end of the book, they show how Johnny-come-lately U.S. companies are
finally taking the fuzzy plunge, although the Japanese clearly dominate the
field with their legions of trained engineers and, more importantly , their
commitment to the technology.
Fuzzy Logic isn't necessarily a technical book and certainly not a programming
one. Instead, it's a book that mixes science and history into a very good,
very readable tale.
If it's technical details you're looking for, however, try the Proceedings of
the 1993 IEEE International Conference on Fuzzy Systems, Volumes I and II
(IEEE Catalog #93CH3136-9; phone number 1-800-678-IEEE). This massive,
two-volume set presents details on every aspect of fuzzy logic, from reasoning
theory and knowledge representation to control, database, robotic, and
surgical applications. While there's not much source code for the
software-pure at heart, like any set of IEEE conference proceedings there are
enough algorithms to keep you busy through a long winter.
While papers such as "Fuzzy Logic-based Banknote Transfer Control" or
"Real-time Fuzzy Control of Mean Arterial Pressure in Postsurgical Patients"
are certainly interesting, "Fuzzy Database Language and Library: Fuzzy
Extensions to SQL" (by Hakajima, Sogoh, and Arao of Omron Corp.) is probably
of more interest for DDJ readers. In this paper, the authors describe a fuzzy
database language called Fuzzy SQL and a C library (called FDL2) that
implements a fuzzy SQL preprocessor. Their specific challenge is to extend the
fuzzy-database model so that it's more tightly integrated to object-oriented
database systems to facilitate the manipulation of multimedia (sound and
images). The problem with these data types is that they're often imprecise and
therefore difficult to process--exactly the type of problem fuzzy logic was
invented to solve. Figure 1 illustrates the fuzzy database described by the
authors.
To address these issues, the authors provided fuzzy extensions to SQL in terms
of a data definition language (DDL), column (table) definition, and fuzzy data
definitions. Example 1 provides an example of table definition (column
constraint) implemented in Fuzzy SQL-DDL. In this example, Fuzzy stores the
fuzzy data, Reldeg is the reliability degree value, Check is the search
condition, and With the limitation condition. When data is stored in a column
(defined by Check) using either Insert or Update, the conditions are applied.
In this case, only data that have a 0 grade value (as determined by the
results of the conditions) are stored.
Example 1: Sample Fuzzy SQL table definition.

 CREATE TABLE People
 ( Name CHAR (8) NOT NULL,
 Age DEC (3) FUZZY RELDEG
 CHECK Age >= young
 WITH GRADE > 0
 Hair_Color CHAR (12) FUZZY,
 Hobby CHAR (12),
 Height DEC (3) FUZZY,
 Weight DEC (3) FUZZY)

Fuzzy data used by the data manipulation language are defined in the data
definition language. Example 2 is a typical data definition where Fuznum is
the fuzzy number, Fuzlab is the fuzzy label, Hedge is a linguistic modifier
that modifies the shape of the membership function using a shifting function
(used for numeric data), and Param is a fuzzy-operator parameter relation
defined by the fuzzy predicate; available fuzzy predicates include =,<,>,Some,
All, and so on. Other relations are approximately equal, much greater than,
and much less than. Example 3 shows a typical database query using Fuzzy
SQL-DML.
Example 2: Sample fuzzy data definition.

 CREATE FDD young_people.age
 CREATE FUZNUM
 ( RATED, 10, 2.0)
 CREATE FUZLAB
 ( young NMF (0, 0, 20, 30),
 old NMF (40, 50, 150, 150))
 CREATE HEDGE
 ( very TIGHT 1, more_or_less WIDE 1 )
 CREATE PARAM
 ( APE 0.2, MGT 5, MLT 5)

Example 3: Sample use of Fuzzy SQL DML (data manipulation language).

 SELECT Name, Age,
 Hair_color, hobby FROM people
 WHERE age = young
 WITH GRADE > 0.5

Although the FDL2 fuzzy-database library can be used with pre-existing
relational or object-oriented databases, it's limited at this time to running
only on the Omron Luna88K workstation.
If you want more timely information on fuzzy systems than yearly conference
proceedings, you might start with a new quarterly magazine called the Journal
of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
(John Wiley & Sons, 212-850-6645), edited by Mo Hamshidi and Timothy Ross. The
magazine (which has a noticeably academic flavor, perhaps because both editors
are on the faculty of the University of New Mexico) consists of many of the
same sort of articles as you'd expect to see in conference proceedings--except
that they're more finely edited and produced. Among the articles presented in
the inaugural issue are: "A Review of Probabilistic, Fuzzy, and Neural Models
for Pattern Recognition" by James Bezdek, "Generalized Fuzzy and Matrix
Associative Holographic Memories" by Ron Yager, "Silicon Implementation for a
Novel High-speed Fuzzy Interference Engine" by Miki, Matsumoto, Ohto, and
Yamakawa, and a short foreword by honorary editor Lotfi Zadeh. As you'd expect
of just about any technical journal with peer-review boards, the magazine
takes itself seriously, perhaps too much so for my taste. But then, I'm not
necessarily typical of its target audience. I enjoy conference proceedings on
obscure applications like "Recognition of Facial Expressions using Conceptual
Fuzzy Sets" or "A Neural-Fuzzy Model of Recall Based on Neuropathological
Findings in Alzheimer's Patients" or even "Fuzzy Logic Technology and the
Intelligent Highway System." You probably won't find too many articles like
these in the Journal, but if you're getting serious about fuzzy systems, the
magazine looks to be a good place to start.
































































June, 1993
OF INTEREST





A group of eight companies including Borland, Microsoft, IBM, Intel, SCO,
Watcom, MetaWare, and Lotus has formed a new consortium called the Tools
Interface Standards (TIS) committee and will publish the TIS 1.0
specification. The committee was formed to define a standard set of 32-bit
tool interfaces across multiple platforms that include UNIX, 32-bit Windows,
and OS/2 in order to improve portability and interoperability among
development tools.
The 1.0 specification standardizes the first linkable, loadable, and debug
formats, an area considered by industry observers to be highly fragmented. The
formats, which are considered to be both portable and widely used, include the
relocatable object-module format (OMF), the executable linkable format (ELF),
and DWARF, a debug information format originally developed at AT&T. The TIS
committee has also agreed upon Microsoft's PE and Symbol and Type Information
(STI) for the Windows environment. Future directions for the TIS committee
include a look at the transition to 64-bit environments as well as
object-oriented interface standards.
Copies of the TIS specification are available from the Intel Literature Center
(order number 241597) or through Intel ACCESS on CompuServe. Reader service
no. 20.
Intel Corp. Literature Center 1-800-548-4725
Lucid recently announced that it is making the Lucid GNU Emacs editor
available to C and C++ programmers on cartridge tape. Lucid Emacs, which is
available in source and binary form for SunOS 4.x, adds a number of features
to GNU Emacs, including a GUI interface, support for multiple windows,
integration with Motif and the Xt toolkit, multiple fonts with variable width
and color, and support for active regions.
Lucid Emacs is one of the editors supported in Lucid's Energize Programming
System. Lucid Emacs is already available free of charge via anonymous ftp from
Internet at labrea.stanford.edu. Log in with the user "anonymous" and
"user-name@host" (that is, your e-mail address) as a password. Execute the
command "cd pub/gnu/lucid/". The files you will find are: lemacs-19.4.tar.Z,
the complete source distribution; lemacs-19.4-sun4.- tar.Z, a ready-to-run set
of Sun4 executables and a DOC file; xpm-3.2a.tar.Z, XPM library. Be sure to
set binary mode when transferring these files. Unpack them with some variation
of the command "zcat lemacs-19.4.tar.Z tar-vxf-". Lucid has created two
mailing lists for discussing Emacs: bug-lucid-emacs@lucid.com, for reporting
all bugs in Lucid GNU Emacs; and help-lucid-emacs@lucid.com, for random
questions and conversations about using Lucid GNU Emacs. The tape
distribution, which includes a user manual, is available for $400.00. The
manual can be purchased separately for $150.00. Reader service no. 21.
Lucid Inc. 707 Laurel Street Menlo Park, CA 94025 415-329-8400
Orion Instruments is making available, free of charge, a 55-page booklet
entitled "Real-Time Debugging Techniques," authored by Orion founder Thomas R.
Blakeslee. The booklet covers numerous techniques that enable developers to
efficiently debug embedded systems. These techniques are particularly
important in a real-time environment where a breakpoint can have undesirable
effects. The booklet covers real-time monitoring, analyzer traces, and symptom
triggering. While most, if not all, of the techniques described in the booklet
are features in Orion's Emulator/Analyzer product family, the techniques are
general enough to be useful in other development environments that integrate
in-circuit emulation with (unintrusive) real-time tracing. Reader service no.
22.
Orion Instruments 180 Independence Drive Menlo Park, CA 94025 415-327-8800
SLR Systems has announced OPTLINK 4.0 for Windows. OPTLINK, a drop-in
replacement for either Microsoft's Link or Borland's TLink. It supports any
compiler or assembler that generates standard .OBJ files as well as Phar Lap
and Rational's 286 DOS extenders. OPTLINK can generate 16-bit DOS, OS/2, and
Windows executables and DLLs. DOS-extended executables can also be targeted.
Version 4.0 features high-capacity linking, faster link times, and an extended
set of link directives to build smaller executables that can load faster.
OPTLINK also supports both Turbo Debugger and Codeview debug formats. OPTLINK
4.0 features a switch that supports "exepacking" for Windows executables.
Exepacking reduces both the load time and file size of the executable while
eliminating the linker's second RC pass. Other switches packcode and
farcalltranslation combine data segments and convert intrasegment calls to
near calls.
In a related announcement, SLR has begun shipping their OPTLIB Superfast
Library Manager. OPTLIB, which specifically supports Visual C++, features
time-based module replacement, an enhanced cross-reference output that lists
all module references, as well as where public symbols are defined and
referenced. OPTLINK is available for $350.00 and OPTLIB is priced at $199.00.
Reader service no. 23.
SLR Systems 1622 North Main Street Butler, PA 16001 412-282-0864
Intel and Nestor have delivered to DARPA samples of the Ni1000, a jointly
developed, 1024-neuron, neural-network chip. With more than 3 million
transistors, the Ni1000 can perform 20 billion integer operations per second.
The companies claim that Ni1000-based character recognition can achieve
recognition rates of up to 10,000 characters per second, far greater than the
10 to 100 characters per second typical of most PC-based recognition software.
The chip uses a large block of Flash memory so that learned patterns can be
memorized and quickly recalled for real-time pattern-recognition applications.
Learning capability is implemented on-chip in the form of a 16-bit
microcontroller. Reader service no. 24.
Intel Corp. Neural Network Hotline 1-408-765-9235 or Nestor Inc. 1 Richmond
Square Providence, RI 02906 401-331-9640
OnTime Marketing recently announced the release of RTKernel 4.0, a real-time
multitasking kernel that runs under DOS. RTKernel, which supports most popular
Pascal and C compilers including Microsoft, Borland, and Stony Brook, is a
library that when linked to a host application allows it to run in parallel
with an arbitrary number of tasks.
RTKernel also supports Borland Pascal 7.0's protected-mode capability, under
which tasks can use the full 16 Mbytes of memory. Additionally, more than 2000
tasks can run concurrently and have unrestricted access to DOS for things like
file I/O. RTKernel transparently resolves re-entrancy problems without
degrading system performance.
RTKernel features a priority-and event-driven scheduler. Task switches can be
initiated by the exchange of messages, interrupts triggered by hardware, or
optionally, by time-slicing. The task-switch time is independent of the number
of tasks running. Intertask communication is handled by using semaphores,
message-passing, and mailboxes, and tasks can be activated in, for example, a
fixed time frame by the scheduler.
RTKernel also comes with drivers for the timer, serial ports, printer,
keyboard, screen, Novell LANs. All drivers come with complete source code.
Additionally, existing drivers supplied by the hardware vendor can be used, or
custom drivers can be implemented using the Kernel's open API. RTKernel-Pascal
is priced at $445.00 with source code available for an additional $375.00.
RTKernel-C is available for $495.00 (add $445.00 for source). Reader service
no. 25.
OnTime Marketing Karolinestrasse 32 2000 Hamburg 36 +49-40-437472
RUNOS2, recently released by Flashtek, lets programmers convert certain 32-bit
OS/2 2.0 applications into self-contained, dual-mode executables that operate
under either OS/2 2.0 or DOS. A special stub executable that contains a DOS
extender attached to the text-mode OS/2 EXE file makes this possible.
When executing under DOS, the stub executable supports a subset of
approximately 50 OS/2 function calls. It will support up to 2.5 gigabytes of
virtual memory; 80387 emulation is available for floating-point applications.
RUNOS2 also supports all DOS extended-memory allocation standards, including
DPMI, VCPI, XMS, and INT 15. In addition to supporting approximately 50 OS/2
functions, RUNOS2 supports all ANSI and POSIX 1003.1 functions in the runtime
libraries of supported compilers.
RUNOS2 is being incorporated into Flashtek's X32VM DOS Extender, which sells
for $250.00. Reader service no. 26.
FlashTek Inc. 121 Sweet Ave. Moscow, Idaho 83843 208-882-6893
CC-Rider, the source-code browser from Western Wares, now supports Windows by
linking with any Windows-based editor for source-code navigation during
editing and debugging sessions. CC-Rider 4.0 also includes a C++ analyzer,
class-hierarchy charts, function-call tree diagrams, database caching to
extended memory, and a fully programmable API library for accessing the
database from your own DOS or Windows programs.
All symbols in a program are stored in the database and cross-referenced
according to type of use. CC-Rider provides full support for all proposed ANSI
C++ features (nested classes, templates, and exception handling) as well as
support for C++ compiler implementations (Borland, Microsoft, and Zortech).
Reader service no. 27.
Western Wares P.O. Box C Norwood, CO 81423 303-327-4898
C++ Designer for Windows, a graphical design and analysis tool for C++ code
generation, has been announced by Meridian Software Systems. The tool, based
on the Rumbaugh Object Modeling Technique (OMT) design methodologies, extends
the integrated development environments (IDEs) of Microsoft's Visual C++ and
Borland's C++ with Application Framework by enabling programmers to view,
edit, and compile C++ code that corresponds to each class on a C++ Designer
object diagram.
C++ Designer for Windows sells for $295.00. Reader service no. 28.
Meridian Software Systems 10 Pasteur Street Irvine, CA 92718 714-727-0700
4GL programmers can pit their skills against other 4GL developers at the world
programming contest in Stockholm on September 20-21, 1993. Three-member
programming teams will be given a problem and allowed 24 hours in which to
solve it. Solutions will address problems such as scheduling, logistics, and
optimization.
The contest is jointly sponsored by Datateknik, a Swedish computer magazine,
and the Stockholm International Fair. For more information, contact Nils
Ohman, editor-in-chief, Datateknik. Reader service no. 29.
Datateknik 106 12 Stockholm Sweden +08-796-66-80
QNX Software (formerly Quantum Software) has announced that its TCP/IP for QNX
4.2 is now available. TCP/IP coexists with the QNX operating system's native,
lightweight-network protocol, thereby allowing both TCP/IP and QNX to share
the same network cable. QNX 4.2 also includes a client/server implementation
of NFS.
The QNX operating system is UNIX-like, microkernel-based, real-time,
distributed OS (8 microseconds per context switch on a 66-MHz 486). The QNX
development environment includes Watcom C and Rundos, the QNX DOS emulator
that allows Microsoft Windows apps to run in standard mode.
At the same time, QNX announced that version 4.2 of the OS supports Intel's
Pentium processor. QNX claims that the Pentium-aware OS can provide as much as
a 30-percent performance boost on the Pentium, yet remain backward compatible
with the 486. Reader service no. 30.
QNX Software Systems 175 Terrence Matthews Crescent Kanata, Ontario Canada K2M
1W8 613-591-0931
Exceptions for C, a C library that provides Ada-like exception handling, has
been released by Koyn Software. The library lets programmers provide a
separate exception handler for a block of code and raise exceptions from
within the block or from functions called from the block. When an exception is
raised, control passes to the exception handler of the most-recently entered
block providing one. Exceptions may be passed up the function call stack with
a reraise operation. The library is compatible with any
ANSI-standard-compliant C compiler. Single copies of the library sell for
$29.95; site licenses are available for an additional $5.95. Reader service
no. 31.
Koyn Software 1754 Sprucedale St. Louis, MO 63146 314-878-9125
Venue's object-oriented programming environment, Medley, is now available for
DOS, so Medley UNIX applications will run on DOS platforms without
recompiling. Medley is written in Lisp and includes a virtual workspace
manager with interfaces for end-user applications; Common Lisp and Interlisp-D
interpreters and compilers; LOOPS, a set of object-oriented extensions; and
debugging and development tools. Medley for DOS costs $795.00. Reader service
no. 32.
Venue 1549 Industrial Road San Carlos, CA 94070 415-508-9672
DiskLokd has released DiskLokd, a software-only copy-protection library for
Turbo C/C++, QuickBasic, PDS, and VB-DOS. The library is implemented by adding
two lines to your existing code, which adds only 296 bytes to your EXE. The
protected program will then run only on the computer you choose.
Also available is NetLokd, which offers the same features for up to 50
machines and is network independent. DiskLokd costs $250.00 and Netlokd is
$400.00. Reader service no. 33.
DiskLokd P.O. Box 1345 Fernandina Beach, FL 32035-1345 904-261-5828










June, 1993
SWAINE'S FLAMES


Pay No Attention to the Man Behind the Curtain, Dorothy




Michael Swaine


The debate on digital-telephony legislation in the March 1993 issue of
Communications of the ACM was a fraud and a disappointment.
At issue was proposed Federal legislation that would require all
digital-telephony suppliers--from AT&T down to owners of computer bulletin
boards--to design their systems in such a way as to facilitate FBI taps. The
legislation's FBI sponsors say they need it in order to keep doing their job
in the face of changes in technology. Critics, including the Electronic
Frontier Foundation, say that it is unnecessary and that it undermines civil
liberties. The stakes are high.
I won't try to debate the bill here. I recommend that anyone interested read
that issue of CACM. It presented a wide range of viewpoints and raised many
important issues. If you're not an ACM member, you can find it in any
university library.
But I say it was a fraud because it was not, as advertised, a debate.
In a debate, discussants are given equal time to make their points, but in the
CACM issue, one discussant got eleven pages and the other eight discussants
together got eight.
In a debate, discussants are given equal chances to respond to each other, but
in the CACM issue, one discussant got the first and last words.
In a debate, the sponsor doesn't take sides, but in the CACM issue, the issue
editorial is hard not to read as favoring one of the discussants over the
others: She gets more ink than they do; she is referred to by name while they
are not; she "presents the case," "points out," and "explains," while they
just "argue."
Rather than a debate, this was a guest editorial by Dorothy Denning, professor
and computer-science chair, Georgetown University, with brief opposing views
from others, followed by a long rejoinder by Denning. It would have been more
honest to have advertised it as such.
It was a disappointment because Denning's argument, while well reasoned and
clearly stated, was ultimately undermined by her extraordinary credulousness.
In their responses, three of the discussants explicitly pointed this out to
Denning: She "accepts the DOJ's [Department of Justice's] and FBI's arguments
uncritically," her "recitation of the FBI's assertions adds little to our
understanding of the technical issues ... or the reasons for the proposal,"
and her "conclusions are based on the claims of interested parties rather than
on independent research." Although she had three pages in which to respond,
she made no response--in fact no reference--to this damning criticism.
Personally, I think that anyone who would accept the assurances of the FBI on
the issue of wiretapping doesn't know much about the history of the FBI, but
that's not the point. Any government agency urging legislation that impacts
our civil rights should expect to have its proposal scrutinized critically.
And readers of any professional journal should be able to expect its articles
to adhere to professional standards of intellectual discourse.
Denning disappoints both expectations.






































July, 1993
July, 1993
EDITORIAL


Clipping the Wings of Privacy


On occasion, even the most hardened criminals commit acts of mercy. You've
seen it in the movies: Tough guys like Aldo Rey snatching a terrified child
from the path of a runaway bus or Humphrey Bogart sacrificing his freedom to
save the life of a nun.
The '90s twist on this story line brings a new kind of high-tech antihero. Of
course, if everyone considered Kevin Poulsen a hero, he wouldn't be passing
time at the Federal Correctional Institution in Dublin, California. Still, you
could argue that some of the crimes he's accused of do have a socially
beneficial side to them.
In the late '80s, Poulsen, a legendary California cracker and former SRI
employee, supposedly hacked his way into Pentagon, California Department of
Motor Vehicles (DMV), and Pacific Bell computers where, it's claimed, he
surreptitiously gathered information about FBI wiretap/sting operations and
IRS criminal investigations. In the process, Poulsen also collected a bushel
basket full of indictments ranging from telecommunications and
computer-related fraud to espionage.
While Dublin is a pastoral setting that lends itself to retrospection, Poulsen
didn't relish the prospect of several years in the slammer. Instead of going
to trial, he opted for life on the owl-hoot trail.
Nearly two years later, Poulsen was picked up, leading the Feds to wonder how
he supported himself while underground. Authorities now claim that Poulsen
seized control of incoming telephone lines of Southern California radio
stations sponsoring call-in contests, allowing him to rig the games by
blocking all in-coming calls but his own. (So much for "It's not whether you
win or lose, but how you play the game_.") As a "random" winner, he supposedly
absconded with a Hawaiian vacation, a couple of Porsches, and thousands of
dollars in cash. To claim and unload the prizes, Poulsen allegedly created
aliases and phony IDs, leading to additional charges for computer fraud, money
laundering, and interception of wire or electronic communications.
There's little question in my mind that pulling the plug on most radio call-in
programs--if only temporarily and with admittedly questionable motives--serves
the public good. If you don't think so, just listen to a few of them while
stuck in traffic. Instead of being put in the pokey, Poulsen probably ought to
be sentenced to FCC-sponsored community service. After summarily dealing with
radio call-in programs, maybe he could turn his attention to TV talk shows.
This isn't to say such crimes aren't serious--they are. I'd be unhappy if
someone grabbed--and nefariously used--confidential information about me
that's stored on a DMV or Pac Bell computer. But you also have to wonder
whether or not "Clipper," the government's most recent attempt to grapple with
computer security, is as insidious in its own way as anything Poulsen's
charged with.
The Feds seem most concerned about citizens keeping secrets from the
government--exercising our right to have private conversations with each
other. In particular, the FBI is worried that criminals will begin using
encryption to scramble their communications, thereby thwarting wiretaps.
The government's solution is Clipper, the first in a family of an NSA-designed
VLSI chips with the classified Skipjack encryption algorithm hardwired into
them. (Skipjack is supposedly 16 million times more secure than DES, the
current standard.) To the Feds' way of thinking, every communication
device--phone, modem, and the like--would have a Skipjack-based chip designed
into the circuit. All communication between, say, two modems, each with
built-in Clipper chips, would then adhere to Skipjack protocol.
Individual chips would have a pair of unique numeric keys which would be
handed over to the government by the vendor. To eavesdrop, law enforcement
would get a warrant to tap the phone, record the communication (which
automatically includes the individual chip identifier), retrieve the key from
the government database, and decrypt the message.
The government sees itself as the sole shepherd of Skipjack and chip keys. Of
course, for this scheme to work, Skipjack/Clipper would have to be the only
encryption game in town--and you can bet there have been discussions about
outlawing encryption techniques other than Skipjack. But millions of dollars
have already been invested in existing schemes (such as RSA) by Microsoft,
Lotus, Novell, Apple, and their millions of users; mandating change would meet
stiff resistance.
And considering the global nature of businesses such as banking and finance,
where encryption is critical, it's unlikely that other countries (which have
generally adopted ISO 9796 encryption) would allow Skipjack/Clipper to be
imported. Or putting the sneaker on the other foot, would the U.S. even now
adopt a Russian encryption scheme over which we had no control?
Nor is there a free lunch involved. Programmed Clipper chips cost about $25.00
each, which means the price of modems, phones, and the like will increase
accordingly.
Money and export concerns aside, the real issues remain those of privacy and
the government's attitude toward its citizens. What we're witnessing is a
fundamental shift from what we've considered to be our Constitutional right to
privacy to a view that the government is privy to our most private
conversations. This alone is enough to make Kevin Poulsen look like nothing
more than an angel with a dirty face.
Jonathan Ericksoneditor-in-chief






































July, 1993
LETTERS


Sometimes the Best Defense Isn't a Good Offense




Dear DDJ,


Defensive programming is something many of us have consciously or
unconsciously adopted over the years, usually from bitter experience of the
kinds of things that can happen without it. Simple things like using possibly
redundant parentheses to ensure that you get the precedence you want from a
sequence of C operators, for example, or automatically using separate source
files as a means of data hiding unless there are overwhelming reasons not to.
OOP was originally touted as eliminating or greatly reducing the need for such
strategies, and after a couple of years' C++, use I would agree that this is
generally true. However, beware! The new languages and techniques can
themselves call for new defensive ploys.
For instance, take the increasingly common style being used for C++ class
declarations. Everywhere I look--in textbooks, compiler manuals, help files,
source code for professional class libraries, and, yes, even DDJ listings--I
see code along the lines of Example 1. This is perfectly legal C++ and will
give you exactly the data protection you expect--as long as you get the code
right. But sooner or later, Murphy's law will assert itself and that "private"
statement will get left out. The code will still be legal; it will compile
okay and your test programs will very likely run perfectly. The data
protection is gone, however. Worse, if your formal documentation is done by
someone else on the team or by some smart new documentation software, the
ultimate users of your brilliant class library will be under the impression
that it's perfectly okay to tinker with the very class members you wanted to
hide from them. This is a surefire recipe for Interesting Times.
The defensive programming strategy against Murphy and his law is
simple--always put the "private" section first. If you leave out the "private"
statement, the section members will stay private by default. And if Murphy
trashes your "public" statement, you will find out all about it the minute you
try to compile code testing the class "public" interface.
Robert Sproat
London, England


LUC Redux




Dear DDJ,


My May 1993 letter to DDJ regarding Peter Smith's article, "LUC Public-key
Encryption" (January 1993) had an important misprint (perhaps because of a
poor fax transmission). The correct value for r should read:

 e -1
 1 2 2
 r = lcm(p (p -2),...,(p -2))
 1 1 t

Willi More
Klagenfurt, Austria


mapdev() for Fortran 




Dear DDJ,


Ken Hamilton forgot one Fortran compiler in his article, "Direct Memory Access
from PC Fortrans" (DDJ, May 1993)--Microway's NDP Fortran. Not only was NDP
Fortran the first 32-bit Fortran for protected-mode DOS, but it was also the
first to employ a concept called "map device," which allows Fortran
programmers to map physical or virtual memory and/or devices into their
Fortran applications through the seamless interface. These mappings include
video, BIOS, dongles, disk drives, the Weitek coprocessor, and so on, all of
which can be driven by NDP Fortran.
I've included a program that demonstrates the expressive power behind
Microway's mapdev() function. See Example 2 for details. This program
accomplishes what the other protected-mode Fortrans mentioned in Ken's article
cannot.
Mark J. Barrenechea
Microway
Kingston, Massachusetts
Fortran Fan


Dear DDJ,



I wanted to say how pleased I was to see an article employing the Fortran
language in the May issue of Dr. Dobb's Journal. I very much enjoyed Kenneth
Hamilton's article on "Direct Memory Access from PC Fortrans" and will be able
to put the material to immediate use. Please consider giving greater
prominence in the future to numerical scientific applications and the Fortran
language in which they are commonly programmed.
Michael L. Berbaum
Research Social Scientist
Tuscaloosa, Alabama
DDJ responds: Glad you enjoyed the Fortran coverage, Michael. Watch for our
September 1993 issue which will examine numerics and numerical programming.


C-like Assembler for DSP--Not!




Dear DDJ,


I recently learned that I will be writing code for an AT&T DSP32C, so I was
especially interested in Mac Cody's article, "A Wavelet Analyzer" (DDJ, April
1993), in which the implementation is based on that processor. This was the
first programmer-oriented information I had found about any of AT&T's DSPs
since starting an informal search involving the Internet and trips to several
technical bookstores. Are there any logical reasons for this surprising lack
of publicly available information?
When I looked at Mac's assembly-language source code and subsequently read
that one of the reasons that he chose AT&T's DSP32 and DSP32C was because of
their "C-like assembly languages," I almost screamed. I strongly feel that
this is a great liability rather than a bonus. I've done a great deal of
assembly-language programming (mainly in the field of real-time computer
graphics), and there's a vastly different mindset needed to write "good"
assembly language than that required to write "good" C. The different look of
most assembly languages helps you get into that mindset. I greatly prefer the
look-and-feel of Motorola 680x0 syntax (my favorite assembly language) to that
of AT&T DSP32 and DSP32C.
I also feel that "normal" assembly language is much more straightforward than
this "C-like" assembly language. Ask youself, for example, which instruction
in Example 3 is easier to understand and deal with: the slightly modified
Motorola 5600xMAC (Signed Multiply-Accumulate) or the example from the DECOMP
routine?
Both instructions do the same thing, but with the former you need to remember
more rules about the construction of such expressions. The latter just
requires you to chose the correct instruction for the risk. Using the same
symbol for more than one thing can also add more chances for errors to occur
(for example, + for addition vs. ++ for postincrement and * for multiply vs. *
for indirect referencing).
Occasionally, I accidentally do something like leave off an operand. The
assembler tells me right away what's wrong and I feel a bit stupid while I fix
it. With C-like syntax, an instruction with a missing operand could very
easily turn out to be a valid different instruction. Bugs like that are much
harder to find_. As an added bonus, the latter takes less time to type.
Maybe my intense dislike for the AT&T DSP32 and DSP32C assembly languages is
rooted in the fact that I believe in highly optimized (speed and size) code,
even on fast processors, and am the type of person who says things like, "Gee,
that's an interesting instruction_. Now how can I use it to my advantage?"
Jesse Michael
Portland, Oregon


So How Was Your Date?




Dear DDJ,


I greatly enjoyed Peter Meyer's "Julian and Gregorian Calendars" article (DDJ,
May 1993). With many systems still relying on date formats that have only two
digits for the year, there will be much more interest in date routines as the
year 2000 approaches.
His brief mention of Easter in the article brought to mind a ten-step
algorithm that, according to one source I've seen, is over 100 years old. This
routine is said to incorporate the paschal full- moon determinations and seems
to work for "modern" years; see Table 1, where Easter is the nth month and the
(P+1)st day.
Karl Hoppe
Orange, California
More Genetic Algorithms


Dear DDJ,


I was very excited to see the topic of your February '93 issue, "Cognitive
Computing." I enjoyed all four articles. Despite this, I was disappointed to
see that your list of software included so little of the available
genetic-algorithm software. My company produces a C++ package for the
Macintosh and Windows called MicroGA. There are also a number of public-domain
packages available. The easiest way to locate these is probably through the
Internet discussion group of GAs. If you wish to join this group, send your
name and e-mail address to GA-List-Request@AIC.NRL.NAVY.MIL. I hope the staff
at DDJ keeps up the good work, and keeps covering leading-edge technologies
such as these.
Stephen D. Wilson
Emergent Behavior
Palo Alto, California
Example 1

class SomeClass();
public:
int this;
char that;
void TheOther();
 ...
private
 long vulnerable;
 float risky;
 int Disastrous();


Example 2
c This program uses mapdev to map the video buffer into
c data space and poke values into which will be displayed.

c grex.fh contains declarations for GREX (GRaphic EXtensions)
c function get_bios_mode. os.fh contains declarations for mapdev,
c peekb, and peekw
 include grex.fh'
 include os.fh'

 integer screen,addr,count
 integer bios_mode,lines,cols

 addr = z'b8000'
 if (get_bios_mode()eq 7 addr = z'b0000'

 screen = mapdev (addr,8192)
 write(*,*) scren pointer = ,screen
c Get size of video page
 lines = 1 + peekb (z'484')
 cols = peekw (z'44A')
 call pauseb
c write values to video buffer, to be displayed on screen
 call try_it (% VAL(screen),2*cols,lines)
 call pauseb
end
 subroutine try_it (a,m,n)
 integer m,n
 character a(m,n)
 do j = 1,n
k = ()
do i=1,m,2#
 k = k+1
 a(i,j) = char(k)
 a(i+1,j) = char(j)
enddo
enddo
 return
 end


Example 3

aO = aO + *r3++ **r1++; /* AT&T DSP32 and DSP32C syntax */

mac (r3)+,(r1)+,aO ; Modified Motorola 6500x syntax



Table 1

==============================================================================
 Step Result Values for 1993
 Quotient Remainder
==============================================================================
 Year/19 -- A -- 17
 Year/100 B C 19 93
 B/4 D E 4 3

 (B+8)/25 F -- 1
 (B-F+1)/3 G -- 6
 (19A+B-D-G+15)/30 -- H -- 17
 C/4 -- K 23 1
 (2E-H+2I-K+32)/7 -- L -- 3
 (A+11H+22L)/451 M -- 0
 (H+L-7M+114)/31 N P 4 10
==============================================================================






















































July, 1993
Morphing in 2-D and 3-D


Where image processing meets computer graphics




Valerie Hall


Valerie is a PhD student in computer science at Curtin University of
Technology in Western Australia. Her main area of research is computer
graphics, and her PhD topic is speech-driven facial animation. She can be
reached at val@ marsh.cs.curtin.edu.au.


Special effects in the movies have always captured our imagination, but
morphing seems to have attracted special attention. The average moviegoer is
familiar with shape-changing sequences in films such as Willow, Terminator II,
The Abyss, and Lawnmower Man; MTV fans regularly see this technique in music
videos such as Michael Jackson's "Black or White;" and couch potatoes view it
daily in everything from commercials pitching Chrysler vans and Schick razor
blades to prime-time TV shows like "Deep Space 9." Even PC graphics
packages--Autodesk's 3D Studio comes to mind--now offer morphing as part of
their feature set.
It's often difficult to get precise information about the methods various
commercial animators use because of the proprietary nature of the software in
high-end animation shops. Still, morphing source code is publicly available
for workstations; Mark Hall's morphine program for X Windows workstations and
Tim Heidmann's demo program for the Silicon Graphics environment are good
examples of this. For PC users, shareware morphing apps such as Richard
Goedeken's Rmorf (see the accompanying textbox entitled "Rmorf:A Shareware
Morphing Program for MS-DOS") are available on CompuServe and similar sources.
In this article, I'll provide a technical overview of this fascinating
technique. But first, let's start with definitions and a bit of history.


What is Morphing?


The term morphing comes from the Greek word morphe, which means form or
shape. The study of shapes is known as "morphology," and morphing has come to
mean shape-changing via digital techniques. Here, I broadly use the definition
to cover both two- and three-dimensional techniques.
The term originated in the late '80s at George Lucas's Industrial Light and
Magic (ILM). Douglas Smythe and others at ILM developed Morf, a program for
interpolating sequences between two images. Morf was originally written to
handle the transformation scenes in the movie Willow and has since been used
on other projects. (See the accompanying textbox, "How Do They Do It?".)
Although the term "morphing" may be recently coined, many of the basic
algorithms have been around for a decade in the fields of computer graphics
and digital image processing.
While it's difficult to single out individuals, certain milestones are evident
as you look back in time. In 1990, George Wolberg wrote Digital Image Warping
(IEEE Computer Society Press, 1990), the definitive compendium of 2-D
image-warping techniques. (For more details on Wolberg's approach, see the
accompanying textbox entitled, "The Canonical Implementation in C," which
implements in C the algorithm used by ILM.) I've already mentioned Doug
Smythe's work at ILM in the late '80s. In the early '80s, Tom Brigham and Paul
Heckbert developed some interesting image-transformation sequences at NYIT's
Computer Graphics Lab. In 1980, Ed Catmull and Alvy Ray Smith published a
seminal paper on computing efficient geometric transformations. In the late
'70s Julian Gomez developed a "tweening" program at Ohio State (and later at
NYIT). Prior to that, in 1976, Jim Blinn and Martin Newell published a key
paper on texture mapping.
The earliest work in geometric transformation of digital images came from the
field of remote sensing, which gained wide attention in the mid-'60s, when
NASA undertook projects to observe Earth from space. Photographic instruments
on Landsat and Skylab produced multiple overlapping views of the same
terrestrial region. To align these images with each other, it's necessary to
compensate for distortions such as lens aberrations and differences in viewing
angles. Similar image-correction techniques have been used in medical imaging
and digital radiology.
But even before digital processors and computer graphics, Hollywood
moviemakers accomplished rudimentary morphing by cross-dissolving images. One
memorable metamorphosis sequence appeared in the 1941 horror film The Wolf
Man. Cross-dissolving is still used as a special effect in conjunction with
the two kinds of digital morphing.


The Two Faces of Morphing


The morphing process varies, depending on whether the morph is 2-D or 3-D. 2-D
morphing gives the visual effect of a 3-D change of shape by warping a 2-D
image from an initial shape to a final shape. Using digital image-warping
algorithms, an initial image is stretched and deformed to conform to the shape
of the target. At the same time, the textures for each image are gradually
blended from the initial texture to the final one. For greater control, source
and target images are broken up into small regions that map onto each other.
In 3-D morphing, a 3-D geometric model of the object is transformed from one
shape into another. At each stage in the metamorphosis, the 3-D model is
rendered and texture-mapped to produce a 2-D screen representation. The
rendering and texture-mapping techniques are standard in 3-D computer
graphics. For 3-D morphing, the animator must set up a correspondence between
different components of the initial and final shapes. Difficulties arise when
trying to morph 3-D objects that are structurally different. For example,
morphing a torus (doughnut shape) to a rectangle poses problems with the
doughnut's hole and the rectangle's edges. But before getting into this, let's
first consider 2-D morphing in more detail.


2-D Morphing and Texture Mapping


The techniques for 2-D morphing come from digital image warping and texture
mapping. Texture mapping is widely used in computer graphics. I distinguish
between computer graphics (the generation of synthetic images) and image
processing (the manipulation of captured-image data). Although texture mapping
is used in the visualization of synthetic images, its counterpart in the
domain of image processing is digital image warping. Both texture mapping and
digital image warping rely on geometric transformations to redefine the
spatial relationship between sets of points in an image.
For texture mapping, the basic technique is a two-step process of first
mapping a 2-D texture plane onto a 3-D surface and then projecting the surface
onto the 2-D screen display. An oft-used analogy to explain texture mapping is
that of clothing: You start with an object (body) and then wrap an image
(article of clothing) around it. Texture mapping serves to create the
appearance of complexity by applying elaborate image detail to relatively
simple surfaces. In computer graphics, textures can be used to perturb surface
normals and thus allow simulation of bumps and wrinkles without the effort of
modeling intricate 3-D geometries.
Digital image warping leaves out the intermediate step of mapping to 3-D
object space, and instead maps directly from one 2-D space (the input image)
to another (the output image). Both digital image warping and texture mapping
need an efficient way to accomplish a spatial transformation--a mapping that
establishes a spatial correspondence between the two 2-D coordinate spaces, of
input image and output image. Although digital image warping predates texture
mapping, it is in the field of computer graphics that efficient two-pass
algorithms have been developed to compute arbitrary geometric transformations,
beginning with Catmull and Smith's 1980 paper.


Geometric Mappings


There are two ways to calculate the geometric correspondence between points in
each of these two spaces: from input to output image (known as "forward
mapping"), or from output back to input (known as "inverse mapping" or
"screen-order traversal"). Screen order is the more common. Your program
traverses the output image scanline by scanline, pixel by pixel; for each
pixel in the output grid, the corresponding value in the input image is
derived. This method is most useful when your program is required to write to
the screen sequentially, when the mapping is readily invertible, and when the
texture allows random access.
Various kinds of geometric transformations are possible between two coordinate
systems, including affine, projective, bilinear, and polynomial
transformations. The equations for calculating basic 2-D transforms use 3x3
matrix multiplications based on homogenous coordinates. A full discussion of
this subject is beyond the scope of this article, but you can find a good
explanation in George Wolberg's book, as well as in standard computer-graphics
texts such as Foley and van Dam's Computer Graphics (Addison-Wesley, 1990).
Regardless of the particular geometric transformation, a problem arises when
mapping from one grid to another: "Holes" and overlaps can occur. This problem
occurs whether you are mapping from output image to input, or vice versa. It
results from the fact that you are working with discrete, rather than
continuous, images. Consequently, you'll end up with pixels in one space that
don't have an exact correspondent in the alternate space. Deriving the
appropriate value in the alternate space requires interpolation and sampling.
Interpolation is a process for arriving at a continuous surface that passes
through the discrete points in your image data. This continuous surface can
then be sampled at arbitrary positions, not just the ones on the coarse
coordinate grid.
There are a host of interpolation functions possible: cubic convolution,
bilinear, cubic spline, and sinc-function convolution. The easiest approach is
"nearest neighbor"--taking the pixel closest to the one wanted. Regardless of
approach, the basic issue with image interpolation and resampling is still how
to arrive at the most representative value for a given pixel. Arriving at the
"wrong" value results in aliasing in which unexpected pixel values can make
the image look chunky or motley. You can bring various techniques into play to
reduce these aliasing artifacts. For more on these techniques, see Wolberg's
book or a signal-processing text.
Many of these techniques are computationally intensive. Catmull and Smith's
contribution in their 1980 paper is a method for decomposing a spatial
transform into a sequence of computationally cheaper mapping operations.
Specifically, they show how a 2-D resampling problem can be replaced with two
orthogonal 1-D resampling stages. That is, your program first transforms the
x-coordinates, then the y-coordinates. Although the basic idea is simple,
there are many subtleties beyond the scope of this discussion.


Subdividing an Image



To specify a morph, the animator defines a correspondence between the initial
and final images. This is commonly done using points, triangles, or meshes.
(Figure 1 shows the meshing process the Rmorf program uses.) Once the
relationship has been defined, the textures in the images are blended from the
initial image to the final image to produce the morph sequence.
The method of subdividing the images varies between different morphing
programs. Triangulations can be specified by hand or generated by the morphing
program. For example, in the morphing programs written by Mark Hall, Michael
Kaufman, and Jay Windley, the user begins by specifying the triangles to be
morphed in each image. Hall's program warps the triangular mesh of the input
image to match the mesh of the output image. As the animation goes from the
initial to the final image, the colors change from that of the first image to
that of the second. Likewise, the user of Kaufman's program begins by defining
common points in two images. For a human face, these points include the eyes,
eyebrows, nose, and mouth. The program builds triangles from these points to
create corresponding areas on the images. Windley's program does a linear
blend of the contents of the triangles using barycentric geometry. One of the
most automatic triangulation algorithms is the Delaunay triangulation, also
known as the Voronoi tessellation. (See "Spatial Data and the Voronoi
Tessellation," by Hrvoje Lukatela and John Russell, DDJ, December 1992.)
Morphing isn't always done with triangular patches. Douglas Smythe's program
fits a Catmull-Rom spline through the x-coordinates of the control points (in
the first pass) and the y-coordinates (second pass) to realize a piecewise
continuous mapping function. Likewise, the Montage program for the Amiga
(developed by Thomas Krehbiel and Kermit Woodall at Nova Design) uses cubic
splines in a deformable mesh to define the input and output images. Cubic
splines are better for defining curves in the image. Triangular patches have
to be small and well chosen to give a similarly smooth result, but on the
positive side, they are simpler to work with.
2-D morphing is a very manual process. It relies heavily on the experience and
know-how of the animator. The choice of subject matter can make all the
difference. Care should also be taken to use images of the subject captured
from similar angles. Similarly, the position of parts of the subject should be
closely matched.


3-D Morphing


In 3-D morphing, the transformation is from one 3-D object model to another.
The objects can be similar in type (people's faces, for example) or they can
be as geometrically different as a cube and a sphere. When the topology is
similar, the morphing will be a point-to-point mapping of some type of mesh.
When the topology differs, finding corresponding points is more difficult.
The methods used for 3-D morphing depend on the type of objects the animator
is working with. Convex shapes are the easiest, because they have no inward
curves on their surface. The simplest 3-D morphs are when the initial and
final shapes have similar topologies. For example, regions of one facial model
can be mapped onto the same region on another. In their paper on human
prototyping, Magnenat-Thalmann et al. use morphing to create new synthetic
actors. They reorganize the input faces by creating a new structure of facets.
These facets need to line up with reference points on each face. Corresponding
regions are defined on each face, and then the insides of the regions are
divided up using an automated algorithm. Once the correspondence between the
faces is defined, a morph can be done using linear interpolation.
In a 1989 article on shape distortion, Wes Bethel and Sam Uselton outlined
their 3-D morphing system, which can perform a warp between two 3-D objects.
Although the initial version required objects to be polyhedrons, the
principles allow the system to work with B-spline surfaces as well. The first
step is to construct a B* tree for each object. The nodes of the tree
represent faces, and its branches represent adjacency relationships between
faces. The user then selects one face and one vertex from that face, for each
object, to set up the correspondence between the objects.
Once the trees and the correspondences have been defined for the objects, a
union of their topologies has to be created. A new B* tree is created to hold
the information about the union. The goal is to preserve the adjacency
relationships among the faces. If one of the objects has more faces than the
other, then an extra face is added to the union tree. Missing faces and
vertices are added as needed. If one of the objects has a hole in it, then the
union will also have a hole. The result is a master object that has all the
topological characteristics of each of the input objects.
The most difficult part of the processing is assigning new coordinate values
to the master object. These coordinates need to be defined for the initial and
final keyframes. Convex and star-shaped objects are fairly straightforward,
but objects with holes are more difficult. If there's a hole in the initial
frame but not in the final, then the hole shouldn't be visible in the initial
frame. The hole is really there, but it has been geometrically collapsed to
hide it. The resultant system can semi-automatically match the components of
two objects. Using a simple interpolation scheme, the user is able to do
keyframe animation of 3-D scenes of shape-changing objects.
In their paper on topological merging, Kent, Parent, and Carlson present a
technique for computing smooth transformations between two polyhedral models.
By taking topological and geometric information into account, their system
gives results that maintain the connectivity of the polyhedrons at
intermediate steps of the transformation, a desirable quality in this type of
system. Their system displays far less distortion than that obtained by
previous shape-transformation methods.
The algorithm, when given two 3-D models, generates two new models that have
the same shapes as the original ones. The new models share the same topology,
which is a merger of the original objects' topologies. The merger allows the
transformations from one model to the other to be easily computed. Kent's
system is currently restricted to "star-shaped" (topologically nonconcave)
models without holes. This restriction is only temporary, as the concepts
involved are applicable for arbitrary polyhedral models.
There are several issues to address when developing a shape-transforming
system. Kent evaluates previous transformation systems thusly:
Is face connectivity maintained for all intermediate shapes?
Are the intermediate shapes overly distorted?
What restrictions exist on the original models?
Few systems pass this evaluation. Kent also outlines two problems that may
arise during interpolation. The first is that faces that have more than three
sides may not stay planar during transformation. Depending on the rendering
system, this may or may not be a problem. If it is, then the objects should be
triangulated before transformation. The second problem is that the object may
intersect itself during the interpolation. This usually happens if the shapes
are extremely concave. No solution to this second problem has been found.
Kent's system establishes a correspondence between the objects in three
stages. First, each model is projected onto the surface of a unit sphere. Then
the union of the two models' projections is taken using a modified version of
the Weiler polygon-clipping algorithm. The merged topology is then projected
back onto the original models. The results of Kent's approach are highly
encouraging when working with convex and star-shaped objects. Future efforts
will look at developing projections for general concave polyhedrons and
polyhedrons with holes. They are also looking at giving the animator more
control over the transformations. By using both topological and geometric
information in their system, Kent maintains the integrity of the objects (in
the Eulerian sense) while the output is intuitive in appearance. There's much
less distortion of the shapes in this method.
Dave Bock, a visualization programmer at the North Carolina Supercomputing
Center, takes a different approach to morphing by transforming objects using
their mathematical descriptions. Bock generates volume data files that contain
varying percentages of the two objects' data sets. The initial and final
frames have 100 percent of one object and 0 percent of the other. He
interpolates both volume data sets to define their respective objects at a
common data threshold to use with an isosurface generator. Then a series of
data files is created to hold the combined percentages of the initial data
files at different stages of the morph. The number of data files depends on
the number of in-between frames the user specifies. For each frame, volume
data files are created containing increasing percentages of the final image
and decreasing percentages of the initial object. Then the isosurface
generator is used to create the geometric object for each frame, based on the
volume data in the files. Bock is enhancing his system to accept geometrically
defined objects as well as mathematically defined ones.
In all these 3-D morphing systems, the emphasis has been on getting the 3-D
models defined--rendering and texture mapping is done later. This is one of
the main differences between 2-D and 3-D morphing. Another difference is that
3-D morphing systems tend to be more automated than their 2-D counterparts.


How to Improve Results


Morphing software won't necessarily produce compelling results on its own. The
animator must work hard to make the morph smooth and believable. One way of
improving results is to define a larger number of points. The use of a smaller
mesh gives a higher resolution to the morph so it can stand close scrutiny.
This is particularly important if the sequence is to be shown on a large
screen.
Key areas on the images need to be separated to stop them from dissolving
during morphing. Animators need to pay particular attention to features such
as the eyes and mouth on a face. Therefore, a good strategy is to choose
images that are fairly closely matched on key features. If the features don't
match up, you can warp the images before use so that reference points line up.
Even with something as familiar as a human face, observers will rarely notice
a well-executed warp. Humans remember the shapes of certain features rather
than an exact image of the whole face.
A good idea for a more-deceptive morphing sequence is to stagger the morphs as
the image changes; that is, to morph different parts of the image at different
times. A good example of this is Michael Jackson's "Black or White" film clip.
In part of the dance sequence, as the faces changed to the different dancers,
each frame would show some of the features from the initial image while other
parts were from the final image. This mix distracts the viewer's eye so that
the viewer can't predict where the next feature change will appear. In some
parts of "Black or White," there were up to seven planes of morphing going on
independently.
Finally, as in all animation, there's no law against doing touch-ups after the
computer animation. Cleaning up rough edges to give an improved finish is very
common. The need for touching up can be minimized by careful planning before
animation begins. Unquestionably, patience on the part of the animator is a
necessity.


References


Bethel, E.W. and S.P. Uselton. "Shape Distortion in Computer-Assisted Keyframe
Animation." State-of-the-Art in Computer Animation (Proc. of Computer
Animation '89), Magnenat-Thalmann & Thalmann, eds. New York, NY:
Springer-Verlag, 1989.
Blinn, J.F. and M.E. Newell. "Texture and reflection in computer generated
images." Communications of the ACM (October, 1976).
Heckbert, P.S. "Digital Image Warping: A Review." IEEE Computer Graphics and
Applications (January, 1991).
Heckbert, P.S. "Survey of texture mapping." IEEE Computer Graphics and
Applications (November, 1986).
Kent, J., R. Parent, and W. Carlson. "Establishing Correspondences by
Topological Merging: A New Approach to 3D Shape Transformation." Proceedings
of Graphics Interface '91.
Magnenat-Thalmann N., H.T. Minh, M. de Angelis, and D. Thalmann. "Human
Prototyping." New Trends in Computer Graphics: Proceedings of CG International
'88.
Sorenson, P. "Morphing Magic." Computer Graphics World (January, 1992).
Wolberg, G. Digital Image Warping. Los Alamitos, CA: IEEE Computer Society
Press, 1990.


How Do They Do It?


There are numerous examples of morphing in recent TV commercials and movies.
Here are a few. Much of this information comes from Peter Sorensen's article
"Morphing Magic" (Computer Graphics World, January 1992). The rest is from
various Internet discussions and e-mail conversations with morphing
practitioners.
Willow (Industrial Light and Magic, 1987). In a number of scenes, the
character Willow tries to transform a sorceress back to her true form, using a
wand with which he is not very familiar. In one scene, the sorceress starts as
a goat, later an ostrich, a peacock, a turtle, a tiger, and finally her true
human form. For this scene, ILM used deformable puppets of each animal so that
they could stretch them into the correct shape for each change. The actual
changeover between the puppets was done using 2-D morphing.
Indiana Jones and the Last Crusade (Industrial Light and Magic, 1988). The
villain believes he has found the grail and drinks from the cup. He has chosen
the wrong cup and proceeds to age rapidly until he dies, shriveling up like a
mummy. ILM used 2-D morphing to blend a series of three increasingly grotesque
masks depicting the villain's face as he died.
The Abyss (Industrial Light and Magic, 1989). In the movie, the
pseudopod--made out of water and shaped like a worm--comes into the human
habitat and explores. ILM animated the body of the pseudopod using traditional
computer-animation techniques. The face was morphed in 3-D using data from
complete 3-D digitizations of the required facial expressions. Doug Smythe's
Morf program was used to do the interpolations between facial expressions.
Terminator II (Industrial Light and Magic, 1991). Nearly every scene with the
T1000 character included morphing, as he reconstituted himself in response to
various attacks. For scenes with relatively little movement and fairly close
initial and final images, 2-D morphing was used. The scene in which Sarah
changes to the T-1000 is an example of the 2-D morphing. To distract the
viewer and make the scene look more realistic, different parts of the actors
were morphed at different times. In other scenes, 3-D morphing was used--for
example, when the T-1000 comes up out of the floor, and when he slips into the
helicopter.
Star Trek 6: The Undiscovered Country (Industrial Light and Magic, 1991). The
shape-changer, played by Iman, was morphed in 2-D to change into many
different forms, including: adult female, young girl, monster (her true form),
and Captain Kirk. The actors didn't move much in their morphing scenes, making
life easier for the animators. The difference in sizes between the actors
increased the difficulty of the morphing, requiring extreme warps between the
images.
Plymouth Voyager Commercial (Pacific Data Images, 1991). In this
advertisement, PDI used 2-D morphing to transform the 1990 Plymouth Voyager
van into a 1991 model. Different parts of the car were morphed at different
times. To reduce distortion of the background during the morph, parts of the
vehicle were photographed separately for easy manipulation. For other
elements, painting software was used to separate the elements and generate
mattes.
Exxon Commercial (Pacific Data Images, 1991). In this advertisement, a moving
car turns into a tiger. (Editor's Note: Refer to the images on the cover of
this issue.) The tiger was filmed on a stage while the car was filmed on a
mountain. Both shots were filmed using motion-control systems to allow exact
retakes. The 2-D morph was done with flair; the car ripples as it morphs, the
ripples becoming the stripes of the tiger.

Michael Jackson's "Black or White" Music Video (Pacific Data Images, 1991). A
series of dancers are morphed into each other. Jackson turns into and out of a
black panther while walking. Both were 2-D morphs. The dancers were edited in
many orders to create the smoothest result possible. Then the end of each
dancer's part was morphed onto the beginning of the next. The morphs were
staggered so that different parts of each dancer changed at different times.
--V.H.


RmorfA Shareware Morphing Program for MS-DOS


While Valerie Hall's article and George Wolberg's source code provide a solid
overview of morphing techniques, it's often necessary to work hands-on with an
actual program to get a visceral understanding of otherwise abstract concepts.
To this end, we're providing Rmorf, a shareware program for MS-DOS written by
Richard Goedeken (see "Availability," page 5). Figure 1 shows the Rmorf user
interface, which brings up two images, side by side, and allows you to specify
corresponding mesh points in each image. The program then produces a series of
in-between images, as many as the frame count you specify. You'll need a way
to display the resulting image sequence, however. One possibility is to use
Dave Mason's DTA or Autodesk's shareware TGAFLI. EXE program, both available
on CompuServe, to chain together a sequence of Targa files into a Flic file
that can then be displayed with a player such as the READFLIC program Jim Kent
included with his article, "The Flic File Format" (DDJ, March 1993). In
addition to Rmorf and the sample image files, we are providing TGAFLIC.EXEand
READFLIC.EXE electronically (see "Availability," page 5).
Although industrial-strength morphing programs rely on interpolated splines
and sophisticated antialiasing, Richard opts for a straightforward
implementation and raw speed. Here's his description of the genesis of Rmorf:
Recently, I was browsing through CompuServe Magazine and noticed a small item
about a shareware morphing program, so I logged on and downloaded it. The
sparse documentation described the basics of image morphing, of which I was
previously unaware. In running the program, it took me several attempts to
produce a moderately decent morph.
After spending several hours with the software, I got to know its problems: It
crashed frequently and was painfully slow.
I've done some work in computer graphics, so, over the following week, I
considered the possibility of writing a better morphing program. I sketched
out, in my mind and on paper, how two images could be morphed together. It
wasn't as difficult as I had first imagined.
All I had to do was make a transfer mesh for each frame, which would be
somewhere in between the first two meshes (depending upon the frame number),
warp each image to the transfer mesh, and mix the colors in a given ratio
(again depending upon the frame number). After approximately nine days of
working in my spare time, I completed a rough copy of Rmorf. Two days later
the finishing touches were done, and I released the program. The end result of
these 11 days of intense work are about 3700 lines of source code--2400 in
assembly language and the remaining 1300 in C.
Because the morphing portion is in assembler--and because it uses integer
rather than floating-point math--Rmorf can calculate a frame in 7.2 seconds on
my 33-MHz 386, instead of the 30--50 minutes required by other programs.
Rmorf currently supports only Targa files, but upcoming versions will include
.FLI and .GIF support. The unregistered version of Rmorf supports 320x200
resolution; the registered version handles 1024x768.
Richard, a junior in high school, has written a number of other programs (a
product-inventory system, a personal accounting package, and an SVGA game)
which are available from his software company. He can be reached on CompuServe
at 70304,1065.
--editors
Figure 1: Running the RMORF program. Note the mesh lines on the images which
define a correspondence between one image and another.


The Canonical Implementation in C


The two-pass mesh warping algorithm implemented by Douglas Smythe at
Industrial Light and Magic in 1989 has become the canonical implementation of
morphing. In his classic book on digital image warping (Digital Image Warping,
IEEE Computer Society Press, 1990), George Wolberg provides a lucid and
thorough explanation of how the algorithm works, as well presenting his own
version implemented in C. For this issue of DDJ, however, Wolberg has shared
with us a new implementation that's more self-contained and optimized than the
one in his book; see Listing One (page 92). For a complete description, refer
to Wolberg's book and the documentation that accompanies the electronic
version of this article; see "Availability," page 5. Here's a quick summary of
Wolberg's description.
The algorithm relies on the fact that an arbitrary, one-pass, spatial
transformation can be decomposed into a computationally cheaper two-pass
operation. This approach stems from the seminal 1980 paper by Ed Catmull and
Alvy Ray Smith, which is generally applicable to affine and perspective
transformations on planar and nonplanar surfaces. Here, a 2-D resampling
problem is replaced by two orthogonal 1-D resampling stages.
The user must specify two sets of control points. Both input and output images
are thereby partitioned into a mesh of patches. Each patch delimits an image
region over which a continuous mapping function applies. Mapping between both
images now becomes a matter of transforming each patch onto its counterpart in
the second image--known as mesh warping. Since the mapping function is defined
only at these discrete points, it's necessary to determine the mapping
function over all points to perform the warp. The patches can be fitted with a
bivariate function to realize a piecewise continuous mapping function. Wolberg
adds:
The benefit of using a mesh derives from the simplicity in interpolating the
new positions of intermediate points (between the mesh points). A bilinear or
bicubic function can be used. We use a Catmull-Rom cubic spline to implement
bicubic interpolation here.
The algorithm requires that all four edges of each mesh be frozen. This means
that the first and last rows and columns all remain intact throughout the
warp. Wolberg continues:
The input includes a source image I1 and two meshes, M1 and M2. Mesh M1 is
used to select landmark positions in I1, and M2 identifies their corresponding
positions in the output image. In this manner, arbitrary points in I1 can be
"pulled" to new positions. Although the use of a parametric mesh might seem to
place unnecessary constraints on the positions of these points, a large class
of useful transformations is possible. It is important, though, that the mesh
not self-intersect in order to avoid the image from folding upon itself.
The algorithm's first pass is responsible for resampling each row
independently. It maps all (u,v) points in the source image I1 to their (x,v)
coordinates in the intermediate image, thereby positioning each input point
into its proper output column. The intermediate image is that whose
x-coordinates are the same as those in I2 and whose y-coordinates are taken
from I1. The second pass then resamples each column in the intermediate image,
mapping every (x,v) to its final (x,y) position. Each point now lies in its
proper row as well as its column.
The implementation presented here is for gray-scale images only. It is
straightforward to extend the program to handle three-channel images (like RGB
color images), by handling each channel separately.
Wolberg's implementation is in the classic UNIX style of a rudimentary
command-line user interface. Wolberg explains:
The code is missing a program to help the user create and edit meshes
interactively. A good mesh editor is a critical component to any mesh warping
program. Such code falls outside of the scope of this presentation. A more
full-featured implementation would allow the user to control the
cross-dissolve schedule at each mesh point, as well as its position. This
permits the intensities in different regions of the image to interpolate at
different rates.
--editors
[LISTING ONE]
_MORPHING IN 2D AND 3D_
by Valerie Hall
Source code written by George Wolberg and accompanies the
sidebar entitled "The Canonical Implementation in C"



/* ==========================================================================
 * meshwarp.h -- Header file for meshwarp.c. (C) 1993 by George Wolberg.
 * =========================================================================*/

#include <stdio.h>

#define BW 0
#define MESH 1
#define MAX(A,B) ((A) > (B) ? (A) : (B))
#define MIN(A,B) ((A) < (B) ? (A) : (B))

typedef unsigned char uchar;

typedef struct { /* image data structure */
 int width; /* image width (# cols) */
 int height; /* image height (# rows) */

 void *ch[2]; /* pointers to channels */
} imageS, *imageP;

extern void meshWarp(); /* extern decls for funcs in meshwarp.c */
extern void resample();
extern imageP readImage(); /* extern decls for funcs in util.c */
extern int saveImage();
extern imageP allocImage();
extern void freeImage();

/* ==========================================================================
 * morph.c - Generate a metamorphosis sequence. (C) 1993 by George Wolberg.
 * =========================================================================*/

#include "meshwarp.h"

/*------- main: Collect user parameters and pass them to morph() ------- */
main(argc, argv)
int argc; char **argv;
{ int nframes;
 char name[10];
 imageP I1, I2;
 imageP M1, M2;

 if(argc != 7) /* make sure user invokes this program properly */
 { fprintf(stderr,
 "Usage: morph src.bw dst.bw src.XY dst.XY frames name\n");
 exit(1);
 }
 /*----------- read input image and meshes --------------*/
 I1 = readImage(argv[1], BW); /* source image */
 I2 = readImage(argv[2], BW); /* target image */
 M1 = readImage(argv[3], MESH); /* source mesh */
 M2 = readImage(argv[4], MESH); /* target mesh */
 nframes = atoi(argv[5]); /* # frames */
 strcpy(name, argv[6]); /* out basename */
 /*----------- call morph -------------------------------*/
 morph(I1, I2, M1, M2, nframes, name);
}
/* -------------------------------------------------------------------------
 * morph: Generate a morph sequence of frames between images I1 and I2.
 * Correspondence points among I1 and I2 are given in meshes M1 and M2.
 * nframes frames are generated (including I1 and I2). The output is stored
 * in files "basename_xxx.bw" where xxx are sequential 3-digit frame numbers.
 *--------------------------------------------------------------------------*/
void morph(I1, I2, M1, M2, nframes, basename)
imageP I1, I2, M1, M2; int nframes; char * basename;
{ int i, j, totalI, totalM;
 double w1, w2;
 char name[20];
 uchar *p1, *p2, *p3;
 float *x1, *y1, *x2, *y2, *x3, *y3;
 imageP I3, Iw1, Iw2, M3;

 /* allocate space for tmp images and mesh */
 M3 = allocImage(M1->width, M1->height, MESH);
 I3 = allocImage(I1->width, I1->height, BW);
 Iw1 = allocImage(I1->width, I1->height, BW);
 Iw2 = allocImage(I1->width, I1->height, BW);


 /* eval total number of points in mesh (totalM) and image (totalI) */
 totalM = M1->width * M1->height;
 totalI = I1->width * I1->height;
 /* copy 1st frame to basename_000.bw*/
 sprintf(name, "%s_000.bw", basename);
 saveImage(I1, name, BW);
 printf("Finished Frame 0\n");

 for(i=1; i<nframes-1; i++)
 { /* M3 <- linearly interpolate between M1 and M2 */
 w2 = (double) i / (nframes-1);
 w1 = 1. - w2;

 /* linearly interpolate M3 grid */
 x1 = (float *) M1->ch[0]; y1 = (float *) M1->ch[1];
 x2 = (float *) M2->ch[0]; y2 = (float *) M2->ch[1];
 x3 = (float *) M3->ch[0]; y3 = (float *) M3->ch[1];
 for(j=0; j<totalM; j++)
 {
 x3[j] = x1[j]*w1 + x2[j]*w2;
 y3[j] = y1[j]*w1 + y2[j]*w2;
 }
 /* warp I1 and I2 according to grid M3 */
 meshWarp(I1, M1, M3, Iw1);
 meshWarp(I2, M2, M3, Iw2);

 /* cross-dissolve warped images Iw1 and Iw2 */
 p1 = (uchar *) Iw1->ch[0];
 p2 = (uchar *) Iw2->ch[0];
 p3 = (uchar *) I3->ch[0];
 for(j=0; j<totalI; j++)
 p3[j] = p1[j]*w1 + p2[j]*w2;

 /* save frame into file */
 sprintf(name, "%s_%03d.bw", basename, i);
 saveImage(I3, name, BW);
 printf("Finished Frame %d\n", i);
 }
 /* copy last frame to basename_xxx.bw */
 sprintf(name, "%s_%03d.bw", basename, i);
 saveImage(I2, name, BW);
 printf("Finished Frame %d\n", i);
}

/* ===========================================================================
 * catmullRom.c - Catmull-Rom interpolating spline. (C) 1993 by George Wolberg
 * =========================================================================*/

#include "meshwarp.h"

/* --------------------------------------------------------------------------
 * catmullRom: Compute a Catmull-Rom spline passing thru the len1 points in
 * arrays x1, y1, where y1 = f(x1). len2 positions on the spline are to be
 * Their positions are given in x2. The spline values are stored in y2.
 *--------------------------------------------------------------------------*/
void catmullRom(x1, y1, len1, x2, y2, len2)
float *x1, *y1, *x2, *y2;
int len1, len2;

{ int i, j, dir, j1, j2;
 double x, dx1, dx2;
 double dx, dy, yd1, yd2, p1, p2, p3;
 double a0y, a1y, a2y, a3y;

 /* find direction of monotonic x1; skip ends */
 if(x1[0] < x1[1]) /* increasing */
 { if(x2[0]<x1[0] x2[len2-1]>x1[len1-1]) dir=0;
 else dir = 1;
 }
 else /* decreasing */
 { if(x2[0]>x1[0] x2[len2-1]<x1[len1-1]) dir=0;
 else dir = -1;
 }
 if(dir == 0) /* error */
 { printf("catmullRom: Output x-coord out of range of input\n");
 return;
 }
 /* p1 is first endpoint of interval
 * p2 is resampling position
 * p3 is second endpoint of interval
 * j is input index for current interval
 */
 if(dir==1) p3 = x2[0] - 1; /* force coefficient initialization */
 else p3 = x2[0] + 1;

 for(i=0; i<len2; i++)
 { p2 = x2[i]; /* check if in new interval */
 if( (dir== 1 && p2>p3 ) 
 (dir== -1 && p2<p3 ))
 { if(dir) /* find the interval which contains p2 */
 { for(j=0; j<len1 && p2>x1[j]; j++) ;
 if(p2 < x1[j]) j--;
 }
 else
 { for(j=0; j<len1 && p2<x1[j]; j++) ;
 if(p2 > x1[j]) j--;
 }
 p1 = x1[j]; /* update 1st endpt */
 p3 = x1[j+1]; /* update 2nd endpt */

 /* clamp indices for endpoint interpolation */
 j1 = MAX(j-1, 0);
 j2 = MIN(j+2, len1-1);

 /* compute spline coefficients */
 dx = 1.0 / (p3 - p1);
 dx1 = 1.0 / (p3 - x1[j1]);
 dx2 = 1.0 / (x1[j2] - p1);
 dy = (y1[j+1] - y1[ j ]) * dx;
 yd1 = (y1[j+1] - y1[ j1]) * dx1;
 yd2 = (y1[j2 ] - y1[ j ]) * dx2;
 a0y = y1[j];
 a1y = yd1;
 a2y = dx * ( 3*dy - 2*yd1 - yd2);
 a3y = dx*dx*(-2*dy + yd1 + yd2);
 }
 /* use Horner's rule to calculate cubic polynomial */
 x = p2 - p1;

 y2[i] = ((a3y*x + a2y)*x + a1y)*x + a0y;
 }
}

/* ==========================================================================
 * meshwarp.c -- Mesh warping program. Copyright (C) 1993 by George Wolberg.
 * =========================================================================*/

#include "meshwarp.h"

/* --------------------------------------------------------------------------
 * meshWarp: Warp I1 with correspondence points given in meshes M1 and M2.
 * Result goes in I2. Based on Douglas Smythe's algorithm at ILM.
 *--------------------------------------------------------------------------*/
void meshWarp(I1, M1, M2, I2)
imageP I1, I2, M1, M2;
{ int I_w, I_h, M_w, M_h;
 int x, y, u, v, n;
 float *x1, *y1, *x2, *y2;
 float *xrow, *yrow, *xcol, *ycol, *coll, *indx, *map;
 uchar *src, *dst;
 imageP Mx, My, I3;

 I_w = I1->width; I_h = I1->height;
 M_w = M1->width; M_h = M1->height;

 /* alloc enough memory for a scanline along the longest dimension */
 n = MAX(I_w, I_h);
 indx = (float *) malloc(n * sizeof(float)); /* should check if err */
 xrow = (float *) malloc(n * sizeof(float));
 yrow = (float *) malloc(n * sizeof(float));
 map = (float *) malloc(n * sizeof(float));

 /* create table of x-intercepts for source mesh's vert splines */
 Mx = allocImage(M_w, I_h, MESH);
 for(y=0; y < I_h; y++) indx[y] = y;
 for(u=0; u < M_w; u++) /* visit each vert spline */
 { /* store col as row for spline function */
 xcol = (float *) M1->ch[0] + u;
 ycol = (float *) M1->ch[1] + u;
 coll = (float *) Mx->ch[0] + u;
 /* scan-convert vert splines */
 for(v=0; v < M_h; v++, xcol+=M_w) xrow[v] = *xcol;
 for(v=0; v < M_h; v++, ycol+=M_w) yrow[v] = *ycol;
 catmullRom(yrow, xrow, M_h, indx, map, I_h);

 /* store resampled row back into column */
 for(y=0; y < I_h; y++, coll+=M_w) *coll = map[y];
 }
 /* create table of x-intercepts for dst mesh's vert splines */
 for(u=0; u < M_w; u++) /* visit each vert spline */
 { /* store column as row for spline fct */
 xcol = (float *) M2->ch[0] + u;
 ycol = (float *) M2->ch[1] + u;
 coll = (float *) Mx->ch[1] + u;

 /* scan-convert vert splines */
 for(v=0; v < M_h; v++, xcol+=M_w) xrow[v] = *xcol;
 for(v=0; v < M_h; v++, ycol+=M_w) yrow[v] = *ycol;

 catmullRom(yrow, xrow, M_h, indx, map, I_h);

 /* store resampled row back into column */
 for(y=0; y < I_h; y++, coll+=M_w) *coll = map[y];
 }
 /*------------ first pass: warp x using tables in Mx --------*/
 I3 = allocImage(I_w, I_h, BW);
 x1 = (float *) Mx->ch[0];
 x2 = (float *) Mx->ch[1];
 src = (uchar *) I1->ch[0];
 dst = (uchar *) I3->ch[0];
 for(x=0; x < I_w; x++) indx[x] = x;
 for(y=0; y < I_h; y++)
 { /* fit spline to x-intercepts; resample over all cols */
 catmullRom(x1, x2, M_w, indx, map, I_w);

 /* resample source row based on map */
 resample(src, I_w, 1, map, dst);

 /* advance pointers to next row */
 src += I_w;
 dst += I_w;
 x1 += M_w;
 x2 += M_w;
 }
 freeImage(Mx);

 /* create table of y-intercepts for intermediate mesh's hor splines */
 My = allocImage(I_w, M_h, MESH);
 x1 = (float *) M2->ch[0];
 y1 = (float *) M1->ch[1];
 y2 = (float *) My->ch[0];
 for(x=0; x < I_w; x++) indx[x] = x;
 for(v=0; v < M_h; v++) /* visit each horz spline */
 { /* scan-convert horz splines */
 catmullRom(x1, y1, M_w, indx, y2, I_w);
 x1 += M_w; /* advance pointers to next row */
 y1 += M_w;
 y2 += I_w;
 }
 /* create table of y-intercepts for dst mesh's horz splines */
 x1 = (float *) M2->ch[0];
 y1 = (float *) M2->ch[1];
 y2 = (float *) My->ch[1];
 for(v=0; v < M_h; v++) /* visit each horz spline */
 { /* scan-convert horz splines */
 catmullRom(x1, y1, M_w, indx, y2, I_w);
 x1 += M_w; /* advance pointers to next row */
 y1 += M_w;
 y2 += I_w;
 }
 /*----------------------- second pass: warp y ------------------*/
 src = (uchar *) I3->ch[0];
 dst = (uchar *) I2->ch[0];
 for(y=0; y < I_h; y++) indx[y] = y;
 for(x=0; x < I_w; x++)
 { /* store column as row for spline fct */
 xcol = (float *) My->ch[0] + x;
 ycol = (float *) My->ch[1] + x;

 for(v=0; v < M_h; v++, xcol+=I_w) xrow[v] = *xcol;
 for(v=0; v < M_h; v++, ycol+=I_w) yrow[v] = *ycol;

 /* fit spline to y-intercepts; resample over all rows */
 catmullRom(xrow, yrow, M_h, indx, map, I_h);

 /* resample source column based on map */
 resample(src, I_h, I_w, map, dst);

 /* advance pointers to next column */
 src++; dst++;
 }
 freeImage(My); freeImage(I3); free((char *) indx);
 free((char *) xrow); free((char *) yrow); free((char *) map);
}
/* --------------------------------------------------------------------------
 * resample: Resample the len elements of src (with stride offst) into dst
 * according to the spatial mapping given in xmap. Perform linear interpola-
 * tion for magnification and box filtering (unweighted averaging) for
 * minification. Based on Fant's algorithm (IEEE Comp. Graphics & Appl. 1/86)
 *--------------------------------------------------------------------------*/
void resample(src, len, offst, xmap, dst)
uchar *src, *dst; float *xmap; int len, offst;
{ int u, x, v0, v1;
 double val, sizfac, inseg, outseg, acc, inpos[1024];

 /* precompute input index for each output pixel */
 for(u=x=0; x<len; x++)
 { while(xmap[u+1]<x) u++;
 inpos[x] = u + (double) (x-xmap[u]) / (xmap[u+1]-xmap[u]);
 }
 inseg = 1.0;
 outseg = inpos[1];
 sizfac = outseg;
 acc = 0.;
 v0 = *src; src += offst;
 v1 = *src; src += offst;
 for(u=1; u<len; )
 { val = inseg*v0 + (1-inseg)*v1;
 if(inseg < outseg)
 { acc += (val * inseg);
 outseg -= inseg;
 inseg = 1.0;
 v0 = v1;
 v1 = *src;
 src += offst;
 }
 else
 { acc += (val * outseg);
 acc /= sizfac;
 *dst = (int) MIN(acc, 0xff);
 dst += offst;
 acc = 0.;
 inseg -= outseg;
 outseg = inpos[u+1] - inpos[u];
 sizfac = outseg;
 u++;
 }
 }

}





























































July, 1993
VGA Palette Mapping Using BSP Trees


A multidimensional cousin of the binary tree




Mark Betz


Mark is a software engineer and C++ consultant for Semaphore Training in North
Andover, Massachusetts. His interests include graphics, computer-game design
and programming, and object-oriented technology. He can be reached at
76605,2346 on CompuServe, where he is a founding member of the Game Design SIG
in the GAMERS forum, and can also be found inhabiting the C++ Study Group in
DDJFORUM.


With the exception of some 24-bit graphics formats, every image on a PC screen
must use colors from a constrained palette, usually of 256 or fewer shades.
This works fine for a single image, but if there are multiple images on the
screen, you'll need to somehow use different sets of colors at the same time.
For example, a rich image of an Arizona sunset requires many shades of red and
orange, while that of a Caribbean island uses blues and greens. Displaying the
two images one after the other is no problem: Simply switch VGA video
palettes. But to display them both at the same time, you must find some way to
make both images use a common palette.
This requires a process I call "best-fit color matching." Some color-matching
algorithms preprocess the palette data into an ordered structure, such as an
octree, while others use hash tables. The method I present here involves an
interesting data structure known as a "binary space partitioning" (BSP) tree.
As you'll see, it performs quite well in terms of speed, accuracy, and memory
requirements. Later in this article, I present a C++ program that builds a BSP
tree and uses this data structure to remap the colors in PCX files.


VGA and RGB Review


Color, as you know, is an analog phenomenon in nature that is only
approximated by computer representations. The most popular digital
representation uses the RGB color model, which specifies colors as a
combination of three components: red, green, and blue. The classic RGB model
uses the three axes of the Cartesian coordinate system to define a cube-shaped
color space that contains all representable colors. Any given shade is located
in this space via its RGB value, which serves as its coordinate, specifying a
particular point location within the color cube. Figure 1 illustrates this
space as inhabited by two sample points.
The VGA hardware on a PC directly makes use of an RGB representation. It uses
6 bits per R, G, and B component, forming a triplet of 18 bits total. A VGA
palette consists of 256 18-bit registers. The bit width means each component
has a range of 0 to 63, and thus determines a 3-D space of 262,144 (633)
possible colors, of which any 256 are contained in a given palette.
Because the RGB model maps color values to a 3-D geometric space, we can speak
of the "distance" between any two colors. This distance is calculated using
the classic Euclidean formula, substituting RGB notation in place of the
traditional XYZ values, as in Example 1. If the formula yields a result of 0,
then two shades are the same. Otherwise, the smaller the result, the closer
the two colors.
Mapping a given image's palette over to that of another image is
straightforward: Step through each color in the source image and find the
closest entry in the target image's palette by calculating the distance from
each source element to every element in the target palette. This strategy
works fine, but may prove computationally expensive for some applications. An
easy first step in improving performance is to always use the squared values
for comparison, rather than deriving the square roots. Still, more improvement
is needed. If the set of target values is somehow ordered, then you can
quickly reject points that are obviously not candidates, and avoid numerous
distance calculations. Working with ordered sets takes us into the realm of
range-searching algorithms.


Range Searching in Various Dimensions


Range searching is the task of finding all the data points within a given
interval. An example of a one-dimensional range search is to find all the
employees of company Z that make $35,000 to $40,000 per year. In this case,
all data points are on a line from $0 to the maximum, and you examine those
within a specified interval. For this task, simple binary partitioning of an
ordered set works well.
Extending this concept to two dimensions requires thinking of the set of data
points as existing on a plane, rather than a line, with the search range
represented as a rectangular subset of that plane. Our search now requires
locating points that fall within two intervals: one along the x axis, the
other along the y. If either half of the problem is viewed separately from the
other, it is equivalent to the one-dimensional problem described earlier.
Indeed, one approach is to apply a one-dimensional method to each axis in
turn. All the points falling within the x interval, for example, can be used
as the input set for a check against the y interval. Even so, this method
proves inefficient if there are relatively few points within the search range.
This is likely to be the case with color matching, since the VGA palette
involves a subset of 256 out of a possible 262,144 colors. For a more
efficient approach, we use a BSP tree.
A BSP tree is a binary tree in which the child nodes are sorted to the left
and right of the parent based on their spatial relationship with the parent.
If we apply a simple binary tree to the one-dimensional problem outlined
above, then the result is a structure that should be familiar to anyone who
has worked with binary trees in the past. The tree is constructed from the set
by picking a middle value for the root node, and inserting the others to left
and right based on a value comparison. When traversing a node in the tree,
values lower than the current node's value go left, while those greater than
or equal to it go right. Figure 2 shows the tree resulting from a hypothetical
data set.
Locating the points in the tree that lie within the search interval requires a
simple recursive traversal, recording nodes which fall within the interval,
and discarding those that don't. At each node, if its value lies outside the
search interval to the left, we travel down the tree to the right. Likewise,
if it is outside the interval to the right, we move left. If the point lies
within the interval we record it, and search both subtrees (since points on
either side may lie within the interval as well). In general, a
one-dimensional search using a structure of this type requires O(N logN) steps
for preprocessing, and O(R+logN) for range searching, with R representing the
number of data points actually falling within the range.
We can extend this approach for two-dimensional searches. As in the
one-dimensional case, you insert all the points in the set into a tree, but
this time you must maintain spatial relationships on two axes--simply by
alternating the keys in strict sequence. For example, at level 0 of the tree
(the level of the root node) compare points on the x axis, traversing the tree
to the left for points to the right of the x axis range, and to the right for
points to the left of it, exactly as before. At level 1 of the tree, follow
the same rules, but compare points on the y axis, and so on, alternating the
compare key with each level of the tree. Figure 3 shows what a typical 2-D BSP
tree might look like after construction.


BSP Trees and the Color Cube


Alternating the compare keys orders the set of points in two dimensions. You
can visualize this arrangement by thinking of each node as a point at which
either a horizontal or vertical line is drawn, partitioning the planar space.
This ordering is readily extensible to n dimensions. For example, for three
dimensions, simply alternate on three keys. Each compare operation partitions
the space along one dimension. This lets us use BSP trees for matching points
in the 3-D RGB color cube.
The problem can be stated as a range search: Given a desired color point P,
what points in the set of available colors fall within a cube, T units on a
side, centered on P? To find the answer, sort the available colors into a
three-dimensional BSP tree, and then search it on the R, G, and B ranges in
alternating sequence, using the value T as the interval against which
comparisons are made. The results of this search represent a very much smaller
set of all the points closest to the desired color. The point-distance
calculation can then be performed on this set in order to determine which of
these is actually the closest match.
The size of the result set (that is, the number of points that lie within the
search cube) depends upon both the nature of the palette data and the size of
the cube. For a VGA palette, with only 256 of 262,144 colors represented, the
RGB color space is primarily empty. The points of color actually represented
in the palette are like stars in the void. If the palette contains many ranges
of colors which vary only slightly, then it will have "clusters" of stars at
various points in the cube, and searching near one of these will yield a
larger result set. Likewise, expanding or contracting the size of the search
cube (the value T in the previous paragraph) expands or contracts the volume
of the color space searched and affects the size of the result set, as well as
the number of nodes visited during a search.
Although your application does not usually control the nature of the palette
data, it can vary the size of the search cube. Choosing an appropriate value
for T is left as an exercise for the reader; the programs that accompany this
article accept the T value as a command-line parameter. My color-mapping
utility is generally successful on complex images with a search-cube edge size
of 10 to 15 units. Naturally, you can guarantee that you will always find the
best match by using a very large value for T. A high value causes a large
volume to be examined and increases the chance of an accurate match. It is
also very much slower than even a sequential search. A value of 0, on the
other hand, ensures that only exact matches for the desired color will be
returned.


The Implementation


Listing One (page 94) presents the declaration of RGBBinTree, a class written
in Borland C++ that builds and searches a three-dimensional BSP tree of RGB
colors. In addition to its two constructors, the public interface of the class
consists of three member functions. The first is the RGBBinTree::rgbMatch()
function. Its arguments are the R, G, and B primaries of the color we're
searching for, and an integer threshold value T which specifies the range in
which we're going to look. The function returns the palette index of the best
match found.
The second public-member function is the RGBBinTree::build() function. This
function takes a set of RGB palette values as a parameter, and builds them
into the BSP tree. It may be called at any time to create a new tree in the
structure, but must be called at least once prior to performing a search. It
is called by one of the constructors for this class.
The third public-member function is RGBBinTree::nodesVisited(), provided for
analytical purposes. It returns the total number of nodes searched during the
last lookup.
The complete code for this article (along with programmer notes) is not
printed here but is available in electronic form; see "Availability," page 5.


Data Structures



Internally, the BSP tree is built using an array structure. An array can be
the most efficient means for representing a tree--vs. using
pointers--especially if the maximum size of a tree structure is known at
compile time. The BSP tree is represented by an array of pnode structs, each
of which contains a 3-byte array (containing the RGB components), an integer
element (for the palette index which originally contained the color), and two
integers, which are array indexes to child nodes.
Other data structures include n_stack[] and n_sp, an array and integer,
respectively; they represent a stack structure and stack pointer used during
traversals of the BSP tree. Every node that falls within the search range will
be saved on the stack, so that it can be revisited when the current subtree
terminates in a leaf node. Each "push" saves the node index and a flag (which
we'll talk about later), for a total of two integers per push. The stack size
is large enough for the worst-case scenario of pushing every node in the tree.


The Algorithm


Now for the fun part: the implementation of the BSP tree algorithm. Listing
Two (page 94) presents the file RGBTREE.CPP, which contains the
member-function definitions for the RGBBinTree class. The first member
function is the RGBBinTree::build() function, called by the application to
construct a tree for a given instance. It first initializes the HEAD and TAIL
nodes, which anchor the tree in the array. After the tree is built, HEAD's
right child will point to the root node, and all leaf nodes will point to
TAIL. RGBBinTree::build() then calls RGBBinTree::getMid() to find the palette
register closest to the center of the color space (in this case, it looks for
a palette register with RGB values of 31, 31, and 31). The midpoint color gets
inserted into the tree as the root node, ensuring that subsequent node
insertions will branch out as nicely as possible--a bushier, better balanced
structure.
The function then inserts the rest of the palette colors into the tree. This
is done in two For loops, which call RGBBinTree::insertNode() to link the node
into the tree. The insertNode() function is passed a reference to a new pnode
to be inserted, and an array index specifying an available slot in the array.
As in any binary tree, insertion into a BSP tree is a matter of first
conducting an unsuccessful search for the node to be inserted. Such a search
terminates at a leaf node, and it is here that the new node is placed. In some
cases nodes will match exactly, and for those instances I've adhered to the
rule that exact matches go to the right child. The search for the insert
location is performed in the single While loop in insertNode(), which
continues until the current node being examined is TAIL, at which point it
falls through to the actual linking in of the new node in the next block.
Inside the While loop is the sequential key compare that is at the heart of
the structure.
To control the alternating sequence of comparison keys in both insertNode()
and rgbMatch(), I use a flag called colorf. This flag is incremented whenever
the algorithm moves down a level in the tree, wrapping around to 0 when it
exceeds 2. The value of this flag is used to index into the array of RGB color
values in each node (a 3-byte array), so that it determines at each level
which two colors will be compared. On entry into either function, the flag is
set to --1, so that the first increment results in the starting state of 0.
At each node, the insertion loop compares the two colors selected from the RGB
arrays in the nodes by the colorf flag, and applies the rules previously
discussed: If the value of the node to be inserted is greater than or equal to
the current node's value, then we traverse the tree to the right; otherwise,
we traverse the tree to the left. In order to "remember" from which direction
it arrived at the current node, the algorithm uses another flag called
dirflag. This flag is 0 if the traversal was to the right, and 1 if it was to
the left. After encountering the TAIL node (meaning the search was
unsuccessful) the While loop falls through to the block which handles
inserting the new node into the tree.
The next member function, RGBBinTree::rgbMatch(), is the most important of the
class-interface functions. It is passed the RGB value to be searched for and
an integer value specifying the search cube size, and returns the palette
index of the best match for that color. The basic strategy is simple, even if
the implementation appears complex. The algorithm performs a ranged search of
the tree, exactly as described previously for one- and two-dimensional
problems. Whenever a node is encountered which falls within the comparison
interval, two things happen: First, the node index and the state of the colorf
flag are pushed onto n_stack[], so that this point in the tree can be
revisited later; second, the distance calculation is performed to determine
the point-to-point distance of this RGB value from the desired one. If the
results of this calculation are better (smaller) than the current best match,
then the old match is replaced with the new.
The function first sets up an array of upper and lower boundaries for the
compare interval. This array is arranged such that the colorf flag can select
the appropriate range value for comparison automatically. Next, the function
prepares for the tree traversal by pushing the starting node and colorf flag
state onto n_stack[]. In this case the node pushed is HEAD, and the colorf
value is --1.
We now arrive at the two nested While loops, which are the heart of the search
algorithm. Again, the general idea is fairly simple: The outer While loop
positions the algorithm at the top of a subtree (in the case of the first
iteration, the subtree is the whole BSP tree) and runs until the stack is
empty, indicating that all subtrees have been searched. The inner While loop
traverses each subtree in turn and exits when a leaf node (TAIL) is
encountered. Within the inner loop, the alternating key sequence is compared
and distances are calculated for nodes encountered that fall within the search
interval. All pushes of nodes onto the stack take place in the inner loop; all
pops from the stack take place in the outer. Pushing a node is the inner
loop's way of letting the outer loop know that there's an interesting subtree
to be searched later.


How Well Does This Work?


To assess the performance, I wrote several programs which are included with
the electronic listings. The first, PMTEST.EXE, compares the speed and
accuracy of the BSP approach to that of a brute-force, sequential search by
matching all the colors in one palette to the colors in another. Both speed
and accuracy vary, naturally, according to the size of the search range and
the nature of the palette data. In tests on my 33-MHz 486DX, the BSP method is
5 to 10 times faster than the sequential search. PMTEST also reports on
accuracy by tracking the maximum disparity between any two colors in the
remapped set.
The second program is called REMAP.EXE, and its task is to remap all of the
colors in a 256-color PCX graphics file to use the colors in another PCX file,
or, alternatively, from a raw palette file. This utility is implemented for
VGA mode 13h (320x200x256 color) files only. It uses the Fastgraph graphics
library from Ted Gruber Software (Las Vegas, Nevada) for display and
palette-grabbing functions. The shareware version of Fastgraph,
Fastgraph/Light, is included with the electronic listings so that you can
experiment with the remapper, perhaps by extending it to accommodate other
256-color formats. REMAP demonstrates the BSP-tree approach to color matching
by letting you compare before and after effects of remapping a particular
image.
In terms of memory requirements, the BSP approach compares very well to both
octrees and hash tables. Since the pnode structure occupies 9 bytes, the total
space occupied by the BSP tree for a 256-entry VGA palette is 2322 bytes
((256+2)*9). The n_stack array occupies another 512 bytes, and the total size
of the code in RGBTREE.C is 2027 bytes, for a total of 4861 bytes of code and
data. In contrast, one method I know uses a 40K hash table, and my first
approximation of the octree approach shows that it requires at least 4K just
for the node links, not counting any other code or data. I don't have actual
performance data for either of these methods; please contact me if you have
more detailed information.
How well does the BSP approach generalize to other application domains? It's a
bit hard to tell, since the algorithm is highly dependent on the nature of the
data. All binary trees benefit from a higher degree of randomness in the data
being sorted, and the BSP tree, with its multidimensional nature, is more
sensitive than most. In particular, generalizing these methods to problems of
more than three dimensions is problematic, because the higher the number of
dimensions, the less likely that a given set of data points will be random
across all of them. However, color data itself is not particularly random, so
the BSP approach may perform even better in other application areas, such as
geographic databases.
 Figure 1: The RGB color space, showing two points and the distance between
them.
 Example 1: The Euclidean distance formula in RGB space. D is the distance
between points; R1 and R0 are R coordinates, G1 and G0 are G coordinates, and
B1 and B0 are B coordinates of the two points, respectively.
 Figure 2: Insertion of a one-dimensional set into a binary tree. Solid
squares are null pointers.
 Figure 3: Insertion of a two-dimensional set into a BSP tree. Color lines on
the plane show the spatial partitioning resulting from alternating the
comparison key. Symbols next to nodes in the tree diagram show which
comparison key was used at that level.
[LISTING ONE]

/**********************************************************************
* RGBTREE.H -- header file for rgb palette sort and search functions
* Copyright (c) 1993, Mark Betz, Derry NH. (603) 898-8214.
\**********************************************************************/

const PAL_SIZE = 768; // number of bytes in palette
const DAC_SIZE = PAL_SIZE/3; // size of DAC assuming 3 bytes/reg
const HEAD = 0; // index of the head node in tree
const TAIL = (PAL_SIZE/3) + 1; // index of the tail node in tree
const ZIDX = -1; // invalid index

const R = 1; // These consts control the order of
const G = 0; // comparisons in the tree build and sort routines.
const B = 2; // Default sequence is G, R, then B.

typedef unsigned char DACTBL[PAL_SIZE]; // generic array for palette values

class RGBBinTree // the BSP tree class for sorting RGB values
{
 friend class ViewTree; // ViewTree is a browser-like utility function
public:
 RGBBinTree();
 RGBBinTree( const DACTBL dacs ) { build( dacs ); }
 int rgbMatch( unsigned char r, unsigned char g, unsigned char b,int thold);
 void build( const DACTBL dacs );
 int nodesVisited() { return visitCnt; }
protected:
 struct pnode
 { unsigned char c[3];

 int index, left, right;
 };
 RGBBinTree( const RGBBinTree& ) {};
 RGBBinTree& operator =(const RGBBinTree& ) {};
 int getMid( const DACTBL dacs ) const;
 void insertNode( const struct pnode& newn, int elem );
private:
 pnode pn[ DAC_SIZE+2 ];

 static int n_stack[ DAC_SIZE*2 ];
 static int n_sp;
 int treeBuilt;
 int visitCnt;
};

[LISTING TWO]

/**********************************************************************\
* RGBTREE.CPP -- implementation of RGBBinTree, a BSP tree for RGB space
* Copyright (c) 1993, Mark Betz, Derry NH. (603) 898-8214.
\**********************************************************************/

#include <stdlib.h>
#include <mem.h>
#include "rgbtree.h"

int RGBBinTree::n_stack[]; // definition of static members
int RGBBinTree::n_sp = 0;

// the default constructor does nothing but initialize member data
RGBBinTree::RGBBinTree() : treeBuilt(0), visitCnt(0) {}

//----------------------------------------------------------------------
void RGBBinTree::build( const DACTBL dacs )
{ struct pnode newn; // local node for insertions
 int i, j, di; // color index, generic counters
 int node = 1; // counts tree array elements

 pn[HEAD].index = ZIDX; // initialize the head and tail...
 pn[HEAD].left = TAIL; // nodes in the array
 pn[HEAD].right = TAIL; // indices = -1, child nodes set...
 pn[TAIL].index = ZIDX; // to point to TAIL
 pn[TAIL].left = pn[TAIL].right = TAIL;
 newn.left = 0; // don't use these, so zero them
 newn.right = 0;

 i = getMid(dacs); // find the color nearest the center
 j = i+1; // point j to the next color
 for (;i > ZIDX; i--) // insert the lower half of the dacs
 { di = i*3; // color index = dac index * 3
 newn.c[R] = dacs[di]; // load the primaries into the node
 newn.c[G] = dacs[di+1];
 newn.c[B] = dacs[di+2];
 newn.index = i; // load the dac index into the node

 insertNode( newn, node ); // insert it into the tree
 node++; // increment to next node
 }
 for (;j < DAC_SIZE; j++) // insert the upper half of the dacs

 { di = j*3; // same procedure as above
 newn.c[R] = dacs[di];
 newn.c[G] = dacs[di+1];
 newn.c[B] = dacs[di+2];
 newn.index = j;
 insertNode( newn, node );
 node++;
 }
 treeBuilt = 1; // set treeBuilt to true
}
//----------------------------------------------------------------------
void RGBBinTree::insertNode( const struct pnode& newn, int elem )
{ int par = HEAD; // HEAD is the first parent node
 int cur = pn[HEAD].right; // current node is HEAD's right chi.
 int colorf = -1; // color flag will be 0 on entry
 int dirflag = 1; // tracks direction of traversal

 pn[elem].c[R] = newn.c[R]; // load the data from the new node...
 pn[elem].c[G] = newn.c[G]; // into the array at elem
 pn[elem].c[B] = newn.c[B];
 pn[elem].index = newn.index;
 pn[elem].left = pn[elem].right = TAIL;

 while (cur != TAIL) // looking for first leaf node
 {
 par = cur; // the current node is new parent
 colorf == 2 ? colorf = 0 : colorf++; // roll the color flag
 dirflag = 0; // clear the direction flag
 if (newn.c[colorf] >= pn[cur].c[colorf]) // if search color greater...
 { // or equal...
 cur = pn[cur].right; // go right, and set the dirflag
 dirflag++;
 } else cur = pn[cur].left; // else go left, leave dirflag = 0
 }
 if (!dirflag) // based on dirflag, insert the...
 pn[par].left = elem; // new node as either the right...
 else // or left child of the parent
 pn[par].right = elem;
 return;
}
//----------------------------------------------------------------------
int RGBBinTree::rgbMatch( unsigned char r, unsigned char g,
 unsigned char b, int thold )
{ int cur; // current node being visited
 register int dx = 3*63*63; // best distance between colors
 int best = TAIL; // saves the current best match
 int t, rx, gx, bx; // used in distance calculation
 int colorf = -1; // color key compare flag
 int rng[3][2]; // search range for each primary
 int key; // used in range comparisons
 if (!treeBuilt) return -1; // bail if no tree built

 visitCnt = 0; // track nodes visited
 rng[0][0] = g-thold; // set up the comparison windows...
 rng[0][1] = g+thold; // for the three primaries
 rng[1][0] = r-thold;
 rng[1][1] = r+thold;
 rng[2][0] = b-thold;
 rng[2][1] = b+thold;


 n_stack[n_sp++] = colorf; // push the starting node and the...
 n_stack[n_sp++] = HEAD; // color compare flag

 while ((n_sp) && (dx)) // subtree loop runs until stack...
 { // empty or distance (dx) = 0
 cur = pn[n_stack[--n_sp]].right; // pop the next node and compare flag
 colorf = n_stack[--n_sp];

 while ((cur != TAIL) && (dx)) // nodewalk loop runs till TAIL is...
 { // hit or distance (dx) = 0
 visitCnt++; // increment nodes visited counter
 colorf == 2 ? colorf = 0 : colorf++; // roll the compare flag
 key = pn[cur].c[colorf]; // get the current compare color
 if (key > rng[colorf][0]) // test against lower range
 {
 if (key <= rng[colorf][1]) // now test against upper range
 {
 n_stack[n_sp++] = colorf; // it's in the window, so push the...
 n_stack[n_sp++] = cur; // node and color flag

 rx = pn[cur].c[R]; // get the primaries from the node
 gx = pn[cur].c[G];
 bx = pn[cur].c[B]; // and calculate the point distance
 t = (rx-r)*(rx-r) + (gx-g)*(gx-g) + (bx-b)*(bx-b);
 if (t < dx)
 { // if it's smaller than the...
 dx = t; // current best distance, save it.
 best = cur;
 }
 }
 cur = pn[cur].left; // node outside range to right...
 } // or inside range
 else cur = pn[cur].right; // node outside range to left
 }
 }
 return pn[best].index; // return the best match
}
//----------------------------------------------------------------------
int RGBBinTree::getMid( const DACTBL dacs ) const
{ int i, rx, gx, bx; // counter, distance calc vars
 int t, best = 0, dx = 512; // counter, best match, distance

 for (i = 0; i < PAL_SIZE; i+=3) // look through the dac palette
 {
 rx = dacs[i]; // get the currect set of primaries
 gx = dacs[i+1]; // and see how far they are from ctr.
 bx = dacs[i+2];

 t = (rx-31)*(rx-31) + (gx-31)*(gx-31) + (bx-31)*(bx-31);
 if (t < dx)
 { // if less than the current best...
 dx = t; // distance, save the distance...
 best = i/3; // and color index
 }
 }
 return best; // return the best match
}
End Listings































































July, 1993
Color Models


RGB isn't the only game in town




Bruce Schneier


Bruce is the author of Applied Cryptography: Protocols, Algorithms, and Source
Code in C (John Wiley & Sons, 1993) and can be contacted at 730 Fair Oaks
Ave., Oak Park, IL 60302.


What's so great about red, green, and blue? Why not orange, yellow, and
violet? Or, for that matter, why not turquoise, ochre, and periwinkle? Why not
two colors? Or four?
The answers to these questions involve the human eye. Our perception of color
begins when light passes through visual pigments in the retinal cones. There
are three pigments--red, green, and blue--that are optimally perceived at
wavelengths of approximately 580, 545, and 440 nanometers, respectively.
In 1931, the Comission Internationale de l'Eclairage, or CIE (also known as
the International Commission on Illumination) established a color standard,
deciding that all colors should be defined in terms of three principal colors:
red, green, and blue (RGB). The standard was first used in color television,
and eventually in computers. Not every color can be reproduced with this
red/green/blue color model (or any other color model for that matter) but it
suffices. More colors could be added, but the payoff would not be worth the
extra bits. RGB is here to stay.
Still, RGB isn't the only game in town. There are several other color models,
each with its strengths and weaknesses. Among those discussed in this article
are CMY, HSV, HLS, and YIQ. Additionally, there's the HVC model, described in
the text box "Putting Colors in Order." But let's start by reviewing RGB.


RGB


Think of a three-dimensional unit cube, with red along the x axis, green along
the y axis, and blue along the z axis; see Figure 1. All colors are defined
within this cube according to the degree of red, green, and blue in them. The
origin is (0,0,0) and is black. The vertex is (1,1,1) and is white. Red is
(1,0,0), blue is (0,0,1), and green is (0,1,0). Yellow, which is red plus
green, is (1,1,0). And so on. The important thing is that the color model is
additive. Red and green are added together to produce yellow. Red, green, and
blue are added together to produce white.


CMY


Computer monitors are additive, but color printers are generally subtractive.
Instead of combining light from phosphors, printers coat paper with colored
pigments. The eye sees colors by reflected light, which is a subtractive
process.
Cyan, magenta, and yellow (CMY) are the complements of red, green, and blue.
For example, cyan is everything that isn't red. When white light is reflected
off cyan-colored ink, the red light is absorbed, or subtracted, and the
reflected light has no red component. To get the color red, you have to
deposit magenta (which absorbs green) and yellow (which absorbs blue), leaving
only red light to be reflected into the human eyeball.
The CMY cube is the opposite of the RGB cube. White is (0,0,0) and black is
(1,1,1). Cyan is (1,0,0), magenta is (0,1,0), green is (1,0,1), and so on.
Converting from RGB to CMY is easy:
C=1--R
M=1--G
Y=1--B
Many printers use a four-color process by adding black because in real life
the combination of CMY pigments produces something more akin to gray than
black. The CMYK color model, for cyan-magenta-yellow-black, is defined as:
K=minimum (C,M,Y)
C=C--K
M=M--K
Y=Y--K


HSV


Red, green, and blue may be how the human eye perceives light, but it isn't
the way the human mind perceives color. Instead, we see the color, or hue,
which is the frequency of the light. We see the luminance, or brightness,
which is a measure of the intensity of the light. The higher the intensity of
the light, the brighter it appears. We also see the saturation, or purity, of
the light. Bold primaries are very pure. Pastels are not.
Figure 2 shows a hexcone, a graphical representation of the hue, saturation,
and value (HSV) model (sometimes called hue, saturation, and brightness, or
HSB). H is represented as an angle about the vertical access, from 0 to 360
degrees. S and V vary from 0 to 1. Pure red, which is (1,0,0) in the RGB
model, is (0 degrees,1,1) in HSV; pure green is (120 degrees,1,1) in HSV.
Listing One (page 96) provides routines in Pascal that convert from RGB to HSV
and back.
The HSV color model is a more intuitive way to choose colors. Starting with a
pure hue, you can add white and black by changing the S and V parameters,
until the color is perfect. If you have Microsoft Windows, pull up the
color-palette dialog and play with the HSV color model.
HSV is also good for reducing the amount of memory required to store color
information. With 24-bit color, 8 bits are used to store each of the R, G, and
B parameters. With HSV, things are easier. The human eye can only distinguish
about 128 different hues and 130 different saturation levels. At the yellow
end of the spectrum, the human eye can only distinguish about 23 different
shades; at the blue end it's even less. That means that humans can only really
distinguish 128x130x23=382,720 different colors. A program could allocate 14
bits to color information: 7 for hue, 3 for saturation, and 4 for brightness.
Add a lookup table for quick conversion to RGB for display on a monitor, and
you're done.


HLS


The hue, lightness, and saturation (HLS) color model is quite similar to the
HSV model, and neither has any marked advantage over the other. H is the same
in both models. L ranges from 0 (black) to 1 (white). S still specifies the
relative purity of the color, but the purity is slightly different in this
case. Pure red is (0 degrees,0.5,1) and pure green is (120 degrees,0.5,1).
Conversion routines are also provided in Listing Two (page 96).



YIQ


YIQ is the color model used for U.S. commercial television. It was designed to
be backwards-compatible with black-and-white TV sets, and thus may seem a bit
odd. Y stands for luminance, or brightness--the only variable that
black-and-white television sets display. (Note that this Y is slightly
different from the V in the HSV color model and the L in the HLS model.) I and
Q contain color information, and are called the "chromaticity." You can
convert from RGB to YIQ using the following formulas:
Y=030*R+0.59*G+0.11*B
I=060*R--0.28*G--0.32*B
Q=021*R--0.53*G+0.31*B
These equations assume the standard RGB NTSC phosphor, whose CIE numbers are
exactly defined. If you're finicky enough about the particular characteristics
of your monitor for this to make a difference, consult Computer Graphics:
Principles and Practice, second edition by Foley, van Dam, et al.
(Addison-Wesley, 1992) or another good graphics text.
It is interesting to note that according to the first equation of the
RGB-to-YIQ set, green is much more important in determining luminosity than
blue. In fact, the human eye is far less sensitive to blue light than it is to
red or green light. That's why in 8-bit color RGB models, three bits each are
used to store the red and green values, while only two bits are used to store
the blue value.


Putting Colors in Order




Harry J. Smith




Harry is an engineer specializing in satellite-telemetry data processing. He
can be reached at 19628 Via Monte Drive, Saratoga, CA 95070.


In any software or hardware interface to a color display, there must be a way
of specifying the various colors that can be displayed. However, specifying
many colors (say 64, 256, or 256,000) in an orderly, understandable fashion
can become a significant problem. This article discusses a method of
converting an RGB (red, green, and blue) specification to another color
specification--HVC (hue, value, and chroma). HVC has an advantage over RGB, in
that colors can be ordered by dominant frequency (hue), luminance (value), and
the amount of color (chroma).
This method of converting RGB to HVC is based on a paradigm I developed that
consists of a geometric model for visualizing a color specification. The model
is illustrated in the two-dimensional graph in Figure 3. On this graph, each
color is represented as three points, equally spaced around the circumference
of a circle, with vertical coordinates between 0.0 and 1.0. These coordinates
are considered to be the relative intensity of the three phosphor colors: red,
green, and blue. The vertical coordinate of the center of the circle
constitutes the color's Value parameter. The angle between the radius from the
center to the green point and a vertical base line is considered the Hue of
the color, and the circle's radius is the Chroma.
Given an RGB triplet or vector, it's not easy to figure out how to plot them
so that they lie on a circle, so I'll develop the equations here.
To derive the model, I'll first start with an HVC triplet and derive equations
for computing RGB. Then I'll find the inverse of this transformation.
In Figure 3, notice that the angle from the vertical base to R is H+120
degrees; the angle G is H and the angle to B is H--120 degrees. It's easy to
see that R, G, and B can be described by the equations in Example 1(a).
This set of three equations can be solved for HVC, as shown in Example 1(b),
where atan2 is the arc-tangent function of two arguments (y,x), which is equal
to the arc-tangent of y/x but resolves the result to be in the proper
quadrant. This set of equations can be simplified for computations if V is
computed first, as in Example 1(c).
Looking at Figure 3, you can think of a color as a bubble floating in
two-dimensional space. If the bubble rotates through 360 degrees without
changing its position or size, it will travel through all color hues without
changing its amount of luminance or color. If it rises without rotating or
changing its size, it will be the same color hue and have the same amount of
color, but will have more white and be brighter. If it expands in size without
rotating and keeps its center fixed, it will be the same color hue and the
same luminance but will have more color and less grayness.
Once you master this way of looking at color as a two-dimensional bubble
floating above a horizontal line--rotating, rising, falling, expanding, and
contracting--it becomes relatively easy to travel through color space in an
orderly manner.
The accompanying source code implements these concepts. The implementation
language is Turbo Pascal 5.0. Listing Three (page 96) shows a procedure for
converting RGB to HVC. Listing Four (page 96) contains a procedure for
converting HVC to RGB.
 Figure 1: The three-dimensional RGB cube.
 Figure 2: Single-hexcone HSV color model.
 Figure 3: The model is a two-dimensional color bubble.
Example 1: Color-conversion equations.

 (a) R = V - C cos(H + 120 degrees)
 G = V - C cos(H)
 B = V - C cos(H - 120 degrees)

 (b) H = atan2( R - B, (sqrt(3) / 3)(R - 2G + B))
 V = (R + G + B) / 3
 C = (R - 2G + B) / (3 cos(H)
 or
 C = (R - B) / (sqrt(3) sin(H))

 (c) V = (R + G + B) / 3
 H = atan2(R - B, sqrt(3)(V - G))
 C = (V - G) / cos(H) if cos(H) > 0.2
 C = (R - B) / (sqrt(3) sin(H)) if cos(H) <= 0.2

 (d) V = (1.0 + 0.25 + 0.75) / 3 = 0.6667
 H = atan2(0.25, sqrt(3)(0.4167)) = 19.1066 degrees
 C = 0.4167 / cos(19.1066 degrees) = 0.4410



[LISTING ONE] (Text begins on page 38.)

{---- RGB -> HSV by Bruce Schneier ----}
procedure RGB_to_HSV (r,g,b : real; var h,s,v : real)
 var min, delta : real;
 begin
 v := maximum(r,g,b);
 min := minimum(r,g,b);
 if v<>0 then s := (v - min)/min else s := 0;
 if s := 0
 then h := UNDEFINED;
 else
 begin
 delta := v - min;
 if r = v then h := (g - b)/delta

 else if g = v then h := (b - r)/delta
 else if b = v then h := (r - g)/delta;
 h := h*60;
 if h<0 then h := h + 360
 end
 end;
prodcedure HSV_to_RGB (h,s,v : real; var r,g,b : real)
 var i, f, j, k, l : real;
 begin
 if h = 360 then h := 0;
 h := h/60;
 i := floor(h) {Floor returns the greatest integer <= h}
 f := h - i; {f is the fractional part of h}
 j := v * (1 - s);
 k := v * (1 - (s * f));
 l := v * (1 - (s * (1 - f)));
 case i of
 0 : begin r := v; g := l; b := j end;
 1 : begin r := k; g := v; b := j end;
 2 : begin r := j; g := v; b := l end;
 3 : begin r := j; g := k; b := v end;
 4 : begin r := l; g := j; b := v end;
 5 : begin r := v; g := j; b := k end
 end
 end;

[LISTING TWO]

{---- RGB -> HLS by Bruce Schneier ----}
procedure RGB_to_HLS (r,g,b : real; var h,l,s, : real)
 var min, max, delta : real;
 begin
 max := maximum(r,g,b);
 min := minimum(r,g,b);
 l := (max + min)/2;
 if max = min
 then begin s:= 0; h := UNDEFINED end
 else
 begin
 if l <= 0.5
 then s := (max - min)/(max + min)

 else s := (max - min)/(2 - max - min);

 delta := max - min;
 if r = max then h := (g - b)/delta
 else if g = max then h := 2 + (b - r)/delta
 else if b = max then h := 4 + (r - g)/delta;
 h := h*60;
 if h<0 then h := h + 360
 end
 end;
procedure HLS_to_RGB (h,l,s : real; var r,g,b : real)
 var m,n : real;
 int function value (a,b,hue : real)
 begin
 if hue > 360 then hue := hue - 360
 else if hue < 0 then hue := hue + 360;
 if hue < 60 then value := a + (b-a) * hue/60
 else if hue < 180 then value := b
 else if hue < 240 then value := a + (b - a)*(240-hue)/60
 else value := a
 end
 begin
 if l <= 0.5
 then n := l * (l + s)
 else n := l + s + l * s;
 m := 2 * l - n;
 if s = 0
 then begin r := l; g := l; b := l end
 else
 begin
 r := value (m,n,h+120);
 g := value (m,n,h);
 b := value (m,n,h-120)
 end
 end;

[LISTING THREE]

{---- HVC -> RGB by Harry Smith ----}

procedure Hvc2Rgb( H, V, C : Double; var R, G, B : Double);
 { Converts H-hue, V-value, C-chroma to R-red, G-green, B-blue }
const
 Pi = 3.14159265358979324;
 Pi2o3 = 2.0 * Pi / 3.0;
 DpR = 180.0 / Pi; { Degrees per Radian }
var
 HRad : Double;
begin
 HRad:= H / DpR;
 R:= V - C * Cos( HRad + Pi2o3);
 G:= V - C * Cos( HRad);
 B:= V - C * Cos( HRad - Pi2o3);
end; { Hvc2Rgb }

[LISTING FOUR]

{---- RGB -> HVC by Harry Smith ----}
function ATan2( Y, X : Double) : Double;
 { Arc Tangent of Y over X }
 { -Pi < ATan2 <= Pi }

const
 Pi = 3.14159265358979324;
 Pio2 = Pi / 2.0;
var
 Temp : Double;
begin
 if X = 0 then begin
 if Y > 0 then ATan2:= Pio2
 else if Y < 0 then ATan2:= -Pio2
 else ATan2:= 0;
 end
 else begin
 Temp:= ArcTan( Y / X);
 if X > 0 then ATan2:= Temp
 else if Y >= 0 then ATan2:= Temp + Pi
 else ATan2:= Temp - Pi;
 end;
end; { ATan2 }
{--------------------------------------}
procedure Rgb2Hvc( R, G, B : Double; var H, V, C : Double);
 { Converts R-red, G-green B-blue, to H-hue, V-value, C-chroma }
const
 Pi = 3.14159265358979324;
 Pi2 = 2.0 * Pi;
 DpR = 180.0 / Pi; { Degrees per Radian }
 SqRt3 = 1.73205080756887719; { Square root of 3.0 }
var
 CosHue : Double;
 HRad : Double;
begin
 V:= (R + G + B) / 3.0;
 HRad:= ATan2( R - B, SqRt3 * (V - G));
 CosHue:= Cos( HRad);
 if abs( CosHue) > 0.2 then
 C:= (V - G) / CosHue
 else
 C:= (R - B) / (SqRt3 * Sin( HRad));
 H:= HRad * DpR;
 if H < 0.0 then H:= H + 360.0;
end; { Rgb2Hvc }
End Listings





















July, 1993
Image Processing Using Quadtrees


An efficient method for the compression and manipulation of raster images




Raj Kumar Dash


Raj is studying in the masters of computer science program at the University
of Guelph in Ontario. He is a main designer for a new GIS package, and is the
assistant editor for id Magazine. You can contact him at
raj@snowhite.cis.uoguelph.ca.


There are a variety of methods for processing raster images, including the use
of numeric quadcodes to represent a data structure known as the "quadtree."
While fractal compression and its cousins provide greater space savings,
quadtrees (a term I'll use in the general sense) provide reasonable savings
and retain an image's hierarchical information without loss of detail. This
means you can perform image-processing operations on a quadtree and transfer
the results of those operations when converting the quadtree back into a
raster image. These characteristics are particularly useful when you're
processing several images too large for storage in main memory. Still, there
is one limitation: Quadtrees require that source images have a size of 2nx2n
pixels (where n=0,1,2,_). This limitation isn't serious, and I'll present a
workaround.
Additionally, large sparse matrices can be compressed using quadtrees since
they are effectively the same as a bitmap. Take, for example, a square, sparse
matrix that represents the node connections of a large computer network. If
the network has many nodes, and if the connections tend to cluster mainly in
local groups, then a quadtree is an ideal structure for storing compressed
link information for the network. A link between two computers is indicated by
a black node. The lack of such a link is indicated by a white node.


Quadtrees


A quadtree can be represented as either a tree data structure or a linked
list. To produce a quadtree, you recursively split an image of 2nx2n pixels
into quadrants and subquadrants until all the pixels in a subquadrant are of
the same color, or the subquadrant is 1-pixelx1-pixel. (If the size condition
of 2nx2n doesn't hold, you can't partition the image into successive quadrants
and subquadrants.)
Figure 1 shows how to derive a quadtree from an image, illustrating that
quadcoding is particularly useful for compressing images containing large,
homogeneous blocks of color. In Figure 1(a), the image has both single black
and larger black quadrants. Figure 1(b) is the same, but the thickness of the
grid lines indicate the level of the corresponding quadrant. In Figure 1(c),
NW, NE, SW, and SE represent the northwest, northeast, southwest, and
southeast quadrants, respectively. Each black circle represents either a
single black pixel or a subquadrant of black pixels (and likewise for the
white circles). A white square indicates that its component subquadrants are
not all the same color. (Li and Loew use the term, "elementary squares" to
describe any subimage of the size 2mx2m, where the full image is 2nx2n and
n_m_0. I'll use the term, "quadrant" to mean the same thing, but where m_1.)
Images that have a large number of individual pixels (m=0) take more space in
quadtree format than an image of equal size with large quadrants; see Figure
2. This is because of the increased number of quad partitions. Apart from this
limitation, quadtrees produce compression ratios of 20--90 percent, depending
on the contents of an image. The image in Figure 2 has no homogeneous
subquadrant larger than 1x1 and takes more space to store as a quadtree than
the image in Figure 1(a). Both images have four levels of nodes in their
quadtrees; however, all quadrants of the quadtree of the image in Figure 2
will
be a single pixel. Contrast this with Figure 1(c), where some quadrants are of
size 2x2 pixels. The quadtree of the image in Figure 2 will be full, with a
total of _4i=40+41+42+43=85 nodes.


Quadtrees for Monochrome Images


Monochrome images encoded as quadtrees offer the best space savings. Since
there are only two "colors," you need only store information for one color
explicitly and can imply information for the second. Suppose the images you're
processing have large quadrants of color 0 (say, white). To save the most
space, store information for only color 1 (say, black). Figure 3, for
instance, shows the quadtree for Figure 1(a), the same quadtree as in Figure
1(c), with the white circles (subquadrants) removed. Instead of allocating
space in the quadtree for missing nodes, set their parent pointers to NULL.
Notice there are 11 leaf nodes in this tree, compared to the 28 leaves in
Figure 1(c), a savings of more than 60 percent.


Pointerless Quadtrees for Monochrome Images


A "quadcode" is a mathematical notation (for monochrome images) that
eliminates quadtree pointers. The savings can be enormous; at least 66 percent
over that of regular quadtrees, according to Gargantini.
You can represent an entire quadtree with a sequence of numeric strings
(quadcodes) by designating a subquadrant at any level by a quadcode from 0 to
3, preceded by the quadcode for its parent quadrant. (NW=0, NE=1, SW=2, and
SE=3; thus, the SW child quadrant of NW has a quadcode of 02.) Figure 4
generates pointerless quadtrees using quadcodes. Each quadrant is assigned a
numeric code from 0 to 3, preceded by its parent quadrant's code. If the image
is 2nx2n, the longest possible quadcode is n digits long, and it represents a
quadrant of one pixel. (Parent codes are shorter than their child codes.)


Multicolor and Gray-scale Images


Color and gray-scale images can be represented by either quadtrees or
pointerless quadtrees (quadcodes), but quadtrees for color images require more
space than monochrome images of the same size. Still, their quadtrees give
favorable space savings, provided there are large quadrants of color. As with
monochrome images, multicolor or gray-scale quadtrees imply information for
one color--say, white. In Figure 5, you can eliminate all the white circle
nodes to save even more space.
To create pointerless quadtrees for multicolor and gray-scale images, store a
pair of values for each region of the image. The first value is the quadcode,
the second, the color of the quadrant. For example, the image in Figure 5 is
quadcoded as listed in Table 1. Note that you don't code color 0 (white)
regions. For images with _256 colors, represent each color by one of the ASCII
characters from 0--255.


Quadtrees for Rectangular Images


Up to now, I've discussed only square images, although most computer screens
have a rectangular bitmap (768x1024, 480x640, and so on). The easiest way to
produce a quadtree for rectangular images is to use the smallest square grid
that fully contains the image. For example, assume you have an image that's
480x640 pixels. The smallest square grid that fully contains this image is
1024x1024. Why? If you use n=9, you have a grid of size 512x512. That's enough
to accommodate the 480-pixel dimension, but not the 640-pixel dimension. The
only choice, then, is n=10, or a grid of 1024x1024. Treat the excess pixels as
"don't-care" colors, chosen to minimize the number of subquadrants needed to
partition the image. Figure 6 shows a quadtree for a 7x6 pixel image. The 7x6
image in Figure 6(a) needs to be contained in an 8x8 grid before we can create
its quadtree. Notice in the augmented image in Figure 6(b) that the new row
has pixels of both colors, whereas the two new columns have white pixels. This
particular ordering ensures a minimal quadtree.


General Quadtree Operations


One advantage of quadtrees is that image manipulation becomes faster since
operations apply to whole regions of pixels simultaneously. This is a
necessity if you can't read an entire image into memory. Another advantage is
that all operations performed on a quadtree transfer back to the image during
conversion.
It's no problem if you have an augmented rectangular image and alter some of
the excess pixels. Why? Because you ignore excess pixels when converting the
quadtree back into a raster image, so it does not matter what operations you
perform on "don't-care" pixels.

One quadtree operation is to contrast or change pixel colors. Suppose you have
a gray-scale image that has large blocks of very light-gray pixels that you
want to change to provide more contrast. By changing the color of the
subquadrants containing the target pixels, you also change the component
pixels. (This becomes obvious after the quadtree is converted back to its
associated image.) See Figure 7.
Masking is an operation that "cuts" shapes in an image. Say you have a square
monochrome image and you want to stamp it with the word "hi." Create the
"mask" image first, then apply the mask by XORing it with the test image;
refer to Figure 8.


Spatial Operations


A number of quadtree operations deal with the spatial attributes of an image,
including those that determine the color of a particular pixel and calculate
the area covered by a given color.
Attributes of a Given Pixel. Given either the quadcode or the row/column
position of a pixel, you can easily determine its color. (Note that row/column
values range from 0 to 2n--1.) You first convert the row/column pair to the
equivalent quadcode by converting both the row and column values into
equivalent binary (base 2) strings. You then pad the strings on the left with
0s so that each string is n digits long. Then, for each pair of bits from left
to right, use Table 2 to determine the equivalent quadcode digit.
Now search the quadtree for either the calculated quadcode or an ancestor's
code. If you can't find either, the pixel must be white (since this is the
color left out of the quadtree). For example, suppose you use the pixel at
row=2, column=2 (both of origin=0), as in Figure 5(a). Both the row and column
values convert to the binary string 010. Since the first pair of bits consists
of 0 and 0, the corresponding qcode digit is 0. The next qcode digit is 3,
followed by 0: qcode=030. You now search the list of quadcodes for 030 or an
ancestor code, such as 03 or 0. Since the parent quadrant (qcode=03, color=1)
is present in the quadtree, the pixel's color is 2.
Area of a Given Color. The area covered by a particular color, c, is the sum
of the areas of its component quadrants: A(color c)=_iA(qi) for all leaf
quadrants qi of color c.
A single quadrant that has a quadcode of length m has an area equal to
2n--mx2n--m, n_m_0. For example, the quadcode 03 in Figure 5(a) has color 1.
The area is 23--2x23--2=2x2=4. (To determine the total area of color 1, search
the quadtree for all leaf nodes having color 1, calculate each area as above,
and then sum the individual areas.)


Manipulation of Two or More Quadtrees


Quadtrees can be used to manipulate several images simultaneously. In
particular, you can perform image overlays and intersections on quadtrees when
the associated images do not fit into memory. For example, take two images of
size 1024x1024 pixels that have 512 nodes apiece in their respective quadtree
structures. An image overlay then takes 512 operations instead of 1,048,576
(1024x1024).
Image Overlays. Monochrome-image overlay operations include AND, OR, and XOR;
see Figure 9.
Image Intersection. For monochrome images, you perform image intersections
with the AND operator. For color and gray-scale images, the method is somewhat
different, although the principle is the same. In this case, you test two
corresponding pixels for equality of color, instead of True (color white) and
False (color black) values. (Choose the color of the first image's pixel if
the two pixels are not the same.)


Implementating Quadtrees


On a UNIX system, you can read into memory the entire bitmap of fairly large
images, something you can't always do under MS-DOS. (Some of the newer C
compilers allow for large data blocks, but we'll assume that such a compiler
is not available.)
You need to devise a method that will produce a quadtree while processing only
two rows of an image at a time. (You read two rows at a time to produce both
1x1-pixel and 2x2-pixel quadrants simultaneously.) Space constraints don't
allow me to go into detail about the code implementation. However, C source
and sample input files that implement quadtrees are available electronically;
see "Availability," page 5. Note that the C code assumes that an input image
file is in the form of a 2nx2n matrix of integers--one matrix row per file
record. Each quadtree is implemented as a quadtree data structure. Each node
consists of a triplet, (quadcode, color, childcount) and has four pointers to
children nodes.
I tested the code with Microsoft QuickC 2.0. With a few small modifications,
the code should work with other C compilers. Since much of the code is
recursive, remember to increase stack size before compiling. The code is
presented as a library of functions, along with a sample main program. (I'm
also developing a more comprehensive set of analysis tools that should be
implemented later in the year. Please contact me by e-mail for information.)


Summary


Quadtrees are particularly useful for spatial analysis in desktop mapping,
geographical information systems, pattern recognition, and CAD applications.
If you're interested in learning more about them, I highly recommend Hanan
Samet's detailed paper, "The Quadtree and Related Hierarchical Data
Structures." Articles by Gargantini and Li and Loew give good accounts of
pointerless quadtrees and quadcode operations. Manohar et al. describe a
variation called "template quadtrees"" that theoretically reduces storage even
further.


References


Gargantini, Irene. "An Effective Way to Represent Quadtrees. Graphics and
Image Processing." Edited by James Foley. Communications of the ACM (December,
1982).
Li, Shu-Xiang and Murray H. Loew. "Adjacency Detection Using Quadcodes. Image
Processing and Computer Vision." Edited by Robert Haralick. Communications of
the ACM (July, 1987).
Li, Shu-Xiang and Murray H. Loew. "The Quadcode and its Arithmetic. Image
Processing and Computer Vision." Edited by Robert Haralick. Communications of
the ACM (July, 1987).
Samet, Hanan. "The Quadtree and Related Hierarchical Data Structures."
Computing Surveys (June, 1984).
 Figure 1: (a) Sample image; (b) same image but with level of corresponding
quadrant indicated; (c) the quadtree for (a).
 Figure 2: Sample image which has no homogeneous subquadrant larger than 1x1.
 Figure 3: Same quadtree as for Figure 1(c) but with the white circles
(subquadrants) removed, resulting in savings greater than 60 percent.
 Figure 4: Generating pointerless quadtrees using quadcodes.
 Figure 5: (a) Multicolor image; (b) multicolor quadtree.
Table 1: The image in Figure 5 is quadcoded like this.
NW: (003,1), (012,1), (013,1), (021,1), (023,1), (03,1)
NE: (103,2), (112,2), (113,2), (121,1), (123,2), (13,2)
SW: (202,2), (203,2), (22,2), (23,1)
SE: (3,3)
 Figure 6: (a) Sample 7x6 image that needs to be contained in an 8x8 grid
before we can create its quadtree; (b) augmented image.
 Figure 7: Contrasting: Change all colors < 4 to color 1 and the colors > 3 to
color 7. The result is a high-contrast image.
 Figure 8: Filtering: (a) image; (b) mask; (c) masked (XORed result).
Table 2: Determining equivalent quadcode digits.
 Row Column qcode digit
 bit bit (base 4)

digit 0 0 0
 0 1 1
 1 0 2
 1 1 3
 Figure 9: (a) Image A; (b) image B; (c) C=A AND B; (d) C=A OR B; (e) C=A XOR
B.

























































July, 1993
Debugging Real-time Systems


Informative breakpoints, in-core event traces, and timer dividing get the job
done




Joseph M. Newcomer


Dr. Joseph M. Newcomer received his PhD in the area of compiler optimization
from Carnegie Mellon University in 1975. He has done real-time programming for
28 years and can be contacted at 610 Kirtland St., Pittsburgh, PA 15208.


A number of analysis and programming techniques guarantee correct performance
when building complex real-time systems. Many such techniques, however, don't
"scale down" to relatively trivial systems. For instance, techniques such as
"rate monotonic analysis" can tell you how to schedule and prioritize tasks in
a multithreaded, real-time control system (a nuclear-reactor control system or
an avionics system, for example), but they don't help much when you're writing
a single-threaded device driver.
Moreover, a lot of real-time code does not have a debugger interface. Debug,
symdeb, and later Codeview weren't available in early PCs for real-time
debugging because they interface to DOS for I/O. Consequently, if a breakpoint
set in the real-time interrupt handler was taken while your program was in
DOS, the debuggers crashed the system or even corrupted the disk because of
DOS's non-reentrancy. I finally ended up using a program called the IBM
Professional Debug Facility, which went to the bare metal for I/O. I later
switched to Nu-Mega's Soft-Ice, even though the first version didn't support
symbolic debugging.
This article presents key techniques for debugging real-time, embedded, or
device-driver code: informative breakpoints, in-core event traces, and timer
dividing. You'll also find these techniques useful when using sophisticated
debuggers such as Soft-Ice.


The Breakpoint


An obvious starting point is the simple breakpoint. The first thing a
breakpoint can tell you is whether or not your program got to a specific piece
of code. For example, if you set a breakpoint at the basic interrupt handler
for a device, and the interrupt isn't taken when there's input pending, you
know you've failed to set up some basic property, such as enabling the
interrupt, or even that you have another board interfering with the interrupt
line.
With debuggers, breakpoints replace the instruction at the desired address
with an instruction that transfers control to the debugger, such as the INT 3
instruction on 80x86 machines. When the breakpoint is taken, the original
contents of the instruction are restored, the instruction is executed with the
trace-trap mode bit enabled so a trap is taken immediately after the
instruction is executed, and the breakpoint instruction is restored for the
next time. Many simple monitor programs (like those on a number of "bare
boards") aren't sophisticated enough to handle the full resume-from-breakpoint
protocol, and execute the instruction replaced by a breakpoint. You can
usually examine and deposit memory, and many will even report that a trap has
occurred and continue at the instruction following the trap, but they can't
actually do real breakpoint handling. Consequently, I've developed a technique
for setting breakpoints in spite of these limitations.
I introduce a NOP instruction at the place I want the breakpoint, then
reassemble, relink, and (on embedded systems) download the new code to the
target. Working with the monitor program, I deposit into the appropriate NOPs
the trap instruction to the monitor. There is no need to restore the original
opcode of a (possibly) multibyte instruction and single-step across it.
Without a symbolic debugger, however, you don't know where that instruction is
until you convert the NOP address based on the relocation of the code and the
offset of the instruction, which also means you need a full assembly listing.
This isn't so bad until you're doing something like writing a DOS device
driver. In this case your only (machine-readable) assembly listing is on the
machine you are debugging on, so you can't examine it when the interrupt
handler is running.Printing the listing in its entirety every time would
consume a lot of time and paper. My approach is the equivalent (in whatever
system I'm using) of the MASM construct in Example 1(a). To determine the
breakpoint's location, I print out the link map. If the link map is large, I
only print the segment of the link map containing all names starting with
B_P_, which is certainly smaller than the entire listing or map.
This solves half of the problem--the part that tells you the location of the
NOP address that you're going to change to a breakpoint instruction--but this
isn't sufficient. What I really need to know when the breakpoint is taken, is
exactly which breakpoint I hit. This means I have to write down the (absolute)
breakpoint address each time I set it, which for a segmented architecture can
be a real pain.
I modified the code sequence in Example 1(a) to save a register, load the
breakpoint number into that register, do the NOP, and finally restore the
register. This does introduce some overhead; if the breakpoint is never taken,
we still have to execute the instructions. Example 1(b) shows the new code,
which can be made into a macro. When the breakpoint is taken, I only have to
examine the AX register to see the breakpoint number. This means that once
I've set the breakpoints, the forebrain is no longer involved in the
recognition process. Since the breakpoint number remains the same on each
compilation, I don't need to remember a large, complex, and different number
each time; once I've remembered that breakpoint 17H is the input-interrupt
entry breakpoint, I can retain that single fact.
Now suppose I want to look at some variable or set of variables at the time a
breakpoint is set. Rather than continuing to decode the variable addresses
from the link map, I can just push a few more registers and load either the
values into the registers, or pointers to the values into the registers.
Typically, on an 80x86 machine these pointers are DS:relative, so I only need
the 16-bit offset for a pointer. For certain breakpoints in the queue manager
of the device driver I was working on, for example, I loaded the BX register
with the base of the active input or output queue (circular character
buffers). Thus, by displaying the contents of memory found at DS:BX, I could
see what was in the queues.


Moving Toward Real-time


The breakpoint technique works only when you are trying to debug the basic
logic flow; that each breakpoint may result in an interruption of the flow
drastically changes the real-time behavior of the program. Eventually, you're
convinced that you now have correct logic flow, but when running without the
debugger, the program still crashes in some way. What now?
Anyone who's suffered through an intermediate-level college software course
knows the terms, "invariant" and "output condition." An invariant is, in the
formal-proof sense, a condition that's true before and after a segment of code
executes. An output condition is a condition that must be true when a piece of
code completes. Both are useful when applied correctly--even to something as
grubby as an interrupt handler.
My problem was that the interrupt routine I was working on had several exit
points; I was sure that one of them was not restoring the interrupts, but I
couldn't see how this was failing. So I added an assembler routine that simply
checked the status of the interrupt-enable flag; if it was set for interrupts
disabled, I enabled interrupts and took a breakpoint. I then called this from
the main-event loop in the C code. Sure enough, about five seconds into the
processing, I trapped with AX=0xEEEE. After careful study, I found the
serial-line interrupt routine had been "optimized" for performance; if, just
before returning to the user code, it detected another character had arrived,
the routine called itself recursively. I'd seen this, including the comment,
"Hope the stack doesn't overflow," but because of the bursty nature of the
incoming data, I suspected this warning did not indicate the real cause. It
turns out that the operative term here is "called." It did a far-call of the
entry point. It even set a variable to tell if it should do a RETF or an RTI.
(Why it didn't fake an INT or do a PUSHF first is a mystery known only to the
original author.) The bug became obvious; the return-by-RETF flag wasn't
cleared on the return, so the eventual return to the main code was via RETF,
and the flags were not restored! Of course, this meant the stack was out of
sync, and the next return done at the higher level would crash.
This crash did not occur, however, because the higher-level code was testing
to see if there was a key struck, if a Ctrl-Break had occurred, or if an
entire input packet had been received from the interface. Because interrupts
were disabled, a keypress would never be seen, and a packet could never be
fully received, so it effectively locked up. (Actually, if this race condition
had occurred on exactly the last byte of a packet, the return would have
occurred and the C code would have crashed, but the condition always seemed to
occur in the middle of a packet.)
On a PC, some of this state is easily monitored. For example, you can put in
conditional code that writes directly to the display memory. No DOS or BIOS
interactions at all; for example, writing a status character to location
B800:9E and a display-characteristics byte to B800:9F will display that
character in the top-right corner of your VGA/EGA screen. This works fine for
character-mode displays; you have to do more work if your screen is graphical.
Thus, if you set up the assembly-code routine called from C to display a blue
background E for interrupts enabled and a red background D for interrupts
disabled, you can instantly see your last state when the system hangs. If you
see a D, you've violated the interrupts-enabled invariant and know where to
start looking.


Event Traces


One of the most useful debugging techniques is an event trace, sometimes known
as the "debug print" statements scattered throughout our code. But how do you
do an event trace when you can't print? The reasons for not printing are many:
DOS is not re-entrant, you may not have a printer, using the printer is not
possible in real time, and so on.
I keep the event trace in a circular buffer in memory. I first fill the buffer
with 0s, then guarantee that the last entry is always followed by a 0 entry
(making it easy to locate the end of the buffer). I've found that 256 events
are more than enough for most problems. This technique is akin to using a
logic analyzer, but cheaper and easier to analyze.
To put this in context, I was interfacing a new real-time data collection and
an external device-controller board to an existing data-analysis/control
package. I had to find all the places where things were done to the old card
and add code to handle the new card. The cards were radically different; the
old card had a packet buffer and interrupted every packet, while the new card
interrupted every byte, and I had to handle the packet decode myself. So I had
to retrofit this to what was already a very convoluted driver, consisting of
about ten thousand lines of assembler.
It took three weeks to figure out how to insert the new board driver into the
old code. We should have written a real DEVICE= style driver for each of the
boards, or at least a TSR driver, but the company insisted that, for business
and product-compatibility reasons, the device driver(s) be embedded in the
application program and be able to determine which of the two boards were
installed in the machine.
When I finally tested the integrated old-and-new code, it worked fine for the
old board but failed for the new one. After painstaking debugging using the
settable-breakpoint technique, I knew the basic control flow was correct. I
needed to figure out what was going wrong under full-speed operating
conditions.
I eventually added event logging for the following events: input-interrupt
routine entry; input-interrupt routine exit; output-interrupt routine entry;
output-interrupt routine exit; input-interrupt routine (places character in
input queue); output-interrupt routine (removes character from output queue);
set interrupt enable (all places STI instructions were executed); set
interrupt disable (all places where CLI instructions were executed);
device-input interrupt enable (set the IE bit in the device register);
device-input interrupt disable (clear the IE bit in the device register);
device-output interrupt enable (set the OE bit in the device register);
device-output interrupt disable (clear the OE bit in the device register);
timer-interrupt routine entry; timer-interrupt routine exit; timer-interrupt
enable; timer-interrupt disable; load timer; output-queue entry insertion
(called from the C code); output-queue entry removal (called from the
interrupt routine); input-queue entry insertion (called from the interrupt
routine); and input-queue entry removal (called from the C code).
Each event had 32 bits of data: an 8-bit event code, an 8-bit status value,
and a 16-bit value. The event code was just a unique integer representing each
of the aforementioned events. The 8-bit status was dependent on the event, but
in the case of the card interrupts was one of the device registers. The 16-bit
value was also dependent on the type of the event; for example, for queue
management it was the 8-bit count of items in the queue, and the 8-bit
character value either inserted or removed. For the load-timer operation, it
was the 16-bit value loaded into the timer, and the status value was the
timer-control register bits.
With this technique, I could detect that I was failing to enable output
interrupts when a certain control command went out. It was a classic
oversight, but I might have looked at the code for days without finding it.
Instead, the trace showed it immediately, and I spent only four hours writing
and inserting the trace-event code. A version that only records an 8-bit event
code of the newevent macro is in Listing One (page 98). Note the use of
conditional compilation. Having inserted these probe points, a simple setting
of the conditional flag will either generate the probes or not. Thus the
debugging tool is always available when you need it.
All names in Listing One are declared PUBLIC using C naming conventions. I did
this because I got tired of decoding the bytes "by hand" using the debugger.
It turns out I needed to use this technique many times as we enhanced the
functionality of the interrupt handler to deal with more sophisticated control
and more complex input packets. I finally spent an hour and modified the
top-level event loop in the C code to use a piece of the screen real estate
and display the last ten events in symbolic form. Once I did this, debugging
went even faster, because the screen update was interruptible and didn't
interfere with the real-time performance at all.


Timers



I once hit a nasty situation due to a bug in the IBM Professional Debug
Facility (PDF). It seems that the PDF restores the flags (and effectively does
an STI) sometime before actually doing the RTI to return to the breakpoint. It
was not re-entrant. When I started debugging with the timer enabled, I
discovered that if I took any breakpoint, but had a breakpoint set in the
timer routine, the system would crash when I said "proceed," because the timer
interrupt was pending and taken as soon as the interrupts were enabled. The
breakpoint in the timer routine caused the debugger to be entered before it
had fully exited, and it was not prepared to deal with this. It took some
careful analysis to determine the cause, but I finally isolated it to the
situation just described. (Unfortunately, I got no vendor support for this
bug; there didn't even seem to be a mechanism for reporting it!)
The problem caused by this debugger bug was that if I took a breakpoint once
the timer was running, I was doomed. What to do? The answer was simple: Turn
the timer off before the breakpoint is taken and restore it afterwards. So I
modified the macro in Example 1(b) to that in Example 1(c).
The routines DBT_OFF and DBT_ON simply stopped the timer and restarted it if
it was on. Again notice that this actually takes executed instructions whether
the breakpoint is taken or not. For super-time-critical applications, this may
be unacceptable. For many applications, the stop/start of the timer may be
unacceptable unless the breakpoint is actually taken. These complicate life
somewhat.
The principle, "Don't design dynamically stable systems; your system should
function at any clock speed, including 0 Hz," is harder to accomplish in
hardware in an era of dynamic RAMs, but we can apply it to software. If we can
safely ignore the impact of slow performance on the external device, we should
be able to build into our "real-time" system the ability to run the clock at
any speed. I've done this in a number of systems. For one thing, it slows down
the interactions to where you can see them; instead of characters coming out
at full 9600 baud, I might emit one character every five seconds, giving me
enough time to actually see the effect. By slowing the clock speed, you can
often take time to do screen displays of what is going on for analysis.
The technique of slowing the clock down is quite simple, and is shown in
Listing Two (page 98). The variable _cdiv can be set from a high-level
debugger, a command to the program, a command-line argument, and so on. For
example, in one debugging scenario, I loaded the address of the divisor into
the D0 register before taking the breakpoint, and saved the contents of the D0
register after the breakpoint. So to set the clock divisor, all I had to do
from the monitor was change the value in the D0 register!
Set to 0 or 1, the clock interrupts are handled at full speed; set to any
larger value, the clock is effectively divided by that amount. Of course, you
could accomplish the same effect by changing the clock-time value.
But I usually make an error in doing this and don't get the expected result,
or if I want 100:1 slowdown, I can't get a large enough value into the 16-bit
clock (such as I found on 80x86 boxes). Slowing the clock down by the divisor
technique is simple and lets me get 65535:1 slowdown if I need it. Debugging
techniques should not be so complicated that they themselves may contain bugs.
As a further consideration, the program itself should be impervious to timing.
For example, in doing certain RS-232 packet protocols, time-out becomes an
important consideration for robustness. But if you're debugging the sender,
the receiver will almost certainly time out. In this case, you need to have an
option that allows you to disable time-out at the receiver, either by
disabling it entirely, or slowing it down enough to allow for debugging of the
transmitter. In addition, you need an option that will force a time-out
condition at the receiver, even if it is a lie. Why? So you can check that the
time-out recovery actually works!


Using High-level Language Interrupt Handlers and ROM Code


With a number of extensions to C compilers (Microsoft and Borland C, for
example), you can declare a C procedure to be of type interrupt and dispense
with any assembly-code interface at all. Such a system means that all the code
is compiler generated, and the NOP trick can't be done.
Assume that we have a low-level debugger which can run on the bare metal, such
as the IBM Professional Debug facility (obviously you want to use Soft-Ice if
you have a 386 or higher, but for 8088/186/286 you need help; or we may have
an embedded system with all of the code in C but no real debugger), and that
this debugger will intercept the interrupts.
What I did in this case was write a procedure very similar to that in Listing
Three (page 98), which is written in Microsoft C. I set a value in the bpts
array to determine which of the 50 breakpoints I want to enable, and if it is
selected, I simply set its index in the AX register and take a debugger
breakpoint. For example, to take the breakpoint identified as breakpoint
number 7, the call is simply bpt(7).
Note the use of #pragma to suppress the generation of stack-overflow tests.
When you're writing interrupt-level code, you don't want the program calling
DOS to exit because of stack overflow! You might find yourself trying to
re-enter DOS, with the expected serious consequences. (No, you don't want to
know how I found this out. It did involve a full reformatting of the disk and
a tape restore.)
For ROM code, the same technique applies. However, it may be necessary to
explicitly zero out the breakpoint vector kept in RAM, or to use an
initialization clause with the declaration, to guarantee that no unwanted
breakpoints are taken because of random trash left in memory from the last
download.
To be able to quickly find the base of the bpts vector, I use the same
technique of taking a breakpoint with a desired value in some register. When
the program starts, I call show_bpts and write down the values for DS:AX,
which gives me the base of the table. Thus, when I'm at interrupt level with
no symbols available, I can readily compute the address of the
breakpoint-enable flag. By using symbolic breakpoint names equated to hex
index values, I can easily compute the correct address of a
breakpoint-enabling flag. At no time do I need an assembler listing or even a
segment of link map.


Conclusion


While nothing substitutes for careful design and coding, the complexity of our
systems--and in particular the complexity of either retrofitting to an
existing system or interfacing to a complex (and poorly documented) interface
protocol--is a trap for even the most careful. When the full power of tools
like source-level debuggers is unavailable, the techniques described here have
significantly improved productivity.

Example 1: (a) Determining the location of the NOP address that you're going
to change to a breakpoint instruction; (b) this modified version of Example
1(a) saves a register, loads into that register the breakpoint number, does
the NOP, and restores the register; (c) this modified version of Example 1(b)
turns the timer off before the breakpoint is taken and then restores it.
(a)

 PUBLIC B_P_n
B_P_n: NOP


(b)

 PUBLIC B_P_n
 PUSH AX
 MOV AX,n
B_P_n: NOP
 POP AX


(c)

 PUBLIC B_P_n
 PUSH AX
 MOV AX,n
 CALL DBT_OFF
B_P_n: NOP
 CALL DBT_ON
 POP AX

[LISTING ONE]

if debug_events
evlen=256
 public _event_len
 public _event_ptr

 public _event_last
 public _event_events
_event_len dw evlen
_event_ptr dw 0
_event_last db 0
_event_events db evlen dup(0)
endif

newevent macro ev
 local L1
if debug_events
 inc _event_ptr
 cmp _event_ptr,evlen
 jl L1
 mov _event_ptr,0
L1: push bx
 mov bx,_event_ptr
 mov _event_events[bx],ev
 mov _event_events[bx+1],0
 mov _event_last,ev
 pop bx
endif
 endm


[LISTING TWO]

 PUBLIC _cdiv
_cdiv dw 0
cval dw 0

_clock proc far ; interrupt entry point
 ... ; interrupt prolog and setup
 dec cval ; decrement divisor
 jle ok ; take interrupt
 jmp cexit ; clock exit
ok: mov ax,_cdiv ; set up for next time
 mov cval,ax ; ...
 ... ; body of clock interrupt handler
cexit: ... ; epilog code
 rti ; return
 endp


[LISTING THREE]

#pragma check_stack(off)
char bpts[50];
void bpt(int n)
 {
 if(bpts[n])
 _asm{
 mov ax,n;
 int 3;
 }

 }

[LISTING FOUR]


void show_bpts()
 {
 register char * where = bpts;
 _asm {
 mov ax,where;
 int 3;
 }
 }
End Listings




















































July, 1993
Distributed Computing Now: Development Environments


Middleware takes the pain out of distributed computing application development




Lowell S. Schneider


Lowell, a cofounder of Ellery Systems, has 25 years of software-development
experience, most of which have been spent in distributed database and
computing systems. He can be contacted at lss@esi.com.


In the first (June 1993) installment of this article, we discussed both the
NASA Astrophysics Data System (ADS) and the Earth Data System (EDS),
applications typical of emerging distributed computing systems. This month,
we'll look under the hood and examine the type of tools and techniques
required to build large-scale distributed systems like the ADS.
The environment in which the ADS and EDS are written is the Ellery Open
Systems (EOS), an interpreted runtime middleware that runs on top of the OSF
standard Distributed Computing Environment (DCE), allowing access to existing
programs as DCE servers. EOS was designed to provide programmers with scant
knowledge of the DCE API a way to begin using DCE at minimum cost. Using a
language called C-Lite (C-Like Interpreted Teleprocessing Environment), EOS
gives DCE what Basic gave the PC over a decade ago: the ability to type a few
lines of code, execute, and keep doing it until you get it right. The
difference is that instead of controlling an 8088, you're in control of an
entire wide-area network of high-speed workstations. EOS also provides a
complete application-development environment (including a Motif user-interface
server) and tools for source debugging, performance profiling, and
test-coverage analysis.
The original EOS was based on a European DCE-like product called ANSA
(Advanced Network System Architecture) designed by APM (Cambridge, U.K.) That
version is still supported because: 1. Not everyone has a workstation that can
support DCE (the DCE runtime requires at least 32 Mbytes of real memory and
100 Mbytes of swap space) and the ANSA version of EOS will run on an old
8-Mbyte SPARCstation1; and 2. not all vendors are yet ready to ship DCE. The
architecture described in this article is the DCE version which, although very
different from the ANSA version, allows applications written for EOS/ANSA to
run on EOS/DCE without change.
With almost 500 functions, the DCE API can be daunting. That's not meant to
knock DCE: All 500 functions are important. But most of the DCE API solves
problems you don't have yet unless you're already into distributed computing.
Furthermore, what if you start a project to develop a native DCE application
from scratch and you find it's going to cost three times what you budgeted, or
that DCE doesn't provide everything you need. Are you left with anything that
you can use? Probably not. If you started from scratch, you wrote all your
"manager code" (the DCE naming convention for that part of the server that
implements the application) as pthreads created by the server stub, and it's
not going to port back to straight C very easily.
The alternative is to write to a middleware environment like EOS which,
depending on the scenario, lets you build a distributed DCE application with
little startup and no throw-away costs.


EOS Server Architecture


The EOS server architecture is a runtime veneer on top of DCE that's both
compatible with, and complements the functionality of, DCE. The principal EOS
server runtime is the remote process invocation (RPI) daemon. It's a highly
replicated, highly available process that fields client requests for server
bindings, forwarding them as needed to other RPIs. An RPI determines whether
it can satisfy a request based on a generalized property specification and
constraint language. If an RPI satisfies a client's request, it forks the
other EOS server runtime, the session-management server (SMS), to manage the
binding for as long as it endures. Underneath the SMS are the servers
themselves, which are merely ordinary UNIX filters that read stdin and
write/flush stdout. Bindings to these servers can have up to six orthogonal
attributes, all of which are dealt with by the EOS runtime transparently to
the server:
Local or remote.
Modal or modeless.
Session or sessionless.
Protected or unprotected.
Authenticated or unauthenticated.
Authorized or unauthorized.
Binding attributes and property-management capabilities are what EOS adds to
the value of DCE in its present form.
With RPI, servers run when needed by clients, as opposed to being started and
administered as separate servers that run continuously. This is important in
two regards. First, we've learned from experience that if the reason for
distributed computing is to integrate already distributed applications, then
there are (very quickly) a large number of servers available, most of which
will be used only occasionally. This is in sharp contrast to the situation
that arises when you build a specific application (such as an accounting
system) in which there are one or a few servers being used continually.
Second, we've learned that DCE servers are huge processes. The executables may
be deceivingly small if they're linked with shared libraries, but in fact, a
null DCE server (one that does nothing at all) will have a 2.5-Mbyte text
image on an HP720. Consequently, if you're going to offer many different
services and you don't have something like the EOS RPI, you're going to need a
lot of machines to run them.
The next problem that arises when you have a large number of servers is
finding the one you want. The name-service interface (NSI) API in DCE is well
suited to the accounting system alluded to earlier, for multiple instances of
the same service (such as print servers), or any similar situation in which
you have a small number of interfaces; see Figure 1. This is because the
cell-directory service (CDS) allows the client to search for servers based on
interface type; see Figure 2. But it provides only minimal functionality for
searching for servers based on attributes. That is, if you integrate a bunch
of disparate databases through a common interface and you want to query one of
the tables, knowing the interface type isn't enough. You have to have some way
of asking whether the instance of the server offering that interface also has
access to the table you want. With CDS, you can do this with an object uuid,
but it's really difficult and is essentially just a place-holder for the DCE
Common Object Request Broker Architecture (CORBA), which isn't part of DCE
yet. Until it is, EOS adds some of this functionality by providing a
context-free property specification and constraint language based on ANSA.
(The property specification and constraint language was developed by Dr. J.
Sventek of Hewlett-Packard as part of the Advanced Network Systems
Architecture [ANSA] while he was seconded to APM Ltd., Cambridge, U.K.)
When an RPI is started, it looks for a .proplist file, an ordinary text file
that might look something like Example 1(a). The properties (TABLE, for
example) are any arbitrary name and can have singular string, string-list, or
numeric values. The RPI reads this file when it starts, and adds to it this
list of its runtime properties; see Example 1(b). It then searches for a
number of other files to create two more built-in lists, the sblist (server
body) list, and the sbdescrip (server-body description) list.
The first file it searches for is .rpirc, which lists the servers it's
supposed to advertise. Unlike the example I'll be illustrating, where I start
RPI with a flag that tells it to advertise all executables in its directory,
it's more typically the case that: 1. You have an environment variable
(EOSSERVERS) that's a path variable to all the different directories in which
servers live; and 2. not all the executables in those directories are servers
(for example, some are still in development). Then, for each server listed in
.rpirc, it looks for a file of the same name, prefaced with a dot, that
contains a brief text description of what the server does and/or how to get
the equivalent of man pages on how to use the server.
On the client side, the PROPV[] component of the SERVER structure is
initialized with a series of constraints, the logical AND of which will be the
client's expression of what it wants; see Example 2. At this point, the client
has two choices. It can merely broadcast this binding request by calling the
C-Lite function rpi(); if a match is found, the RPI that matched will return
its binding information, which the client will use on the subsequent
INITIALIZE. When the INITIALIZE entry point is called, the RPI will fork/exec
the session manager and return the binding information for the session manager
to the client. This is the binding the client uses for the duration of that
connection. If the client chooses this approach, almost everything that
happens is behind the scenes, as viewed from C-Lite. The other choice is to
call the C-Lite function servers(), optionally including a constraint
expression, as in Listing One (page 100). This establishes the bounds of the
search on the first call, and on all subsequent calls until a null binding is
returned, to get the complete property list of every matching rpi. The client
can programmatically (or interactively with a user) examine the other
properties and choose a particular RPI that best meets its requirements. It
then calls rpi(), with the only PROPV[] being the binding property that
uniquely matches only one RPI. One possible reason for using this approach is
to achieve load balancing.


Building an EOS-based Application


The example scenario I'll use here assumes you want to integrate two existing
applications: The first is financial simulation that takes the value of your
stock portfolio, the name of a stock, and some statistics about the stock's
recent performance and returns the number of shares you should buy or sell (or
0 if you sit tight). It runs on a high-speed server on your network and the
name of the executable is "finsim."
The second application is a broker application that takes the name of a stock
and the number of shares to buy (or sell if the number is negative) and
electronically delivers the buy/sell orders to your broker. In the new
application, the client will read a table in which each row represents one
stock in your portfolio, call the simulation model, and if it returns other
than 0, will place a buy or sell order via the broker server. It runs on many
different hosts and the executable is "sbroker." (For the time being, I won't
address user-interface issues.)


The Servers


Building the servers is straightforward under EOS, particularly if the servers
are already written in the UNIX paradigm--that is, they read a line from stdin
and write a line to stdout. If the servers aren't UNIX-like but are written in
a language that supports stdin/stdout (C, for instance) and you have access to
the source, then you only need to change how the server reads its input
parameters so that it uses gets() and how it writes its output parameters so
it uses puts() (or readln()/writeln() in Pascal, or READ(5,*)/WRITE(6,*) in
Fortran). If the server writes multiple lines of output per input line, you
need to marshal these into a single output line, typically as a C initializer
string such as {2,{{\"line0\",\"line1\"}}}) that C-Lite can ingest. If you
don't have access to the source, or the language doesn't support stdin/stdout,
then you'll need to write a simple C filter that fork()/exec()s the
application and deals with user-interface peculiarities. An example would be a
prompting interface that takes one line of input, produces multiple output
lines, and then puts the prompt up without a newline, so you need to check the
beginning of each line for the prompt and discard it before you gets() so you
don't block. This isn't as difficult as it sounds. We've written hundreds of
such wrapper programs, and the only hard part is figuring out the trick you
need to play on the application to get it to do what you want. Once you know
that, the whole wrapper (exclu
ding the trick) is less than a hundred lines of C. To illustrate, assume that
the sbroker server is a program for which you don't have any source, and it
implements a command-line prompting interface and a minimal command language,
like Example 3, which is typical of legacy code.
Listing Two (page 100) is a C wrapper program that behaves like a filter; it
reads commands from stdin and writes a file of orders. It also makes the input
file one line per order instead of three, allowing us to call it as a server
just once per order, so you can exec it from a shell with:
sbroker<orders>&/dev/null. Debugging the servers is straightforward. Assuming
you already have programs that read and write stdout, you can debug them as
you would any filter--from the shell, using your debugger.


The Client



To build an EOS client, you use C-Lite. Becoming familiar with C-Lite isn't
difficult, particularly if you're familiar with C syntax. C-Lite semantics, on
the other hand, are Basic-like, and use a string as the basic unit instead of
a character. Consequently, everything is passed by value unless you explicitly
pass pointers, and memory is managed automatically.
The skeletal client code (see Listing Three, page 100) imports the model
server and the broker server, then opens a table. For each row, the client
calls the model server with an argc/argv structure representing the contents
of the row, and if the model server returns anything other than 0, the client
calls the broker with the name of the stock and the number of shares to trade.
When it's done, it closes the table and returns. To keep it simple, there's no
error checking.
To recap: I first declared some local variables. The next variable is in a new
kind of storage class called "stable" that has a global scope; the value of
this variable is maintained across sessions. I then declared two SERVER
structures. When I initialized the components of the model SERVER structure, I
set the TYPE=112000, which means that the server binding is remote, modal,
sessionless, and that there are no protection, authentication, or
authorization requirements. Then I set CONTEXT (which is like the UNIX PATH
variable) to /.:/hosts/ferrari /.:/ which means try to bind first on the host
called "ferrari" (the fast one), and failing that, bind to any host in the
cell. Finally, I set PROVP[0] to finsim' in sblist, meaning the only property
required is that there be a finsim server there.
Next, the *model=rpi(*model) statement broadcasts this request, which is
fielded by the first RPI daemon that satisfies our needs; that is, it can run
a finsim server. When this call succeeds, the returned structure has a number
of other components in the structure filled in with function pointers to the
various entry points in the interface to the server, one of which is
INITIALIZE. The next statement, (model->INITIALIZE)(), does three things: 1.
It declares the function symbol "model" that will be bound to the server; 2.
it starts an SMS to handle all the RPC and session semantics; and 3. it tells
the SMS that the server's shutdown command is quit. The SMS, in turn, execs
finsim hooked to pipes. The initialization of the broker structure is
essentially the same, except that we give no host preference. We then change
to the directory with the portfolio table and attach the table to a data
window. When we do this, all the columns of the table become implicitly
accessible using a table.column.row syntax. In this syntax, a missing
component denotes the current component, in this case the row, which is made
current by the nextrow() function. A * in the column component means all
columns are marshaled as an argc/argv[] structure. If model() returns 0, we
continue the loop, otherwise we call the broker with the value of the Name
column in the table, and the number of shares model() returned. The last two
statements instruct the session manager to shut down the server, and to shut
itself down if it's not managing any other servers. And that's it: About 30
lines of C-Lite is interpreted into about 30,000 lines of C and DCE API.


Debugging the Client/Server Interface


Remote debugging is hard. With all the technology that leads up to and is now
offered in the DCE, no more than trivial logging and tracing facilities are
offered in this regard. While EOS provides a Motif mouse-driven source
debugger (modeled on HP Softbench) for C-Lite, it leaves open the issue of how
you debug the client/server interface and, instead, makes it a nonproblem. In
the above example, I said the TYPE component specified a remote, modal,
sessionless connection. If you change foo->TYPE to 212000, it means a local,
modal, sessionless binding, so the same server code you write can run remotely
or locally without modification. When it's run as a local server, the session
manager is fork()ed directly out of the client using ordinary IPC. So, you can
run a distributed application as a completely local application. This means
you can use whatever local debugger you would use ordinarily in concert with
the C-Lite debugger to get everything working right, all on your own
workstation.


Installing the Distributed Application


Assuming you debugged locally and you're not already running the RPI daemon on
the server host, installing the debugged version as a distributed application
requires three things:
1. Change the first digit of the TYPE component back to 1 in the client
C-Lite.
2. ftp your server executable and the RPI and SMS executables to the same
directory on the server host.
3. rlogin to the server host, cd to that directory, make sure that the files
have execute permissions, and then start RPI like this: rpi_srv --a>&/dev/null
&.
It can get much more involved, but if you start the RPI daemon this way, the
-a flag tells it that any executable in this directory is to be offered as a
server, and the >& /dev/null says you're not interested in seeing or saving
any log entries (these are written to stderr). If you now rerun clientCB(), it
will bind to the remote server, and everything will work exactly as it did
with the local server.


Conclusion: What You Risked vs. What You Gained


If you already had the server code, you've still got it and, at worst, you had
to write a C wrapper that turned it into a standard UNIX filter--a valuable
thing to have done, anyway. If you had to write the server code, you wrote it
as a standard UNIX filter and it's still a perfectly good program that you can
run from the shell. The 30 lines of C-Lite in clientCB() are probably the only
throw-away cost you've got. The bottom line is that you've spent a few days
learning an already familiar language and maybe a week building a distributed
application. If you like it, that's a huge return on investment. If you don't,
most of what you wrote is still usable code. In either case, you have actually
done some distributed computing instead of just thinking or talking about it.
And you'll see both the advantages and pitfalls of distributed computing in a
very different light once you've gotten your hands dirty.
 Figure 1: Wide Area Network binding via the Name Service Interface, CDS.
 Figure 2: Using asynchronous entry points with a session semantics.
Example 1: (a) Sample .proplist file; (b) RPI-generated additions to the
.proplist file.
(a)

TABLE {student' enroll' professors' course' dept'}
STUDENT {stu_id' lastname' major' stu_desc'}
SERVERS {sql_app' filexfer' syb2eos' eos2syb'}
MIPS 144
DISK 1000000000
AUTH {Pete' Lowell' Geoff' Andrew' Kyle' Clark'}


(b)

HOST ferrari'
hostname ferrari.lri.com'
binding 187.95.103.78[1234]'
SECPAC 211'



Example 2: Client initialization.
foo->PROPV[ 0 ] = "(student in TABLES) or (enroll in TABLES)";
foo->PROPV[ 1 ] = "sql_app in sblist";
foo->PROPV[ 2 ] = "MIPS > 50";
foo->PROPC = 3;


Example 3: Command-line prompt.
% sbroker
broker=> set stock IBM


stock set to IBM
broker=> set shares +20
order set to buy 20
broker=> execute
placing order to buy 20 shares IBM
broker=> quit

[LISTING ONE] (Text begins on page 64.)

typedef struct {
 binding, /* the binding property */
 princ_name, /* the rpi's principal name */
 secpac, /* the rpi's security package */
 ns_auth, /* the rpi's privileges in the NSI */
 struct {
 name, /* property name */
 argc, /* number of values */
 argv[16] /* array of values */
 } prop[24], /* property list */
 propc, /* number of properties */
} INQUIRE;
auto INQUIRE inq[ 24 ];
auto cexp = "((student' in TABLES) or (enroll' in TABLES ))", i = 0;
*inq[ 0 ] = servers(
 "", /* subnet - "" means local subnet */
 "/.:/", /* context - in this case every host */
 cexp, /* constraint expression */
 000, /* secpac - in this case none */
 "" /* CDS group -- "" means default
 which is /.:/eos */
);
while( inq[ i ]->binding )
 *inq[ ++i ] = servers();

[LISTING TWO]

/* Implements a C wrapper around an executable we have no source for that
turns it into a server. */
#include <stdio.h>
#define MAXBUF 65536
char stdinb[ MAXBUF ];
int prlen = 10;
char prompt[] = "\nbroker=> ";
char buf[ 3 ][] = { "set stock ", "set shares ", "execute\n" };
char *arg[ 3 ] = { NULL, NULL, NULL };
char null[] = "";
main( argc, argv )
int argc;
char *argv[];
{
 char *c, *d, tmp[ 255 ];
 int i, pid, j;
 FILE *fp[ 2 ];
 int fdin[ 2 ], fdout[ 2 ];

 /* create output pipe to server body */
 if ( pipe( fdout ) == -1 ) exit( 1 );
 if (!( fp[ 1 ] = fdopen( fdout[ 1 ], "w" ))) exit( 2 );
 /* create input pipe from server body */
 if ( pipe( fdin ) == -1 ) exit( 3 );

 if (!( fp[ 0 ] = fdopen( fdin[ 0 ], "r" ))) exit ( 4 );
 /* name of server body to exec */
 strcpy( tmp, "/usr/bin/sbroker" );
 argv[0] = tmp;
 /* start the a new process, which runs on its own. */
 if ((pid = fork()) == -1) exit( 5 );
 /* if in the server body */
 if( pid == 0 ) {
 /* dup our output pipe to servers stdin */
 close( 0 );
 dup( fdout[ 0 ] );
 /* dup our input pipe to servers stdout */
 close( 1 );
 dup( fdin[ 1 ] );
 /* execute sbroker program. */
 execlp( tmp, tmp, NULL );
 exit( -1 );

 }
 /* swallow the first prompt */
 for( i = 0; i < prlen; i++ )
 getc( fp[ 0 ] );
 /* forever or until we're told to quit */
 while( 1 ) {
 /* wait for sms to call us */
 if ( !fgets( stdinb, MAXBUF, stdin ))
 break;
 /* if caller wants us to quit */
 if ( !strcmp( stdinb, "quit\n" ) ) {
 fputs( stdinb, fp[ 1 ] );
 fflush( fp[ 0 ] );
 break;
 }
 /* parse arguments: should be stock, shares */
 arg[ 0 ] = ( char * )strtok( stdinb, "," );
 arg[ 1 ] = ( char * ) strtok( NULL, "," );
 arg[ 2 ] = null;
 /* begin the "trick" code which turns our single input */
 /* line into three outputs to the server after each of */
 /* which we discard the prompt */
 for ( i = 0; i < 3; i++ ) {
 fprintf( stdout, "%s %s\n", buf[ i ], arg[ i ] );
 fflush( stdout );
 d = c;
 for( j = 0; j < prlen; j++ ) {
 *c = getc( fp[ 0 ] );
 if ( *c != prompt[ j ] ) {
 c++;
 break;
 }
 if ( *c == \n' ) *c = \0';
 else
 c++;
 }
 *c = \0';
 if ( !strcmp( d, ( char *) prompt + 1 ))
 continue;

 fgets( stdinb, MAXBUF, fp[ 0 ] );

 }
 /* this server is implemented as a void function */
 fputs( "\n", stdout );
 fflush( stdout );

 }
 exit( 0 );
}

[LISTING THREE]

function "clientCB"
{
 auto dir = "/users/me/stock", table = "portfolio", shares, row;
 stable current_portfolio_value;
 static SERVER model;
 static SERVER broker;

 /* import the model server */
 model->TYPE = 112000;
 model->CONTEXT = "/.:/hosts/ferrari/ /.:/";
 model->PROPC = 1;
 model->PROPV[ 0 ] = "finsim' in sblist";
 *model = rpi( *model );
 (model->INITIALIZE)( "model", "finsim", "quit" );
 /* import the broker server */
 broker->TYPE = 112000;
 broker->CONTEXT = "/.:/";
 broker->PROPC = 1;
 broker->PROPV[ 0 ] = "sbroker' in sblist";
 *broker = rpi( *broker );
 (model->INITIALIZE)( "trade", "sbroker", "quit" );
 /* open the portfolio table */
 chdir( dir );
 wattach( table );
 for ( row = 1 ;( row ); row = nextrow() ) {
 shares = model( current_portfolio_value, table.* );
 if ( !shares ) continue;
 trade( sprintf( "%s,%s", table.stock, shares ));

 }
 (model->DISCARD)();
 (broker->DISCARD)();
 wdetach();
 return();
}
End Listings















July, 1993
Extending Standards for CD-ROM


Rock Ridge extends ISO-9660 for POSIX filesystems




Lynne Greer Jolitz


Lynne Jolitz is co-author of 386BSD, a PC implementation of the Berkeley UNIX
operating system. She can be contacted on CompuServe at 76704,4266.


Most programmers are aware of the ISO-9660 standard and its significance in
sharing CD-ROM data between different platforms. In our article "Inside the
ISO-9660 Filesystem Format" (DDJ, December 1992), we examined how this
standard has encouraged the use of CD-ROM technology and how a modern ISO-9660
CD-ROM is structured. ISO-compliant CD-ROMs are interchangeable and can be
used on any type of system and architecture. However, the minimalism that
helped make the ISO-9660 standard successful may sometimes be too minimal for
specific applications (such as distributing POSIX-based, bootable CD-ROMs).
Because ISO-9660 does not adequately support the POSIX filesystem, the Rock
Ridge Group was formed to develop ISO-9660:1988 extensions, which take
advantage of the system-use area of the directory record (provided for in
ISO-9660) to store complete POSIX filesystem information.
Extensions to ISO-9660 can make a CD-ROM appear like a given target operating
system (such as a POSIX-compliant filesystem). By encoding these extensions
(using the sharing-use protocols), you can allow for separate sets of
attributes for the same filesystem. This lets you organize extended
information for different systems (such as VMS, DOS, and UNIX) in a
nonconflicting way. Also, any system that only understands ISO-9660 without
any extensions can still gain access to the files and obtain the exact same
contents of data for a file; you aren't precluding any use of CD-ROM by the
use of extensions, you're simply extending the scope of use of the
information. You get the best of both worlds: ISO compatibility and
interoperability, and POSIX operating-system transparency and functionality.


ISO-9660 Filesystem


A CD-ROM can be mastered with any kind of information on it. Unlike UNIX and
DOS (and many other) filesystems which have blocks allocated from a separate
list of nonsequential disk blocks, ISO-9660 specifies a dense, sequential
arrangement of file information to minimize nonsequential access and,
consequently, latency time.
Figure 1 illustrates an ISO-9660 CD-ROM. A reserved field at the beginning of
the disk is often used in booting the CD-ROM. Immediately afterwards, a series
of volume descriptors detail the contents and kind of information contained on
the disk, somewhat akin to the DOS partition table (see the text box entitled,
"Structure of the CD-ROM Volume Descriptor"). It's possible to have many
different kinds of filesystems and information arrangements on a single
CD-ROM. There are also many kinds of descriptors which can be used to
optionally record non-ISO defined information contents.
ISO-9660:88 specifies a minimal set of file attributes (directory or ordinary
file and time of recording) and name attributes (name, extension, and version)
in the directory-entry fields, and has specific restrictions on naming
conventions (see the text box entitled "Directory Entries and Filenames").
There are also attributes that make the file appear invisible even though it's
still present (existence); allow some systems to store information adjacent to
visible files without letting the user note their presence (association);
determine whether the file is protected (protection); and determine whether
the file is implemented as a group of directory entries to define a more
complex multiple-data area (multi-extent).
These attributes allow for a simple hierarchical filesystem, but not for all
of the file attributes in modern systems. To that end, ISO-9660 allowed for an
optional extended attributes record (XAR) stored at the beginning of the
file's extent which can contain additional file-attribute information. The
file-attribute information contained in XAR comes at a cost--you must have an
exact semantic definition of all fields in the context of each operating
system that might use them. Because XAR semantics are designed by committee,
they don't exactly match any system. Rather, they attempt to match all systems
slightly. To further the problem, XAR information is stored adjacent to file
contents on the disk rather than in the directory entry itself. Thus, an
additional seek may be required to pick up the file attributes to decide if
the file can even be opened in the first place. Even with the XAR present, you
still can't record all of the file attributes of any one operating system in
its entirety. This failure in the standard is precisely why extensions like
Rock Ridge are necessary.
Since the limited set of ISO-9660 directory entries and attributes does not
provide sufficient information to transparently reconstruct file attributes
for POSIX, Macintosh, Windows NT, and other filesystems, a location for
optional extensions called the system-use area was added for each file; see
Figure 2. The system-use area is the space between successive ISO-9660
directory entries, denoted by the end of the current entry (fixed, optionally
padded offset) and the start of the next entry. (ISO-9660 directory entries
have separate name and entry lengths.)
One of the great strengths of ISO-9660 compliance lies in a developer's
ability to support many types of filesystems on a single CD-ROM, by
selectively using extensions stored in the system-use area. If an
ISO-compliant driver or program is not familiar with certain extensions or
with a particular type of extension, it will skip the embedded information in
the directory entry and continue processing. Thus, many different types of
extension attributes can be encoded on a single CD-ROM, although only the
extensions pertinent to a particular system need be read. Also, since the
information is embedded in the directory entry itself, there will be no
additional seeks or data transfers that would slow the file-open process. In
effect, ISO-9660 allows for variable-length attributes of an unspecified kind.


Extension Attributes


File attributes can specify anything about that file: graphics, color, a
specific model of computer, and so forth. Extension attributes are simply a
way to extend the attributes of files. Since attributes vary according to the
user, most everyone has a different opinion on what a file attribute should
specify. Hence, different extensions have evolved to meet these differing and
potentially conflicting standards. In fact, the standards battle in the
extension arena--contention for the limited space allocated to extensions and
how they are structured--continues. Since attributes and extensions change by
their very nature, the standard was actually separated into two parts: the
system-use sharing protocol (SUSP) and the actual Rock Ridge extensions.
SUSP dictates the structure of each file extension, much in the same way a
directory entry is structured; see Figure 3(a). The system-use area is divided
into a number of these extension fields; see Figure 3(b). This approach simply
addresses the void in ISO-9660 by providing a mechanism for standards to
coexist in the extensions area. In other words, SUSP defines standard
"containers" in which the extensions live. Each system-use field is identified
by a "system-use field signature word." The signature word field specifies
what kind of extension it is; see Table 1(a). This is a universal category, in
that you can not only process known types such as SUSP and Rock Ridge
extensions, but also specify a proprietary extension, such as one specific to
a target piece of hardware and known by the target hardware CD-ROM driver
(such as a laser printer). If the driver understands more than one extension
entry, then it can use the union of these sets of attributes. The length field
of the extension is recorded at the beginning, allowing the system to skip to
the next extension entry if the previous one is processed or if the entry
itself is not understood by the CD-ROM driver.
One of the most critical of the SUSP extensions mechanisms is the continuation
entry, or CE, that allows arbitrary, logical extension of the system-use area
to a linked-list chain of sectors. This provides additional room for extension
information. Thus, SUSP files can have arbitrary numbers of extensions without
fear of overflowing a fixed field size. Basically, SUSP is a filesystem within
a filesystem for attributes.


Rock Ridge Extensions


Because ISO-9660 provides insufficient information to reconstruct a UNIX
filesystem, most CD-ROM UNIX filesystem implementations have had to fake
reasonable file attributes. One method is to inherit attributes, such as
directory ownership and permissions, directly from the directory currently
mounted (as in SVR4). Another method uses preset values and some type of
filesystem translation to obtain file attributes (as in 386BSD Release 0.1 and
SUNOS 4.x). However, the Rock Ridge Interchange Protocol (RRIP) provides a way
to construct a POSIX filesystem by compacting specific semantics and encoding
them in an architecture-independent manner. The semantics of RRIP exactly
match those of POSIX. As such, a CD-ROM that understands these extensions will
look exactly like a POSIX filesystem, with userid and other expected file
attributes present, while still being an ISO-9660 compliant disk. (Non-POSIX
systems could still access the files using ordinary ISO-9660 names and
attributes.) 386BSD Release 0.2 is such a system.
RRIP is described as a type of SUSP information present within the system-use
area of the directory entry, with new signature words specific to RRIP; see
Table 1(b). The PX type, which contains the key file information required for
reconstruction, is absolutely required for all entries to create a POSIX-style
directory or file. All time stamps related to a file are documented in the TS
entry. Symbolic links are recorded in SL signature words, and the CL, PL, and
RE signature words are used to reconstruct and track deep directories.
(ISO-9660 is limited to a fixed eight levels of directory hierarchy due to the
limitations of the pathname tables.)


Rock Ridge in a Non-POSIX Environment


While POSIX file attributes and names were the primary point in doing the Rock
Ridge extensions, the standard designers wisely chose not to intensely limit
these extensions to POSIX. Thus, file-naming records have no "UNIXisms" forced
upon them. Since there are no restrictions on filename contents and no
embedded filename syntax, you could, for example, use JIS or Unicode in the
name octets. Rock Ridge can also be used in a networked situation, since a
single CD-ROM can be exported to a variety of different operating systems
viewing the same files, while appearing to be in the local system's native
file-structure format. In sum, Rock Ridge is heading in the same "universal"
direction of other filesystems like NFS.


cdromcat Revisited


In our previous article, we examined a program (cdromcat) that viewed CD-ROM
files on the 386/486. This program has been modified to allow for CD-ROMs with
Rock Ridge extensions. It will function as before for ISO-only CD-ROMs. If
Rock Ridge extensions are present, a UNIX-like directory listing will appear
instead. The program now consists of four major components: A main
user-application section (Listing One, page 101) which allows the user to
interact with the other programs, the filesystem primitives for the CD-ROM
filesystem (cdromfs.c), the SUSP and RRIP implementations (susp.c and rrip.c),
and the object output routines which format and print the output
(printcdromfs.c). Due to space limitations, the entire source code is
available electronically; see "Availability," page 5.
Listing One is a minimal applications program that uses the other facilities
in this program to interpret multiple CD-ROM filesystems with extensions. The
program accepts a pathname, prints the file, and either returns the contents
of the file (if the file is ordinary) or provides a directory listing (if it
is a directory). The iscdrom() function is used to decipher the CD-ROM
filesystem format, which, if successful, returns the root directory entry of
the CD-ROMs hierarchical filesystem. For convenience, the program then
displays the contents of the root directory as a directory list using
printdirents(). Filenames are passed to opencdf() to see if they can be
opened; if so, they can be displayed with either printdirents() or
extractdirents().
cdromfs.c contains the filesystem primitives to interpret ISO-9660 (and the
now obsolete High Sierra) filesystem format, as well as Rock Ridge extensions.
It checks for the presence of a known CD-ROM filesystem, probes for known
extensions within the filesystem, performs recursive filename lookup in the
name hierarchy, and obtains data from CD-ROM files.
The iscdrom() function attempts to locate a valid primary descriptor on the
CD-ROM by trying logical sector sizes from 2048 to 65536 octets in size. If it
finds the descriptor, it probes the root directory's "." entry to see if the
SUSP is present. One oddity of ISO-9660 is that the root-directory entry is in
two places: the primary descriptor (with no extensions, it's used to find
where the root directory begins) and the first entry of the directory it
points to (in this case, with extensions, as any other directory entry). If
the SUSP is present, the function continues to check for the presence of the
RRIP by looking for a valid SUSP signature record of the correct kind, and
sets the USES_RRIP bit. One oddity of Rock Ridge standardization is that the
probe for SUSP does not just check the first octet of extensions in the root
directory's "dot" entry, it checks for the record throughout the file
extensions, for compatibility with CD-ROM XA and any other standard that might
insist on "first" billing.

The functions opencdf(), lookup(), searchdirent(), and namematch() implement
the filesystem's name-to-file translation mechanism. opencdf() is a wrapper
function that implements the operating system's interface semantics and passes
off the details to lookup(). lookup() decomposes the syntax of a file pathname
and uses searchdirent() to find components of the path on the actual CD-ROM.
lookup() recurses if the current component is found and others still remain.
While recursion is used in this example, an iterative version can also be used
(for example, when inside an operating system with limited stack depth).
searchdirent() obtains the contents of a directory (via getblockdirent()) and
puzzles apart the directory records so that it can attempt to match them with
namematch(). In namematch(), ISO file extensions, if present, are used to
obtain the alternative filename and characteristics ahead of matching the
actual ISO name in this directory entry. namematch() is where the susp()
function is used to scan for Rock Ridge name records (NM).
susp.c contains a single function that scans the ISO system-usage field for
valid SUSP entries. rrip.c contains both support code used to obtain POSIX
file-status information out of other RRIP records found with the file and a
function to translate ISO/Rock Ridge times into POSIX times.
cdrom.c invokes three header files: cdromfs.h, rrip.h, and susp.h. cdromfs.h
contains the data structures and definitions of the CD-ROM itself. The susp.h
and rrip.h header files contain the fields, constants, and interface
definitions for the SUSP implementation and for the RRIP extensions to provide
POSIX file attributes to an ISO-9660 CD-ROM, respectively. susp.h also
contains an implementation-specific data structure to hold continuation state,
so one can iterate through a sequence of SUSP records in a given directory
entry. Finally, prtcdromfs.c contains the subroutines to produce formatted
output of CD-ROM filesystem files. It prints the directory header, all the
entries in the directory, the CD-ROM file modes, and (if available) the
creation time of the file.


386BSD Rock Ridge Extensions


A completely new version of the ISO-9660 filesystem format and Rock Ridge
extensions are available on the Tiny 386BSD 0.2 installation floppy as part of
the DDJ Careware program. This updated floppy can be used to qualify,
partition, download, extract, and install the complete 386BSD 0.1 binary
distribution released last year (30 Mbytes uncompressed). If you have an
extant 386BSD 0.1 system on hard disk, it will automatically update your
system without the user having to patch and recompile. (Do not use any
"unofficial" patchkits, only the Tiny 386BSD 0.2 and later updates from DDJ.)
To obtain the latest Tiny 386BSD floppy, send a formatted, high-density floppy
and a SASE mailer to: Tiny 386BSD, Dr. Dobb's Journal, 411 Borel Avenue, San
Mateo, CA 94402. There's no charge for the service, but if you would like,
slip in a dollar for the Children's Support League of the East Bay.


Acknowledgments


We'd like to thank Andrew Young, founder and president of Young Minds, a key
architect of both SUSP and RRIP, for his assistance while preparing this
article and the example programs. You can obtain a copy of the Rock Ridge
Technical Specifications through the CD-ROM forum on CompuServe (GOCDROM).
 Figure 1: ISO-9660 CD-ROM
 Figure 2: Directory entries with and without extensions.
Table 1: (a) SUSP Standard Signature Word Fields; (b) RRIP Signature Word
Fields.
==============================================================================
 Field Description Comments
==============================================================================
(a) SP SUSP Indicator SUSP in use
 ST SUSP Terminator SUSP use terminated
 CE Continuation Area Extend SUSP field
 ER Extension Reference System-Specific Extension
 PD Padding Field (optional) Ignore Field Information
(b) PX POSIX file attributes POSIX definition
 PN POSIX device modes Character or block
 NM Alternate name POSIX or non-ISO-9660 name
 SL Symbolic link Contents of symbolic link
 CL Child link Location of relocated directory
 PL Parent link Original location of parent directory
 RE Relocated directory This directory is a relocated directory
 TF Time stamp(s) for a file
 RR Flags indicating which
 field are recorded (optional)
==============================================================================
 Figure 3: (a) System-use field description; (b) SUSP extensions in detail.


Directory Entries and Filenames


A directory entry is a data structure that describes the characteristics of a
file or directory, beginning with a length octet describing the size of the
entire entry. Entries themselves are of variable length, up to 255 octets in
size. File attributes for the file described by the directory entry are stored
in the directory entry and optionally, in the extended attribute record. The
name-length field specifies how long the name is and is limited to 31 octets
(characters). The choice of characters is also limited in scope. A CD-ROM
filename may include any combination of numbers, uppercase letters, and
underscore characters, optionally followed by a period (.) and another set of
numbers and uppercase letters. A version number (optional in the High Sierra
format) is delimited by a semicolon. (For example, a legal filename would be
FOO_BAR2.BLECH;1.)
This picture of the filenames is complicated by a few more restrictions.
Directories cannot have extensions or version numbers. The two separator
characters, period and semicolon, are not strictly in the character set of
filenames, as they take up known locations (or values) in the character set.
So when the filename match is to be done, they are treated independent of the
character set that the CD-ROM is using. (ISO-9660 allows for other character
sets for filenames--it remains relative to the files.)
Some gray-zone implementations are also of concern and occur on many of the
name-brand ISO-9660 CD-ROMs we've sampled. The period separator can terminate
a filename (for instance, a file with no extension), and the mandatory version
number on ordinary files is usually missing on "compliant" discs. Worse, null
(or blank) padding in names can be present, in clear violation of ISO rules.
It is wise for the designer to accommodate these minor technical violations of
the standard in a filesystem reader, as long as they don't compromise the
standard semantics. However, creators of CD-ROMs that play fast and loose with
ISO-9660 semantics should be warned that they are living on borrowed time.
Another problem is that many ISO-compatible systems are not fully compatible
with these naming conventions. For example, some MS-DOS CD-ROM programs only
allow the filename to be the maximum number of characters that a MS-DOS
filename is allowed, because the software cannot differentiate between a
MS-DOS filename and a CD-ROM filename. If the name field ends on an odd
boundary, a reserve field of one octet is added because some microprocessors
(like the 68000) experience a segmentation violation if they fetch a word on
an odd boundary. If the name field ends on an even boundary, then this field
is not used. The maximum size of a directory entry with no extensions is 58
octets; 197 octets are reserved for attribute extensions.
--L.G.J.


Inside the ISO-9660 Filesystem Format--Some Clarifications




William Frederick Jolitz



We've received a great deal of mail regarding our article discussing the
basics of the ISO-9660 CD-ROM standard and sample applications, including some
of the people behind the early CD-ROM standards effort, and we're grateful for
this thoughtfully provided feedback.
A standard such as ISO-9660 spends as much time discussing areas "defined to
be undefined" as it does with areas "defined to be defined" in a certain
manner. The system area (beginning of the disk), areas not assigned to the
storage of CD-ROM descriptors, directory entries, or other data structures may
be used in any fashion. Thus the system area does not need to be used for
booting (although that is its common use).
You could, in fact, use the system area to create a CD-ROM that could be
mounted as a 386BSD UFS filesystem, as well as an ISO-9660 one, by arranging
the filesystem structures of each to be nonconflicting with the other! All
file contents would be at the exact same locations on the CD-ROM, just found
via the different methods of each filesystem.
ISO-9660 descriptors have the sole function of describing ISO-9660 file
systems, and they reserve all field contents for possible future use. Another
reserved mechanism is that of making the standard function with certain
specific oddities of the CD-ROM. This includes using the 2336-octet, mode-2
sectors instead of the more-standard, 2048-octet sectors that increase data
capacity at the expense of error-detection/correction reliability. (Replicated
primary descriptors and terminators become necessary to avoid making the disk
useless by having a redundant copy of the information.)
File placement, interleaving, multiple extent, and discontiguous "allocation"
are possible with ISO-9660, and desirable for certain kinds of applications.
But the rules of how to use this to advantage are subtle and weighted toward
the application developer and CD-ROM publisher's use. ISO-9660 and its
predecessor, High Sierra, provide guidance on structure, not use. Thus, the
application of techniques to minimize latency has been inconsistent. This is
completely unlike other file systems, such as UFS, where storage policy is
fixed within the file system design, and invisible to use.
Using the fine-grain features of ISO-9660 requires considerable "vertical"
knowledge--from top (application) to bottom (physical layout), without which a
full understanding of CD-ROM standards (even those with only hypothetical
impact) is impossible. Fortunately, this type of specialized knowledge is
generally not necessary.
Please note that in our December 1992 article the parenthetical comment near
the end of the third paragraph at the end of page 82 starting "a variable
length_" should read "for example, the volume descriptors constitute a
variable length table_". In the page break between the text on page 82 and
page 83, a "0x11," was omitted (that is, the byte sequence should be [0x44,
0x33, 0x22, 0x11, 0x11, 0x22, 0x33, 0x44]). Figures 2 and 1 are reversed in
the text. To be precise, ISO refers to a file version number instead of a
revision number, and while optional in High Sierra, it is mandatory in ISO
(see the text box entitled "Directory Entries and Filenames").
High Sierra is a deprecated standard, valuable for reading non-ISO CD-ROMs, so
its mention is strictly for reverse compatibility. All multibyte integer
fields in ISO-9660 are recorded in both byte orders, even if an application
will only use one of them. Filenames have slightly more complicated rules than
previously mentioned (see the text box "Directory Entries and Filenames").
The ISO-9660 implementation in 386BSD Release 0.1 and other UNIX-like systems
implemented filename translation, since it was inconvenient to use the
uppercase filenames and semicolon (;) delimiters in filenames. (The command
interpreter, or "shell," uses the semicolon to separate commands, thus
filenames must be quoted to avoid misinterpretation.) This technically limits
the portability of programs that would depend on the version number being
present across different operating systems. In newer versions of 386BSD, an
additional flag is added to the mount command to allow strict ISO conformance
when desired.
No article can quite replace a copy of the standard itself (only available in
printed form via existing sources of ISO standards documents). Finally, thanks
to Howard Kaikow and Jim Harper for providing critical feedback.


Structure of the CD-ROM Volume Descriptor


A volume descriptor describes the characteristics of the filesystem
information present on a given CD-ROM, or volume. It's divided into two parts:
the type of volume descriptor, and the characteristics of the descriptor. The
volume descriptor is constructed in this manner so that if a program reading
the disk doesn't understand a particular descriptor, it can just skip over it
until it finds one it recognizes, thus allowing the use of many different
types of information on one CD-ROM. Also, if an error were to render a
descriptor unreadable, a subsequent redundant copy of a descriptor could then
allow for fault recovery. (Each descriptor is conveniently contained in a
single logical sector by itself.) The minimum requirement is that it have a
primary descriptor describing the ISO-9660 filesystem and an ending descriptor
(a variable-length table that contains information on how many other
descriptors are present).
The ISO-9660 primary volume descriptor acts much like the superblock of the
UNIX filesystem, providing details on the ISO-9660-compliant portions of the
disk. Contained within the primary volume descriptor is the root-directory
record describing the location of the contiguous root directory. (As in UNIX,
directories appear as files for the operating system's special use.) Directory
entries are successively stored within this region. Evaluation of the ISO-9660
filenames is begun at this location. The root directory is stored as an
extent, or sequential series of sectors, that contains each of the directory
entries appearing in the root. In addition, since ISO-9660 works by segmenting
the CD-ROM into logical blocks, the size of these blocks is found in the
primary volume descriptor as well.
While we can have many different types of filesystems on a single ISO-9660
CD-ROM, there's always one ISO-9660 primary volume descriptor.
--L.G.J.
[LISTING ONE] (Text begins on page 74.)

/* Copyright (c) 1992, 1993 William F. Jolitz, TeleMuse
 * All rights reserved.
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met: 1. Redistributions of source code must retain the above copyright
 * notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 * notice, this list of conditions and the following disclaimer in the
 * documentation and/or other materials provided with the distribution.
 * 3. All advertising materials mentioning features or use of this software
 * must display the following acknowledgement:
 * This software is a component of "386BSD" developed by
 * William F. Jolitz, TeleMuse.
 * 4. Neither the name of the developer nor the name "386BSD"
 * may be used to endorse or promote products derived from this software
 * without specific prior written permission.
 * THIS SOFTWARE IS A COMPONENT OF 386BSD DEVELOPED BY WILLIAM F. JOLITZ
 * AND IS INTENDED FOR RESEARCH AND EDUCATIONAL PURPOSES ONLY. THIS
 * SOFTWARE SHOULD NOT BE CONSIDERED TO BE A COMMERCIAL PRODUCT.
 * THE DEVELOPER URGES THAT USERS WHO REQUIRE A COMMERCIAL PRODUCT
 * NOT MAKE USE OF THIS WORK.
 * THIS SOFTWARE IS PROVIDED BY THE DEVELOPER ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE DEVELOPER BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 * cdrom:
 * A simple program to interpret the CDROM filesystem, and return
 * the contents of the file. (Directories are formatted and printed,
 * and files are returned untranslated).
 * Allows for the use of Rock Ridge extensions.
 * main.c: application program to peruse cdrom filesystems.

 * Written in the Oakland CA. district of Rockridge.
 */

#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <sys/stat.h>
#include "cdromfs.h"
#include "susp.h"
#include "rrip.h"

/* per filesystem information */
struct fs fsd;

/* filesystem directory entry */
struct directent rootent, currentent;

/* user "application" program */
int
main(int argc, char *argv[])
{
 struct directent *dp;
 char pathname[80];

 /* open the CDROM device */
 if ((fsd.fd = open("/dev/ras2d", 0)) < 0) {
 perror("cdromcat");
 exit(1);
 }

 /* is there a filesystem we can understand here? */
 if (iscdromfs(&rootent, &fsd) == 0) {
 fprintf(stderr, "cdromcat: %s\n", fsd.name);
 exit(1);
 }
 currentent = rootent;
 /* print the contents of the root directory to give user a start */
 printf("Root Directory Listing:\n");
 printdirentheader("/", &fsd);
 printdirents(&rootent, &fsd);
 /* print files on demand from user */
 for(;;){
 /* prompt user for name to locate */
 printf("Pathname to open? : ");
 fflush(stdout);
 /* obtain, if none, exit, else trim newline off */
 if (fgets(pathname, sizeof(pathname), stdin) == NULL)
 exit(0);
 pathname[strlen(pathname) - 1] = \0';
 if (strlen(pathname) == 0)
 exit(0);
 /* lookup filename on CDROM */
 if (dp = opencdf(pathname, &fsd)){
 /* if a directory, format and list it */
 if (FDV(dp, flags, fsd.type) & CD_DIRECTORY) {
 printdirentheader(pathname, &fsd);
 printdirents(dp, &fsd);

 }
 /* if a file, print it on standard output */
 else
 extractdirent(dp, &fsd);
 } else
 printf("Not found.\n");
 }
 /* NOTREACHED */
}
/* Extract the entire contents of a directory entry and write this on standard
 * output. (XXX needs to be turned into a readcdf() function call. -wfj ). */
void
extractdirent(struct directent *dp, struct fs *fs) {
 long datalen = ISO_WD(&FDV(dp, datalen, fs->type)),
 lbn = 0, cnt;
 char *buffer = (char *) malloc(fs->lbs);
 /* iterate over all contents of the directory entry */
 while (getblkdirent(dp, buffer, lbn, fs)) {
 /* write out the valid portion of this logical block */
 cnt = datalen > fs->lbs ? fs->lbs : datalen;
 (void) write (1, buffer, cnt);
 /* next one? */
 lbn++;
 datalen -= cnt;
 if (datalen == 0) break;
 }
 free(buffer);
}
End Listing

































July, 1993
A Multimedia Class Library for Windows


Encapsulating the Media Control Interface




John Musser


John is a senior developer at Personal Media International, a startup
specializing in multimedia entertainment software. He can be reached via
CompuServe at 70621,1460.


Microsoft Windows' Media Control Interface (MCI) is a standard command
language for communicating with multimedia devices--CD, Waveform and MIDI
audio, AVI, videodiscs, video overlay devices, audio mixers, and the like.
However, even though both Microsoft and Borland provide with their compilers
large, comprehensive class libraries--the Microsoft Foundation Class library
and ObjectWindows Library, respectively--neither provide object support for
multimedia or Windows API multimedia extensions.
This article addresses this gap by showing how to design and implement a
comprehensive C++ class library that enhances the MCI interface to multimedia
devices. This hierarchy employs encapsulation, inheritance, and polymorphism
to create a flexible and extensible framework for controlling multimedia
devices under Windows. The result is a set of objects that make programming
multimedia easier, more robust, concise, and maintainable. A simple client
program, MciMan, demonstrates the use of the Waveform audio and AVI video
classes. The class library and client program are compiler independent and can
be used with any Windows 3.1 compatible C++ compiler, including Borland's
C/C++ and Microsoft's Visual C++. The AVI portions of this code require the
digitalv.h #include file, Video for Windows drivers, and runtime DLLs that
come with the Video for Windows package and its SDK. They're available free of
charge from the Windows Extensions forum on CompuServe (GO WINEXT). Vfwrun.zip
contains the runtime DLLs and vfwdk.zip is the Video for Windows Development
Kit.


MCI Overview


The MCI specification, released in 1991 by Microsoft and IBM, defines a set of
base commands that can be applied to any general device, and extended commands
for specific device types. The specification is designed to be extensible so
that other devices may be added. For example, the AVI specification was added
in 1992. Another example is MediaVision, which supplies an MCI driver that
provides audio-mixing capabilities using extended commands specific to this
device type. The specification as documented in the Windows SDK identifies 11
device types. Some drivers are supplied with either Windows or the SDK, while
others are provided by the device supplier. Table 1 describes most of the
currently available MCI drivers. As of this writing, however, not all the
device types listed actually had MCI drivers available.
MCI divides all multimedia devices into one of two device types: simple or
compound. The basic difference is that simple devices do not use data files,
whereas compound devices usually do. CD audio and videodisc players are simple
devices; waveform audio, MIDI sequencers, and AVI are all compound devices.
MCI provides two basic programming interfaces: a command-string interface that
allows the use of ordinary text strings such as play cdaudio to control
multimedia devices (this very open approach is well suited to scripting and
authoring applications); and a command-message interface that uses C-language
structures and a Windows-style message-passing model for device control. The
MCI classes I describe here use the command-message interface because it's
more efficient and better suited to general programming.
A single function, mciSendCommand(), is used along with a set of "polymorphic"
arguments to access the command-message interface. The mciSendCommand()
function takes:
A WORD device identifier (analogous to a handle).
A message-type constant prefixed with MCI_ (such as MCI_PLAY).
A DWORD value set to one or more flags usually specifying which elements of a
given structure contain valid values.
A far pointer to a data structure containing values to be sent or returned.
(The structure is often specific to each message type.)
Nearly all the member functions in this library make at least one call to
mciSendCommand(). But don't be fooled by its simplicity: It has many options,
constants, flags, messages, structures, and return values (and our goal is to
hide these).


MCI Class Library


The MCI class hierarchy is designed to encapsulate and enhance the MCI
interface and to get the most benefit from the least code. It is part of a C++
library that will be used as part of the basis for the commercial multimedia
products we are currently developing. As such, it was designed for a specific
immediate purpose, but also had to be capable of handling unknown future
requirements. This influenced both the design and the implementation. The
design had to be flexible, extensible, and (most of all) practical. Because
our needs are for specific devices only, not all MCI devices are directly
implemented although adding support for additional devices is a short process.
Figure 1 diagrams the MCI class hierarchy, giving an overview of this
single-inheritance tree.
All MCI commands are classified into one of four categories: system, required,
basic, and extended. Commands specified for the first two types (such as open,
close, and status) require support by all device types and are good candidates
for member-function definitions in the base class(es) of the MCI class
hierarchy. The basic commands, including play, stop, and seek, are supported
by most device types and can also be placed at or near the top of the tree.
Extended commands that support specific devices can be implemented farther
down the hierarchy.
The root node for the MCI class hierarchy, MCIDevice, serves as the base class
for both the CompoundDevice class and all simple device classes. It provides
the basic open/close/play type commands, status commands for querying a
device's state, set commands to configure devices, and a number of other
miscellaneous functions. The data items needed (all private) are:
An integer device id.
An error-status value.
A string pointer to the device name.
Anoptionalhandle to a parent window.
The constructor (see Listing One, page 102) does not automatically open the
device--doing so would mean that the device would be opened and potentially
unavailable to other applications from the time this object is constructed
until it's destroyed. This is also true for all the derived classes: A device
is not opened until explicitly told to do so with an Open() call. It is closed
either through a call to the Close() member function or when the object is
destroyed and the destructor is invoked. (Constructing a CompoundDevice with a
filename is treated as if it were an Open() call and that file is immediately
opened.) Keeping devices open only as long as necessary is generally polite
behavior in a multitasking environment such as Windows.
The constructor for MCIDevice and any derived simple device (such as CDAudio)
takes a single argument, a string value specifying the appropriate MCI device
name. A default value is provided for each of the simple device types so that
an application will rarely have to explicitly provide this value. The only
exception might be on a system using nonstandard or additional device names
that do not match the usual MCI device-name strings defined in the [mci]
section of the SYSTEM.INI file. MCIDevice is not an abstract class and can
therefore be constructed and used directly. It is designed, however, to serve
as the basis for derived classes supporting unique device types.
Most of the member functions have a direct mapping to a corresponding MCI
command. These functions almost always require fewer arguments than either the
command-message or the command-string interfaces because some of the necessary
information has been encapsulated in the object. Each member function first
initializes any necessary data structures, sets the appropriate flags and
arguments, and calls mciSendCommand(), occasionally more than once.
The function's return code is passed back to the object's client and is also
saved in order to maintain an error state for later reference. The
ErrorMessage() member function calls mciGetErrorString() to get the error
description, and displays a standard Windows error-message dialog box to the
user. This class and all derived classes highlight one of the benefits of
using C++: the ability to use destructors to perform automatic cleanup. It is
important that applications properly close all MCI devices and files they have
opened. By utilizing the C++ destructor mechanism we can automatically trigger
the closing of any open devices before an application exits and thereby
improve reliability and robustness (not to mention avoiding any extra
general-protection-fault-type messages).
A sampling of functions built upon the MCI set, status, and capability
commands are implemented to show how these access operations can be
implemented. MCIDevice provides the virtual functions Set(), Status(), and
GetDevCaps() for this purpose, but makes these protected because they are not
intended to be directly called. Making these public forces the user to be
familiar with the numerous MCI_SET and MCI_STATUS constant values needed for
each specific option. Instead, these are used as the mechanism within publicly
defined inline functions defined for each of the set and status options. For
example, the Length() function, which returns the length of the current
device, uses the Status() function with the MCI_STATUS_LENGTH option to get
its value. (It should be pointed out that a type of simple runtime type
identification can be achieved by using information functions such as
DeviceType() and CompoundDevice(). The DeviceType() function returns a unique
constant identifier for each device type, and therefore each class.
CompoundDevice() returns a Boolean value of True if the given object is a
compound device.)
One other notable method, SetParent(), is used to assign the handle of a
window designated as the parent of this MCI object. Play() checks to see if a
parent window has been assigned to it and if so, stores this handle in the
parameter block given to mciSendCommand() and sets the MCI_NOTIFY bit of the
corresponding flags value. This causes MCI to post the MM_MCINOTIFY message to
the specified window when the operation is completed. When this occurs, the
window procedure's lParam is set to the device id that initiated the callback,
and the wParam gives the status of the operation, which can be success,
failure, superseded, or aborted. MCI allows the Notify option to be used with
all commands, and this library uses it specifically as an option to the Play()
function. If needed, this option can be added to any or all of the other
functions for these classes.


A Simple MCIDevice


CDAudio is a simple class for this "simple" device. In fact, all the basic
functionality for simple devices is provided in the MCIDevice base class.
Therefore, CDAudio doesn't need to provide additional functions of its own. It
can instead rely on the operations inherited from its parent. The MCI command
set for cdaudio defines a few additional options for some of the base
commands, three of which are implemented here. These are Eject() to eject the
disc, and SetTimeFormatMSF() and SetTimeFormatTMSF() for setting the time
format (T=tracks, M=minutes, S=seconds, F=frames). Each is implemented as an
inline function that passes the appropriate flags to the Set() function. By
allowing just the time format to be set through these functions, we can avoid
inadvertently trying to set a device to a time format that it will not accept.
This lets the programmer know which formats are acceptable for each device by
looking at its class definition, which cannot be determined without the
documentation using the C interface.
This CDAudio device can be opened, closed, played, stopped, queried for its
status, and so on. All common operations are inherited from the MCIDevice base
class. Other simple devices such as Videodisc and VCR can be implemented
similarly by deriving from MCIDevice, specifying the appropriate device-type
string for the constructor and adding any device-specific operations.



CompoundDevice


A compound device is an MCI device that uses files. Therefore, the
CompoundDevice class and its descendants must be able to handle a filename on
open. Keep in mind that the device id is more like an element id, one of which
is allocated for each open compound-device file. (A filename for a compound
device is known as the "device element.") The constructor for CompoundDevice
takes two optional arguments: the name of the file to open, and the name of
the device type to be opened. The second argument is immediately passed to the
base-class constructor. The first argument, if given, is stored internally and
then used to open the file. This approach follows the iostream model, in which
a filename passed to the constructor automatically opens the file. The
implementation could easily be changed to not open the file and require a
subsequent call to Open(). Not requiring a filename allows the object to be
constructed first and later supply a filename on the call to Open().
The Open() and Close() virtual functions are redefined in order to deal with
filenames. Open() saves the filename as part of the object's data and then
passes this along in the lpstrElementName field of the mciOpenParms structure
given to mciSendCommand(). The Close() command uses MCIDevice::Close to close
the device and then frees any memory associated with the filename.
An example of a straightforward derivation from CompoundDevice is the Wave
class. Windows supports digital audio through waveform audio files that
typically use the .WAV file extension. The MCI support for waveform audio is
fairly complete, allowing recording and playback of formats ranging from 8-bit
mono at 11 KHz to 16-bit stereo at 44 KHz, depending on the audio card
installed. The Wave class does not need to override most of the virtual
functions inherited from CompoundDevice. Functions such as Play, Stop, Pause,
Resume, and Seek can all be used as is. Three additional status functions have
been added to allow the user to query the format of the current file:
Channels(), which returns 1 if the file is mono and 2 if it is stereo;
BitsPerSample(), which returns either 8 or 16; and SamplesPerSecond(), which
returns 11025, 22050, or 44100 to reflect the sampling rate. These functions
use extended MCI commands added as part of the waveform audio-command set.


Video as a Data Type


Audio Video Interleaved (AVI) provides scalable, software-only (with optional
hardware support), full-motion digital video. The video sequences can contain
images, audio, and palettes. The audio can be synchronized with the video
within 1/30th of a second. Currently, application support for AVI playback is
only available through an MCI interface--a low-level playback is provided in
the Video for Windows Development Kit. Microsoft provides an MCI driver,
MCIAVI.DRV, which implements a subset of the MCI digital-video command set for
AVI.
To support this new (and large) command set, our AVI class implements a number
of new functions. A Window() function was added, which takes a handle to a
parent window to allow playback as the child of a specific window rather than
the default behavior which is to create its own frame window. PutDestination()
was added to allow the playback window to be positioned at a specific
rectangle within the parent window. Other added functions include: Step(),
which moves forward or backward a specified amount; Seek(), which moves to a
given position; Update(), which displays the current frame; and Signal(),
which notifies the parent when a particular position has been reached. This
implementation could be further enhanced to redefine some of the base-class
virtual functions for extensions that cannot be handled by the inherited
versions, such as having a Play() that handles options to play back in a
window or full screen. (Alternatively, these can be set using the Configure()
function, which pops up a dialog box to set these and other options.)
This class could be modified to use three different window handles: one for
the recipient of the MM_MCINOTIFY notify message after Play(), one to receive
the MM_MCISIGNAL message when a Signal() position is reached, and another to
be used as the parent window for display purposes using the Window() function.
The present design uses up to two concurrent window handles: One is used for
the Window() parent--this value is not stored as class data; the other, the
hWndParent data member, is used for receiving both types of notification
messages.


MciMan


I've included a sample client application, MciMan, which is available
electronically (see "Availability," page 5). MciMan shows how an application
might use the MCI class library. Specifically, MciMan uses the AVI and Wave
classes to allow the user to work with either type of device. A simple menu
interface allows you to select, play back, pause, resume, and stop the device
type. A Notify toggle under the Command menu causes the user to be notified by
a modal message box when the next Play command is completed. (This uses the
MCI_NOTIFY option.) One file of each type can be open simultaneously.
MciMan works by creating one instance each of the Wave and AVI classes as
global objects. When the user selects which type to make currently active from
the MCI Device menu, a pointer to that global object is assigned to a
CompoundDevice* variable. This pointer is then used for all subsequent
operations and relies on the virtual-function mechanism to invoke the
appropriate class function based upon the current selected type. Therefore,
when the Play menu item is selected, the Play() function is invoked using this
pointer and either the Wave::Play() or the AVI::Play() function will be
executed. When the program is exited, the destructors for these two objects
are invoked, and all devices and files are automatically closed--no special
cleanup is needed in the WM_DESTROY or other similar code block.


Future Directions


Although the current implementation of the MCI class hierarchy is fairly
comprehensive, it does not cover all devices nor does it implement all of the
commands and their variants. In addition to MCI's large command set, the
number of MCI devices is also increasing. You can, in fact, write your own MCI
device drivers with the Microsoft Device Driver Kit (DDK). And as with any
software library, there's always room for one more feature.
A good example of an additional device is the MCI driver that Apple is
expected to supply for playing QuickTime for Windows video clips. Depending on
the command set this driver supports, it may make sense to create a
DigitalVideo class which could serve as the parent to both AVI and QuickTime
subclasses. This class could be used to provide a set of functions common to
these two specific devices, thus sharing code and giving a standard API for
both types. (Client programs could then use these interchangeably.)
In any case, you now have a class library that uses C++ and the MCI command
set to build a solid framework of multimedia objects for Windows. The MCI
class library hides the arcane syntax of the MCI command-message interface
(the messages, ids, structures, double casts, and flags) and instead provides
a set of objects that give applications a cleaner, more flexible, and robust
interface for controlling a variety of multimedia devices.
Table 1: Several Windows MCI devices.
==============================================================================
Device Type Description Driver Files Driver Source
==============================================================================
animation Plays Autodesk Animator MCIAAP.DRV Autodesk
 (.flc/.fli) files
 Plays MacroMind Director MCIMMP.DRV MDK, Visual Basic
 (.mmm) files

cdaudio Controls compact disc audio MCICDA.DRV Windows 3.1 SDK, MDK

dat Controls Digital Audio Tape
 deck

digitalvideo Plays AVI video files MCIAVI.DRV Video for Windows

other An undefined MCI device

overlay Controls a video overlay MCIVBLST.DRV Creative Labs
 device--analog video in a
 window

scanner Controls an image scanner

sequencer Plays MIDI audio files MCISEQ.DRV Retail Windows

vcr Controls a video cassette
 recorder

videodisc Controls the Pioneer MCIPIONR.DRV Windows 3.1 SDK, MDK

 LD-V4200 videodisc player

waveaudio Plays and records waveform MCIWAVE.DRV Retail Windows
 audio files
==============================================================================
 Figure 1: MCI class hierarchy.
Products Mentioned
Video For Windows
Microsoft Corp.
One Microsoft Way
Redmond, WA 98052-6399
$199.00; Runtime DLL and SDK available free of charge from CompuServe
(GOWINEXT)
System Requirements:
Playback: 386 SX, Windows 3.1, sound card, mouse, VGA
Capture: 50 Mbytes free hard-disk space, video-capture board

[LISTING ONE]
/*---------------------------------------------------------------------------*\
 * Mci.cpp: C++ class hierarchy for Windows multimedia objects using MCI.
 * $Author: John Musser $
 * $Date: 24 Feb 1993 19:37:02 $
 * $Copyright: Personal Media International Inc., 1993 $
 * Description: A set of classes to encapsulate the command-message interface
 * to MCI. Currently implemented: CDAudio, Waveform audio, AVI.
 * Two base classes MCIDevice and CompoundDevice provide most of
 * the basic functionality needed for derived MCI types.
\*---------------------------------------------------------------------------*/

#include "MCI.h"
#include <memory.h>
#include <string.h>

#define MCI_BUFSIZE 128 // used for char array sizing

/*------ * MCIDevice functions * ------*/
MCIDevice::MCIDevice(LPSTR lpszDevice)
{
 wDeviceID = NULL;
 hWndParent = NULL;
 dwErrState = NULL;
 lpszDeviceType = NULL;
 if (lpszDevice)
 SetDeviceType(lpszDevice);
}
MCIDevice::~MCIDevice()
{
 if (wDeviceID)
 Close();
 if (lpszDeviceType)
 delete [] lpszDeviceType;
}
LPSTR
MCIDevice::Info(DWORD dwFlags)
{
 MCI_INFO_PARMS mciInfoParms;
 static char cBuf[MCI_BUFSIZE];
 if (!wDeviceID)
 return (LPSTR)NULL;
 mciInfoParms.lpstrReturn = cBuf;

 mciInfoParms.dwRetSize = MCI_BUFSIZE;
 dwErrState = mciSendCommand(wDeviceID, MCI_INFO, dwFlags,
 (DWORD)(LPMCI_INFO_PARMS) &mciInfoParms);
 return mciInfoParms.lpstrReturn;
}
DWORD
MCIDevice::Set(DWORD dwFlags, DWORD dwExtra)
{
 MCI_SET_PARMS mciSetParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 if (dwFlags & MCI_SET_TIME_FORMAT)
 mciSetParms.dwTimeFormat = dwExtra; // dwExtra is format
 dwErrState = mciSendCommand(wDeviceID, MCI_SET, dwFlags,
 (DWORD)(LPMCI_SET_PARMS) &mciSetParms);
 return dwErrState;
}
DWORD
MCIDevice::Status(DWORD dwFlags, DWORD dwItem, DWORD dwExtra)
{
 MCI_STATUS_PARMS mciStatusParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciStatusParms.dwItem = dwItem;
 // Determine which extra struct fields to set based on flags.
 if (dwFlags & MCI_TRACK)
 mciStatusParms.dwTrack = dwExtra;
 dwErrState = mciSendCommand(wDeviceID, MCI_STATUS, dwFlags,
 (DWORD)(LPMCI_STATUS_PARMS) &mciStatusParms);
 return mciStatusParms.dwReturn;
}
DWORD
MCIDevice::GetDevCaps(DWORD dwItem)
{
 MCI_GETDEVCAPS_PARMS mciGetDevCapsParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciGetDevCapsParms.dwItem = dwItem;
 dwErrState = mciSendCommand(wDeviceID, MCI_GETDEVCAPS, MCI_GETDEVCAPS_ITEM,
 (DWORD)(LPMCI_GETDEVCAPS_PARMS) &mciGetDevCapsParms);
 return mciGetDevCapsParms.dwReturn;
}
void
MCIDevice::SetDeviceType(LPSTR lpszDevice)
{
 if (lpszDeviceType)
 delete [] lpszDeviceType;
 lpszDeviceType = new __far char[lstrlen(lpszDevice)+1];
 lstrcpy( lpszDeviceType, lpszDevice);
}
void
MCIDevice::ErrorMessageBox()
{
 char szErrorBuf[MAXERRORLENGTH];
 if (mciGetErrorString(dwErrState, (LPSTR)szErrorBuf, MAXERRORLENGTH))
 MessageBox(hWndParent, szErrorBuf, "MCI Error", MB_ICONEXCLAMATION);
 else
 MessageBox(hWndParent, "Unknown MCI Error", "MCI Error",
 MB_ICONEXCLAMATION);

}
DWORD
MCIDevice::Open(LPSTR lpszDevice /* = NULL */)
{
 MCI_OPEN_PARMS mciOpenParms;
 if (wDeviceID) // Already open don't do it again
 return MCI_WARN_ALREADY_OPEN;
 if (lpszDevice) // Type given, save it in private data
 SetDeviceType(lpszDevice);
 else
 if (!lpszDeviceType) // Make sure we have a type to open,
 return MCI_ERR_NO_DEVICENAME; // had to given here or in ctor.
 mciOpenParms.lpstrDeviceType = lpszDeviceType;
 dwErrState = mciSendCommand(NULL, MCI_OPEN, MCI_OPEN_TYPE,
 (DWORD)(LPMCI_OPEN_PARMS) &mciOpenParms);
 if (!dwErrState)
 wDeviceID = mciOpenParms.wDeviceID;
 return dwErrState;
}
DWORD
MCIDevice::Close()
{
 MCI_GENERIC_PARMS mciGenericParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 if (!(dwErrState = mciSendCommand(wDeviceID, MCI_CLOSE, MCI_WAIT,
 (DWORD)(LPMCI_GENERIC_PARMS) &mciGenericParms)))
 wDeviceID = NULL;
 return dwErrState;
}
DWORD
MCIDevice::Stop()
{
 MCI_GENERIC_PARMS mciGenericParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 return dwErrState = mciSendCommand(wDeviceID, MCI_STOP, MCI_WAIT,
 (DWORD)(LPMCI_GENERIC_PARMS) &mciGenericParms);
}
DWORD
MCIDevice::Pause()
{
 MCI_GENERIC_PARMS mciGenericParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 return dwErrState = mciSendCommand(wDeviceID, MCI_PAUSE, MCI_WAIT,
 (DWORD)(LPMCI_GENERIC_PARMS) &mciGenericParms);
}
DWORD
MCIDevice::Resume()
{
 MCI_GENERIC_PARMS mciGenericParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 return dwErrState = mciSendCommand(wDeviceID, MCI_RESUME, MCI_WAIT,
 (DWORD)(LPMCI_GENERIC_PARMS) &mciGenericParms);
}
DWORD
MCIDevice::Play(LONG lFrom, LONG lTo)

{
 MCI_PLAY_PARMS mciPlayParms;
 DWORD dwFlags = 0L;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 if (hWndParent) {
 mciPlayParms.dwCallback = (DWORD)(LPVOID) hWndParent;
 dwFlags = MCI_NOTIFY;
 }
 if (lFrom != current) {
 mciPlayParms.dwFrom = lFrom;
 dwFlags = MCI_FROM;
 }
 if (lTo != end ) {
 mciPlayParms.dwTo = lTo;
 dwFlags = MCI_TO;
 }
 dwErrState = mciSendCommand(wDeviceID, MCI_PLAY, dwFlags,
 (DWORD)(LPMCI_PLAY_PARMS) &mciPlayParms);
 return(dwErrState);
}
DWORD
MCIDevice::Seek(LONG lTo)
{
 MCI_SEEK_PARMS mciSeekParms;
 DWORD dwFlags = 0L;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 switch(lTo) {
 case start:
 dwFlags = MCI_SEEK_TO_START;
 break;
 case end:
 dwFlags = MCI_SEEK_TO_END;
 break;
 default:
 mciSeekParms.dwTo = (DWORD)lTo;
 dwFlags = MCI_TO;
 }
 dwErrState = mciSendCommand(wDeviceID, MCI_SEEK, dwFlags,
 (LONG)(LPMCI_SEEK_PARMS) &mciSeekParms);
 return dwErrState;
}
DWORD
MCIDevice::StopAll()
{
 return mciSendCommand(MCI_ALL_DEVICE_ID, MCI_STOP, 0, NULL);
}
/*------- * CompoundDevice functions * -------*/
CompoundDevice::CompoundDevice(LPSTR lpszFile, LPSTR lpszDevice)
 : MCIDevice(lpszDevice)
{
 lpszFileName = NULL;
 if (lpszFile)
 Open(lpszFile);
}
CompoundDevice::~CompoundDevice()
{
 if (wDeviceID)

 Close();
 if (lpszFileName)
 delete [] lpszFileName;
}
DWORD
CompoundDevice::Open(LPSTR lpszFile)
{
 MCI_OPEN_PARMS mciOpenParms;
 DWORD dwFlags = 0L;
 if (wDeviceID) // If we're already open don't do it again
 return MCI_WARN_ALREADY_OPEN;
 lpszFileName = new __far char[lstrlen(lpszFile)+1];
 lstrcpy( lpszFileName, lpszFile);
 mciOpenParms.lpstrElementName = lpszFileName;
 mciOpenParms.lpstrDeviceType = lpszDeviceType;
 dwFlags = MCI_OPEN_ELEMENT MCI_OPEN_TYPE;

 dwErrState = mciSendCommand(NULL, MCI_OPEN, dwFlags,
 (DWORD)(LPMCI_OPEN_PARMS) &mciOpenParms);
 if (!dwErrState)
 wDeviceID = mciOpenParms.wDeviceID;
 return dwErrState;
}
DWORD
CompoundDevice::Close()
{
 MCIDevice::Close();
 if (lpszFileName) {
 delete [] lpszFileName;
 lpszFileName = NULL;
 }
 return dwErrState;
}
/*------ * AVI functions * ------*/
DWORD
AVI::Update(HDC hdc)
{
 MCI_DGV_UPDATE_PARMS mciUpdateParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciUpdateParms.hDC = hdc;
 dwFlags = MCI_DGV_UPDATE_HDC MCI_DGV_UPDATE_PAINT;
 dwErrState = mciSendCommand(wDeviceID, MCI_UPDATE, dwFlags,
 (LONG)(LPMCI_DGV_UPDATE_PARMS) &mciUpdateParms);
 return dwErrState;
}
DWORD
AVI::PutDestination(RECT &rect)
{
 MCI_DGV_PUT_PARMS mciPutParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciPutParms.rc = rect;
 dwFlags = MCI_DGV_PUT_DESTINATION MCI_DGV_RECT;
 dwErrState = mciSendCommand(wDeviceID, MCI_PUT, dwFlags,
 (LONG)(LPMCI_DGV_PUT_PARMS) &mciPutParms);
 return dwErrState;

}
DWORD
AVI::Signal(DWORD dwPosition)
{
 MCI_DGV_SIGNAL_PARMS mciSignalParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 if (!hWndParent) // we need a parent to get the message
 return MCI_ERR_NO_PARENT;
 mciSignalParms.dwCallback =(DWORD)(LPVOID)hWndParent; // MCI_SIGNAL recipient
 mciSignalParms.dwPosition =dwPosition; // when to send
 mciSignalParms.dwUserParm =(DWORD)(LPVOID)this; // self will be lParam
 dwFlags = MCI_DGV_SIGNAL_AT MCI_DGV_SIGNAL_USERVAL;
 dwErrState = mciSendCommand(wDeviceID, MCI_SIGNAL, dwFlags,
 (LONG)(LPMCI_DGV_SIGNAL_PARMS) &mciSignalParms);
 return dwErrState;
}
DWORD
AVI::Configure()
{
 MCI_GENERIC_PARMS mciGenericParms;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 return dwErrState = mciSendCommand(wDeviceID, MCI_CONFIGURE, 0,
 (DWORD)(LPMCI_GENERIC_PARMS) &mciGenericParms);
}
DWORD
AVI::Cue(DWORD dwTo)
{
 MCI_DGV_CUE_PARMS mciCueParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciCueParms.dwTo = dwTo;
 dwFlags = MCI_DGV_CUE_OUTPUT MCI_TO;
 dwErrState = mciSendCommand(wDeviceID, MCI_CUE, dwFlags,
 (LONG)(LPMCI_DGV_CUE_PARMS) &mciCueParms);
 return dwErrState;
}
DWORD
AVI::Step(DWORD dwFrames, BOOL bReverse)
{
 MCI_DGV_STEP_PARMS mciStepParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciStepParms.dwFrames = dwFrames;
 dwFlags = MCI_DGV_STEP_FRAMES;
 if (bReverse)
 dwFlags = MCI_DGV_STEP_REVERSE;
 dwErrState = mciSendCommand(wDeviceID, MCI_STEP, dwFlags,
 (LONG)(LPMCI_DGV_STEP_PARMS) &mciStepParms);
 return dwErrState;
}
DWORD
AVI::SetAudioVolume(DWORD dwVolume)
{
 MCI_DGV_SETAUDIO_PARMS mciSetAudioParms;

 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciSetAudioParms.dwValue = dwVolume;
 mciSetAudioParms.dwItem = MCI_DGV_SETAUDIO_VOLUME;
 dwFlags = MCI_DGV_SETAUDIO_ITEM MCI_DGV_SETAUDIO_VALUE;
 dwErrState = mciSendCommand(wDeviceID, MCI_SETAUDIO, dwFlags,
 (LONG)(LPMCI_DGV_SETAUDIO_PARMS) &mciSetAudioParms);
 return dwErrState;
}
DWORD
AVI::SetSpeed(DWORD dwSpeed)
{
 MCI_DGV_SET_PARMS mciSetParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciSetParms.dwSpeed = dwSpeed;
 dwFlags = MCI_DGV_SET_SPEED;
 dwErrState = mciSendCommand(wDeviceID, MCI_SET, dwFlags,
 (LONG)(LPMCI_DGV_SET_PARMS) &mciSetParms);
 return dwErrState;
}
DWORD
AVI::Window(HWND hWnd)
{
 MCI_DGV_WINDOW_PARMS mciWindowParms;
 DWORD dwFlags;
 if (!wDeviceID)
 return MCI_ERR_NOT_OPEN;
 mciWindowParms.hWnd = hWnd;
 dwFlags = MCI_DGV_WINDOW_HWND;
 dwErrState = mciSendCommand(wDeviceID, MCI_WINDOW, dwFlags,
 (LONG)(LPMCI_DGV_WINDOW_PARMS) &mciWindowParms);
 return dwErrState;
}
End Listing

























July, 1993
PROGRAMMING PARADIGMS


Fuzzy Logic and Prejudice




Michael Swaine


This column is about fuzzy logic: its struggle for acceptance, the arguments
of its critics and supporters, and its record of success. If you're already
pretty familiar with this paradigm and want some sample code and
implementation advice, you may as well skip to the references at the end.
Because this column is about fuzzy logic as an emerging paradigm. And along
the way, it takes a detour to look at another paradigm, one that has struggled
against resistance very much as fuzzy logic has, and which now finds itself in
an ironic position.


Fuzzy Logic Has a Bad Name


Fuzzy logic is an approach to logic that has applications to
microprocessor-control logic and expert-systems design. It was invented in
1964 by Lotfi Zadeh, who also named it.
Although it has been around for almost 30 years, fuzzy logic has been slow to
catch on. One reason for this, as Dan McNeill and Paul Freiberger catalog in
their book, Fuzzy Logic, is that it has been subject to prejudice and
discrimination. Papers submitted to journals have been rejected because they
dealt with fuzzy logic, engineers have been told not to employ fuzzy
techniques, grant requests have been rejected because of the word "fuzzy" in
their titles.
True, these journal editors, managers, and grant approvers may have been onto
something. But McNeill and Freiberger make a convincing case that fuzzy logic
has been the victim of unreasoning bias. At least some of this bias against
fuzzy logic seems to involve its name. Perhaps it's not surprising that NASA
engineers are reluctant to put themselves in a position where they have to
tell their bosses that they're using fuzzy techniques. And their bosses
certainly don't want to have to justify to the taxpayers that the logic behind
space-shuttle navigation is fuzzy.
Zadeh deliberately gave fuzzy logic a provocative name. That decision may have
been unwise.
Fuzzy logic is controversial enough without the oxymoronic name. It rejects
the eternal verities of Aristotelian logic: That a thing is either A or not-A;
that nothing is both true and false; and that nothing is neither true nor
false. Fuzzy logic tolerates the true-and-not-true statement, and the
neither-true-nor-false. It takes the view that, while absolute truths may
exist, we never seem to see them in our data. We live in a world of imprecise,
uncertain, subjective information, and to pretend otherwise is to ignore an
important fact, to throw away data. The fact that the data are imprecise is
itself data, data that excessive, obsessive precision throws away.
Fuzzy logic can answer questions classical logic can't. Consider the following
examples of logical deduction:
Most people who blink rapidly are lying.
Bob Dole often blinks rapidly.
Therefore, Bob Dole is often lying.
My cousin Corbett looks a lot like Rush Limbaugh.
Rush Limbaugh is fat.
Therefore, my cousin Corbett is more or less a porker.
Classical logic fails to follow these deductions. It insists that fuzzy
language like "most" and "a lot like" be quantified precisely. Natural logic
follows these deductions informally, not worrying about precision. Fuzzy logic
follows them formally and naturally.
But fuzzy logic is not fundamentally about logic.
It's about set inclusion. To how great a degree is a tomato a vegetable?
Classical, Aristotelian, binary-valued logic says it either is or it isn't.
Other logics permit an indeterminate third value, a Neither between True and
False. But even such logics are crisp-edged: They insist on unambiguous set
membership. They just have more sets than Aristotle would have approved of.
Fuzzy logic is different.
Everything we know about biology and the evolution of species says that the
boundaries are in fact vague. Moreover, our minds seem bent on treating them
that way: A tomato is sort of a vegetable, and sort of a fruit. Psychological
researchers have found that our mental categories are generally fuzzy. We
think of exemplars of a category as being more or less good examples, more or
less members of the set. Rather than thinking in terms of the set theory
taught in school, we seem to think in terms of prototypes, and rough,
subjective degrees of matching to these prototypes. When forced to categorize
a penguin, we throw it unambiguously into the set called birds, but what we
really think is that it's not much of a bird.
Zadeh's fuzzy sets exactly capture this imprecision. Fuzzy sets are a way of
talking about imprecision rather than masking it by rounding it off.
Say you're interested in finding the young, effective salespeople in your
organization. The terms "young" and "effective" are fuzzy. Even if we agree
that "effective" means a high total sales, these terms are imprecise. And the
point is, this imprecision represents information. If we were only interested
in total sales, we could perhaps set some precise cutoff, but suppose we're
interested in the two variables of age and sales volume in combination. We'll
look at a very young salesperson with a merely good sales total, or a somewhat
older salesperson with a phenomenal sales record. We'll trade one variable off
against the other. No precise cutoff for the sets of "young" or "effective"
salespeople will let us handle this trade-off. Fuzzy sets do.
In the mathematics of fuzzy sets, an item has a value for its membership in a
set, and this value is not necessarily 1 or 0, but may be any value between 1
and 0. The value of the intersection of two sets is the lower of their values;
the value for their union is the larger of their values.
From a rigorous theory of sets one can derive a rigorous logic, and Lotfi
Zadeh has done so with fuzzy sets, deriving the fuzzy equivalents to the
propositional calculus and first-order predicate logic.


Why are the Victims of Discrimination the First to Discriminate?


But fuzzy doesn't have a lock on uncertainty.
Fuzzy logic is not the first attempt to reason rigorously about imprecise,
uncertain, subjective information. That's what statistics is all about.
One type of statistics has been employed effectively in the very areas where
fuzzy logic is being used: Bayesian statistics. Like fuzzy logic, the Bayesian
approach has been used in expert systems. Like fuzzy logic, it has been
discriminated against.
Bayesian statisticians refer to classical statisticians as "relative
frequentists," alluding to the fundamental difference in their interpretations
of probability. The relative-frequency interpretation has held sway throughout
most of the history of statistics; only recently have Bayesians gained some
ground.
To a relative frequentist, a probability is the limit of a sequence. The true
probability of heads in flipping a coin is the limit of the relative frequency
of heads in n flips of the coin as n goes to infinity.
If you start with this definition of probability, it determines what
interpretation you can put on the outcome of an experiment. In particular, it
doesn't allow you to draw conclusions about the probability that a theory or
hypothesis is true; it either is true or it isn't. You can, however, draw
conclusions about the probability that you would have observed the results you
did if a given hypothesis is true. In other words, you can't talk about the
probability of the hypothesis given the data, but you can talk about the
probability of the data given the hypothesis. Since you can observe the data
and you can only speculate about the hypothesis (that's why they're called
hypotheses), this seems exactly backward.
Let's see how this works in practice. Let's assume that you're a relative
frequentist and that you want to know if the coin I'm flipping is a legitimate
coin. I've flipped it 20 times, and every time it's come up heads. Your
suspicions are aroused. What can you conclude about the hypothesis that this
is a fair coin?
The answer is, nothing. You can compute the probability of getting 20 heads
out of 20 flips under this hypothesis, and it's very low. But this says
nothing about the hypothesis itself. Classical statistics doesn't allow you to
do much of anything with a single hypothesis. What it does allow you to do is
to compare two hypotheses.
In our example, the hypothesis that I'm flipping a fair coin makes the
observed outcome extremely unlikely, while the hypothesis that I've got a
two-headed coin makes the observed outcome certain. On this basis, you are
justified in rejecting the fair-coin hypothesis categorically, in favor of the
two-headed-coin hypothesis. But you are not making a judgement about the
probability of either hypothesis being true: It either is or is not true, and
in the classicalapproach probabilities do not apply. You've merely used the
probability of an observed event occurring under the two hypotheses to make an
educated guess.
If the logic of this seems contorted, it may be because I have not presented
the classical approach fairly. Or it may seem contorted because it is
contorted. That's what Bayesians think.
Bayesian statistics takes its name from Thomas Bayes, who published a simple
formula about a century ago. The formula itself is uncontroversial; it's just
a statement about how to compute conditional probabilities. It looks like
this: P(HD)=P(DH)*P(H)/P(D). That is, the conditional probability of H given D
is equal to the conditional probability of D given H, times the ratio of the
unconditional probabilities of H and D. It's a cute trick for reversing the
direction of conditionality. If we know how likely D is given H, we can use it
to find out how likely H is given D. Assuming, that is, that we know how
likely D and H are unconditionally. To make it clearer where these values come
from when they use Bayes's rule, Bayesians usually write it this way:
P(HD)=P(DH)*P(H)/(P(DH)*P(H)+P(DH')*P(H')), where H' means the complement of
H, and P(D) has just been expanded into a weighted sum of probabilities.
The controversy comes in when we attach meanings to the symbols. H means
hypothesis and D means data. Again, no one has ever questioned the validity of
the equation. But notice what it does: It lets you compute the probability of
the hypothesis given the data. In the classical interpretation of probability,
this is meaningless. Hypotheses do not have relative frequencies. Observations
can be repeated, so probabilities apply to them. Hypotheses can't; they are
either true or false. So what is this equation saying?
As I understand it, in the classical view, the equation is mathematically
valid and semantically vacuous. It follows from the mathematical definition of
probability, all right, but it doesn't mean anything.
In the Bayesian view, it means exactly what it says. It is the key to Bayesian
statistics. That and one other thing: a different interpretation of
probability.
To a Bayesian, probability is a degree of belief.

Bayes's rule lets you revise your belief on the basis of new data. Your
initial degree of belief, or prior probability, is P(H); your revised opinion,
or posterior probability, is P(HD); and the probability of observing this
outcome if the hypothesis is true, which is derivable from the mathematical
specification of the hypothesis and is the one probability here that a
classical statistician would accept as meaningful, is P(DH). Bayes's rule lets
a Bayesian update a posterior probability from a prior probability based on
the data observed.
There is a whole body of statistics that flows from this rule, but it doesn't
necessarily lead to different conclusions than the ones reached by classical
statisticians. Bayesian results map precisely onto classical results, given
reasonable assumptions. Bayesian statistics caught on in business schools, but
was sneered at by mathematicians and scientists because of P(H): In order to
get the Bayesian engine cranking, you've got to prime it with a prior
probability, a prior opinion. Every Bayesian conclusion starts with the
unsupported (and therefore unscientific, subjective) opinion of the
investigator. The fact that this mirrors reality did not initially impress the
scientific and mathematical communities. Eventually, though, the success of
the Bayesian approach led to its acceptance. (Those B-school statisticians
were more concerned with the bottom line than with academic respectability.)
Not only is the Bayesian approach more intuitively satisfying than the
classical, it also lends itself naturally to the design of robotic controllers
and expert systems, both of which involve the updating of an initial guess on
the basis of new data.
Like fuzzians, Bayesians were discriminated against. Like fuzzians, they
championed an approach that was, at once, more natural and at odds with the
fundamental assumptions of the official view. Having won some respectability,
Bayesians were the first to denounce fuzzy logic as bogus. Naturally.
But fuzzy logic is a genuinely new paradigm. It's not about probability. It's
about possibility. So say the fuzzians. To this the Bayesians have a
rejoinder: Anything fuzzy logic can do can also be done with probability
models.
Bayesians have claimed this repeatedly. This is the argument behind some of
the rejections of fuzzy articles by journals. Fuzzy logic is nothing new, the
argument goes, just a terminological affectation.
The argument is, of course, mathematical. It ignores the fact that fuzzy
expert systems generally have about one-tenth the number of rules that
probabilistic expert systems have, and are easier to implement. A similar
argument says object-oriented programming can't do anything that
spaghetti-code Basic can't do. Possibly so, mathematically, but in practice
there's a world of difference.
But one fuzzy researcher, Bart Kosko, has tried to resolve the mathematical
argument. Kosko has come up with a not-entirely uncontroversial mathematical
characterization of the domain of fuzzy logic, from which it is possible to
derive Bayes's rule. This would seem to imply that anything the Bayesians can
do, the fuzzians can do, but that there may be more to fuzzy than mere
probabilities.


People Don't Count (But They Do Classify)


There is a taxonomy of uncertainty.
George Klir, another fuzzian, claims that there are, in fact, four kinds of
uncertainty: nonspecificity, fuzziness, dissonance, and confusion. Both fuzzy
logic and probability theory--Bayesian or otherwise--deal with all four kinds
of uncertainty, but probability does so unconvincingly at times. Proponents
contend that fuzzy is more representative of reality, even if it is
mathematically equivalent to probability.
There's another sense in which the probabilistic approach rings false, and
here I have personal experience. In graduate school, in cognitive psychology,
studying how people handle probabilities, recomputing them, working in a
Bayesian paradigm, I chanced upon work of two Israeli psychologists. They
concluded that people in research settings like mine are really judging how
closely events match an exemplar, how much they fit into a category, not
updating probabilities at all. What Kahneman and Tversky were describing,
although they didn't use the term and I had not heard it then, was "fuzzy-set
membership." In any case, it wasn't Bayesian-probability revision, and a few
months later, I transferred to the computer-science department


Fuzzy Logic Needs a Killer App


Fuzzy logic has recently been gaining acceptance in the United States and in
Europe. But the process is slow, resistance remains, and meanwhile the
Japanese are making excellent use of fuzzy logic in processors and expert
systems. Fuzzy is actually a fad in Japan, but that shouldn't overshadow the
real successes: The Sendai subway, for example, which American engineering has
not been able to match for smoothness of ride, employs fuzzy logic for this
control.
What American fuzzy logic needs to break through, I think, is the
quintessential American application. This needs to be an appropriate
application for fuzzy logic: one in which the input is subjective, imprecise,
and uncertain. And it should be a decision-making task. It should also be
something that'll catch on at universities, where it will be burned into the
brains of the next generation of executives and programmers. And it needs to
be something warm and nonthreatening, to overcome the resistance to the name
and the novelty of the technology; preferably a familiar application that has
already been used to break down resistance to computers.
I can think of only one such application. A computer dating service. Your
dream date through warm and fuzzy logic. How can it fail?


Further Reading


Kosko Bart. Neural Networks and Fuzzy Systems. Englewood Cliffs, NJ: Prentice
Hall, 1991.
McNeill, Daniel and Paul Freiberger. Fuzzy Logic. Englewood Cliffs, NJ: Simon
& Schuster, 1993.
Zadeh, L.A. "The Calculus of Fuzzy If/Then Rules." AI Expert (March, 1992).






























July, 1993
C PROGRAMMING


C++ Templates and Filling in some Historical Gaps




Al Stevens


This month's column looks at C++ templates. The template implements what
Stroustrup calls "parameterized types" in a paper in the Winter 1989 Journal
of the USENIX Association. His proposal for the addition of class templates
and function templates was ultimately implemented and released in the AT&T 3.0
C++ system, adopted by the ANSI C++ technical committee, and is now available
in several commercial compilers.
Let's consider what templates do and where they are useful. A class template
is a generic class that takes on meaning when it is compiled to support
objects of some other concrete class.
Consider data types that are maintained in collections of objects. An
application has many different classes and potentially many different ways to
organize them. Depending on how the application manages the objects, it might
use a number of different container organizations, such as trees, queues,
lists, and stacks. Each container type exists only to manage objects. You
could have a stack of pointers, a queue of integers, a balanced tree of
employee objects, a linked list of window handles, and so on. What is more,
you could have a stack of date objects, a queue of date objects, and a tree of
date objects, each container supporting different storage and retrieval
requirements, but all of them containing objects of the same type. Each data
type has its behavior, and each container type has its behavior, and the
behaviors of the two are unrelated. If I want to organize my objects of type
SolidCube into a stack in one part of the program and into a list in another,
that decision has nothing to do with the behavior of the SolidCube class.
Making that distinction is essential to understanding templates.
The template is a mechanism with which you define a class that you instantiate
only in conjunction with another type. Given a class template for type
LinkedList<T>, for example, you may declare an object of that type only by
associating it with another type, the type that the linked list manages. The
LinkedList<T> class template is the "parameterized" type. The other type,
represented by the <T>, is the parameter. (There can be more than one.)
Therefore, if you instantiate the LinkedList class with, for example, the Date
class, you have declared an object that is a linked list of Date objects, or,
more precisely, of type LinkedList<Date>.
Before looking at templates, let's consider ways to implement such things with
traditional C++. I'll discuss three alternatives for managing objects in a
linked list without using templates and give examples of two of them. Then
I'll build a class template to perform the same operation.


The Linked List


First, a brief explanation of the linked-list data structure. A list is a
collection of like objects, not necessarily in an array. They could be on the
heap, on the stack, declared as global or static variables, or any combination
of these. A linked list associates the objects with one another in an
incidental sequence unrelated to any particular collating sequence. The first
object points to the second object, which points to the third object, and so
on. If the list is bidirectional, the third object also points to the second,
which points to the first. So, each object in the list has one or two
pointers: a pointer to the next object and a pointer to the previous object.
To navigate the list, the program uses a listhead, which contains a pointer to
the first object in the list, and, if the list is bidirectional, a pointer to
the last object. If the list is bidirectional, you can insert objects into,
and delete objects from, the middle of the list.
There are two ways to look at such a container. The container class can either
be a repository that makes a copy of the object and puts it in the container,
or it can "containerize" the user's copy of the object.


First Alternative: Do It Yourself


A C++ programmer has at least three options without templates. The simplest
solution adds the next and previous pointers to the classes that need
linked-list management and builds a listhead that points specifically to
objects of that class. There are disadvantages to this approach. First, you
have to modify the class to add the behavior of a linked list, which is
unrelated to its original purpose. You would probably do that with
inheritance. Now, besides having your concrete SolidCube class, you have
something like a LinkedListSolidCube class, which is one more class than you
need, and not a particularly good application of inheritance. The other
disadvantage is that you have to write linked-list code for every such derived
class instead of having it be generic behavior. Duplicating that code in
multiple classes creates maintenance problems.


Second Alternative: An Embedded LinkedList Class


Rather than duplicate the code for every class, you could build a generic
LinkedList class and embed an object of it in the target class. This approach
shares one disadvantage with the first alternative in that it requires
modifications to the class. It does, however, eliminate the duplication of the
linked-list code.
Listing One, page 138, is emblist.h, the header file that defines the
LinkedList class to embed in a target class. Two classes are defined: the
LinkedList class, which is the listhead; and the ListEntry class, which
encapsulates the linked list behavior. That behavior includes the nextentry
and preventry pointers to other ListEntry objects.
ListEntry also includes the listhead pointer to an object of type LinkedList
to associate the object with a particular list.
Since a ListEntry object is embedded in the target object, and since an
embedded object cannot determine its owner's address, the ListEntry class
includes a void pointer named thisentry, which points to the outer object. It
must be void because the generic class does not know about the type that it
supports. That's where the "parameterized" part of templates is going to help.
The ListHead class contains pointers to the first and last objects in the
list. These point to embedded ListEntry objects. The address of the listed
object is dereferenced through the thisentry pointer mentioned earlier.
Listing Two, page 138, is emblist.cpp, which contains member functions for the
LinkedList and ListEntry classes. It includes the constructors and destructors
and ListEntry member functions to append and remove objects from the list.
Observe that the ListEntry constructor's parameter is a void pointer to the
object of the outer class. This pointer initializes the ListEntry's thisentry
pointer. This technique is aesthetically displeasing. It offends me to know
that an object stores its own address, even in a data member of an embedded
object. Furthermore, it compromises the type safety of the linked list. The
compiler can't prevent me from passing any address as an argument to that
parameter.
The ListEntry::AppendEntry function appends the outer object to the linked
list by modifying the pointers in the LinkedList object and by setting its own
nextentry and preventry pointers. The ListEntry::RemoveEntry function does the
opposite, removing the object from the list, patching any hole opened by its
departure, and repairing the LinkedList object's firstentry and lastentry
pointers if the departing object is the first or last object in the list.
The destructor for the ListEntry class calls its own ListEntry::RemoveEntry
function, and the destructor for the LinkedList class calls the
ListEntry::RemoveEntry function for every object in the list. That is because
this implementation of a linked list does not make copies of the objects. It
adds linked list behavior to the user's copies. When an object goes out of
scope, its destructor is called. The linked list has to expel the object to
preserve the list's integrity. If the LinkedList object goes out of scope
while objects are still in the list, the objects will seem to be in a list
that no longer exists.
Listing Three, page 139, is testemb.cpp, the program that uses the LinkedList
and ListEntry classes. It defines a Date class in a linked list. The class has
the usual day, month, and year data members, and it overloads the << insertion
operator to display itself. There are two additions to the class to support
the linked list. First is the ListEntry le data member. Second is the call to
the ListEntry constructor from the constructor for the Date object. This call
initializes the thisentry pointer with the Date object's this pointer, which
is where type safety breaks down. You could pass any address, including a NULL
address, and the compiler wouldn't complain.
The program declares a LinkedList object and then gets dates from the user and
appends them to the linked list by calling the ListEntry le object's
AppendEntry function.
After the last date entry, the program iterates through the list and displays
the dates on the console. The calls to ListEntry::FirstEntry and
ListEntry::NextEntry return void pointers, so they are cast to Date pointers.
I didn't bother deleting all of the dates from the heap in these examples, and
a more complete program would do that. Also, a more complete ListEntry class
would include member functions to retrieve the last and previous entries as
well as to insert entries into specified places in the list.


Third Alternative: Inherit the Linked-list Behavior


We can eliminate some of our objections to the LinkedList and ListEntry
classes. Instead of embedding the ListEntry object, the Date object is derived
from the base ListEntry class, inheriting the linked-list behavior. This
approach improves the code's notation, and it improves type safety by
eliminating void pointers, but it introduces a new objection. Inheritance is
typically used to model the is a relationship between classes. By deriving
Date from ListEntry, we are saying that a date is a list entry. Well, yes it
is, but only in the programmer's view. It is only incidental to the way the
program manages objects that a date is a list entry, and the model does not
reflect common-sense object-oriented design. Nonetheless, this approach is our
final alternative to templates.
Listing Four (page 139), inhlist.h, and Listing Five (page 139), inhlist.cpp,
modify the LinkedList class to be a base class. The thisentry void pointer is
gone, and the other void pointers and void pointer functions are now pointers
to type ListEntry. Since the target object is derived from, rather than host
to, the ListEntry class, the this pointer serves as the address of the object.
Listing Six, page 139, is testinh.cpp, the test program modified to use the
base class. The only concession that the Date class makes to being in a linked
list is that it is derived from the ListEntry class. AppendEntry is not called
through an embedded object but directly through the Date object itself. The
calls to ListEntry::FirstEntry and ListEntry::NextEntry still need casts,
however. They return pointers to the base ListEntry class which must be cast
to pointers to the Date class.


A LinkedList Class Template



We can eliminate all of our objections to the approaches just discussed by
using templates. Realize first, however, that when you declare an object of a
template class, the compiler builds source code that associates the template
with its parameter class. If you use the same template for a different type,
you get another copy of the source code, customized for the other type. If the
algorithm is big and your program needs many versions of it, the template
solution, while easier to code, might produce a bigger executable program than
you want.
Listing Seven, page 139, is linklist.h, the LinkedList class template. The
header file contains the class definition and the member-function templates.
The compiler uses these member-function templates to build source code when
your program declares an object of a class-template type. The template is a
form of macro, and all of its code must be visible wherever you declare a
parameterized type. This version of the linked list contains everything. It
includes functions to append, insert, remove, and find objects on the list.
This implementation makes copies of the objects that go into the list, which
means that your program can let the objects go out of scope after you put them
in the list, but also means that your classes must have valid copy
constructors.
Only the LinkedList<T> class can declare objects of the ListEntry<T> class
because ListEntry<T> has no public members and the LinkedList<T> class is a
friend. The list-navigation member functions are in the LinkedList<T> class
instead of the ListEntry<T> class. The user declares an object of the
LinkedList<T> class template with a parameter and adds to, deletes from, and
navigates that list through the LinkedList<T> object.
Listing Eight, page 140, is testtmpl.cpp, a program that tests the
LinkedList<T> class template. Its Date class is unaware that its objects are
in a linked list. This is the strength of the class template. You don't have
to monkey with the target class to get it into a container, and you don't have
to give up type checking.
The program declares an object of type LinkedList<Date>, then gets dates from
the user and puts them into the list. The program does not retain copies of
the objects. This is another strength of the class template: It knows its own
size and can instantiate objects of itself. A base class cannot do that and
include the derived class's members in the instantiated object. An embedded
class cannot do that and include the outer class. Templates solve that
problem. In this example, the ListEntry<T> class includes a copy of the
parameterized type. The LinkedList<T>::AppendEntry and
LinkedList<T>::InsertEntry functions build an object of type ListEntry<T> on
the heap, initializing it with the object being added to the list.


An Unabridged History of MSC/C++


There are two books out about Bill Gates and Microsoft. I reviewed one of
them, Hard Drive, by James Wallace and Jim Erickson (John Wiley, 1992) in the
May 1993 "Programmer's Bookshelf." The other is Gates, by Stephen Manes and
Paul Andrews (Doubleday, 1993). They both tell much the same story, although
Gates is a better book with more information.
I was disappointed that neither book gave any attention to Microsoft C and C++
compilers. Maybe those products are more important from my perspective than
they are from that of the typical book consumer. Maybe the authors don't
understand these languages or the significance of their history with respect
to Microsoft's position in the languages market. Hard Drive says that Cobol is
"difficult to master," which shows what they know, and does not mention C. I
doubt that the authors, both newspaper reporters and not programmers, could
relate the importance of C to the story they were telling. Manes and Andrews
are seasoned journalists in the computer world, and they should know better,
but their mention of C and C++ is only coincidental to other points that they
make.
So, to fill the void, here is the unauthorized history of Microsoft C and C++,
drawn from my own flawed memory, recalled without the benefit of any research
whatsoever, and rife with opinion.
The first Microsoft C was Lattice C in a Microsoft binder. Lattice C was an
early and successful C compiler for the IBM PC. I remember it from about 1983.
Apparently Microsoft licensed the software to get into the C marketplace
before their compiler was ready. Neither book mentions Lattice. Unlike the two
books, I cannot offer any insight into the deals that were made and broken,
the careers that were crushed, or how Bill's dandruff swirled around his head
in an ephemeral cloud when he ran around the room and yelled during
negotiations.
About 1984, Microsoft dropped Lattice C and came out with the first of their
own C compilers, which they dubbed "Version 3.0." It was a typical K&R
compiler, and it worked well. Its successor, Version 4.0, was notable mainly
for Codeview, a source-level debugger. I had been using other compilers,
debugging with the venerable printf function, and became an immediate convert
to Microsoft C. Codeview made the difference.
In spring of 1987, Borland stirred up the market with Turbo C 1.0. It had no
debugger, but it was faster than Microsoft C by several factors, and it had a
new feature called the integrated development environment (IDE), which
integrates the editor, compiler, and linker into one program. Turbo C was hot
stuff, and Microsoft announced QuickC in the fall of the same year, finally
delivering in the winter. QuickC was not ready for prime time. It wasn't all
that quick, its IDE supported only the medium memory model, and the compiler
generated huge executables. However, QuickC included an integrated
Codeview-style debugger, online help, and a graphics library, features that
Turbo C lacked. QuickC had more bugs than Bill Clinton has encouraging words,
but a lot of people bought it.
Borland and Microsoft chased one another over the years with upgrade after
upgrade, each one playing catch-up and then upstaging the other with new,
unheard of features. Support for Windows programming became a big deal.
Naturally, Microsoft had their pricy SDK, and for a while it was the only game
in town. Then someone realized that the complete API is resident in a DLL in
every Windows installation. All you need is a way to call DLL functions, a
header file full of prototypes and message mnemonics, and documentation. When
Microsoft Press started selling the SDK documentation in a three-volume set
out of retail bookstores, every C compiler in the business offered support for
Windows programming almost overnight. Some of them even bought the Microsoft
books in volume and packaged them with their compilers. Talk about competing
with yourself.
Then Borland got the jump with a C++ compiler fully a year ahead of Microsoft.
Since they had C++ and Microsoft didn't, Borland and Philippe Kahn hit the
conference circuit, carrying the good news to the natives, promoting
object-oriented design as the only true way to do anything, while Gates and
Microsoft grumbled about OOPS and dismissed it as just another trendy fad.
Microsoft C 5.1 was a rugged, reliable package. Version 6.0 was buggy and had
sparse documentation. Programmers howled. Version 7.0 corrected those
deficiencies and added--what else?--C++ to the package. Now it was
technologically correct to be an OOPS programmer. The fad of yesteryear
finally found validation at the Redmond altar and became a sanctified
paradigm. Next came the Microsoft Foundation Classes (MFC), a C++ class
library that encapsulates some of the Windows API. That product was a reaction
to Borland's ObjectLibrary for Windows (OWL), which does the same thing. Not
to be left at the gate, Borland added templates to their C++ compiler.
Microsoft C++ does not have templates yet, although when packaged with Visual
C++, the compiler's version number got a boost to 8.0. It reminds me of movie
sequels.
Adding features and bumping version numbers is how a software developer tries
to keep market share. Those features add size and sink resources, too. I used
to run the Borland and Microsoft command-line C compilers on an 8088 laptop
with a 720-Kbyte diskette drive. Now you need 60 Mbytes of disk, several
Mbytes of RAM, and a DPMI driver. I believe in free enterprise and oppose
monopolies, but I wonder how beneficial all of this competition is. I guess
without it we'd still be stuck in the '70s.
Well, this historical account is surely full of holes and probably has some
inaccuracies, but we seem to be the only biographers who care, so it's all
we're going to get. I bought Manes and Andrews' Gates book in an airport
bookstore and found, to my surprise, that it was a first edition, autographed
by the authors. There was no sign advertising it as such, and I wondered if it
was excess stock from some promotional campaign. Doesn't matter, but if either
of those guys goes nuts and assassinates somebody important like J.D.
Hildebrand or John Dvorak, I'll have a valuable collectible on my bookshelf.
Anybody want to buy an option?
[LISTING ONE]
// ------------ emblist.h
// a linked list class embedded in the listed class

#ifndef EMBLIST_H
#define EMBLIST_H

class LinkedList;

// --- the linked list entry
class ListEntry {
 void *thisentry;
 ListEntry *nextentry;
 ListEntry *preventry;
 LinkedList *listhead;
 friend class LinkedList;
public:
 ListEntry(void *entry, LinkedList *lh = 0);
 ~ListEntry() { RemoveEntry(); }
 void AppendEntry(LinkedList *lh = 0);
 void RemoveEntry();
 void *NextEntry()
 { return nextentry ? nextentry->thisentry : 0; }
 void *PrevEntry()
 { return preventry ? preventry->thisentry : 0; }
};
// ---- the linked list
class LinkedList {
 // --- the listhead
 ListEntry *firstentry;
 ListEntry *lastentry;
 friend class ListEntry;
public:
 LinkedList();
 ~LinkedList();
 void *FirstEntry() { return firstentry->thisentry; }

 void *LastEntry() { return lastentry->thisentry; }
};
#endif


[LISTING TWO]

// ------------ emblist.cpp
// linked list class

#include "emblist.h"

// ---- construct a linked list
LinkedList::LinkedList()
{
 firstentry = 0;
 lastentry = 0;
}
// ---- destroy a linked list
LinkedList::~LinkedList()
{
 while (firstentry)
 firstentry->RemoveEntry();
}
// ---- construct a linked list entry
ListEntry::ListEntry(void *entry, LinkedList *lh)
{
 thisentry = entry;
 listhead = lh;
 nextentry = 0;
 preventry = 0;
}
// ---- append an entry to the linked list
void ListEntry::AppendEntry(LinkedList *lh)
{
 if (lh)
 listhead = lh;
 if (listhead != 0) {
 preventry = listhead->lastentry;
 if (listhead->lastentry)
 listhead->lastentry->nextentry = this;
 if (listhead->firstentry == 0)
 listhead->firstentry = this;
 listhead->lastentry = this;
 }
}
// ---- remove an entry from the linked list
void ListEntry::RemoveEntry()
{
 // ---- repair any break made by this removal
 if (nextentry)
 nextentry->preventry = preventry;
 if (preventry)
 preventry->nextentry = nextentry;
 if (listhead) {
 // --- maintain listhead if this is last and/or first
 if (this == listhead->lastentry)
 listhead->lastentry = preventry;
 if (this == listhead->firstentry)

 listhead->firstentry = nextentry;
 }
 preventry = 0;
 nextentry = 0;
}

[LISTING THREE]

// -------- testemp.cpp

#include <iostream.h>
#include "emblist.h"

class Date {
 int mo, da, yr;
public:
 ListEntry le;
 Date(int m=0, int d=0, int y=0) : le(this)
 { mo = m; da = d; yr = y; }
 friend ostream& operator << (ostream& os, Date& dt)
 { os << dt.da << /' << dt.mo << /' << dt.yr; return os; }
};
void main()
{
 LinkedList dtlist;
 int d = 0, m, y;
 Date *dt;
 while (d != 99) {
 cout << "Enter dd mm yy (99 .. .. when done): ";
 cout << flush;
 cin >> d >> m >> y;
 if (d != 99) {
 dt = new Date(m,d,y);
 dt->le.AppendEntry(&dtlist);
 }
 }
 dt = (Date *) dtlist.FirstEntry();
 while (dt != 0) {
 cout << \n' << *dt;
 dt = (Date *) dt->le.NextEntry();
 }
}

[LISTING FOUR]

// ------------ inhlist.h
// a linked list base class
#ifndef INHLIST_H
#define INHLIST_H

class LinkedList;

// --- the linked list entry
class ListEntry {
 ListEntry *nextentry;
 ListEntry *preventry;
 LinkedList *listhead;
 friend class LinkedList;
protected:

 ListEntry(LinkedList *lh = 0);
 virtual ~ListEntry() { RemoveEntry(); }
public:
 void AppendEntry(LinkedList *lh = 0);
 void RemoveEntry();
 ListEntry *NextEntry() { return nextentry; }
 ListEntry *PrevEntry() { return preventry; }
};
// ---- the linked list
class LinkedList {
 // --- the listhead
 ListEntry *firstentry;
 ListEntry *lastentry;
 friend class ListEntry;
public:
 LinkedList();
 ~LinkedList();
 ListEntry *FirstEntry() { return firstentry; }
 ListEntry *LastEntry() { return lastentry; }
};

#endif

[LISTING FIVE]

// ------------ inhlist.cpp
// linked list base class
#include "inhlist.h"

// ---- construct a linked list
LinkedList::LinkedList()
{
 firstentry = 0;
 lastentry = 0;
}
// ---- destroy a linked list
LinkedList::~LinkedList()
{
 while (firstentry)
 firstentry->RemoveEntry();
}
// ---- construct a linked list entry
ListEntry::ListEntry(LinkedList *lh)
{
 listhead = lh;
 nextentry = 0;
 preventry = 0;
}
// ---- append an entry to the linked list
void ListEntry::AppendEntry(LinkedList *lh)
{
 if (lh)
 listhead = lh;
 if (listhead != 0) {
 preventry = listhead->lastentry;
 if (listhead->lastentry)
 listhead->lastentry->nextentry = this;
 if (listhead->firstentry == 0)
 listhead->firstentry = this;

 listhead->lastentry = this;
 }
}
// ---- remove an entry from the linked list
void ListEntry::RemoveEntry()
{
 // ---- repair any break made by this removal
 if (nextentry)
 nextentry->preventry = preventry;
 if (preventry)
 preventry->nextentry = nextentry;
 if (listhead) {
 // --- maintain listhead if this is last and/or first
 if (this == listhead->lastentry)
 listhead->lastentry = preventry;
 if (this == listhead->firstentry)
 listhead->firstentry = nextentry;
 }
 preventry = 0;
 nextentry = 0;
}

[LISTING SIX]

// -------- testinh.cpp
#include <iostream.h>
#include "inhlist.h"

class Date : public ListEntry {
 int mo, da, yr;
public:
 Date(int m=0, int d=0, int y=0)
 { mo = m; da = d; yr = y; }
 friend ostream& operator << (ostream& os, Date& dt)
 { os << dt.da << /' << dt.mo << /' << dt.yr; return os; }
};
void main()
{
 LinkedList dtlist;
 int d = 0, m, y;
 Date *dt;
 while (d != 99) {
 cout << "Enter dd mm yy (99 .. .. when done): ";
 cout << flush;
 cin >> d >> m >> y;
 if (d != 99) {
 dt = new Date(m,d,y);
 dt->AppendEntry(&dtlist);
 }
 }
 dt = (Date *) dtlist.FirstEntry();
 while (dt != 0) {
 cout << \n' << *dt;
 dt = (Date *) dt->NextEntry();
 }
}

[LISTING SEVEN]


// ------------ linklist.h
// a template for a linked list

#ifndef LINKLIST_H
#define LINKLIST_H
template <class T>
// --- the linked list entry
class ListEntry {
 T thisentry;
 ListEntry<T> *nextentry;
 ListEntry<T> *preventry;
 ListEntry(T& entry);
 friend class LinkedList<T>;
};
template <class T>
// ---- construct a linked list entry
ListEntry<T>::ListEntry(T &entry)
{
 thisentry = entry;
 nextentry = 0;
 preventry = 0;
}
template <class T>
// ---- the linked list
class LinkedList {
 // --- the listhead
 ListEntry<T> *firstentry;
 ListEntry<T> *lastentry;
 ListEntry<T> *iterator;
 void RemoveEntry(ListEntry<T> *lentry);
 void InsertEntry(T& entry, ListEntry<T> *lentry);
public:
 LinkedList();
 ~LinkedList();
 void AppendEntry(T& entry);
 void RemoveEntry(int pos = -1);
 void InsertEntry(T&entry, int pos = -1);
 T *FindEntry(int pos);
 T *CurrentEntry();
 T *FirstEntry();
 T *LastEntry();
 T *NextEntry();
 T *PrevEntry();
};
template <class T>
// ---- construct a linked list
LinkedList<T>::LinkedList()
{
 iterator = 0;
 firstentry = 0;
 lastentry = 0;
}
template <class T>
// ---- destroy a linked list
LinkedList<T>::~LinkedList()
{
 while (firstentry)
 RemoveEntry(firstentry);
}

template <class T>
// ---- append an entry to the linked list
void LinkedList<T>::AppendEntry(T& entry)
{
 ListEntry<T> *newentry = new ListEntry<T>(entry);
 newentry->preventry = lastentry;
 if (lastentry)
 lastentry->nextentry = newentry;
 if (firstentry == 0)
 firstentry = newentry;
 lastentry = newentry;
}
template <class T>
// ---- remove an entry from the linked list
void LinkedList<T>::RemoveEntry(ListEntry<T> *lentry)
{
 if (lentry == 0)
 return;
 if (lentry == iterator)
 iterator = lentry->preventry;
 // ---- repair any break made by this removal
 if (lentry->nextentry)
 lentry->nextentry->preventry = lentry->preventry;
 if (lentry->preventry)
 lentry->preventry->nextentry = lentry->nextentry;
 // --- maintain listhead if this is last and/or first
 if (lentry == lastentry)
 lastentry = lentry->preventry;
 if (lentry == firstentry)
 firstentry = lentry->nextentry;
 delete lentry;
}
template <class T>
// ---- insert an entry into the linked list
void LinkedList<T>::InsertEntry(T& entry, ListEntry<T> *lentry)
{
 ListEntry<T> *newentry = new ListEntry<T>(entry);
 newentry->nextentry = lentry;
 if (lentry) {
 newentry->preventry = lentry->preventry;
 lentry->preventry = newentry;
 }
 if (newentry->preventry)
 newentry->preventry->nextentry = newentry;
 if (lentry == firstentry)
 firstentry = newentry;
}
template <class T>
// ---- remove an entry from the linked list
void LinkedList<T>::RemoveEntry(int pos)
{
 FindEntry(pos);
 RemoveEntry(iterator);
}
template <class T>
// ---- insert an entry into the linked list
void LinkedList<T>::InsertEntry(T& entry, int pos)
{
 FindEntry(pos);

 InsertEntry(entry, iterator);
}
template <class T>
// ---- return the current linked list entry
T *LinkedList<T>::CurrentEntry()
{
 return iterator ? &(iterator->thisentry) : 0;
}
template <class T>
// ---- return a specific linked list entry
T *LinkedList<T>::FindEntry(int pos)
{
 if (pos != -1) {
 iterator = firstentry;
 if (iterator) {
 while (pos--)
 iterator = iterator->nextentry;
 }
 }
 return CurrentEntry();
}
template <class T>
// ---- return the first entry in the linked list
T *LinkedList<T>::FirstEntry()
{
 iterator = firstentry;
 return CurrentEntry();
}
template <class T>
// ---- return the last entry in the linked list
T *LinkedList<T>::LastEntry()
{
 iterator = lastentry;
 return CurrentEntry();
}
template <class T>
// ---- return the next entry in the linked list
T *LinkedList<T>::NextEntry()
{
 if (iterator == 0)
 iterator = firstentry;
 else
 iterator = iterator->nextentry;
 return CurrentEntry();
}
template <class T>
// ---- return the previous entry in the linked list
T *LinkedList<T>::PrevEntry()
{
 if (iterator == 0)
 iterator = lastentry;
 else
 iterator = iterator->preventry;
 return CurrentEntry();
}
#endif

[LISTING EIGHT]


// -------- testtmpl.cpp
#include <iostream.h>
#include "linklist.h"

class Date {
 int mo, da, yr;
public:
 Date(int m=0, int d=0, int y=0)
 { mo = m; da = d; yr = y; }
 friend ostream& operator << (ostream& os, Date& dt)
 { os << dt.da << /' << dt.mo << /' << dt.yr; return os; }
};
void main()
{
 LinkedList<Date> dtlist;
 int d = 0, m, y;
 while (d != 99) {
 cout << "Enter dd mm yy (99 .. .. when done): ";
 cout << flush;
 cin >> d >> m >> y;
 if (d != 99)
 dtlist.AppendEntry(Date(m,d,y));
 }
 Date *dt = dtlist.FirstEntry();
 while (dt != 0) {
 cout << \n' << *dt;
 dt = dtlist.NextEntry();
 }
}
End Listings
































July, 1993
ALGORITHM ALLEY


Alien Text-File Compression




Tom Swan


File-compression algorithms are among the most fascinating in computer
science. The very idea that a certain sequence of bytes can be represented by
another, shorter sequence of bytes, makes me wonder if, someday, a programmer
will discover the ultimate compression method that can reduce any data set to
a single integer. Imagine how much disk space you would gain with a 99.999
percent file-compression ratio.
That may seem to be an impossible dream until you consider a story told by
Martin Gardner in his book, Gotcha (W.H. Freeman, 1982). As Gardner tells the
tale, on visiting Earth, an inquisitive alien wishes to collect all of human
knowledge. Having no room for the Encyclopedia Britannica aboard an
already-cramped spaceship (and lacking a CD-ROM player), the alien proposes a
clever method for compressing the Encyclopedia's volumes. Assuming there are
fewer than 1000 unique letters, digits, and other symbols in the text, our
resourceful visitor assigns each symbol a three-digit code from 000 to 999,
including leading 0s. The word "Snow," for example, might be encoded as
083110111119. Translating the entire Encyclopedia this way produces a giant
integer that, with an imaginary decimal point to the left, is equivalent to
the decimal fraction, A/B. To complete the data compression, the visitor
places a mark on a small rod of otherworldly material that divides the bar
into two lengths, A and B. After returning to Zenon (or wherever), the alien
precisely measures the marked rod, obtains lengths A and B, and divides A/B to
yield the original integer, which a computer decodes to print out the
Encyclopedia.
Is this a valid data-compression technique? Yes. Is it possible to implement?
Hardly. To precisely mark the rod, explains Gardner, would require measuring
distances many times smaller than an electron. If, however, you could make an
appropriately fine measurement, the method would work. So, theoretically
speaking, does this imply that all data is infinitely compressible? Maybe not,
but imagine a "rod drive" of the future that moves a quark-size mark up and
down to compress gigabytes of data on a staff no bigger than a popsicle stick.
What a great backup system that would be! All of this is science fiction, of
course, although the story makes me wonder what sorts of exotic
data-compression techniques--not to mention aliens--are out there waiting to
be discovered.


Compressing Text Files


Popular text-compression algorithms such as variable-width Huffman codes,
which have been covered extensively in DDJ, produce impressive compression
ratios. Though highly efficient, Huffman codes must be processed one bit at a
time, causing compression programs to run slowly.
When choosing a compression algorithm, however, efficiency isn't always the
primary concern. Speed and ease of use might be more important in some
applications, and in this column, I'll focus on algorithms that, unlike
Huffman codes, run quickly while still appreciably reducing text-file sizes.
The methods gain much of their speed simply by processing text as characters
rather than as individual bits. I'll start with the most basic of these
methods: embedded tabs. Though simple, the algorithms for inserting and
deleting tabs belong in every programmer's toolbox. I'll also explore some
other alien--that is, less common--text-compression algorithms that you might
want to try.


Embedded Tab Compression


Example 1, Algorithm #6, lists the pseudocode for inserting tabs into a line
of text L at fixed columns of size tabWidth. Example 2, Algorithm #7, removes
tabWidth tabs from a line of text. Both algorithms operate on a single string,
represented as an array of characters, indexed as S[1] to S[n], where n is the
string length.
To insert tabs, Example 1 calls a subfunction, NextChar, which peeks ahead to
the next character in line L. If that character equals a unique sentinel
appended to the end of the line, the algorithm sets flag Eol True. A While
loop appends tabs to a temporary string T. A second While loop appends blanks
to T in order to fill out partial columns that can't be aligned with a tab.
Removing tabs is the simpler of the two related algorithms; see Example 2. In
this case, the procedure merely examines line L one character at a time,
outputing blanks at every tab character encountered.
INSTAB.PAS (Listing One, page 142) implements Algorithm #6. REMTAB.PAS
(Listing Two, page 142) implements Algorithm #7. To alter the column width,
change tabWidth to any value from 2 up. A width of 4, instead of the standard
8, usually produces better results on program source files. For simplicity,
and to keep the listings short, I designed the programs to write their output
to a new file, TXT.OUT, which is overwritten in the current directory without
warning. The original file is never changed.
While experimenting with these programs, I noticed an odd effect that tabs can
have on other compression software such as the popular PKZIP and LHA. Table 1
shows the results of one test in which I inserted 4-column tabs into a Pascal
source file of 16,273 bytes, reducing the file to 15,052 bytes. PKZIP further
reduced the tabbed file to 5614 bytes, but to my surprise, this file was 277
bytes larger than the compressed, tab-less text. LHA did a better job than
PKZIP in compressing this file, but here again, the compressed file with tabs
was 224 bytes larger than the compressed, tab-less original.
These numbers suggest some interesting points about PKZIP and LHA, and
possibly other data compressors. The amount of savings gained from compressing
a file can be highly dependent on the nature of the information in that file.
Also, compressing text with tabs apparently interferes with algorithms that
compact runs of identical values (not only blanks) into a count N and a value
V, a method commonly called "run-length encoding." It therefore makes good
sense to strip tabs from text files before compressing them. Perhaps a smart
text-compression engine could recognize this fact and detab files before
compressing. During decompression, the tabs could easily be added back. I've
never seen a data compressor that does this, but the idea might be worth
giving a whirl.


Indentation Compression


UCSD Pascal uses a relatively obscure, but effective, text-compression
technique that takes advantage of indentation in a typical Pascal, C, or C++
text file. The algorithms are straightforward, so I'll leave them as a project
for you to implement. Simply replace each line's leading blanks with a unique
escape code and a value that represents the number of replaced blanks--a
relatively crude form of run-length encoding, but one that can produce better
results than embedded tabs. To expand a line, if it begins with an escape
code, read the next value and output that many spaces. Lines that don't begin
with escape codes are displayed normally.
One drawback with indentation compression is the need to reserve a unique
escape code, but a side benefit is that compressed files tend to display
faster--the exact opposite of the extra runtime overhead usually added by
compression algorithms. To display a line of compressed text, assuming the
display row is clear, just move the cursor to the position of the first
nonblank character, saving the time it would take to display that many spaces
one at a time. It's amazing how much display speed you can gain from this
simple method. A programmer's editor could use the concept internally to
increase output speed on relatively slow terminals or with sluggish GUIs such
as Microsoft Windows. The compressed text would also conserve memory.


Differential Compression


One of my favorite compression algorithms takes advantage of the redundancy in
sorted files where groups of adjacent words begin with the same letters.
Storing the differences between these words saves space by eliminating
duplicate prefixes. For example, the words Aardvark, Aardwolf, and Aaronic can
be encoded as Aardvark, 4wolf, and 3onic, where 4 and 3 represent the number
of letters to be copied from the preceding, uncompressed words. This technique
is especially good at squeezing lengthy dictionaries, but it can also be used
with sorted name-and-address databases or with other alphabetically sorted
files. Difference compression can also pack bitmaps, especially those of
large-size color values, say, 24 bits per pixel. Such files can often be
greatly reduced by storing the difference in color between one pixel and the
next.
Example 3, Algorithm #8, lists the pseudocode for differentially compressing a
word W given a preceding word P. Example 4, Algorithm #9, expands a compressed
word W, again given an uncompressed preceding word P. To use these algorithms,
initialize P to a null string, then pass a list of words W to the appropriate
procedure.
DIFF.PAS (Listing Three, page 142) and UNDIFF.PAS (Listing Four, page 142)
implement Algorithms #8 and #9, respectively. To test the programs, I
compressed a sorted list of 21,400 proper names. Compressing the file with
DIFF reduced its size from 174,510 to 112,935 bytes, a savings of about 35
percent. That's pretty good as compression ratios go, but consider what
happened when I again compressed each of the two files using LHA. The
original, plain text reduced to 57,295 bytes. The differentially compressed
text slimmed down to only 42,977 bytes, a total reduction of greater than 75
percent. Now, that's a hot compression ratio!
The message here is that, despite advances in data-compression techniques,
it's often the combination of selected algorithms along with knowledge of a
data file's contents that can produce optimal results. Removing tabs from text
files, taking advantage of indentation, and packing sorted files
differentially (just to name a few of many possibilities), and then also
compressing those precompressed files using other techniques might reduce file
sizes more than any single algorithm on its own.


Your Turn


Next time, I'll explore data-compression algorithms used for packing Microsoft
Windows bitmaps. Until then, if you know of any unusual text-file or other
compression techniques, I'd like to hear from you. Feel free to upload files
packed with PKZIP or LHA to my CompuServe ID, 73627,3241. Please send only
text files; do not send executable code. If you don't have a CompuServe
account, you can send a disk to me in care of DDJ. Include postage and a
self-addressed mailer if you want your disk back.
Sorry, but I can't return marked rods or popsicle sticks. (I just know some
alien with a sense of humor is going to send me a bag of these.)
Example 1: Pseudocode for Algorithm #6 (insert tabs).
procedure InsertTabs(L: String);
var

 T: String;
 J, K: Integer;
 C, Q: Char;
 Eol: Boolean; { End of line }
 function NextChar(var C: Char): Char;
 begin
 C  L[J + 1];
 Eol  C = Q; { True at end of line }
 Return C;
 end;
begin
 Set T to null string;
 Set Q to unique char;
 Append Q to L as sentinel;
 K  0; { Column count }
 repeat
 J  K;
 while NextChar(C) = blank do
 begin
 J  J + 1;
 if J mod tabWidth = 0 then
 begin
 Append tab to T;
 K  J;
 end;
 end;
 while (K < J) do
 begin
 Append blank to T;
 K  K + 1;
 end;
 if not Eol then
 begin
 Append C to T;
 K  K + 1;
 end;
 until Eol;
 Return T;
end;
Table 1: Results of compressing a text file with and without tabs.
 Original PKZIP LHA
Without tabs 16,273 5,337 5,127
With tabs 15,052 5,614 5,351
Bytes saved 1,221 --277 --224
Example 2: Pseudocode for Algorithm #7 (remove tabs).
procedure RemoveTabs(L: String);
var
 T: String;
 I: Integer;
 C: Char;
begin
 Set T to null string;
 for I  1 to Length(L) do
 begin
 C  L[I];
 if C = tab then
 repeat
 Append blank to T;
 until Length(T) mod tabWidth = 0;

 else
 Append char C to T;
 end;
 Return T;
end;
Example 3: Pseudocode for Algorithm #8 (differential compression).
procedure Compress(var W, P: String);
var
 T: String; { Temporary copy of W }
 I: Integer; { String index }
begin
 Copy W to T;
 Set I equal to 1;
 while (I <= Length(W)) and
 (I <= Length(P)) and
 (W[I] = P[I] ) do
 I  I + 1;
 Delete I - 1 chars from head of W;
 Insert Chr(I - 1) at head of W;
 Set P equal to T;
end;
Example 4: Pseudocode for Algorithm #9 (differential decompression).
procedure Decompress(var W, P: String);
var
 I: Integer; { String index }
begin
 Set I to value of W[1];
 Delete W[1] from W;
 while (I >= 1) do
 begin
 Append P[I] to head of W;
 I  I - 1;
 end;
 Set P equal to W;
end;
[LISTING ONE] (Text begins on page 121.)

{ instab.pas -- Algorithm #6: Insert Tabs by Tom Swan }
program InsTab;
const
 outFName = TXT.OUT';
 tabWidth = 8; { Try 4 for source code files }
 blank = #32; { ASCII blank character }
 tab = #09; { ASCII tab character }
var
 InFName: String;
 InFile, OutFile: Text;
 Line: String;
procedure InsertTabs(var L: String);
var
 T: String;
 J, K: Integer;
 C, Q: Char;
 Eol: Boolean; { End of line }
 function NextChar(var C: Char): Char;
 begin
 C := L[J + 1];
 Eol := C = Q; { True at end of line }
 NextChar := C

 end;
begin
 T := '; { Set result T to null }
 Q := #0; { Sentinel }
 L := L + Q; { Append sentinal to L }
 K := 0; { Column count }
 repeat
 J := K;
 while NextChar(C) = blank do
 begin
 J := J + 1;
 if J mod tabWidth = 0 then
 begin
 T := T + tab;
 K := J
 end;
 end;
 while (K < J) do
 begin
 T := T + blank;
 K := K + 1
 end;
 if not Eol then
 begin
 T := T + C;
 K := K + 1
 end;
 until Eol;
 L := T { Return T via parameter L }
end;
begin
 Writeln(Insert tabs');
 Write(Input file name? );
 Readln(InFName);
 Assign(InFile, InFName);
 Reset(InFile);
 Assign(OutFile, outFName);
 Rewrite(OutFile);
 Write(Inserting tabs...');
 while not Eof(InFile) do
 begin
 Readln(InFile, Line);
 InsertTabs(Line);
 Writeln(OutFile, Line)
 end;
 Writeln;
 Close(InFile);
 Close(OutFile);
 Writeln(InFName,  -> , outFName)
end.

[LISTING TWO]

{ remtab.pas -- Algorithm #7: Remove Tabs by Tom Swan }
program RemTab;
const
 outFName = TXT.OUT';
 tabWidth = 8; { Try 4 for source code files }
 blank = #32; { ASCII blank character }

 tab = #09; { ASCII tab character }
var
 InFName: String;
 InFile, OutFile: Text;
 Line: String;
procedure RemoveTabs(var L: String);
var
 T: String;
 I: Integer;
 C: Char;
begin
 T := ';
 for I := 1 to Length(L) do
 begin
 C := L[I];
 if C = tab then
 repeat
 T := T + blank
 until Length(T) mod tabWidth = 0
 else
 T := T + C
 end;
 L := T { Return T via parameter L }
end;
begin
 Writeln(Remove tabs');
 Write(Input file name? );
 Readln(InFName);
 Assign(InFile, InFName);
 Reset(InFile);
 Assign(OutFile, outFName);
 Rewrite(OutFile);
 Write(Removing tabs...');
 while not Eof(InFile) do
 begin
 Readln(InFile, Line);
 RemoveTabs(Line);
 Writeln(OutFile, Line)
 end;
 Writeln;
 Close(InFile);
 Close(OutFile);
 Writeln(InFName,  -> , outFName)
end.

[LISTING THREE]

{ diff.pas -- Algorithm #8: Differential Compression by Tom Swan }
program Diff;
const
 outFName = TXT.OUT';
var
 InFName: String;
 InFile, OutFile: Text;
 AWord, Prev: String;
procedure Compress(var W, P: String);
var
 T: String; { Temporary copy of W }
 I: Integer; { String index }

begin
 T := W;
 I := 1;
 while (I <= Length(W)) and (I <= Length(P)) and (W[I] = P[I] ) do
 I := I + 1;
 Delete(W, 1, I - 1);
 W := Chr(I - 1) + W;
 P := T
end;
begin
 Writeln(Differential Compression');
 Write(Input file name? );
 Readln(InFName);
 Assign(InFile, InFName);
 Reset(InFile);
 Assign(OutFile, outFName);
 Rewrite(OutFile);
 Prev := ';
 Write(Compressing...');
 while not Eof(InFile) do
 begin
 Readln(InFile, AWord);
 Compress(AWord, Prev);
 Writeln(OutFile, AWord)
 end;
 Writeln;
 Close(InFile);
 Close(OutFile);
 Writeln(InFName,  -> , outFName)
end.

[LISTING FOUR]

{ undiff.pas -- Algorithm #9: Differential Decompression by Tom Swan }.
program UnDiff;
const
 outFName = TXT.OUT';
var
 InFName: String;
 InFile, OutFile: Text;
 AWord, Prev: String;
procedure Decompress(var W, P: String);
var
 I: Integer;
begin
 I := Ord(W[1]);
 Delete(W, 1, 1);
 while (I >= 1) do
 begin
 W := P[I] + W;
 I := I - 1
 end;
 P := W
end;
begin
 Writeln(Differential Decompression');
 Write(Input file name? );
 Readln(InFName);
 Assign(InFile, InFName);

 Reset(InFile);
 Assign(OutFile, outFName);
 Rewrite(OutFile);
 Prev := ';
 Write(Decompressing...');
 while not Eof(InFile) do
 begin
 Readln(InFile, AWord);
 Decompress(AWord, Prev);
 Writeln(OutFile, AWord)
 end;
 Writeln;
 Close(InFile);
 Close(OutFile);
 Writeln(InFName,  -> , outFName)
end.
End Listings













































July, 1993
UNDOCUMENTED CORNER


The PIF File Format, or, Topview (sort of) Lives!




Michael P. Maurice


Mike is the developer of EDOS and other enhancements to DOS sessions under
Windows. He can be reached on CompuServe at 71171,47, or by telephone at
503-694-2221. Additional PIF-related resources are available on the EDOS BBS,
503-643-8396.




Edited by Andrew Schulman


The next time you're in Silicon Valley, visit the Weird Stuff Warehouse in
Sunnyvale, just across the street from Fry's Electronics. In addition to piles
of dead disk drives and boxes full of doohickies, Weird Stuff ("We Buy Excess
Inventories") also has several aisles of defunct software.
Walking these aisles is a sobering experience for any software developer.
Here, for example, you will find the mammoth OS/2 Extended Edition for $14.95
(the disks alone are probably worth more), IBM's Topview software development
kit for $7.95, and Topview itself for only $4.95. This is the software
boulevard of broken dreams.
The fact is, most new technology never goes anywhere. Very often, something to
which many man-years have been devoted ends up as little more than excess
inventory. Equally often, these products were greeted with great fanfare when
they first appeared. How many of today's hot products are going to wind upon
the shelves of Weird Stuff with their price knocked down to $9.95?
So products, technologies, and companies come and go. At the same time,
another aspect of our industry is the way that outmoded technology persists
longer than anyone would expect. The presence of many CP/M-isms in MS-DOS,
even when running on the hottest Pentium processor, is a good example of this
kind of uneven development.
Even unsuccessful products can leave their mark. A case in point is this
month's topic, the Program Information File (PIF) format. Most Windows
programmers and users are probably familiar with PIF files as the mechanism
that Windows uses to exercise some control over how "old" DOS programs are
run. An end user might use the PIF editor to instruct Windows to run his or
her copy of dBase III Plus in a window, with background execution enabled.
Windows NT (which may or may not be destined to wind up at Weird Stuff, next
to the piles of OS/2 Extended Edition) also uses PIF files.
Most Windows programmers probably don't remember that Windows PIF files come
directly from that $4.95 operating environment, Topview. Quarterdeck's
Desqview operating environment, which at one point in its career was a Topview
clone, uses the same PIF file format, under the name DVP. As Stephen Manes and
Paul Andrews note in their biography of Bill Gates, "Microsoft's adaptation of
PIFs would remain long after Topview had withered and died."
So we have a more or less unbroken chain from Topview to NT; the more things
change, the more they stay the same. As Mike Maurice shows this month, a large
portion of this file format really has remained unchanged since the days of
Topview. This portion of the format was documented by Quarterdeck for the
Desqview SDK; you can read more about it in the chapter on Desqview in
Extending DOS, second edition, edited by Ray Duncan (Addison-Wesley, 1992).
Unfortunately, most items of interest to Windows programmers, such as the
flags controlling windowed vs. full-screen display, background vs. foreground
execution, the idle-detection flag, and so on, are in portions of the file
added later by Microsoft, and thus are not described in the Desqview
documentation. Microsoft has documented many other less-important file formats
(such as those used by the Calendar applet), but it has not documented PIF.
Mike (author of EDOS, a popular Windows DOS-box enhancer) has done a nice job
of cracking this format, and even of explaining how the PIF editor in Windows
3.1 can manage NT PIFs. In essence, after the initial Topview/Desqview portion
of the PIF file, there's a linked list of records; each record starts with a
string. The first of these records starts at offset 171h in the PIF file and
has the string "MICROSOFT PIFEX"; this can be used as a signature to determine
that you have a valid Windows PIF file. Most of the interesting PIF flags are
kept in the record that begins with the string "WINDOWS 386 3.0"; the
PIFSTRUC.H header file (described later) refers to the corresponding structure
as DATA386.
As an illustration of what you can do once you know the PIF file format, I've
written a small set of functions (PIFEXEC.C, Listing One, page 144) that allow
a few PIF flags to be set programmatically. The PifExec() function is similar
to WinExec(), except that you can specify a window title and set or clear the
background-execution and windowed-display flags. PifExec() is written using
the functions ReadPif(), WritePif(), GetPif386(), and IsPif(), also in
PIFEXEC.C. PifExec() reads in the stock _DEFAULT.PIF file, modifies a few
fields, and writes out a new __TMP.PIF file, which it then passes to
WinExec(). PIFEXEC.C uses Mike's PIFSTRUC.H (Listing Two, page 144).
As always, I welcome your comments and suggestions for this column; send
e-mail via CompuServe (76320,302) or Internet (andrew@pharlap.com).
--Andrew Schulman
By now you are probably thinking, "Who cares about PIFs, anyway?" Program
Information Files: What could be more boring? Well, maybe PIF files are
slightly boring. But knowing how they are constructed can also be very useful!
In case you have forgotten (or never knew), PIF files contain
information--flags, integer, byte quantities, and alphabetic strings--to guide
the operating environment (notice I did not call it an "operating system"), in
starting and running a DOS application.
The DOS executable file format does not contain enough information for a
multitasking environment. When writing new applications, you use a new
executable file format. In fact, Windows applications use something called,
appropriately enough, the "new" executable file format, with an NE signature.
But what about running old applications under new environments? The old
executable can't be changed. So a secondary file tags along, with information
needed by the operating environment, called a Program Information File (PIF).
If you have ever used the PIF Editor in Windows, you know that PIF files
contain flags that determine whether the target application can run in the
background, what its timeslice priorities are, whether it starts up windowed
or full-screen, how much XMS and EMS memory it uses, and so on.
When Moses came down from the mountain, he said there would be PIFs. Well,
maybe it was someone else. But the effect was the same. The first record of
PIFs seems to have been with IBM's Topview. Remember Topview? It came before
QuarterDeck's Desqview. Both of these systems were designed to run more than
one task. Would you believe that even Windows NT uses PIFs? This just proves
that we are stuck with PIFs, as far as the eye can see.


The PIF Record Structure


Topview used a PIF of about 171h bytes. This basic early PIF contained a
filename and stored some BIOS data variables that needed duplicating, such as
memory and screen information.
When Desqview came out, PIFs were renamed .DVP files, and their size
increased, from 172h up to 18Dh in later versions. This extra room was needed
for more exotic flags to indicate serial-port use and virtualization of other
shared hardware resources (memory, screen, keyboard, and so on). Ralf Brown's
"MS-DOS Interrupt List" (available on CompuServe in the IBMPRO forum and at
Simtel20 on the Internet) and the book PC Interrupts (Addison-Wesley, 1991) by
Ralf Brown and Jim Kyle, contains a fair description of the basic
Topview/Desqview PIF format. This information can be used as the beginning
basis for decoding the layout of the Windows PIF format.
Today, Windows uses the basic 171h data structure with few modifications, but
increases the file, potentially to 3FFh (1023) bytes. These extra bytes are of
course not covered in Topview/Desqview descriptions. The new portion of the
file is used for storing the flags and switches used by standard and 386
Enhanced Mode Windows when running a DOS session. Only about a third or a half
of the original 171h byte space is used by Windows. The PIF Editor reads in
3FFh bytes and will save the number of bytes read, which may be less than
3FFh, as most Windows PIF files are only 221h (545) or 23Ch (572) bytes long.
PIFs created under NT use the same format, but store different information. NT
PIFs can be edited and run using Windows 3.1 and vice-versa. How's that?
Obviously, the only way that a newer file layout can be recognized by an older
editor is if all the formats follow the same guidelines.
The area from 171h to 3FFh in the Windows PIF file is a simple linked-list
record system. A sample hex dump of this area (using _DEFAULT.PIF) is shown in
Figure 1. A block consists of:
A 16-byte string, such as "WINDOWS 386 3.0".
A three-word structure that contains the offsets to the next and current
record and the data-record size.
The data record itself.
The four known record types are: "MICROSOFT PIFEX", "WINDOWS 286 3.0",
"WINDOWS 386 3.0", and "WINDOWS NT 3.1". (Note that there are two spaces
between "NT" and "3.1".)
The MICROSOFT PIFEX record must come directly at offset 171h. (WinOldAp, the
Windows module responsible for running DOS programs, relies on the "MICROSOFT
PIFEX" string at offset 171h as a "sanity check" that it has a valid PIF
file.) However, the number and order of the WINDOWS ?86 3.0 record groups does
not appear to be important. In some files the first W in the 286 record will
be zeroed. This seems to indicate that it is not being used, in which case
there is normally another 286 record that does not have the W zeroed.
A COMMENT record can be created, using the appropriate string, and plugging in
the correct offsets and size. This has been tested and worked on the March
1993 NT beta. The resulting records can be read and written by both the
Windows 3.1 and NT PIF editors.
The layout of the Windows PIF format is presented in PIFSTRUC.H (Listing Two).
Byte offsets in hex are noted in the comments. These offsets are correct up to
171h. After that, they are based on the offsets typical of Windows 3.1 PIFs.
These are not correct for all PIFs, but since most readers will only have
access to Windows 3.1, I've also provided them in this manner. It is more
portable to use the structures and fields in PIFSTRUC.H rather than the
hard-coded file offsets.
PIFSTRUC.H uses C bit fields. Oddly enough, in some cases the PIF bit fields
are organized on half-byte boundaries; very strange! For example, the reserved
hotkeys start in the middle of one byte and continue into the next.
NT PIF files add a new record type; a typical NT PIF file is 745 bytes long.
The record consists of the ID string ("WINDOWS NT 3.1"), then 12 unknown bytes
followed by two 64-byte fields to hold the pathnames for two files:
session-specific AUTOEXEC.BAT and CONFIG.SYS files. NT totally ignores several
kinds of information used in Windows 3.1 Enhanced Mode. NT automatically sets
the size of conventional memory, priorities, monitor ports, idle detection,
and use of the HMA. Under NT, PIF information is much less environment
sensitive than in Windows 3.1: Options are simplified, and the NT PIF is left
with identifying strings, files, paths, and keyboard issues.
READPIF.C is a demonstration program that prints the contents of a PIF file
(available electronically; see "Availability," page 5). The program is a DOS
application that will compile under Borland or Microsoft C; it basically reads
the PIF file into a buffer, checks to see if it is a legal MS PIF and then
prints out the defined strings. Using printf statements, it formats and prints
the values of the various flags, bytes, and word variables. The first 171h
bytes are in a fixed format and can be decoded on the spot. The record blocks
that follow require following the linked list; a pointer initialized to the
first record leads to the next record, and so on. At each record block, a
compare is made to find the record type. The record can then be dumped using
an appropriately typedefed pointer. The last record has a next pointer of --1,
at which point the For loop is exited and a success message is printed to
indicate that the record system was successfully decoded.



How Windows Uses PIF Settings


It's interesting to look for a few moments at how Windows actually uses PIF
files. This requires an understanding of how Windows starts a DOS session. The
following discussion assumes that Standard mode is a dead issue, and that the
emphasis should be on Enhanced mode.
DOS sessions are started by a call to WinExec. When it sees that it has a DOS
rather than a Windows program, WinExec starts WinOldAp (WINOA386.MOD in
Enhanced mode), giving it the PIF or DOS application name to run. If a DOS
application name is given, WinOldAp will use _DEFAULT.PIF. If an explicit PIF
name is used, it will contain the name of the DOS application or batch file to
be executed (see the prog_path field in Listing Two).
When the first instance of WinOldAp is started, the 386 grabber is loaded.
WinOldAp creates a Virtual Machine (VM), and then WSHELL, a virtual device
driver (VxD) built into WIN386.EXE, forces the DOS application to start in the
new VM. WinOldAp opens the PIF file to be used and at various times reads out
the data it needs to start the session. In general, WinOldAp does not store
the PIF information internally.
The DOS session system consists of the grabber, virtual display driver (VDD),
virtual keyboard driver (VKD), and WinOldAp. There are other virtual drivers
but these are the basic components. The grabber is a Windows DLL that renders
a windowed DOS session into a Windows window. When a DOS session is windowed,
the DOS application is not actually printing to the real display, but only to
a virtual hidden ("shadow") screen. The grabber reads this hidden screen and
displays the results in the window. The VDD maps the physical display memory
in and out of the various DOS sessions (VMs), traps the I/O ports, and
attempts to bring order in controlled chaos. Many times, when you get a GP
fault while running a DOS application, the VDD is at fault. VKD does similar
work for the keyboard.
PIF options are supported by several function calls documented in the Windows
Device Driver Kit (DDK) Virtual Device Adaptation Guide (VDAG). The following
VxD calls are related in one way or another to PIF settings: VDD_PIF_State,
VKD_Define_Paste_Mode, VMPoll_Enable_Disable, _DOSMGR_Set_Exec_VM_Data, the
Get/Set_Time_Slice_Priority/Granularity functions, SHELL_GetVMInfo, and the
V86MMGR_Get/Set_EMS_XMS_Limits functions. There are also undocumented calls,
and generally the data structures in WinOldAp and the VxDs are all
undocumented.
PIF settings fall into several categories. Those set at VM start-up can't be
changed; this includes settings such as the file to be started. Some PIF
settings are easy to change in mid-session: priority, exclusive, background,
application hotkey, and window title. Some are quite difficult to change in
midstream, including memory locking and reserving hot keys. Most video options
seem impossible to change on-the-fly. Finally, some options are simply not
practical to change once a session starts, such as EMS/XMS memory size.
The Monitor Ports settings in the PIF editor Advanced screen turn on and off
VDD's trapping of I/O for the various display modes. Enabling VDD trapping
causes a performance slowdown and is normally avoided. The Emulating Text Mode
option causes the VDD to replace some video BIOS calls with its own routines.
The performance improvement is substantial.


Changing PIF Settings On-the-Fly


Knowledge of PIF layout is necessary to change PIF settings which, as shown in
the PIFEXEC example, allows much more dynamic DOS-session creation. However,
the weakness in this approach is that the PIF has to be changed before the DOS
application is started.
A more elegant solution would allow changes on-the-fly, while the DOS
application is running. Changing PIF settings on-the-fly requires building a
Virtual Device Driver (VxD). A VxD can watch PIF setting changes and make its
own changes, in concert or at any time it chooses.
For the past year I've been building just such a system. It's called "EDOS,"
or Enhanced DOS for Windows, and is built from a VxD and a DLL. It supports
changing most PIF settings (for instance, priorities, time slice, exclusive,
background, fast paste, and task switching) either from the command line
(using the undocumented COMMAND.COM interface, INT 2Fh functions AE00h and
AE01h) or by way of a virtual-8086 (V86) entry point. A V86 entry point allows
a DOS application to call into a VxD and execute code running at ring zero.
There is also a protected-mode (PM) entry-point system for use by Windows
applications. The entry-point mechanism is a feature of the VxD system, but
the code that is executed and that provides the useful functionality is the
responsibility of the VxD developer. VxD development is not trivial, but it
can be fun.
If a Windows PIF had additional flag bits defined, then a utility such as EDOS
could be enhanced to examine this information. The VxD could be modified to
enable/disable disk swapping, or perhaps to assign the serial port
automatically. Alternatively, it could be modified to support a V86-mode DOS
call that would assign the serial port in an open/close environment.
A shareware version of EDOS, which demonstrates changing many PIF settings
from the command line, is available on CompuServe (GO WINADV) and is also
included in Brian Livingston's book Windows 3.1 Secrets (IDG Books, 1992).
In closing, we might speculate on why Microsoft does not document the PIF file
format, especially when so many other formats, such as font files, Program
Manager files, and even the file format used by the Calendar applet have been
documented. One developer at Microsoft told us that, "We can't document that;
it's going away in the next release." However, as we've seen, the ages-old PIF
format persists in NT, and, for better or worse, definitely is not going away
any time soon.

Figure 1: Hex dump of _DEFAULT.PIF: Boxes indicate next and current pointers.

0000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0020: 80 02 80 00 5F 44 45 46 41 55 4C 54 2E 42 41 54 ...._DEFAULT.BAT
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
....
0170: 00 4D 49 43 52 4F 53 4F 46 54 20 50 49 46 45 58 .MICROSOFT PIFEX
0180: 00 87 01 00 00 71 01 57 49 4E 44 4F 57 53 20 32 .....q.WINDOWS 2
0190: 38 36 20 33 2E 30 00 A3 01 9D 01 06 00 00 00 00 86 3.0..........
01A0: 00 00 00 57 49 4E 44 4F 57 53 20 33 38 36 20 33 ...WINDOWS 386 3
01B0: 2E 30 00 FF FF B9 01 68 00 80 02 80 00 64 00 32 .0.....h.....d.2
01C0: 00 00 04 00 00 00 04 00 00 08 10 02 00 1F 00 00 ................
01D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

_UNDOCUMENTED CORNER COLUMN_
edited by Andrew Schulman
_THE PIF FILE FORMAT_
by Michael P. Maurice

[LISTING ONE]: PIFEXEC.C

/* PIFEXEC.C -- Dr. Dobb's Journal "Undocumented Corner"
Andrew Schulman, April 1993 -- bcc -WS -DTESTING pifexec.c */

#include <stdlib.h>
#include <string.h>
#include "windows.h"
#include "pifstruc.h"

/* is this a valid Windows PIF file? */
BOOL IsPif(PIF far *fppif)
{
 BYTE far *fp = (BYTE far *) fppif;
 return (lstrcmp(&fp[PIFEX_OFFSET], "MICROSOFT PIFEX") == 0);
}

/* read PIF file into memory */
BOOL ReadPif(char far *name, PIF far *fppif)
{
 HFILE f;
 UINT cb;
 if ((f = _lopen(name, READ)) == HFILE_ERROR)
 return FALSE;
 if ((cb = _lread(f, fppif, MAX_PIFFILE_SIZE)) == HFILE_ERROR)
 return FALSE;
 if (cb < PIFEX_OFFSET)
 return FALSE;
 _lclose(f);
 return IsPif(fppif);
}
/* write PIF structure to file on disk */
BOOL WritePif(char far *name, PIF far *fppif)
{
 HFILE f;
 UINT cb;
 if ((f = _lcreat(name, 0)) == HFILE_ERROR)
 return FALSE;
 if ((cb = _lwrite(f, fppif, MAX_PIFFILE_SIZE)) == HFILE_ERROR)
 return FALSE;
 if (cb < MAX_PIFFILE_SIZE)
 return FALSE;
 _lclose(f);
 return TRUE;
}
/* return pointer to the Windows 386 section */
DATA386 far *GetPif386(PIF far *fppif)
{
 BYTE far *fp = ((BYTE far *) fppif) + PIFEX_OFFSET;
 SECTIONHDR far *fpsection =
 (SECTIONHDR far *) (fp + sizeof(SECTIONNAME));
 if (! IsPif(fppif))
 return (DATA386 far *) 0;
 for (;;)
 {
 if (lstrcmp(fp+1, "INDOWS 386 3.0") == 0)
 return (DATA386 far *) ((BYTE far *) fppif +
 fpsection->current_section);
 if (fpsection->next_section == 0xFFFF)
 break;
 fp = (BYTE far *) fppif + fpsection->next_section;
 fpsection = (SECTIONHDR far *) (fp + sizeof(SECTIONNAME));
 }
 /* still here */
 return (DATA386 far *) 0;
}
int _dos_delete_file(char far *filename)
{
 _asm push ds
 _asm lds dx, dword ptr filename
 _asm mov ah, 41h
 _asm int 21h
 _asm pop ds
 _asm jc error
 return 0; // success
error:;

 // return error in AX
}
/* WinExec a DOS app, specifying a few PIF settings. This is intended only as
 an example; other PIF settings can similarly be manipulated programmatically.
 For example, the program's command line
 (ppif->prog_param), default directory (ppif->def_dir), and
 idle-detect flag (data386->flags386.Detect_Idle). */
UINT PifExec(char far *name, char far *title, BOOL background, BOOL windowed)
{
 static char *tmp_pif = "__tmp.pif";
 PIF *ppif;
 UINT retval = 0;
 DATA386 far *data386;
 char *pathname;

 if (! (pathname = (char *) malloc(256)))
 return FALSE;
 if (! (ppif = (PIF *) malloc(MAX_PIFFILE_SIZE)))
 {
 free(pathname);
 return FALSE;
 }
 /* read in the standard _DEFAULT.PIF file */
 GetWindowsDirectory(pathname, 256);
 strcat(pathname, "\\_default.pif");
 if (! ReadPif(pathname, ppif))
 goto done;
 /* modify some fields in the PIF structure */
 if ((lstrlen(name) > 63) (lstrlen(title) > 30))
 goto done;
 lstrcpy(ppif->prog_path, name);
 lstrcpy(ppif->title, title);
 if (! (data386 = GetPif386(ppif)))
 goto done;
 data386->flags_386.BackgroundON = background;
 data386->flags_386.FullScreenYes = (! windowed);
 /* write out a new __TMP.PIF file, WinExec it, and delete it */
 if (WritePif(tmp_pif, ppif))
 {
 retval = WinExec(tmp_pif, SW_NORMAL);
 _dos_delete_file(tmp_pif);
 }
done:
 free(pathname);
 free(ppif);
 return retval;
}
#ifdef TESTING
/* Standalone test: run with a DOS program name on the command
 line. For example: PIFEXEC \DOS\COMMAND.COM */
int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance,
 LPSTR lpszCmdLine, int nCmdShow)
{
 if (lpszCmdLine && *lpszCmdLine)
 PifExec(lpszCmdLine, "A Test of PifExec", 1, 1);
 else
 MessageBox(0, "usage: pifexec [program name]", "PIFEXEC", MB_OK);
 return 0;
}

#endif


[LISTING TWO] : PIFSTRUC.H

/* PIFSTRUC.H -- Structure of Windows PIF files --
Dr. Dobb's Journal "Undocumented Corner" -- Mike Maurice, July 1993 */

#define MAX_PIFFILE_SIZE 0x3FF
#define PIFEX_OFFSET 0x171

typedef struct {
 char name_string[16];
} SECTIONNAME, *npSECTIONNAME, FAR *fpSECTIONNAME;
typedef struct {
 WORD next_section; /* offset of section after this */
 /* last section if contents = FFFF */
 /* contents = 205, NT = 1A3 */
 WORD current_section; /* offset of data */
 /* contents = 19d */
 WORD size_section; /* sizeof section */
 /* contents = 68, NT = 06 */
} SECTIONHDR, *npSECTIONHDR, FAR *fpSECTIONHDR;
typedef struct {
 int Unused0 :1;
 int Graph286 :1;
 int PreventSwitch :1;
 int NoScreenExch :1;
 int Close_OnExit :1; /* only bit used in 386 mode */ // 0x10
 int Unused001 :1;
 int Com2 :1;
 int Com1 :1;
} CLOSEONEXIT;
typedef struct {
 int AllowCloseAct :1; // 0x01
 int BackgroundON :1; // 0x02
 int ExclusiveON :1; // 0x04
 int FullScreenYes :1; // 0x08
 int Unused1 :1;
 int RSV_ALTTAB :1; // 0x20
 int RSV_ALTESC :1; // 0x40
 int RSV_ALTSPACE :1; // 0x80
 int RSV_ALTENTER :1; // 0x01 << 8
 int RSV_ALTPRTSCR :1; // 0x02 << 8
 int RSV_PRTSCR :1; // 0x04 << 8
 int RSV_CTRLESC :1; // 0x08 << 8
 int Detect_Idle :1; // 0x10 << 8
 int UseHMA :1; // 0x20 << 8
 int Unused2 :1;
 int EMS_Locked :1; // 0x80 << 8
} FLAGS386;
typedef struct {
 int XMS_Locked :1; // 0x01
 int Allow_FastPst :1; // 0x02
 int Lock_App :1; // 0x04
 int Unused3 :5+8;
} FLAGSXMS;
typedef struct {
 int VidEmulateTxt :1; // 0x01

 int MonitorText :1; // 0x02
 int MonitorMGr :1; // 0x04
 int MonitorHiGr :1; // 0x08
 int InitModeText :1; // 0x10
 int InitModeMGr :1; // 0x20
 int InitModeHiGr :1; // 0x40
 int VidRetainVid :1; // 0x80
 int VideoUnused :8;
} VIDEO;
typedef struct {
 int HOT_KEYSHIFT :1; // 0x01
 int Unused4 :1;
 int HOT_KEYCTRL :1; // 0x04
 int HOT_KEYALT :1; // 0x08
 int Unused5 :4+8;
} HOTKEY;
typedef struct {
 int AltTab286 :1;
 int AltEsc286 :1;
 int AltPrtScr286 :1;
 int PrtScr :1;
 int CtrlEsc286 :1;
 int SaveScreen :1;
 int Unused10 :2;
} FLAGS286;
typedef struct {
 int Unused11 :4+2;
 int Com3 :1;
 int Com4 :1;
} COMPORT;
typedef struct {
 /* The offsets are accurate only for Windows -- *NOT* NT! */
 short mem_limit; /* 19d */
 short mem_req; /* 19f */
 WORD for_pri; /* 1a1 */
 WORD back_pri; /* 1a3 */
 short ems_max; /* 1a5 */
 WORD ems_min; /* 1a7 */
 short xms_max; /* 1a9 */
 WORD xms_min; /* 1ab */
 FLAGS386 flags_386; /* 1ad */
 FLAGSXMS flags_XMS; /* 1af */
 VIDEO video; /* 1b1 */
 WORD zero1; /* 1b3 */
 WORD hot_key_scan; /* 1b5 */
 /* any other legal ky on board, a scan code number. */
 HOTKEY hot_key_state; /* 1b7, alt, ctrl, shift. */
 WORD hot_key_flag; /* 1b9, 0=no hot key, ? f= hot key defined */
 WORD zero2[5]; /* 1ba */
 char opt_params[64]; /* 1c5, 386 mode for opt params */
} DATA386, FAR *fpDATA386;
typedef struct {
 WORD xmsLimit286; /* 237 */
 WORD xmsReq286; /* 239 */
 FLAGS286 flags_286; /* 23b */
 COMPORT com_ports; /* 23c */
} DATA286, FAR *fpDATA286;
typedef struct {
 /* from 0 -170 hex, not used by Windows, unless so indicated. */

 /* Note that in some cases the PIF editor fills in a value, */
 /* even though it does not SEEM to be used */
 BYTE resv1;
 BYTE checksum; /* used by Windows */
 char title[30]; /* 02 used by 286,386 mode for title */
 short max_mem; /* 20h used byt 286, 386 mem size */
 short min_mem; /* 22h, these 2 are duplicates see 19c */
 char prog_path[63]; /* 24h used by 286,386 mode for program & path*/
 CLOSEONEXIT close_onexit; /* 63h, 286 and 386 modes */
 BYTE def_drv; /* 64h */
 char def_dir[64]; /* 65h used by 286,386 mode for start dir */
 char prog_param[64]; /* a5, used by 286 */
 BYTE initial_screenMode; /* usually zero, sometimes 7F hex */
 BYTE text_pages; /* always one */
 BYTE first_interrupt; /* always zero */
 BYTE last_interrupt; /* always FF hex */
 BYTE rows; /* always 25 */
 BYTE cols; /* always 80 */
 BYTE window_pos_row;
 BYTE window_pos_col;
 WORD sys_memory; /* always 7 */
 char shared_prog_name[64];
 char shared_prog_data_file[64];
 BYTE flags1; /* 16f, usually zero */
 BYTE flags2; /* 170, usually zero */
 /* Microsoft PIF editor reads up to 3FF hex bytes in. When writing back */
 /* out it writes same number of byte read. This means a PIF file can */
 /* be up to 3FF hex bytes with the assumption that any 3rd party */
 /* utilities take this into account. NOTE 400 hex WILL NOT WORK !! */
 /* Tested under Win 3.1 and NT (Oct 92 beta). */
} PIF, FAR *fpPIF; /* PIF structure */
#ifdef DOCUMENTATION
/* --- 171h Begin of Microsoft Windows Stuff */
SECTIONNAME pifex; /* 171,hard coded " MICROSOFT PIFEX" */
SECTIONHDR section_zero; /* 181 */
SECTIONNAME first_name; /* 187, hard coded "WINDOWS 386 3.0", 286 if NT */
SECTIONHDR section_one; /* 197, points to str_286A, or section_nameNT */
#ifdef NT
DATA286 data_286; /* 19D */
SECTIONNAME section_nameNT; /* 1A3, hard coded "WINDOWS 386 3.0" */
SECTIONHDR section_hdrNT; /* 1B3 */
DATA386 data_386; /* 1B9 */
/* paded with zeros, from 220-22f hex. */
#else
DATA386 data_386; /* 19D */
/* ---205 hex, end of 386 material */
/* start of 286 specific stuff */
SECTIONNAME str286A; /* 205, hard coded " INDOWS 286 3.0" */
SECTIONHDR section_286A; /* 215, */
DATA286 data_286A; /* 21B */
SECTIONNAME str286B;/* 221, hard coded "WINDOWS 286 3.0" */
SECTIONHDR section_286B; /* 231 */
DATA286 data_286B; /* 237 */
/* ends at 23c */
#endif /* NT */
/* 23d */
#endif /* DOCUMENTATION */
typedef struct {
 SECTIONNAME SName;

 SECTIONHDR Hdr;
 DATA386 D386;
} BLOCK386, *npBLOCK386, FAR *fpBLOCK386;
typedef struct {
 SECTIONNAME SName;
 SECTIONHDR Hdr;
 //DATA386 D386;
} BLOCKNT, *npBLOCKNT, FAR *fpBLOCKNT;
typedef struct {
 SECTIONNAME SName;
 SECTIONHDR Hdr;
 DATA286 D286;
} BLOCK286, *npBLOCK286, FAR *fpBLOCK286;
typedef char FAR *fpBLOCKCMNT;
typedef char *npBLOCKCMNT;
typedef struct {
 SECTIONNAME SName;
 SECTIONHDR SHdr;
} BLOCKVOID, *npBLOCKVOID, FAR *fpBLOCKVOID;
typedef struct {
 char AuxName[8+1+3];
} SECTIONAUX, *npSECTIONAUX, FAR *fpSECTIONAUX;
typedef struct {
 BYTE Hdr1[3];
 BYTE HChkSum;
} SECTIONHDR1, *npSECTIONHDR1, FAR *fpSECTIONHDR1;
typedef struct {
 SECTIONHDR1 CHdr1;
 SECTIONAUX CAux;
} COMMENTS, *npCOMMENTS, FAR *fpCOMMENTS;



[LISTING THREE: READPIF.C]

/*
READPIF.C
Copyright 1992,1993 Michael P. Maurice
This is very accurate and moderately tested.
This is originally based on documentation in Ralf Brown's interrupt list.

The bit structures that are passed to printf, etc, are not portable
and give a structure passed by value warning under Borland C.
*/

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef unsigned char BYTE;
typedef unsigned short WORD;

#define FAR _far

#include "pifstruc.h"

void display_NT(BLOCKNT *pPIF);
void display_386(DATA386 * pPIF);
void display_286(DATA286 * pPIF);

void display_comment(char * pPIFCOMMENT);
void usage(void);
void argcheck(int, char **);
void help(void);

/* make sure this is in public space & cleared to ZERO! */
char tbuf[MAX_PIFFILE_SIZE + 10];
char *ifilename;
int lflag; /* list linked records */
int cflag; /* list comment records */
int nflag; /* NT records */
int vflag; /* verbose listing, including the kitchen sink */
int nfile;
int flag2; /* 286 records only */
int flag3; /* 386 records only */

main(int argc, char **argv)
{
 FILE *ifil;
 BYTE tChkSum;
 unsigned int offset;
 int i;
 PIF *pPIF;
 char *pPIFCOMMENT;
 DATA286 *pPIF286;
 DATA386 *pPIF386;
 BLOCKNT *pPIFNT;
 char *pPIFEX;
 SECTIONHDR *pPrevSect, *pCurSect;
 SECTIONNAME *pName;
 char *pLastByte;

 if (argc < 2) usage();
 ifilename = *++argv;

 argcheck(argc,--argv);

 ifil = fopen(ifilename, "rb");
 if(!ifil) {
 printf("Failed Open: %s\n", ifilename);
 exit(1);
 }

 fputs("\n\nReading PIF: ", stdout);
 fputs(ifilename, stdout);
 fputs("\n", stdout);

 fread(tbuf, MAX_PIFFILE_SIZE ,1, ifil);
 fclose(ifil);

 for (i=2, tChkSum=0; i< PIFEX_OFFSET; i++)
 tChkSum += tbuf[i];
 if(vflag)
 printf("calc.chksum = 0x%x\n\n",(tChkSum & 0x00ff));

 pPIF = (PIF *)&tbuf;

#define PIF_String(a, b) if(pPIF->a) printf("%s\n",b)


 if(tChkSum != pPIF->checksum)
 fputs("Checksum ERROR!!\n", stdout);

 if(vflag)
 {
 printf("\
resv1 = 0x%x\n\
checksum = 0x%x\n",
 pPIF->resv1, pPIF->checksum & 0x00ff);

 printf("close_onexit = 0x%x ,",pPIF->close_onexit);
 }/* vflag */

 printf("\n\
title = %.30s\n\
max_mem = %d dec.\n\
min_mem = %d dec.\n\
prog_path = %.63s\n",
 pPIF->title, pPIF->max_mem,
 pPIF->min_mem, pPIF->prog_path);

 if(flag2)
 {
 PIF_String(close_onexit.Graph286, "Graphics 286");
 PIF_String(close_onexit.NoScreenExch, "No Screen Exchange 286");
 PIF_String(close_onexit.PreventSwitch, "Prevent Switch 286");
 PIF_String(close_onexit.Close_OnExit, "Close On exit");
 PIF_String(close_onexit.Com1, "Com 1 - 286");
 PIF_String(close_onexit.Com2, "Com 2 - 286");
 }/* flag2 */

 printf("def_drv = %s\n", pPIF->def_drv);
 printf("def_dir = %s\n", pPIF->def_dir);
 if(flag2)
 printf("286 opt. param = %.64s\n", pPIF->prog_param); /*vflag*/

 printf("\
initial_screenMode = 0x%x\n\
text_pages = 0x%x\n\
first_interrupt = 0x%x\n\
last_interrupt = 0x%x\n\
rows = %u dec.\n\
cols = %u dec.\n",
 pPIF->initial_screenMode & 0x00ff,pPIF->text_pages& 0x00ff,
 pPIF->first_interrupt& 0x00ff,pPIF->last_interrupt& 0x00ff,
 pPIF->rows & 0x00ff, pPIF->cols& 0x00ff);

 if(vflag) {
 printf("\
window_pos_row = 0x%x\n\
window_pos_col = 0x%x\n\
sys_memory = 0x%x\n\
shared_prog_name = %s\n\
shared_prog_data_file = %s\n\
flags1 = 0x%x\n\
flags2 = 0x%x\n",

 pPIF->window_pos_row& 0x00ff, pPIF->window_pos_col& 0x00ff,
 pPIF->sys_memory, pPIF->shared_prog_name,

 pPIF->shared_prog_data_file,
 pPIF->flags1& 0x00ff, pPIF->flags2& 0x00ff);
 }/* vflag */

 pPIFEX = &tbuf[PIFEX_OFFSET];
 if(strcmp(pPIFEX, "MICROSOFT PIFEX"))
 {
 fputs("NOT a MICROSOFT PIF FILE\n", stdout);
 return 1;
 }

 fputs(" ********** pifex = ",stdout);
 fputs(pPIFEX, stdout);
 fputs(" *********************\n", stdout);

 pPrevSect = (SECTIONHDR *)&tbuf[PIFEX_OFFSET+sizeof(SECTIONNAME)];
 pName = (SECTIONNAME *)&tbuf[pPrevSect->next_section];
 pLastByte = &tbuf[MAX_PIFFILE_SIZE];

 for (i=0;i<40;i++) {
 if(lflag) {
 printf("\n\
next_offset = 0x%x, \
current_section = 0x%x, \
size_section = 0x%x\n",
 pPrevSect->next_section, pPrevSect->current_section,
 pPrevSect->size_section);
 }/* sflag */

 if(pPrevSect->next_section == 0xFFFF)
 break;
 fputs("============================================\n",stdout);

 offset = pPrevSect->next_section+sizeof(SECTIONNAME);
 pCurSect =(SECTIONHDR *) &tbuf[offset];

 fputs("----------------- Record Type: ",stdout);
 if(pName->name_string[0] != 0)
 fputc(pName->name_string[0], stdout);
 else fputc('\x20', stdout);

 /* if the first char position is zero, the section is not in use */
 /* however, since this is a dump program, we will dump the section */

 fputs(pName->name_string+1, stdout);
 fputs(" ----\n", stdout);

 if(!strcmp(pName->name_string+1, "INDOWS 386 3.0"))
 {
 if(flag3)
 {
 pPIF386 = (DATA386 *)&tbuf[pCurSect->current_section];

 display_386(pPIF386);
 if(pPIF386->zero1 != 0) fputs("\7 386.zero1 != 0 \n", stdout);

 if((pPIF386->zero2[0] == 0) 
 (pPIF386->zero2[1] == 0) (pPIF386->zero2[2] == 0) 
 (pPIF386->zero2[3] == 0) (pPIF386->zero2[4] == 0)) ;

 else
 fputs("\7 386.zero2 != 0 \n", stdout);

 if((pPIF386->hot_key_flag == 0) 
 (pPIF386->hot_key_flag == 0xF)) ;
 else
 fputs("\7 386.hot_key_flag has strange value\n", stdout);
 }/* flag3 */
 }
 if(!strcmp(pName->name_string+1, "INDOWS 286 3.0"))
 if(flag2)
 {
 pPIF286 = (DATA286 *)&tbuf[pCurSect->current_section];
 display_286(pPIF286);
 }/* vflag2 */

 if(!strcmp(pName->name_string+1, "INDOWS NT 3.1"))
 if(nflag)
 {
 pPIFNT = (BLOCKNT *)&tbuf[pCurSect->current_section];
 display_NT(pPIFNT);
 }/* nflag */

 /* here we document a technique for supporting comments in PIFs */
 if(!strcmp(pName->name_string, "COMMENT"))
 if(cflag)
 {
 pPIFCOMMENT =
 (char *)&tbuf[pCurSect->current_section+sizeof(COMMENTS)];
 display_comment(pPIFCOMMENT);
 }/* cflag */

 pPrevSect = (SECTIONHDR *)&tbuf[pPrevSect->next_section +
 sizeof(SECTIONNAME)];
 pName = (SECTIONNAME *)&tbuf[pPrevSect->next_section];
 if((char *)pPrevSect > pLastByte) break;

 }/* for */

 if(pPrevSect->next_section == 0xFFFF)
 fputs("\n\n----Success: Last Record Found---- \n\n", stdout);
 else
 fputs("\n\n----ERROR: Last Record NOT Found---- \n\n", stdout);

 /* the offset defintions labeled unknown should have some kind of
 code to check for any deviation from the usual contents */

 return 0;
}

void usage(void)
{
 fputs("readpif infile \n", stdout);
 fputs("readpif -?, for help\n", stdout);
 exit(1);
}

void display_NT(BLOCKNT *pPIF)
{

 char *ptr = (char *) pPIF;
 ptr += 12; /* padding ?? */
 fputs(ptr, stdout);
 fputs("\n", stdout);
 ptr += 64; /* start of next string */
 fputs(ptr, stdout);
 fputs("\n", stdout);
}

void display_386(DATA386 * pPIF)
{
 printf("\
mem_req = %d dec.\n\
mem_limit = %d dec.\n\
for_pri = %u\n\
back_pri = %u\n\
ems_min = %u\n\
ems_max = %u\n\
xms_min = %u\n\
xms_max = %u\n",
 pPIF->mem_req, pPIF->mem_limit, pPIF->for_pri,
 pPIF->back_pri, pPIF->ems_min, pPIF->ems_max,
 pPIF->xms_min, pPIF->xms_max);

 if(vflag)
 printf("flags_386 = 0x%x ,", pPIF->flags_386);

 PIF_String(flags_386.AllowCloseAct, "Allow Close while Active");
 PIF_String(flags_386.ExclusiveON,"Exclusive ON");
 PIF_String(flags_386.BackgroundON,"Background ON");
 PIF_String(flags_386.FullScreenYes,"Full Screen YES");
 PIF_String(flags_386.RSV_ALTESC,"RSV_ALT ESC");
 PIF_String(flags_386.RSV_ALTTAB,"RSV_ALT TAB");
 PIF_String(flags_386.RSV_ALTSPACE,"RSV_ALT SPACE");
 PIF_String(flags_386.RSV_ALTENTER,"Reserve ALT-ENTER");
 PIF_String(flags_386.RSV_ALTPRTSCR,"Reserve ALT-PRT-SCR");
 PIF_String(flags_386.RSV_PRTSCR,"Reserve PRT-SCR");
 PIF_String(flags_386.RSV_CTRLESC,"Reserve CTRL-ESC");
 PIF_String(flags_386.Detect_Idle, "Detect Idle");
 PIF_String(flags_386.EMS_Locked,"EMS Locked");

 if(pPIF->flags_386.UseHMA) ;
 else fputs("Use HMA\n", stdout);

 if(vflag)
 printf("flagsXMS = 0x%x ", pPIF->flags_XMS);
 PIF_String(flags_XMS.XMS_Locked,"XMS_Locked");
 PIF_String(flags_XMS.Allow_FastPst,"Allow_FastPst");
 PIF_String(flags_XMS.Lock_App,"Lock_App");
 fputs("\n",stdout);

 if(vflag)
 printf("video = 0x%x, ", pPIF->video);

 if(pPIF->video.MonitorText) ;
 else fputs("Monitor Text\n", stdout);
 if(pPIF->video.MonitorMGr) ;
 else fputs("Monitor Med Gr\n", stdout);
 if(pPIF->video.MonitorHiGr) ;

 else fputs("Monitor Hi Gr\n", stdout);

 PIF_String(video.InitModeText,"Init Vid. Mode Text");
 PIF_String(video.InitModeHiGr,"Init Vid. Mode Gr");
 PIF_String(video.InitModeHiGr,"Init Vid. Mode Hi Gr");
 PIF_String(video.VidEmulateTxt,"EmulateTxt");
 PIF_String(video.VidRetainVid,"VidRetainVid");

 /* this test for a hot key defined may not be correct */
 /* it may be that the test should be on hot_key_flag */
 if(vflag)
 printf("hot key flag = 0x%x \n", pPIF->hot_key_flag);

 if(pPIF->hot_key_scan == 0)
 fputs("No Hot Key Defined\n",stdout);
 else {
 if(vflag)
 printf("hot_key_state = 0x%x\n", pPIF->hot_key_state);

 PIF_String(hot_key_state.HOT_KEYALT,"HOT-KEY ALT");
 PIF_String(hot_key_state.HOT_KEYCTRL,"HOT-KEY CTRL");
 PIF_String(hot_key_state.HOT_KEYSHIFT,"HOT-KEY SHIFT");

 printf(" - scan code = 0x%x hex\n", pPIF->hot_key_scan);
 }

 if(vflag)
 {
 printf(" zero1 = %x\n", pPIF->zero1);
 printf(" zero2 = %x %x %x %x %x\n", pPIF->zero2[0],
 (pPIF->zero2[1]), (pPIF->zero2[2]),
 (pPIF->zero2[3]), (pPIF->zero2[4]));
 }/* vflag */

 printf("386 optional parameters = %.64s\n",pPIF->opt_params );
}

void display_comment(char *p)
{
 fputs(p, stdout);
 fputs("\n", stdout);
}

void display_286(DATA286 * pPIF)
{
 if(vflag)
 printf("flags_286 = 0x%x - ", pPIF->flags_286);

 PIF_String(flags_286.AltTab286,"286 ALT TAB");
 PIF_String(flags_286.AltEsc286,"286 ALT ESC");
 PIF_String(flags_286.AltPrtScr286,"286 ALT PRT SCR");
 PIF_String(flags_286.PrtScr, "286 PRT SCR");
 PIF_String(flags_286.CtrlEsc286,"286 CTRL ESC");
 PIF_String(flags_286.SaveScreen,"Save Screen");

 printf("286 xms limits=%d req=%d\n",pPIF->xmsLimit286, pPIF->xmsReq286);
 printf("com_ports = 0x%x - \n", pPIF->com_ports);

 PIF_String(com_ports.Com3,"COM 3");

 PIF_String(com_ports.Com4,"COM 4");
 fputs("\n", stdout);
}

void help(void)
{
 fputs("readpif -v -l -n -2 -3 -c filename(.pif)\n", stdout);
 fputs("where -v = verbose\n", stdout);
 fputs("where -l = list linked records\n", stdout);
 fputs("where -n = print comment records\n", stdout);
 fputs("where -2 = print 286 records\n", stdout);
 fputs("where -3 = print 386 records\n", stdout);
 fputs("where -c = print comment records\n", stdout);
 exit(1);
}

#include <ctype.h>
void argcheck(int argc, char **argv)
{
 register char *p;
 register int c, i;
 int gotpattern;

 if (argc <= 1)
 fputs("No arguments\n", stdout);
 if (argc == 2 && argv[1][0] == '?' && argv[1][1] == 0) {
 help();
 return;
 }
 nfile = argc-1;
 gotpattern = 0;
 for (i=1; i < argc; ++i) {
 p = argv[i];
 if (*p == '-') {
 ++p;
 while (c = *p++) {
 switch(tolower(c)) {

 case '?': help(); break;
 case 'l': ++lflag; break;
 case '2': ++flag2; break;
 case '3': ++flag3; break;
 case 'c': ++cflag; break;
 case 'n': ++nflag; break;

 case 'v': ++vflag; ++lflag; ++flag2;
 ++flag3; ++cflag; ++nflag; break;

 default:
 fputs("Unknown flag\n", stdout);
 }
 }
 argv[i] = 0;
 --nfile;
 }
 else if (!gotpattern) {
 ifilename = p;
 argv[i] = 0;
 ++gotpattern;

 --nfile;
 }
 }
}


























































July, 1993
PROGRAMMER'S BOOKSHELF


Graphics Gems and Fractal Compression




Ray Valds


This "Programmer's Bookshelf" covers two graphics books--both a bit expensive,
somewhat specialized, but definitely worth your consideration.
The first is unreservedly recommended and has the most general appeal.
Graphics Gems III is part of a series initiated by Andrew Glassner of Xerox
PARC. The current volume, edited by David Kirk, follows on the heels of two
successful volumes (one of which was recommended by Michael Abrash in his July
1992 "Graphics Programming" column), with a fourth planned for later this
year.
Glassner's concept resembles DDJ's mission: to provide "a collection of tips,
techniques and algorithms for the practicing computer_programmer. Many ideas
that were once passed on through personal contacts or chance conversations can
now be found here_. Many are illustrated with accompanying source code." The
books consist of short, accessible bits of software expertise that can be used
immediately in solving specific, practical problems--fast stretching of a
bitmap, joining two lines with a circular arc fillet, computing the
intersection between a triangle and a cube, and the like.
There are no tedious tutorials, no mention of hot products or fashionable
paradigms--just no-nonsense, proven chunks of software technology, embodied in
either pseudocode, C/C++, or sometimes just in narrative form. To make use of
these gems, you need a background in computer graphics, but many items are
self-contained, and most are lucidly explained. Chances are, if you know you
need a particular item, you'll have the background to understand its
exposition. If your application programs rely on an API provided by a graphics
library, then you probably won't need this book; in such a case, the coverage
here is of interest only if you want to know what's happening on the other
side of the API boundary.
Volumes subsequent to the original Graphics Gems are the result of
contributions by numerous programmers from both the research, industrial, and
university communities. Volume III, for instance, consists of items from about
60 different contributors, subdivided into categories such as image
processing, numerical programming, modeling, rendering, ray tracing, and
radiosity. Space limits preclude mentioning more than a few items: quaternion
interpolation, fast random rotation matrices, interpolating Bzier curves,
decomposing linear and affine transformations, fast circle clipping,
partitioning a 3-D convex polygon with an arbitrary plane, converting Bzier
triangles into rectangular patches, ray tracing with a BSP tree, and
antialiasing polygon edges using bit-masks. (For readers interested in
contributing to future Graphics Gems, each volume includes a postcard to
request an author's packet.)
Glassner's goal of capturing the oral history of high-end computer graphics
into a written record is extremely worthy. I recall 12 years ago when I was a
member of a team designing a high-end electronic-publishing workstation from
the ground up, 1.5 million lines of code in all. Some of us had strong
backgrounds in graphics research, others had lengthy programming experience.
Even so, we had to struggle to solve many problems that fell in between
theoretical research and workaday programming; for example, efficiently
scan-converting a rotated ellipse or a Bzier curve. The standard graphics
textbooks of the day--such as Foley and van Dam, or Newman and
Sproull--provide the basic equations, but neglect critical practical aspects
(such as using recursive subdivision for Bzier scan-conversion). Because
these topics are not substantive enough to be merit coverage in the research
journals, we had to invent our own solutions, or ask other working graphics
programmers. In many cases we succeeded, in others we did not. As Glassner
says in the foreword, "[With Graphics Gems] we avoid reinventing the wheel,
and by sharing this information, we help each other move towards a common goal
of amassing a body of useful techniques to be shared throughout the
community."
All the C and C++ code in the series is in the public domain, and you don't
even have to buy the books to get it; you can just ftp it from Internet sites
such as princeton.edu (in the pub/Graphics directory) or
weedeater.math.yale.edu. For those without Internet access, the current volume
comes with either a Macintosh or IBM format diskette containing code from all
three volumes.
Fractal Image Compression, by Michael Barnsley and Lyman Hurd, is of interest
to a much smaller audience, but has been eagerly awaited by that group of
people. Many of us have been reading for years how iterated function systems
(IFS) can be used for fractal compression of scanned images, resulting in
compression ratios of 2500:1 or more.
Much of the allure is that the task of generating an image from a given IFS
specification is very simple, and can be expressed in about 12 lines of C code
and 16 floating-point numbers. Because the images are fractal in nature, the
resolution is effectively infinite--the closer you look, the more detail is
there.
The hard part is arriving at the small set of numbers that characterize a
given image. Or as Barnsley puts it:
If our real world image is one of many basic shapes, such as a leaf or a
letter of the alphabet, or a black-and-white silhouette of a fern, or a black
cat sitting in a field of snow, or a rook's feather on a white starched sheet,
or a black crack in a white teacup, or a snowflake lying on a frozen lump of
coal, or a circle or a square, or a Julia set, or the outline of a pine tree
or many pine trees against the skyline at dusk, or a component of an image
received, or to be sent, via a fax machine, or the graph of a complicated
function, or one of a multitude of familiar shapes and forms such that a model
is appropriately made in black-and-white alone; in such cases, one can achieve
fractal image compression via the IFS compression algorithm, which is an
interactive image modeling method based on the collage theorem. The IFS
compression algorithm starts from a target image T which lies in [the
support]. An affine transformation wi(x)=[matrix formula omitted]=Aix+ti is
introduced_"
Although the preceding reads like Finnegan's Wake blended in with a
mathematics textbook, most of the book is less verbose and highly technical,
requiring a background in topology, measure theory, information science, and
image processing. This despite the fact that the authors spend the early
chapters providing formal definitions of basic concepts such as metric space,
affine transformation, Hausdorff space, and Hutchinson metric. Even so, there
aren't enough pages left to define additional terms such as Borel-measurable
function, which are used throughout the discussion.
Those who have heard Barnsley speak shouldn't be surprised. He's a brilliant
dynamo of ideas and allusions, with enough incandescent brainpower to heat a
small winter cabin, his body going through various geometric transformations
as he steps lightly from one subject to another. This pedagogical style
parallels his approach to handling images, which is to take bits and pieces of
an image and paste them in altered form at different places, repeating the
process until a sufficient level of detail is achieved. Although I always walk
away from a lecture having learned something, I'm also left wondering whether
the whole story was presented, or only a lower-resolution "lossy" version, or
perhaps it is just that I don't understand.
So how is this hard-to-believe magic achieved? This is the book in which
Barnsley promised to open the kimono and reveal the secret of the fractal
transform, now that his patent has been approved. The results are mixed.
There's a lot of interesting material here but the secret is only partially
revealed.
I don't think it's just me, either. Thomas Colthurst, who teaches an MIT class
on Advanced IFS Theory, and who has come up with his own methods for fractal
image compression, posted his reaction to this book in the comp.graphics
newsgroup on the Internet:
Rather a disappointment. It contains precious little about how Barnsley's
fractal transform method works, and what it does reveal is nothing more than a
variant of the stuff Jacquin and Yisher, Boss, and Jacobs have been doing for
a couple of years now. Sometimes it seems like the main point of the book is
to allow Barnsley to wave his patents (#5,065,447 and #4,941,193) around.
Colthurst's expectations are probably higher than that of most readers. For
me, there was enough substance in the book to merit struggling through it, but
the list price should certainly give you pause.
Another way in which the book is frustratingly incomplete is that much of the
discussion depends upon theorems such as the IFS, Condensation, Collage,
Hutchinson, Elton, and Attractor Computation theorems, which are stated
without proof. In all these cases, the reader is referred to Fractals
Everywhere (Barnsley's earlier book) for the proofs. Given the high price of
the book, and given the presence of somewhat irrelevant material like sections
on Huffman coding and Discrete Cosine Transform, the proofs should have been
provided in context.
Further confusing matters is the imminent appearance of another book by
Barnsley (this time co-authored with Louisa Anson), entitled The Fractal
Transform, but not yet available. This collage of coverage parallels the
development of IFS technology, which from this vantage point was undertaken by
a collage of overlapping companies, all generally within the domain of
Barnsley.
Moreover, in contrast to the generous collegial spirit behind Graphic Gems, I
counted about a half dozen references to Patent #5,065,447, followed by the
repeated admonition: "If you wish to set up an image compression system on a
digital computer using the fractal transform, you should apply for a license
to: Licensing Department, Iterated Systems, Inc., 5550-A Peachtree Parkway,
Norcross, GA 30092." Keep in mind that patent law restricts your use of a
particular technology even if you have independently come up with the same
algorithm or program.
So how does the fractal transform algorithm work? Ironically, I found the best
summary in Colthurst's Internet posting:
[It works] by dividing the image into domain blocks (typically an 8x8 square
of pixels), and range blocks (bigger than the domain blocks, say 16x16 pixels)
and finding local iterated function system transforms (LIFS transforms) from
range blocks to every single domain block. The LIFS transforms will typically
translate the image, scale it by a fixed factor (in our case of transforming
16x16 blocks into 8x8 blocks, it shrinks each dimension by one-half), rotate
and/or flip it (giving the eight symmetries of the square), and multiply the
color or gray scale intensities by some factor so that the average intensities
of the domain and range blocks match. The trick, of course, is to find the
best range block for every domain block, and to do this quickly. Barnsley does
not address this issue at all, and the C code which he provides merely
implements the brute force algorithm (it computes the [Hausdorff] distances
between every range block and every domain block).
Graphics Gems III
David Kirk, editor
Academic Press, 1992
631 pp. $49.95
ISBN 0-12-409670-0
Fractal Image Compression
Michael Barnsley and Lyman Hurd
AK Peters, 1993
244 pp. $49.95
ISBN 1-56881-000-8

















July, 1993
OF INTEREST
EdScheme, a Scheme interpreter developed for educational environments, has
been released by Schemers. Supplementing the software, which is available for
PCs, Macs, and Amigas, is the companion textbook The Schemer's Guide.
The book sells for $29.95 and the EdScheme interpreter for $49.95. Reader
service no. 20.
Schemers Inc.
2136 NE 68th St., Ste. 401
Fort Lauderdale, FL 33308
305-776-7376
Media Cybernetics has introduced the HALO Image Library which provides a means
of adding imaging capabilities to applications. Eventually available for
Windows, OS/2, Macintosh, and several UNIX platforms, the toolkit gives
application developers over 100 imaging functions and commands, allowing for
the development of C programs that can read and store image files in several
file formats to perform the processing of the images in memory.
The HALO Imaging Library is divided into several functional groups covering
image management (for creating and managing images in memory), conversion (to
adjust and refine brightness, contrast, and gamma characteristics, and provide
special effects), transformation (rotation, transposing, flipping, or spatial
distortion), and file formats (to read/write TIFF, JPEG, GIF, TGA, PCX, BMP,
CUT, and PICT).
The HALO Imaging Library for Windows costs $595.00. Reader service no. 21.
Media Cybernetics Inc.
8484 Georgia Ave.
Silver Spring, MD 20910
301-495-3305
Object Graphics has released TIMS Tool Kit, a programmer's toolkit for AutoCAD
12. TIMS R12 uses AutoCAD as a graphics engine and provides a wide range of
functions, including transparent spatial relationships based on an innovative,
object-oriented technology. It also provides easy access to industry-standard
databases and other file formats such as Oracle, dBase, and Lotus.
The toolkit's library includes over 200 routines and is written in C, but can
be accessed from AutoLISP. The Toolkit sells for $1000.00 and is royalty free.
Reader service no. 22.
Object Graphics
400 Stierlin Road
Mountain View, CA 94043
415-968-1500
Captain Crunch, a compression chip from Media Vision, offers real-time
compression and decompression of captured video. It delivers video in a
320x245 window at 30 frames per second at standard CD-ROM data rates with
24-bit color. Media Vision says that the chip uses only 20,000 gates compared
to 400,000 gates for similar compression silicon.
Video developed using Captain Crunch allows software-only playback on a 386 PC
or Macintosh IIci (or higher) without degradation of the picture quality. The
video chip sells for $50.00, and an add-in board (available later this year)
will cost about $300.00. Reader service no. 23.
Media Vision
3185 Laurelview Court
Fremont, CA 94538
510-770-8600
The Network Basic Library 2.0 from Automation Software Consultants provides
Basic programs with access to nearly all of the system services available from
Netware. Supported Netware services include services for accounting, bindery
management, connection services, directory and file management, locking,
messages, printing, queue management, security and rights, semaphores, and the
like.
The upgrade adds Netware 386-specific features such as support for encrypted
passwords, file trustees, and Netware 386 rights. Functions providing IPX
peer-to-peer message capabilities have also been added to Version 2.0.
The Library supports QuickBasic and is available for $295.00; upgrades from
1.0 and 1.1 are available for $75.00. Reader service no. 24.
ASCI
8188 South State Route 48
Maineville, OH 45039
513-677-0842
Wiseport Data Systems is shipping C-edit, a new text editor similar to the
UNIX vi text editor. This full-screen text editor for major UNIX platforms
provides split-windows capabilities, cut/paste/move within the same file, and
edit/merge/maintain of script files. Keystrokes are DOS-like for those who
often switch between systems. Shortcuts via custom macros can be created.
Advanced functions include "mark" cursor position for instant return, jump to
beginning/end of line/file, search forward/backward, and scroll
forward/backward within a page or by line. Scrolling between windows is
handled with the PgUp/PgDn keys.
Editing can be disabled for specified users, allowing them to read files only.
Editing is made easy by just moving the cursor and altering the text without
changing modes.
C-edit is offered for $545.00. Reader service no. 25.
Wiseport Data Systems
3900 Birch Street, Ste. 105
Newport Beach, CA 92660
714-250-2981
Now shipping from Matra MHS is a 80C51 microcontroller with an internal
running TAG/ID number. The 80C51T is an 8-bit microcontroller with a
factory-programmable 5K ROM and fully static design. The device operates from
0 MHz (DC) to a maximum clock rate of 20 MHz over the full commercial
temperature range.
The TAG/ID number is a 64-bit serial number contained in a special-functions
register area within the microcontroller. This running serial number is
personalized during manufacture and is unique for each part; no two parts can
have the same identification number. The TAG/ID contains the lot number, a
personalized number, year and laser number, day, month, and serial number.
Typical applications for this device are in vehicle-identification systems,
smart credit cards, electronic keys, electronic banking products, and other
products that require tracking, tracing, and accountability.
In OEM quantities (10,000 and up), the part costs $9.50 for the P80C52T in
40-pin plastic DIP, 12-MHz operation. Reader service no. 26.
Matra MHS
2201 Laurelwood Road, MS 53
Santa Clara, CA 95056-0951
408-748-9362
Storc Gold 1.0, a form-conversion tool from PractiSys, provides a bridge
between Visual Basic and the Windows SDK. The tool converts a Visual Basic
form into a Windows Resource Script, to allow programmers to quickly turn a
Visual Basic demo into a fully functional product developed using C/C++/Turbo
Pascal and the Windows SDK. Storc can also be used to convert any other form
displayed by Windows. The conversion tool is available for $39.93 per copy
(single user). Reader service no. 27.
PractiSys
4767 Via Bensa
Agoura, CA 91301
818-706-8877
A set of Video Decompression Source Kits that support JPEG still images, MPEG
full-motion video, and Px64-based video teleconferencing has been released by
Performance Computing. The platform-independent VDS kits (written in ANSI C)
provide full ANSI compliance for file-to-file conversion and are adaptable to
just about any existing application or tool. The complete package includes a
video-stream editor, test images, documentation, and a source-code option.
The company is offering five kits: MPEG Encoding, MPEG Decoding, MPEG Audio,
JPEG Encoding/Decoding, and Px64 Encoding/Decoding. The individual kits, which
support four API calls (create data stream, create frame, decode, and store
frame into a format), come with a video bitstream editor and encoded
test-image video streams. The VDS kits also sport a unique and efficient
video-bitstream error-detection/recovery scheme that essentially copies the
prior "good" frame in place of the "bad" frame.
The kits are not optimized for specific processors. However, Performance's
Dave Glass told us that the VDSs have run without modification on DOS/Windows
PCs, Macs, Sun UNIX systems, IBM RS/6000s and Cray Superserver S-MPs.
Object-code versions for single-node use sell for $750.00/kit. Contact the
company for OEM, source code, and other licensing information. Reader service
no. 28.

Performance Computing
1815 SW Marlow Ave.
Portland, OR 97225
503-297-2292
CD Net 4.4 for Netware, an NLM which supports Windows and DOS clients, has
been announced by Meridian Data. CD Net extends Netware by providing services
and additional functionality required in multiserver environments unique to
the CD-ROM storage medium and multimedia applications.
The software allows CD-ROMs to be network configured, either as DOS drives or
as native Netware columns, requiring no redirector TSR application in the
client workstation. When CD-ROMs are configured as DOS drives, the CD Net
redirector minimizes client memory usage, while maintaining compatibility with
Microsoft CD-ROM Extensions (MSCDEX).
CD Net for Netware supports ISA, EISA, and MCA I/O bus architectures,
accessing up to 28 multisession CD-ROM drives (16.8 gigabytes) per file server
through the ASPI SCSI interface. The package includes dynamic load balancing
and intelligent caching.
CD Net for Netware server software requires Netware 3.11 or Novell Netware
4.0, or later. CD Net requires Microsoft Windows 3.1, MS-DOS 5.0, or PC-DOS
5.0 or later.
The software starts at $645.00 for the server with a ten-user license. The
software can support up to 250 concurrent users, and is configurable at the
time of initial purchase or through subsequent license upgrades. Reader
service no. 29.
Meridian Data
5615 Scotts Valley Drive
Scotts Valley, CA 95066
408-438-3100
Silicon Studio, a new professional video and film postproduction system for
the IRIS Indigo platform, has been announced by Silicon Graphics. Central to
the system is a video server called the Challenge--a symmetric multiprocessing
server that supports up to 36 CPUs, 16 gigabytes of main memory, and multiple
concurrent streams of real-time video data. The server stores up to 30 hours
of uncompressed online video.
Other components of the Challenge-based system are the Galileo Video adapter,
the Cosmo Compress JPEG compression system, and Sirius Video digital video
board. The Galileo adapter delivers real-time, frame-by-frame input and output
of analog video as well as S-video and composite formats. Cosmo Compress
enables real-time video capture and playback of full-frame-rate,
full-resolution digital movies. The Sirius board blends real-time digital
video processing, computer-generated graphics, 3-D geometry, and image
processing into a single system. Reader service no. 30.
Silicon Graphics
2011 N. Shoreline Blvd.
Mountain View, CA 94043-1389
415-390-3365
TCS Limited has released the Security Expert's Assistant (SEA), a set of
software utilities for developers who use DES cryptography for financial or
non-financial applications.
SEA brings all the standard and many common proprietary, cryptographic
algorithms together into a single package for DOS PCs so that programmers can
create and use test keys, calculate PINs and PIN blocks, and generate and
verify MACs. All ANSI and VISA standards are supported. SEA also provides a
cryptographic scripting language which allows developers to create custom
algorithms and procedures. Reader service no. 31.
TCS (Canada) Ltd.
Suite 202, Oakville Corporate Centre
700 Dorval Drive
Oakville, Ontario
Canada L6K 3V3
A library of digital photographic textures, called Pixar One Twenty Eight, has
been released by Pixar. The 128 images, which typically would be used as
backgrounds or fills for graphic applications, include a variety of textures
for bricks, fabrics, fences, floors, ground covers, metals, roofs, sidings,
animal skins, stones, walls, wood, and so on. The library can be used by any
application that reads the TIFF file format.
The library, delivered on ISO 9660 CD-ROM, includes both 512x512x24-bit and
128x128x8-bit versions of the images. Each image was photographed from
original material, then processed in a custom texture lab. Plug-ins are
available for both Photoshop on the Macintosh and Photoshop and PhotoStyler on
Windows.
The Pixar One Twenty Eight library sells for $299.00. Reader service no. 32.
Pixar
1001 West Cutting
Richmond, CA 94804
510-236-4000
In a recently released report on C and C++, Lucid claims that a majority of
the 63 organizations surveyed (12 of which were Lucid customers) were either
currently programming in C++ or planning to move to C++ in the near future.
According to Lucid, the study also indicated that "74 percent of the market
has more than 100,000 lines of code with an overall average of 450,000 lines
of code." (It's worth noting that Lucid is a C/C++ vendor.)
For a copy of the report, contact Lucid or request it from dearmon@lucid.com.
Reader service no. 33.
Lucid
707 Laurel Street
Menlo Park, CA 94025
415-329-8400























July, 1993
SWAINE'S FLAMES


Channel Mine


...yeah, well I hear that when the CPU goes into sleep mode the mouse is still
drawing power and that's the whole battery-life problem right there. I
understand they've got a software fix for it that puts the mouse to sleep
whenever the CPU is in sleep mode. They call it "mouse mickey."
What?
Is this thing on?

{:-0

Hello out there and welcome to "Swaine's Flames," the Equal Opportunity
Monthly Visitor; You Don't Want to Miss It. Our theme: Is Artificial
Intelligence Dead Yet?
Sounds dull, you say? Lame and uninteresting?
Well, yes.
It is, actually.
The fact is, the competition for good themes has been getting pretty heavy
since software development became the In Thing. Geraldo's got multithreaded
apps all tied up, and Bill Gates refuses to return my calls since his
unfortunate experience on the Howard Stern Interview. CNN is changing its
format from news to media news. Multimedia, unimedia, it wants to be the media
medium. The latest Larry King Live was on tape. You probably caught Ted
Koppel's in-depth look at algorithms, Rush Limbaugh's expose of groupware, or
Vladimir Posner putting his Russian spin on RPN. Donahue has jumped on softer
software, Maury Povitch has got bugs, and Oprah is all over OOPS.
You get the idea.
What's that?
No, Jon, I don't think I'll tell them the one about David Letterman and late
binding. We'll just save that one, okay? Maybe you can use it at Software
Development '94. I don't think I could do it justice.
We're_ back. Still here, actually. Didn't go anywhere. And I don't think I can
put it off any longer. Next up: Is Artificial Intelligence Dead Yet?
The death of artificial intelligence has been predicted for as long as I've
been paying attention, but I've been a skeptic. I always thought that AI would
live forever. Now I'm not so sure.
The logic behind my view ran like this: AI will never die because it tackles
ill-specified problems that can never be conclusively solved, like speech and
handwriting recognition, reasoning, learning, and judgment. Since its problems
will never be solved, its grants will never expire and there will always be
picking for those who toil in the AI field. Or, AI will never die because its
label keeps getting peeled off and affixed to new goods when the old ones
spoil.
But I look around me, and I see expert systems and fuzzy logic and neural nets
solving real problems. Medical diagnosis, microwave cooking, orange-juice
quality control. But the fuzzians and neural netniks and even the expertians
seem less willing than past practitioners of the heuristic arts to
characterize their magic as artificial intelligence. Even in academia the
label seems to be losing its luster.
Is AI dead? I'm a fan, but if the current trend away from using the term
continues for another year or two, I'd say AI is kaput. Muerte. Finito.
And there you have it. Dead? Alive? Or in some sloppy limbo that is neither
life nor death? You decide. Until next month, for "Swaine's Flames" and
Channel Mine this is Mike Swaine saying, Good Night and Good Bytes.

;-]

I can't believe I said "sloppy limbo." You know Rush will count that as an
additional mention.
Michael Swaineeditor-at-large





























August, 1993
August, 1993
EDITORIAL


PCs and the 3Rs


Sitting here sweltering in the dog days of summer, no one really wants to be
thinking about reading, 'riting, and 'rithmetic. Still, it won't be long
before school bells are ringing, locker doors slamming, and, if the PC
industry has its way, disk drives whirring.
Education is fad happy (although cynics might contend "fad plagued"). From
whole-language strategies to school vouchers, another new liniment for
educational aches and pains is around the corner. Computers in the classroom
is one such teacher's pet. Computers, say the experts, will make teachers more
effective, students more attentive, and learning more meaningful.
The problems that fads are supposed to solve range from high truancy rates to
low test scores. According to the results of the International Assessment of
Education Progress exams given to students in 15 nations, U.S. students ranked
7th in science and 14th in math. At the same time, U.S. SAT scores show little
improvement, with the verbal scores dipping from 426 to 423 while math scores
improved slightly from 467 to 476. PCs, say the PC industry, will improve
this.
Schools have long been among the crown jewels for computer companies. The
software vendor who cracks the halls of academia with a killer app will move
to the front of the class, finance-wise. Book publishers have known this for
years--that's why they wine-and-dine textbook-selection committees,
particularly in California and Texas whose leads other states follow.
Education is big business, and technology vendors--from PC makers like Apple
and IBM to Chris Whittle's Channel One in-school TV commercials--want a piece
of it. The U.S. alone spends more than $215 billion a year to educate over 46
million students in pre-university schools. Even a small piece of a
multibillion dollar market is worth going for, and marketing efforts are on
the upswing. (At a recent education technology conference, one investment
banker was quoted as saying that "educational multimedia will be the
investment opportunity of the '90s.")
PC companies make their pitch to schools in a variety of ways, including
studies like the Software Publishers Association's "Report on the
Effectiveness of Technology in Schools 1990--1992" conducted by Interactive
Educational Systems Design, an educational-technology consulting firm. It
comes as no surprise that the report, ultimately funded by software companies,
concludes that technology has a positive effect on student achievement,
self-concept, and attitudes.
However, the U.S. arguably is ahead of other countries in implementing
technology in schools (about half the states are on the road to requiring
computers in the classroom), even while SAT scores are down from 20 years ago
with no improvement in reading and writing skills. None of the other countries
in the International Assessment study rely on computers to the extent the U.S.
does.


So how do we go about fixing education?


Lewis Perelman, author of School's Out: Hyperlearning, the New Technology, and
the End of Education, says, "Education cannot be reformed, it needs to be
replaced." He adds that "schools don't need to be housed in buildings at
all--most can be accessed through a portable, personal telecomputer terminal,"
explaining that through telelearning "schools will be transformed from a
centralized architectural and bureaucratic structure to a dispersed
information and service channel."
Less extreme are research projects like Vivarium, which has been exploring
technology-based solutions for several years. Vivarium, sponsored by Alan Kay
and Apple Computer, studies computers as "amplifiers for learning" at the Open
School in Los Angeles. According to Kay, Vivarium seeks "to better understand
the value computers might have as supporting media."
Ultimately, however, many arguments for technology-based solutions to
education woes don't wash. Hardware and software are often too expensive and
ill-designed, and teachers too busy and inadequately trained to use them
effectively. Technology can help in many situations, but it's not the sole
answer as Perelman and others might profess.
PCs do have a place in schools but we need to look at where computers have
proven successful, then model education use accordingly. In business, for
instance, PCs have clearly made a difference in record keeping and
communication. Spreadsheets and databases for calculating and recording grades
and maintaining student records can free up teacher time, enabling more
teacher-student interaction. Networking PCs with administrative offices would
make even more time available. Adding communication capabilities so that
teachers can communicate with parents is another step. Simulation, which has
proven successful in business and engineering, lends itself to student
exploration in science and math. (The Vivarium project has had great success
with ecological simulations.)
Technology is not the solution to education's problems. Instead, technology
is, as Alan Kay says, "supporting media" that makes it possible for teachers
to more closely interact with students and parents and more closely focus on
individual needs--elements that even Perelman admits are critical.
Jonathan Ericksoneditor-in-chief



































August, 1993
LETTERS


Windows Instance Return




Dear DDJ,


Just about the most common programming task required for each application I
write for Windows is located at the beginning of WinMain, namely, code to
ensure that, at most, one instance of the application can exist. All this code
has to do is return to the previous instance, if any. Sounds simple, right?
Well, there's no Windows function to do it! Not only that, but the existing
published examples I've seen fail in a number of simple situations. See
Example 1 for some generalized code that you can plug right into your
applications to accomplish this common task. This code has been tested in
several applications.
David Spector
Waltham, Massachusetts


Copyright Questions




Dear DDJ,


Having owned three IBM-compatible computers since 1982 and having used several
such computers at work, I'm concerned about the issues raised in your January
"Editorial." In this ten-year period, I've probably spent more dollars on
software than on hardware. I have also upgraded software numerous times and
changed to new products when old products failed to stay competitive or when
their update costs were out of line with those of their competitors. I also
have made a conscientious effort to regularly backup my systems at home and at
work using both floppy- and tape-based techniques. Combining this experience
with your editorial has prompted several concerns. Therefore, I would be
interested in an article addressing the following:
Ignorance of the law. Publishing a copy of the law and an interpretation of it
in plain English would be most helpful. Along with this interpretation, the
impact of the law on some typical examples would be most helpful. Only with
this type of assistance can the user community understand how the law relates
to their use of computer software.
Creditability of vendor records. During this ten-year period I have made an
effort to register all of my software. Unfortunately, when I moved recently, I
found out that many vendors do not make a very strong effort to track their
users. In many cases their records were incomplete or non-existent, or they
had mixed up registrations for work and personal copies. In one case, the
vendor had no way to track copies owned at both work and home by the same
individual. Their response in many cases was to send in the registration card
(which had been sent to them years before). Therefore, how can the user
community be held responsible for proof of license when vendors are not
diligent in their record keeping?
Number of copies made. Your article states that if more than two copies are
made for personal use, penalties can be imposed. Does this cover copies
contained in system backups? If so, probably every user of backup software
could be sent to jail. Most long-term users have learned that incremental
backups are fine over the short term, but that full backups are required
periodically if system restoration is to be feasible in any sort of reasonable
manner. In ten years I have probably accumulated numerous copies of key
software on my floppy and tape backups.
The other aspect of this question is, what about previous versions of the
software--do they count as additional copies? In many cases as software has
changed over the years the new versions lack direct compatibility with data
created using previous versions. Therefore, to provide reliable access to this
data, all versions of the software must be retained.
What if the software license allows you to make more than one copy, do the
numbers stated in the law still apply?
Purpose of backup copies. We backup software to ensure that if something
happens to the original diskettes, we can still use the software. What is our
position if the originals are lost or destroyed? Having a legitimate copy of
software without the original disks may place the user in a difficult
situation. Also, in such a situation, can the user then legally make an
additional copy? Further, does the working copy of the software on the hard
disk count as one of the two backup copies?
Burden of proof. Your editorial implies that the powers that be must first
establish reasonable proof that you have violated the law to enter your office
or house. Could they use the fact that you own an option board or Fastback as
an indication of generic guilt and search for any duplication, or must they
specifically state that they are looking for duplicate copies of Wordperfect?
If they must be specific, what happens if they find other copied software
during their search? If the first case is true, then maybe the industry should
track the sale of all companies selling backup software more closely.
Out-of-print software. Several of the programs I have used over the last ten
years have ceased to be available. Can I make copies of these programs to
allow others to access the old data created with them?
Define availability for use. If you have a computer at home and one at work,
can you use one copy of a software package in both locations under any of the
following conditions?
1. You delete and reinstall the package with each use.
2. You delete and reinstall only the main .exe file with each use.
3. Your computers are password protected or locked and only you have the means
of access.
4. Your computers are used only by you at both locations, but there is no
physical or software control that prevents others from using them. For
instance, no one at home knows how to use the computer and your office is
locked when you are not at work and the computer is assigned for your use
only.
Existing copyright procedures. How does the new law relate to the ability of
individuals to record video and audio works for personal use?
I hope that you find these issues useful in the development of an article
addressing the critical issues. As for me, I wonder why a nation with an
increasingly permissive society supports the Gestapo-like tactics of the SPA.
Because of my lack of understanding regarding the full implementation of the
new and or existing laws and SPA's use of fear to motivate software users, I
must request that you not print my name in your magazine.


OS Resource Management: Another Country Heard From




Dear DDJ,


I take issue with Andrew D. Todd's assertion that it is the role of the
operating-system developer (OSD) to manage computer resources ("Letters," DDJ,
February 1993). It is the role of the OSD to provide computer resources.
Managing those resources should be the role of the application-program
developer (APD), which implies that the APD be given the opportunity to manage
them. When an OSD provides resources that it uses routinely in its own
software products, but refuses to make available to the programming community
at large, that is management of computer resources of a particularly invidious
sort.
Furthermore, the whole point of Undocumented DOS (and, I assume, Undocumented
Windows) is not that Microsoft should put more into DOS in order to reduce the
workloads of APDs, but that DOS already has more in it than Microsoft is
willing to divulge. The work of the authors of the Undocumented books, and the
thousands of others who mine for hidden nuggets in the operating systems we've
bought into, is based on the premise that everyone who owns a stake in the
claim should start digging with the same-sized shovel, and have an equal
chance at striking a rich vein.
Microsoft's claim that it doesn't document some features because it would then
be committed to supporting them in the future rings a bit hollow. Many of the
undocumented features exist in the first place because Microsoft needed to
circumvent DOS's deficiencies in its own products to the same nightmare of
incompatibility as everyone else's, besides making it impossible to do some of
the things that users have come to expect of their software. The fait accompli
with which Microsoft has been presented is entirely of its own making.
Is Microsoft, having been forced to document and maintain some of the features
that developers (including myself) found indispensable, better off than it
would be if it had presented those features in an orderly manner in the first
place? Would Microsoft enjoy its present status among OS vendors, and would
Bill Gates be one of the wealthiest men in America, if some of the first APDs
hadn't found unmapped routes around the roadblocks that DOS erected and, as a
consequence, made their products so useful that they sold in the hundreds of
thousands? When we buy an operating system, do we buy the whole thing or just
those parts of it that the OSD is sure won't allow us to create products that
could conceivably compete with its own? If I don't have the right to use
everything the OSD put into the product, where is my discount from the
purchase price? Would you buy a car, Mr. Todd, if the manufacturer hid the
fuel filter and then refused to tell you where it was?
Mr. Todd's proposal for dealing with undocumented features, if followed to its
logical conclusion, presents a chilling prospect. In fact, part of it is
already in place. A system of [leaks] "to the APDs of known probity" is not
exactly a new idea, though it isn't always clear just how an APD's probity is
assessed. The idea that "APDs may commission new features in the operating
system," of which "interested parties" would then be notified, and for which
they would presumably be billed, is a prescription for turning the dull
headache of version incompatibilities into an epidemic of unimaginable
virulence.
Finally, I would say to Mr. Todd that if he has ever written a DOS TSR that
made more than trivial use of OS resources without resorting to undocumented
features, he should write a book. It is sure to find a ready market among
serious programmers, who aren't looking for thrills when they use undocumented
features, but for ways to accomplish what would otherwise be impossible.
Richard Zigler

Marion, Michigan
Example 1
/* find_prev_wnd(): Find the main window of the previous
 * instance by searching all top-level windows
 */
HWND prev_wnd;
FARPROC find_prev_wndi;

BOOL CALLBACK _export find_prev_wnd(HWND wnd, LPARAM lParam)
{
 /* Look for a top-level window having the same instance*/
 if (!GetParent(wnd)
 && GetWindowWord(wnd, GWW_HINSTANCE)== (HINSTANCE)lParam)
 {
 prev_wnd = wnd;
 return FALSE; /* Stop enumerating */
 }
 return TRUE; /* Continue enumerating */
}

/* Main application function */
int pascal
WinMain(HINSTANCE hInst, HINSTANCE hPrevInst, LPSTR pCmdLine, int Show)
{
 /* If there is a previous task instance, return to it */
 if (hPrevInst)
 {
 find_prev_wndi =
 MakeProcInstance((FARPROC)find_prev_wnd, instance);
 EnumWindows(find_prev_wndi, hPrevInst);
 FreeProcInstance(find_prev_wndi);
 /* Window not found error */
 if (!prev_wnd)
 return 0; /* Fail silently */
 /* Return to the last active popup window, if any,
 * of the previous instance
 */
 prev_wnd = GetLastActivePopup(prev_wnd);
 /* if the existing window is minimized, restore it */
 ShowWindow(prev_wnd, SW_SHOWNORMAL);
 SetActiveWindow(prev_wnd);
 return 0;
 }
 . . .
} /* end WinMain() */

















August, 1993
The History of Programming Languages


HOPL features top language designers




 K.N. King


K.N. King is an associate professor of mathematics and computer science at
Georgia State University. He is the author of Modula-2: A Complete Guide (D.C.
Heath) and is currently at work on C and C++ books for W.W. Norton. He can be
reached at king@prism.gatech.edu.


It's official. C and C++ have come of age, or at least grown old enough to be
featured at the second History of Programming Languages conference (HOPL-II)
held in Cambridge, Massachusetts this past spring. HOPL-II, which provided
programming-language designers a forum for discussing their languages, staking
their claim to immortality, and (occasionally) taking a swipe at competing
languages, brought together the likes of Dennis Ritchie, Bjarne Stroustrup,
Niklaus Wirth, and Alan Kay under one roof--a rare occasion indeed.
In particular, I was curious to see how the designers of C and C++ would be
received by the proponents of languages such as Ada and Pascal. Despite their
enormous popularity (or perhaps because of it), there are still academics who
view C and C++ with suspicion. Would Wirth exchange pleasantries with Ritchie
or engage him in hand-to-hand combat? I couldn't wait to find out.


Preserving History


The computing field hasn't always done a good job of preserving its own
history. As time passes, our pioneers pass away and valuable artifacts are
lost with them. For instance, according to HOPL-II chair J.A.N. Lee, the first
Fortran compiler now exists only in binary form as boxes of punched cards; the
source code has been lost or destroyed. To help remedy this situation, the
Special Interest Group on Programming Languages (SIGPLAN) of the Association
for Computing Machinery (ACM) sponsored the first HOPL conference in 1978.
This conference, which covered languages in use by 1967, included speakers
such as John Backus (Fortran), Alan Perlis and Peter Naur (Algol 60), John
McCarthy (Lisp), Jean Sammet (Cobol), Kristen Nygaard (Simula), Thomas Kurtz
(Basic), George Radin (PL/I), and Ken Iverson (APL).
This year's HOPL-II picked up where HOPL left off. For a language to be
eligible for HOPL-II, the program committee required that "preliminary ideas
about the language were documented by 1982 and the language was in use or
being taught by 1985." Thus, HOPL-II was able to cover languages developed in
the '70s as well as upstarts like C++.


The Worst Programming Language Ever


In his keynote address, "Language Design as Design," Fred Brooks (project
manager for the IBM System/360 and author of The Mythical Man-Month) gave a
software designer's view of programming-language design.
Brooks acknowledged that his credentials in the programming-language arena are
shaky. He called OS/360 JCL, which was developed under his supervision, the
"worst programming language ever designed." He also ruefully admitted being
"the person who tried to displace Fortran with PL/I."
In his view, the data types and data structures a language provides must come
from its intended application area; the operations are then determined by the
data types. Everything else, according to Brooks, is "languagehood."
Brooks also discussed "rationalism versus empiricism" in language design. As
he sees it, rationalism (that is, designing a language by "pure thought") is
doomed to fail, since we have "no hope of getting our complex designs right
the first time."


Languages, Languages, Everywhere


Most of the other presentations were by people involved in the original design
of various languages: C.H. Lindsey (Algol 68), Niklaus Wirth (Pascal), Per
Brinch Hansen (Concurrent Pascal), Alain Colmerauer (Prolog), Jean Sammet
(FORMAC), Barbara Liskov (CLU), Alan Kay (Smalltalk), Ralph Griswold (Icon),
Dennis Ritchie (C), and Bjarne Stroustrup (C++).
Several talks didn't fit this pattern however. William Whitaker, who oversaw
the development of Ada for the Department of Defense, discussed the management
of the Ada project, not the design of the language itself. Guy Steele and
Richard Gabriel traced the evolution of Lisp, while Richard Nance gave a
history of discrete-event simulation languages. The presentation on Forth was
made by Elizabeth Rather instead of Chuck Moore, the actual designer of the
language.
A shorter talk, by someone involved with the language yet with a different
viewpoint, immediately followed each main presentation. Kay was followed by
Adele Goldberg, for example, while Stu Feldman, an early user of C and C++ and
author of the UNIX f77 compiler and make utility, followed Ritchie and
Stroustrup.
Steele and Gabriel employed a "tag-team" approach in their presentations on
Lisp. Using two overhead projectors, the pair alternately traced the
development of Lisp on an amazingly detailed schematic. Still, honors for the
most unusual presentation go to Alan Kay. Although his paper in the conference
proceedings discussed the history of Smalltalk, his talk had little to do with
the language. Instead, Kay focused on issues that he finds important,
including the importance of general education instead of specialized training
and the need for creative freedom. His overheads were unique, to say the
least, featuring pithy sayings ("You can make a doghouse out of anything"--in
other words, it's easy to write small programs), a picture of an E. coli
bacterium, and a copy of The Federalist Papers (labeled "Best Book on Complex
Systems Design?").
One potential speaker was conspicuous by his absence from the podium. Jean
Ichbiah, the chief designer of Ada, had been invited to write a paper on the
history of Ada, but declined, citing lack of time. Nonetheless, he attended
and participated in question-and-answer sessions.


Second-guessing


Language designers spent a lot of time discussing the reasons for their
decisions--and even indulged in a bit of second-guessing. Ritchie's paper on C
identified its major problems, including "the failure of the original language
to include argument types in the type signature of a function." He called ANSI
C "a noticeably better language" and acknowledged in later questioning that he
uses ANSI C himself.
Stroustrup was fairly happy with the decisions made in the design of C++. He
did, however, confess to making one major mistake: not providing a basic class
library from the beginning. As he put it, "Release 1.0...should have been
delayed until a larger library including some simple classes such as singly-
and doubly-linked lists, an associative array class, a range checked array
class, and a simple string class could have been included."
Wirth, however, refused to second-guess himself: "It is... fruitless to
question and debate early design decisions; better solutions are often quite
obvious in hindsight. Perhaps the most important point was that someone did
make decisions, in spite of uncertainties."


Secrets of Success


Designers also spent time analyzing the reasons for their success. Ritchie was
modest about C's success, even somewhat embarrassed by his own celebrity.
(When asked if there were anything he'd do differently, Ritchie replied,
perhaps only half in jest, "I'd become a monk!") He summed up the reasons for
C's success on two overheads, titled "How to Succeed in Language Design
without Really Trying." The first noted C's widespread availability, its
ability to interact with its environment, and its adaptability to unexpected
situations. But the second gave the real secret: "Be lucky."
For his part, Stroustrup noted "the need for a programming language and the
code written in it to be just a cog in a much larger machine" as a key factor
in the success of C++, while Wirth said that the most important reason for
Pascal's success was that "many people capable of recognizing its potential
actively engaged themselves in its promotion."



Exploding Myths


Speakers often cleared up common misconceptions about their languages.
Ritchie, for instance, pointed to the widespread belief that C's ++ and --
operators were added to take advantage of the PDP-11's autoincrement and
autodecrement addressing modes. In fact, these operators were present in B,
the immediate predecessor of C, which was designed before the PDP-11 existed.
Stroustrup said that he chose C as the basis for C++ because it was "the best
systems programming language available," not because he worked at Bell Labs,
as people have often assumed. He also attacked the belief that C++ is
successful because of AT&T's marketing clout. "We once had a marketing budget;
it was $3,000. It lasted for three years."


Oddities


HOPL-II was a goldmine for trivia buffs. What do Scheme and Forth have in
common? (Both languages originally had longer names that were shortened to
satisfy operating-system restrictions. Scheme was originally named Schemer,
after the AI languages Planner and Conniver, while Forth was supposed to be
Fourth, as in "fourth-generation."
Which language was the first not to have goto statements? (Concurrent Pascal,
according to Brinch Hansen, who added "What are you supposed to do with goto's
in a concurrent programming language? Where are you going?")
What was the origin of the // comment convention in C++? (Far from being an
innovation, it was lifted directly from BCPL, a predecessor of C.)
What did the Department of Defense do before officially choosing Ada as the
name of their new language? (According to Whitaker, they contacted the heirs
of Ada Lovelace for permission to use her name. He jokingly wondered whether
Wirth had gotten similar permission from Pascal's descendants.)
How much code is written for the DoD each day? (Two million lines!)


Warfare in the Hallways?


So how did these luminaries get along? For the most part, just fine. During
coffee breaks, Ritchie talked to Wirth while Stroustrup chatted with Ichbiah.
So much for my visions of warfare in the hallways. Similar respect was evident
during the talks, with speakers generally avoiding direct attacks on competing
languages. There were exceptions however.
In his talk on Concurrent Pascal, Brinch Hansen referred to Ada as "large" and
"incomprehensible" and C as "small" but "insecure." Later, he again criticized
Ada and C, going so far as to accuse them of not being programming languages
at all.
Wirth likewise attacked Ada, saying it lacked "an economy of design without
which definitions become cumbersome and implementations monstrous." He
refrained from criticizing C, however, until the closing panel, where he said
that "hacking is in" and claimed that "most programmers enjoy working by trial
and error." Looking at Ritchie, who was sitting next to him, Wirth continued:
"The most important promoter of this trend: C." He said that languages such as
C are useful for bootstrapping software onto a new machine, but their use
should be only "temporary."
Ritchie, taking these attacks graciously, noted that Wirth's points were
"well-taken" and acknowledged that "it is possible to use C in a better way
than people do." But he also said that "one sometimes has to make compromises"
in the real world.
In his earlier talk on C, Ritchie actually said good things about Pascal, even
admitting that Pascal is "elegant." He listed many similarities between C and
Pascal, which were developed at about the same time but without contact
between the designers. He observed that the languages even share some of the
same problems, such as handling arrays with varying bounds.
Even Stroustrup couldn't resist a few shots at C. Of all the languages he used
in the '70s, Algol 68, not C, he said, was his favorite, noting in particular
his dislike for C's syntax and the loopholes in its static type checking.


Improving SEX


An excursion to Boston's Computer Museum one evening featured the opening of
what was billed as the "first-ever museum exhibit on programming languages."
The museum had asked conference attendees to bring buttons, T-shirts, and
other pieces of "programming language ephemera" which would later be added to
the exhibit's "Tower of Babel," a tall structure listing hundreds of
programming languages.
At a banquet the next evening, Cobol pioneer Jean Sammet told of being
upbraided by a visiting IBM executive, who complained that members of her
group were discussing sex on company time. It turned out that the visitor had
overheard the staff referring to a subroutine named FMCSEX by the last three
letters of its name, which stood for "symbolic expression." At the time of the
visit, unfortunately, the group had been discussing ways to improve SEX and
make it faster.


Survival of the Fittest


While history was clearly the focus of the conference, the future was not
ignored.
One important issue, of course, is the survival of programming languages as we
know them. William Whitaker rhetorically asked, "Will there be languages such
as Ada, Fortran, and Pascal in 15 years?" He answered his own question with a
resounding "yes," admitting, however, that the "growth of new languages will
probably slow down" and "evolution will produce dominant languages" that will
"force out" weaker ones.
Other speakers concurred. Early in the conference, for instance, Sammet noted
that, of the approximately 1000 languages implemented up to 1993, 700 are
dead. Even well-known languages are not immune, as evidenced by Steele's
comment that "Lisp has been on the decline for 3--5 years."
Not that every language designer would be upset if his or her language
disappeared. Alan Kay admitted that he "wouldn't shed a tear" if Smalltalk
disappeared tomorrow. Languages shouldn't hinder progress by outliving their
usefulness, he said.
What will drive the development of future languages? Opinions varied, but
several speakers singled out the need for better parallel computing support.
In his paper on Concurrent Pascal, Brinch Hansen stated, "I don't think we
have found the right programming concepts for parallel computers yet. When we
do, they will almost certainly be very different from anything we know today."
Both Ritchie and Kay pointed out the growing number of people using computers,
and Kay also mentioned the influence of "pervasive, worldwide networking."
Both trends could have a long-term effect on programming languages.
What should students learn as a first programming language? Many colleges are
beginning to teach C as a first language. Ritchie didn't endorse this trend.
Any approach that tends to produce dependence on a particular language is bad,
he said, suggesting that Scheme might be a good choice. Kay refused to pick a
particular language. Wirth asked "Are you teaching a skill or [providing]
general education?" In the former case, he recommended Ada; in the latter, he
advised using "a simpler language"--but not C. "I view the landslide of C use
in education as rather a calamity," he said.


Another HOPL?


Will there be a HOPL-III? Probably. When will it be held? Maybe 5 years from
now, maybe 15.
Preprints of the HOPL-II papers appear in the ACM SIGPLAN Notices, March 1993.
A more complete record of the conference is to be published in 1994 as the
book, History of Programming Languages-II. (The original History of
Programming Languages, based on the 1978 HOPL conference, is still in print.)
Reading these histories is the next best thing to a long chat with Dennis or
Bjarne.


References



Second ACM SIGPLAN History of Programming Languages Conference Preprints.
Published as ACM SIGPLAN Notices (March, 1993).
Wexelblat, Richard L., ed. History of Programming Languages. New York:
Academic Press, 1981.




























































August, 1993
C/C++ Standardization: An Update


Is the future of C spelled "C++"?




Rex Jaeschke


Rex is a member of X3J11, a U.S. International Representative to ISO C (WG14),
and convener of X3J11.1, the Numerical C Extensions Group. His most recent
books are The Dictionary of Standard C and C++: An Introduction for
Experienced C Programmers (CBM Books). You can can reach Rex at rex@
aussie.com or 703-860-0091.


If the number of recently published books and magazine articles are any gauge,
you'd think the future of C is spelled "C++." To paraphrase Mark Twain,
however, reports of C's death are greatly exaggerated. Books and articles
aside, C remains a proven, reliable, precisely defined workhorse, still used
by many more programmers than C++. Furthermore, use of C continues to grow,
especially in internationalization applications, where C is the language of
choice.
Since my article "Standard C: A Status Report" (DDJ, August 1991), C has been
buffeted on several fronts, including that of internationalization and, most
notably, the surge of C++. In this article, I'll look at the current status of
Standard C and Draft Standard C++. I'll also examine factors that could affect
the future of these languages and discuss what is and is not technically
and/or politically feasible.


C and C++: Current States and Standards


The ANSI C standard (X3.159-1989) was ratified in December 1989, although
technically it was completed a year earlier. An ISO C standard followed,
but--except for a number of minor editorial and formatting differences--it was
equivalent to that accepted by ANSI. In 1992 the ANSI standard was officially
withdrawn, as control of the C standard passed from ANSI to ISO, so that we
now have a single standard. The upshot is that C is truly an internationally
managed language.
In December 1992, the ISO C committee distributed for balloting an addendum
that added several headers and a large group of associated macros, typedefs,
and functions--all of which enhance the support for multibyte characters. Some
digraphs also were added to overcome readability problems with terminals using
the ISO 646 character set. This addendum will likely be approved by the end of
1993.
The ANSI and ISO C committees are now in interpretations mode. Any new
development comes through the ISO committee although national committees (such
as that from the U.S.) could be authorized to do technical work on ISO's
behalf.
There are a number of issues pertaining to internationalization still pending
at the ISO level, among them the use of national characters in identifiers.
Other issues will surely arise, particularly as we gain experience with
existing components of the standard. (It's significant to note that
Microsoft's Windows NT uses a 16-bit-based character set, not ASCII. With its
extensive multibyte support, C will grow in this direction even further.)
X3J11.1, known informally as the "Numerical C Extensions Group" (NCEG), has
issued final drafts of three parts of its technical report. (For background
information on NCEG's work, see "Numerical Extensions to C," by Robert Jervis,
DDJ, August 1992.) While this report does not have the power of a standard, it
nonetheless lays the groundwork for additions to the language. The final
report is expected by the end of 1994.


Enter C++


In December 1989, an independent C++ standards effort (X3J16) was launched.
It's important to note that the X3J11 C committee declined to take on the job
of standardizing C++. Certainly C++ and Standard C have common ancestors, but
they were viewed as different languages. For more information on C++
standardization, see "Standard C++: A Status Report" by Dan Saks (special
supplement to Dr. Dobb's Journal, December 1992.)
Within the first year or so of deliberation, the committee decided that the
C++ standard would be a joint ANSI/ISO effort. Currently, meetings are held
jointly, alternating between U.S. and non-U.S. locations, resulting in
considerably more international participation than the C standard had in its
infancy.
The C++ committee is also inventing much more than did the original Standard C
committee. The additions being considered and the set of standard classes that
will be defined may well result in the final standard being delayed beyond the
current goal of 1996. And don't forget that when the draft standard goes out
for public review, the committee must respond in writing to every comment
received. This process, which could easily take a couple of years, is repeated
until no more substantive changes are made and no appeals are pending.
(Standard C went through three cycles.) As a result, a C++ standard isn't
going to become official any time soon.
It is important to note that C++ is not a proper superset of Standard C; there
are numerous differences. The C++ standards committee has a C Compatibility
subcommittee that identifies the incompatibilities and either rationalizes
them or suggests changes. ISO JTC1/SC22 decreed that Standard C and Standard
C++ should have no gratuitous differences; but some differences are permitted,
meaning that C++ might never be a proper superset of C.
A number of other ANSI and ISO committees are busy defining language bindings
for C and C++. Some are also working on language-independent issues--parallel
processing, internationalization, language-independent arithmetic,
procedure-passing methods, and the like--that will likely have some impact on
the C and C++ standards.


Possible Directions


We may see a number of possible scenarios played out regarding C and C++,
among them:
Freeze C based on the latest ISO addendum, thereby placing it in maintenance
mode. This is quite restrictive in that it ignores the efforts of X3J11.1,
doesn't permit further work on internationalization support or the addition of
some interesting and useful parts of C++, and doesn't cater to the extensions
that will likely be possible (if not mandated) by related standards bodies.
Extend C, but only in the direction of C++. This is also restrictive because
it, too, ignores the efforts of X3J11.1 and doesn't permit further work on
internationalization support.
Extend C in a number of directions, including C++ and internationalization.
This isn't restrictive and allows for the efforts of X3J11.1 and other
subcommittees to be considered.
Formally coordinate the C and C++ standards committees with the long-term goal
of merging the two languages. The result could be a single language, C++, or,
the language C++ with a distinct subset called C.
The third approach has the greatest potential for C and C++ diverging forever.
Any changes and/or additions to the language and preprocessor that aren't also
adopted by the C++ committee will cause both technical and political problems.
Providing new headers is a non-issue since these do not require linguistic
support and can readily be adopted by the C++ committee, provided they are
well thought out and address a real problem. (Hopefully this will be the case
with the ISO C addendum.)
It's worth noting that some commercial C implementors would much rather extend
C "just a little bit more" (for example, adding a complex type rather than
defining a complex class) rather than buying into the whole of C++.
The fourth option--making one language--requires that differences between the
languages be eliminated completely. Realistically speaking, this is where we
are now, except the subset is not a proper subset, and the subset is the
responsibility of a separate standards committee.


Probable Directions


Because C and C++ are far more similar than different, a more formal
synchronization between them would be beneficial if, for no other reason than
to reduce the resources required to participate in standards activities.
Synchronization does, however, have its own problems. Currently, the C
committees are busy interpreting their standard with an eye on at least some
minor additions in the internationalization arena. Meanwhile, the C++
committee is working on its first standard using a specific project proposal
endorsed by ANSI and ISO. It's almost certainly a bad idea to derail the
current C++ standardization effort to include synchronization with Standard C.
The setback in time and inertia would likely produce a lose/lose situation.
The more likely alternative is to wait until the C++ standard is approved
before attempting some formal synchronization. Let's say, for argument's sake,
that C++ is standardized in 1996. Will Standard C stand still in the interim?
Not likely. Will it be extended in the C++ direction only? Again, probably
not. In the meantime, if C is extended in ways that are incompatible with C++,
synchronization will be even more difficult.

It's all well and good to say that if the C committee wants to extend C, it
should be consulting the C++ committee. That's good advice, but the C++
committee has enough to do without other standards bodies bothering them. This
isn't to suggest the C++ committee is ignoring outside input; they simply have
more interest in their own charter, and rightly so.


Closing Thoughts


Even if it were generally agreed that the future of C is really C++, the
transition is problematic. It's one thing to be starting new projects with
newly trained people and a new design methodology; it's quite another to have
a nontrivial investment in code and training already in place.
The issue of object-oriented design and programming is separable from the
language that implements it. While C++ is the commercial leader for OOP
technology, in the next decade most of the popular procedural languages will
likely also have OOP extensions.
Until a C++ standard is completed, C will probably be extended in the
direction of C++ as well as in other ways, some of which will also be picked
up by C++. Once the C++ committee has met their original goal, some kind of
formal synchronization plan is likely. Certainly informal discussion can, and
probably should, occur long before then, but it's unreasonable to expect
anything formal until then.
So, is the future of C spelled "C++?" Quite possibly, but not until the end of
this decade at least, and maybe never. Formal synchronization will require
compromise, and we all know how bloody territorial disputes can get. And while
we all might want there to be one way, each of us wants it to be our way.
Just what kind of extensions should we make to C? While you could certainly
consider completely new ideas--perhaps adding packed-decimal type or I/O
statements--many proven extensions are already in existence. Since the list
(and merits) of possible extensions is endless and very subjective, I'll not
discuss it further. Instead, I'll identify some obvious and incremental ways
in which C could be improved.
A number of small extensions could be made. For example:
Require float and long-double math libraries.
Add extra E*value macros for errno (for fopen failures, for example).
Add I/O primitives such as "get character without terminator" and "get
character without echo."
Add the ability to flush an input stream.
Add more LC_*macros to setlocale, and more locale machinery in general.
Add more multibyte library support.
Include binary integer constants.
Allow nonconstant expressions in auto aggregate initializers.
Add new bit-field types (such as char, short, long, enum).
The following C++ facilities to C might be considered as additions:
//-style comments.
Type-safe linkage (encoding function signature in generated name).
Declarations at other than the beginning of a block.
The extended syntax for the first expression in a For loop.
Functional notation casts.
Scope resolution operator.
Extra semantics of const.
Anonymous unions.
Stricter compatibility checking of enumerated types.
Overloaded functions.
Inline functions.
Default function arguments.
Operators new and delete.
References (although they aren't much use without operator overloading).
Requirement of a diagnostic on failure to return a value from a nonvoid
function.
Drop support for old-style function declarations and definitions.
Requirement of a prototype in the scope of a function call.
The following numeric extensions (X3J11.1) to C might be considered as
additions:
More FP/IEEE support.
Extended initializers.
Aliasing control via keyword restrict.
Complex data types and associated libraries.
Variably sized arrays.
Extended integer precision
Data-parallel constructs.
--R.J.














August, 1993
Strategies for Better Linked Lists


A C toolkit that saves time and memory




Garyl Hester


Garyl is the author of programming tools, connectivity products, and related
subsystems. He can be reached via CompuServe at 76507,1503 or through DDJ.


Linked lists are fundamental tools used by any application that deals with
variable types and data. The problem with linked lists, however, is that when
more than one is being used in an application, duplicate code is required to
handle each list. This is because the type of data being managed by each list
is different, and there is no simple way to educate the logic about the kind
of data being managed at any particular moment. Linked lists also tend to
diffuse when small pieces of data are being stored.
This article discusses linked-list theory and presents a generic linked-list
toolkit written in C that will reduce, if not eliminate, management problems.
(The entire toolkit is available electronically; see "Availability" on page
5.) This discussion focuses on doubly linked lists, but with a little effort
the algorithms can be used for singly linked lists. This package was developed
using Microsoft C 6.0; the Microsoft extensions were disabled for ANSI C
conformance.


Linked-list Strategies


Figure 1 shows a traditional linked list with three links, called "atoms." An
atom has three parts: the previous pointer, which contains the address of the
predecessor or Nil if this is the first atom; the next pointer, which contains
the address of the successor or Nil if this is the final atom; and the atom
data itself. The previous and next pointers form the bindery of the atom.
The atom's data area can be of any type and size, and is usually composed of
multiple fields. An atom is easily defined in C using a structure definition
like that in Example 1.
Head and tail pointers define the ends of a list. The head pointer is an
anchor whose value records the address of the first atom in the list, while
the tail pointer records the address of the last atom in the list. Even though
the tail pointer is not required (the terminating link can be determined by
traversing the list and looking for an atom whose next pointer is Nil), it is
worth the extra code and processing time to implement a tail pointer. With
doubly linked lists, the tail pointer allows you to start at the end of the
list and traverse upward. Head and tail pointers both must be declared as
pointers to their specific atom types, as in FOOATOM*pHead.
The algorithm for creating and adding an atom to the end of a linked list is
shown in Example 2. In this example, h and t are the head and tail pointers, a
is a pointer to an atom, and s is the atom size. Note that the data portion of
the atom is never referenced. Using this basic template, we can create a
routine to process any type of atom.
To implement more generic code, start by defining a generic bindery type, then
include that type as the first member of any atom definition; see Example
3(a). When processing the list, the atom-specific pointer is cast to a pointer
of type BIND and subsequently passed to the generic logic shown in Example
3(b).
By placing the bindery as the first member of any atom structure, we can fool
the generic logic into handling the head and tail as pointers to simple BIND
structures rather than complex FOO-ATOM structures. This method works well in
most situations but is rather tedious. As an alternative, define a standard
structure to hold head and tail pointers, as in Example 4. This way, only the
address of the ENDS structure--not the two pointers--can be passed.
While this method is easier, it requires you to cast the head and tail
pointers to a pointer of the appropriate type whenever needed. We can avoid
this inconvenience by declaring pointers as type void, indicating they have no
assumed type. Pointers of type void can be assigned to a pointer of a specific
type without casting. Notice in Example 5 that the value of void pointer pHead
is assigned directly to the pFooAtom pointer (a pointer to a specific data
type): Believe me when I say that even lint won't complain about these types
of assignments.


Linked-list Density


Linked-list algorithms make heavy use of dynamic memory functions like
malloc() and free(). A single call to malloc() returns a pointer to a memory
block n bytes long. However, for free() to work properly, the block must
contain information describing the nature of the block. This additional
overhead is referred to as a "memory control block" (MCB). If k is the size of
the MCB, then the actual amount of core being occupied by a block of n bytes
is n+k bytes. The actual size of the MCB is system dependent; in some
implementations it is as small as one word, while in others it is as large as
a paragraph (eight words)!
MCB size is important because it affects the list density, which is calculated
by contrasting the size of the data being stored (the bang) with the amount of
memory required to store it (the buck). This relationship can be expressed as
Density=nd/(k+n(b+d)), where n is the number of atoms stored in a malloc()ed
block, d the size of the stored data, k the MCB size, and b the bindery size.
This is the generic formula. Since n is almost always 1, the accepted form is
d/(k+b+d). As the density approaches 1, the list becomes more efficient. For
example, a list of integers kept in an array has a density of 1 because there
is no overhead: The amount of core required to maintain the list is exactly
that of the data being stored. When the value of n is 1, the only means of
achieving higher densities is to increase the size of d by storing larger
pieces of data. This makes linked lists impractical for storing small pieces
of data (integers, for instance) because the buck drastically outweighs the
bang.


Requirements


Taking all of this into account, a generic linked-list toolkit should:
1. Provide a generic set of routines for handling any type of atom data.
2. Not require special considerations of atom data structures on the part of
the user (that is, a BIND structure as the first member).
3. Increase list density.
4. Make best use of system memory.
The void data type satisfies requirement #1. By passing void pointers instead
of pointers to specific data types, we can make the data appear as anything we
want without lint complaining.
For requirement #2, the linked-list manager handles the allocation and
management of binderies. In effect, you request that an instance of the atom
data be allocated. The linked-list manager then tacks on the size of the
bindery, requests a block of that size from malloc(), and retains that
pointer. The pointer returned to the caller is to the atom data only; see
Figure 2. This lets the user employ the linked list as a subsystem, requesting
next and previous pointers, adding and deleting atoms, and so forth. However,
this approach does not fully satisfy requirement #3 because multiple malloc()
calls are required.
Requirements #3 and #4 are best satisfied by allocating space for more than
one atom at a time. By creating blocks of atoms at once, we increase the value
for n, and thus the list density. However, there is a trade-off in that memory
is allocated and can go unused. Therefore, you should specify the appropriate
size of a block for the application. An application that will require hundreds
of atoms will have a larger block size than one using only a few. Also, it is
possible to store an entire list in a single block, thus achieving maximum
density but requiring the user to estimate the maximum size of the list.


Design


The linked-list control block (LLCB) and the list boundary (LLST) govern the
linked list. The LLCB contains two sets of head/tail pointers, the size of the
atom data, and the number of atoms to create per block. The LLST contains one
set of head/tail pointers and a pointer to the LLCB; see Listing One, page
100.
There are three types of verbs in the toolkit: primitives, management, and
insert/move/swap. The primitives, defined in llprim.c (Listing Two, page 100),
are ll_open(), ll_close(), ll_mklist(), ll_rmlist(), ll_alloc(), and
ll_free(). ll_open creates a list and ll_close destroys it. (Table 1 describes
these calls.) ll_open() accepts the address of an LLCB, the size of the atom
data, and the number of atoms to create per block, and returns the address of
the LLCB passed to it.
Once the LLCB is set up, one or more linked lists can be created using
ll_mklist(), which associates a LLST with an LLCB. ll_mklist() accepts the
address of a LLCB and LLST, returning the address of the LLST. Multiple lists,
consisting of individual lists of separate atom chains, can be associated with
a single LLCB. Neither ll_mklist() nor ll_open() allocate memory; they only
initialize structures passed to them.
Atoms are created and destroyed using ll_alloc() and ll_free(), respectively.
ll_alloc() accepts a pointer to a LLST, and returns a pointer of type void.
Whenever ll_alloc() is called, the LLCB (taken from the LLST) is checked for
atoms that may exist in its free chain. If available, a free atom is removed
from the free chain and appended to the list managed by LLST. The address of
the atom data is returned.

If free atoms are not available, an attempt is made to create a fresh block of
atoms. A pointer to the atom data is returned if successful, else NULL to
signal that no atoms are available nor could be created. A block is organized
as shown in Figure 2. The atoms in a block are logically created, and because
they are all unused, are "sewn" together by linking their binderies. The first
atom in the block is added to the end of the free list in the LLCB. Figure 3
shows the results of a few ll_alloc() and ll_free() calls.
The ll_free() call accepts as arguments the address of an atom and the address
of the LLST to which the atom belongs. The atom is removed from the current
list and added to the end of the free chain maintained in the LLCB. Note that
a usage count is maintained within the block bindery information. When an atom
is allocated from the block, the usage count is increased. When an atom is
returned to the free chain, the usage count is decreased. When this usage
count reaches 0, the block itself is removed.


Atom Blocks and Pointer Arithmetic


The file llbase.c (available electronically) defines the management verbs
employed by the primitives to manage a linked list properly. The management
verbs should never be referenced directly by the application program. The
management verbs are ll_appatom (append atom to end of list), ll_mkblock
(create atom block), ll_unbatom (unbind atom from list), ll_inblock (find
block that contains an atom), ll_frstatom (return the first atom in a list),
ll_lastatom (return the last atom in a list), ll_getbind (get the bindery of a
neighbor), and ll_rmblock (remove a block of atoms).
The verbs ll_mkblock() and ll_rmblock() deserve some explanation. Using
malloc(), ll_mkblock() allocates a block of memory large enough to hold the
desired number of atoms (from the LLCB) and the block-control information
(LLBC). This block must be carved up into many logical atoms, then added to
the end of the free atom chain for the LLCB. This is done with simple pointer
arithmetic. Start by calculating the size of an atom block. The atom block
contains three basic types of objects: the block header, the atom bindery (one
per atom), and the atom data itself (again, one per atom). Figure 2
illustrates the internal arrangement of an atom block. The formulas for
calculating the overall size of a block of atoms are shown in Example 6(a).
The SizeofAtomData and AtomCount (which were specified when the LLCB was
created) are taken from the LLCB. Given that a call to malloc() will return a
pointer to a block of memory of exactly BlockSize bytes, our task is to carve
the region into usable atoms. The first step is to calculate the address of
the first atom bindery. This will be the first byte past the LLBC, as in
Example 6(b), which should give us the address of the first bindery, right?
Wrong.
Now is a good time to point out a dangerous thing about pointer arithmetic:
The compiler is very smart, and to make things easier for the programmer,
adding an offset to a pointer is the same as taking a subscript to the
pointer, or pBlock+sizeof(LLBC) is equivalent to pBlock[sizeof(LLBC)]. Because
the elements of an array have an inherent size associated with them, the
actual calculation performed by the compiler is:
pBind=(PLLBIND)(pBlock+(sizeof(LLBC)*sizeof(LLBC))), which is not what we
wanted. To get a better feel for what is happening here, consider an array of
integers. The size of an integer on 80x86 computers is two bytes. Let's assume
that the base address of the array (the address of the first integer in the
array) is 0x8000, and let's also assume that we want the address of the fourth
integer in the array. In C, we have to do as shown in Example 6(c).
The inherent size of an integer is two bytes, so the compiler multiplies the
size of the array item by the index to get the relative offset, and then adds
that offset to the base address to achieve the absolute address, or, from our
example, 3*2+0x80006+0x80000x8006.
If we obtained the address of the fourth element with the statement int*j=a+3,
nothing would be different. The compiler knows that a is a pointer to an
integer, and the index (3) would be modified by the size of the item being
pointed to. This fact about pointer arithmetic gives us pause before
proceeding.
How do we calculate the true address of the first bindery? Again we use the
power of type casting. By changing the compiler's assumption about the type of
item being pointed to, we can avoid the nasty problem of inherent sizes. By
casting the pointer pBlock into a character pointer (char*), the compiler
believes that pBlock points to a character, which has an inherent size of 1.
Therefore,the calculation, pBind=(PLLBIND)((char*)pBlock+sizeof(LLBC)); will
indeed yield the address of the first bindery. Further, the address of the
first atom data is sizeof(LLBIND)--bytes from pBind. However, we must again
cast pBind into a character pointer so that wearen'tsurprised:
void*pAtom=(void*)((char*)pBind+sizeof(LLBIND));.
At this point, however, we are not interested in the atom data area. Our task
is to ascertain the addresses of each of the binderies, link them to each
other, and then append them to the end of the free atom chain in the LLCB.
Because we are creating a set of new atoms, it follows that they are all
unused and will end up in the free chain anyway. By doing a little magic here,
we avoid calling ll_appatom() for each of the logical atoms as we locate them,
which would use up many more cycles than it is worth.
Now that we have discovered the address of the first bindery, and because we
know that this is the first of many atoms to be created, we can go ahead and
append this bindery to the end of the free list with a call to ll_fappend().
This done, we "sew" the remaining atoms together with the logic in Example
6(d). We do this once for each atom. When we are done, pBind contains the
address of the last bindery in the block. Because we know this to be the final
bindery of all free binderies, we will set the free-list tail to pBind after
setting pBind->pNext to NULL.
The destructor function, ll_rmblock(), has a similar task--only in reverse.
The atoms contained within a block must be removed from the free chain before
the block is destroyed--or else, disaster! The problem is that the atoms
contained within the block will more than likely not be in sequential order,
as they were when they were created. They must, therefore, be removed one at a
time using the ll_funbind (unbind atom from free chain) call. The problem then
becomes how to identify which atoms in the free chain belong to the block
being destroyed so they can be removed in a graceful manner.
One method is to traverse the free chain, using the ll_inblock() function to
determine if an atom is a member of the doomed block, and if so, remove it.
With the potential of a rather lengthy free chain, this approach is
undesirable.
But remember that we would have all of the atoms readily available if only we
could calculate their addresses. We did it once when we created them, and we
do it again. We calculate the address of the first bindery, then call
ll_funbind() with that address. Next we calculate the address of the next
bindery, and so on until all of the atoms have been removed from the free
chain. Finally, we remove the block from the block chain itself, and free()
the block.


Using the Toolkit


The first step in using the linked-list toolkit is to declare the LLCB and the
LLST variables and a structure to hold the atom data; see Example 7(a).
The next step is to initialize the LLCB using ll_open(), passing the address
of the LLCB, the size of the atom data, and the number of atoms to create per
block, as in Example 7(b). Once the LLCB is initialized, we can attach an LLST
structure to it using ll_mklist(), as in Example 7(c).
Now the linked list is ready to use. To get an atom from the list, use
ll_alloc(); Example 7(d). To traverse the list, use ll_first() and ll_next(),
as in Example 7(e). To remove a certain atom from the list, use ll_free() and
specify the atom to remove; see Example 7(f). To remove all of the atoms from
the list, destroy the list using ll_rmlist(); see Example 7(g). Finally, to
destroy all lists associated with a particular LLCB, use ll_close(); see
Example 7(h).
The test program, lltest.c (available electronically), exercises every
function in the toolkit and provides an excellent template for its use.


Conclusions


The end analysis indicates that the worst case of employing a blocking method
is as good as a nonblocking one, but is generally better with smaller atom
sizes. However, this consideration pales when compared to the benefits of
using a standardized toolkit vs. customizing each linked list from scratch.

Example 1: Traditional definition of an atom structure in C.

typedef struct FOO_ATOM
{
 struct FOO_ATOM *pPrev; /* pointer to previous atom */
 struct FOO_ATOM *pNext; /* pointer to next atom */
 char szName[ 30 ]; /* foo name */
 char szPhone[ 15 ]; /* foo phone number */
} FOOATOM;


Example 2: Algorithm for creating a new atom.
if((a = malloc(s))!= NULL)
{
 if(h==NULL)
 h = a;
 if(t!=NULL)
 t->next = a;
 a->prev = t;
 a->next = NULL;
}


Example 3: (a) Defining and using a generic bindery structure; (b) application
of an atom using a generic bindery structure.

(a)

typedef struct BINDERY
{
 struct BINDERY *pPrev; /* pointer to previous bindary */
 struct BINDERY *pNext; /* pointer to next bindery */
} BIND;

typedef struct FOO_ATOM
{
 BIND b; /* atom bindery */
 char szName[ 30 ]; /* foo name */
 char szPhone[ 15 ]; /* foo phone */
} FOOATOM;


(b)

void MyFunc( ... )
{

 FOOATOM *pHead, *pTail, *pAtom;
 pHead = pTail = NULL;
 pAtom = (FOOATOM *) AddAtom((BIND **)&pHead,
 (BIND **)&pTail, sizeof(FOOATOM));
 [ other program statements ]
}

BIND *AddAtom(BIND **ppHead,BIND **ppTail,unsigned usDataSize)
{
 BIND *pRet;
 if( ( pRet = malloc( usDataSize ) ) != NULL )
 {
 if( *ppHead == NULL )
 *ppHead = pRet;
 if( *ppTail != NULL )
 (*ppTail)->pNext = pRet;
 pRet->pPrev = *ppTail;
 pRet->pNext = NULL;
 *ppTail = pRet;
 }
 return( pRet );
}


Example 4: ENDS structure definition.
typedef struct LIST--ENDS
{
 BIND *pHead;
 BIND *pTail;
} ENDS;


Example 5: Traversing a linked list using ENDS structure.
FOOATOM *pFooAtom;
ENDS FooEnds;
for(pFooAtom=(FOOATOM*)FooEnds.pHead;
 pFooAtom;pFooAtom=pFooAtom->pNext)
{

 [other program statements]
};


 Figure 1: Traditional doubly linked lists with three atoms.
 Figure 2: The linked-list manager tacks on the size of the bindery, requests
a block of that size from malloc(), and retains that pointer. The pointer
returned to the caller is to the atom data only.
Table 1: Linked-list toolkit verbs.
LLCB *ll_open( LLCB *pList, unsigned usAtomSize, unsigned usAtomCnt );
Initializes a linked-list control structure. Must be the first call performed
for a linked-list control. usAtomSize indicates the size of the atom data, and
usAtomCnt indicates the number of atoms to create per block. This call does
not allocate any memory; it only initializes the linked-list control structure
with the values supplied.
LLST *ll_mklist( LLCB *pListCntl, LLST *pList );
Creates a linked list as part of the linked-list control addressed by
pListCntl.
void *ll_alloc( LLST *pList );
Allocates a new atom in pList. Returns a pointer to atom data area if
successful, or NULL if no atoms are available and none can be generated by
creating new blocks.
void ll_free( LLST *pList, void *pAtom );
Releases an atom to the free list.
void ll_rmlist( LLST *pList );
Closes pList and releases all resources currently allocated by pList.
void ll_close( LLCB *pListCntl );
Closes the linked-list control. Any resources owned by any linked lists
associated with the control are released to the system pool.
void *ll_first( LLST*pList );
Returns the first atom in pList, or NULL if pList is empty.
void *ll_last( LLST *pList );
Returns the last atom in pList, or NULL if pList is empty.
void *ll_next( void *pAtom );
Returns the next atom relative to pAtom or NULL if pAtom is the last atom in
the list.
void *ll_prev( void *pAtom );
Returns the previous atom relative to pAtom, or NULL if pAtom is first on the
list.
void ll_swap( LLST *pList, void *pSrcAtom, void *pTrgAtom );
Effectively swaps the places of pSrcAtom and pTrgAtom in pList. Both atoms are
assumed to be members of pList, and are not checked. This can cause
significant problems if atoms from two different lists are swapped.
void ll_mvbefore( LLST*pList, void *pSrcAtom, void *pTrgAtom );
Moves pSrcAtom before pTrgAtom in pList. Both atoms are assumed to be members
of pList, and are not checked.
void ll_mvafter( LLST *pList, void *pSrcAtom, void *pTrgAtom );
Moves pSrcAtom after pTrgAtom in pList. Both atoms are assumed to be members
of pList, and are not checked.
Example 6: (a) Formulas for calculating true atom and block sizes; (b) formula
to calculate the address of the first bindery, with errors; (c) getting the
address of a vectored integer; (d) "sewing" the atoms of a block together.
(a)

TrueAtomSize=sizeof(LLBIND)+SizeofAtomData
BlockSize =sizeof(LLBC)+(AtomCount*TrueAtomSize)

(b)
pBlock=malloc(BlockSize);
pBind =(PLLBIND)(pBlock+sizeof(LLBC));


(c)

int a[ 10 ];
int *j = &a[ 3 ];
 /* get address of fourth integer */


(d)

for( ... )
{
 /* calc address of next bindery */
 pBind->pNext = (PLLBIND)( (char *)pBind + TrueAtomSize );
 /* setup back reference */
 pBind->pNext->pPrev = pBind;

 /* go on to the next bindery */
 pBind = pBind->pNext;
}


Example 7: (a) Setting up list control; (b) initializing the list control
(LLCB); (c) creating a list; (d) obtaining a fresh atom; (e) forward traversal
of a list; (f) returning an atom to the free chain; (g) destroying a list; (h)
releasing the LLCB.
(a)

#include "ll.h"
LLCB ListCB;
LLST List;
typedef struct
{
 char szName[ 30 ];
 char szPhone[ 15 ];
} FOOATOM;


(b)

ll_open( &ListCB, sizeof ( FOOATOM ), 10 );


(c)

ll_mklist( &ListCB, &List );


(d)

FOOATOM *pAtom;
pAtom = ll_alloc( &List );


(e)

for( pAtom = ll_first( &List );
 pAtom; pAtom = ll_next( pAtom ))
 printf("Name:%-30s Phone:%s\n",
 pAtom->szName,
 pAtom->szPhone);


(f)

ll_free( &List, pAtom );


(g)

ll_rmlist( &List );


(h)

ll_close( &ListCB );


 Figure 3: The results of a few ll_alloc() and ll_free() calls.

[LISTING ONE] (Text begins on page 32.)
/* LINKED LIST TOOLKIT -- Copyright (C) 1990 by Garyl Lee Hester
 * ll.h -- Global header file */
#ifndef TRUE
#define TRUE 1
#define FALSE 0
#endif

#define _LL_NEXT 1
#define _LL_PREV 2
#define _LL_BEFORE 3
#define _LL_AFTER 4
typedef void * PATOM; /* pointer to atom */
typedef void * PGEN; /* generic pointer */
/* Link List Bind Structure (LLBIND) */
typedef struct LL_BIND
{
 struct LL_BIND *pPrev; /* reference to next link */
 struct LL_BIND *pNext; /* reference to prev link */
} LLBIND, *PLLBIND;
/* Link List Head & Tail Structure (LLENDS) */
typedef struct LL_ENDS
{
 PGEN pHead; /* generic head pointer */
 PGEN pTail; /* generic tail pointer */
} LLENDS, *PLLENDS;
/* Link List Block Control Strucure (LLBC) */
typedef struct LL_BLOKCNTL
{
 struct LL_BLOKCNTL *pPrev; /* ref to prev block */
 struct LL_BLOKCNTL *pNext; /* ref to next block */
 unsigned usUsageCnt; /* usage count */
} LLBC, *PLLBC;
/* Link List Control Structure (LL) */
typedef struct LL_CNTL
{
 LLENDS Block; /* ends of block list */
 LLENDS Free; /* ends of "free" atom list */
 unsigned usAtomSize; /* size of link Atom */
 unsigned usAtomCnt; /* number of Atoms per block */
} LLCB, *PLLCB;
typedef struct LL_LIST
{
 PLLCB pLL; /* reference to master list */
 LLENDS Ends; /* ends of the current sub-list */
} LLST, *PLLST;
typedef int (*COMPFUNC)( void *, void * );
/* Macros used by ll package only: MAKEPBIND(p), converts an atom ptr to a
 * bindary ptr; MAKEPATOM(p), converts a bindary ptr to an atom ptr */
#define MAKEPBIND(p) ((PLLBIND)((char *)(p) - sizeof(LLBIND)))
#define MAKEPATOM(p) ( (PATOM) ((char *)(p) + sizeof(LLBIND)))
/* User Functions supplied by Macros */
#define ll_next(p) ll_getbind(_LL_NEXT,(PATOM)(p))
#define ll_prev(p) ll_getbind(_LL_PREV,(PATOM)(p))
#define ll_mvbefore(l,s,t) ll_mvatom((s),_LL_BEFORE,(t),&(l)->Ends)
#define ll_mvafter(l,s,t) ll_mvatom((s),_LL_AFTER, (t),&(l)->Ends)
#define ll_insbefore(l,s,t) ll_insatom((s),_LL_BEFORE,(t),&(l)->Ends)
#define ll_insafter(l,s,t) ll_insatom((s),_LL_AFTER, (t),&(l)->Ends)
#define ll_first(p) ll_frstatom((p),(p)->Ends.pHead)

#define ll_last(p) ll_lastatom((p),(p)->Ends.pTail)
#define ll_swap(l,s,t) ll_swapatom((s),(t),&(l)->Ends)
#define ll_append(l,s) ll_appatom((s),&(l)->Ends)
#define ll_unbind(l,s) ll_unbatom((s),&(l)->Ends)
#define ll_memoff(p,m) (unsigned)((char *)(&(p)->m)-(char *)(p))
/* Similar Functions for Handling the Free List */
#define ll_fmvbefore(l,s,t) ll_mvatom((s),_LL_BEFORE,(t),&(l)->Free)
#define ll_fmvafter(l,s,t) ll_mvatom((s),_LL_AFTER, (t),&(l)->Free)
#define ll_finsbefore(l,s,t) ll_insatom((s),_LL_BEFORE,(t),&(l)->Free)
#define ll_finsafter(l,s,t) ll_insatom((s),_LL_AFTER, (t),&(l)->Free)
#define ll_ffirst(p) ll_frstatom((p),(p)->Free.pHead)
#define ll_flast(p) ll_lastatom((p),(p)->Free.pTail)
#define ll_fappend(l,s) ll_appatom((s),&(l)->Free)
#define ll_funbind(l,s) ll_unbatom((s),&(l)->Free)
/* Function prototypes: llprim.c - linked list primitives */
PLLCB ll_open( PLLCB, unsigned, unsigned );
void ll_close( PLLCB );
PLLST ll_mklist( PLLCB, PLLST );
void ll_rmlist( PLLST );
PATOM ll_alloc( PLLST );
void ll_free( PLLST, PATOM );
/* llbase.c - linked list management */
void ll_appatom( PATOM, PLLENDS );
void ll_mkblock( PLLCB );
void ll_rmblock( PLLCB, PLLBC );
PLLBC ll_inblock( PLLCB, PATOM );
void ll_unbatom( PATOM, PLLENDS );
PATOM ll_frstatom( PLLST, PLLBIND );
PATOM ll_lastatom( PLLST, PLLBIND );
PATOM ll_getbind( unsigned, PATOM );
/* llins.c - insert and move atom routines */
void ll_mvatom( PATOM, unsigned, PATOM, PLLENDS );
void ll_insatom( PATOM, unsigned, PATOM, PLLENDS );
/* llswap.c - atom swap routine */
void ll_swapatom( PATOM, PATOM, PLLENDS );
/* llsort.c - shell sort routine */
void ll_sort( PLLST, short, unsigned, COMPFUNC );

[LISTING TWO]

/* LINKED LIST TOOLKIT -- Copyright (C) 1990 by Garyl Lee Hester
 * llprim.c -- Routines for Linked List Primitives */
#include <stdio.h>
#include <memory.h>
#include <malloc.h>
#include "ll.h"

/* ll_open - initialize a new linked list control */
PLLCB ll_open( PLLCB pLL, unsigned usAtomSize, unsigned usAtomCnt )
{
 unsigned usRet = 0;
 if( pLL )
 {
 /* if the list is already active, destroy it */
 if( pLL->Block.pHead )
 ll_close( pLL );
 memset( (PGEN)pLL, 0x00, sizeof( LLCB ) );
 pLL->usAtomSize = usAtomSize;
 pLL->usAtomCnt = usAtomCnt;

 }
 return( pLL );
}
/* ll_close - closes a linked list */
void ll_close( PLLCB pLL )
{
 if( pLL )
 {
 while( pLL->Block.pHead )
 ll_rmblock( pLL, (PLLBC)( pLL->Block.pHead ) );
 memset( (PGEN)pLL, 0x00, sizeof( LLCB ) );
 }
}
/* ll_mklist - make a new sub list */
PLLST ll_mklist( PLLCB pLL, PLLST pList )
{
 if( pList && pLL )
 {
 pList->pLL = pLL;
 pList->Ends.pHead = NULL;
 pList->Ends.pTail = NULL;
 }
 return( pList );
}
/* ll_rmlist - destroys a sub list */
void ll_rmlist( PLLST pList )
{
 if( pList )
 {
 while( pList->Ends.pHead )
 ll_free( pList, MAKEPATOM( pList->Ends.pHead ) );
 }
}
/* ll_alloc - return ptr to new atom, else NULL */
PATOM ll_alloc( PLLST pList )
{
 PLLBIND pRet = NULL;
 PLLBC pBlock;
 PLLCB pLL;

 if( pList )
 {
 pLL = pList->pLL;
 /* if pFreeHead is NULL, then a new block needs to be created. */
 if( pLL->Free.pHead == NULL )
 ll_mkblock( pLL );
 /* if pFreeHead is STILL NULL, then there is no more memory */
 if( pLL->Free.pHead )
 {
 pRet = (PLLBIND)( pLL->Free.pHead );
 if( ( pBlock = ll_inblock( pLL, pRet ) ) )
 pBlock->usUsageCnt++;
 ll_unbatom( MAKEPATOM( pRet ), &pLL->Free );
 ll_appatom( MAKEPATOM( pRet ), &pList->Ends );
 pRet = MAKEPATOM( pRet );
 }
 }
 return( (PATOM) pRet );
}

/* ll_free - remove pAtom from pList and add to global free chain */
void ll_free( PLLST pList, PATOM pAtom )
{
 PLLBIND pBind;
 PLLBC pBlock;
 PLLCB pLL;
 if( pList && pAtom )
 {
 pLL = pList->pLL;
 pBind = MAKEPBIND( pAtom );
 ll_unbind( pList, pAtom );
 ll_appatom( pAtom, &pLL->Free );
 if( ( pBlock = ll_inblock( pLL, pAtom ) ) )
 if( !( --pBlock->usUsageCnt ) )
 ll_rmblock( pLL, pBlock );
 }
}
End Listings












































August, 1993
C++ Templates


Simplify your C++ code with templates




Pete Becker


Pete is a software engineer at Borland International and is Borland's
principal representative to the ANSI C++ Committee. At Borland he works on
Object Windows Library, class libraries, and sometimes linkers. He can be
contacted at 1800 Green Hills Road, Scotts Valley, CA 95066.


Where would the automobile industry be without custom tools? An assembly line
in Detroit turns out hundreds of cars each day, each having its own particular
variations of color, engine size, and accessories, all built according to a
common plan. Since they use the same plan every time, automobile engineers can
create custom tools specifically for that design, simplifying the
manufacturing of cars and multiplying the productivity of factories.
C++ templates can multiply your coding productivity by making it easier to
write your own custom tools. You're not producing hundreds of classes each
day, so your productivity gains probably won't be as dramatic as those in the
automobile industry. Still, a few well-designed templates in your toolkit can
make your job much easier.


Life Without Templates


Suppose you've written a class that implements a stack of integers like
Example 1. Of course, programming being what it is, the next time you have to
write a stack, it won't be a stack of integers but, say, a stack of strings.
You could reuse IntStack by copying it, changing its name, and replacing int
with a string where appropriate. However, this approach has drawbacks.
First, it's error prone. You can't just do a global search and replace. If you
do, you'll end up with the member function ItemsInStack returning a string,
which probably isn't what you want. Doing this transformation correctly
requires you to understand every use of int in the original class.
Transforming a larger, seldom-used class could prove to be an enormous
undertaking. Not to mention that once you've done this a few times, there'll
be several variations of stacks used by different programs on your hard disk.
If you find and fix a logic error in one, you ought to fix it in the others as
well. Unless you've kept scrupulous notes on where these variations are used,
you'll have a hard time finding them all.
Some programmers reject templates, saying a stack class isn't very big and you
can write one when needed. Sure you can copy the definition of IntStack and,
with some care, change it to a FloatStack in minutes--and there's a good
chance it will work correctly. However, it takes less time to create a
FloatStack from a template: You #include "stack.h" and typedef Stack<float>
FloatStack. These two lines of code do the same as the 28 lines in Example 1.
Don't let anyone tell you that the other way is better.


On the Inside


These two lines of code don't tell the whole story. Consider the STACK.H
header in Example 2, the heart of the Stack template. If you compare this
template with the IntStack class, you'll see the two are almost identical. The
template begins with template <class Type>, which tells you it's a template
that takes one parameter we'll refer to as Type. Most of the ints in the
original class have been replaced with Types in the template, the exception
being the int specifying the return type of ItemsInStack. Does that sound
familiar? It's the same transformation that changes an IntStack into a
FloatStack. In fact, it's common to develop a template by first writing a
class that does what's needed for a single data type, then changing that class
into a template by replacing that data type with a parameterized type as we
did here. When you create a template, you only have to do the transformation
once. After that, the compiler can do it for you.


A Brief Diversion For Philologists


The chapter on templates in The Annotated C++ Reference Manual (ARM) includes
some rather confusing terms, such as "class templates," "template classes,"
"function templates," and "template functions." My Stack example is a class
template--a template used to create classes. I used it to create a
Stack<float>, a class created from a template--a template class.
The same applies to function templates, which, as the name suggests, are
templates that create functions. The template that you write is a function
template. When you use that template to create a function, that function is a
template function.


Template Parameters


Every template definition begins with the keyword template followed by a
template parameter list. As in the Stack example, the template parameter is
delimited by a less-than (<) and a greater-than (>) symbol. The alternative of
overloading the meanings of {}, [], or () results in confusing code because
these delimiters already have so many other meanings.
The parameters in a template definition fall into two broad categories: type
parameters, which tell the compiler to expect the name of a type when the
template is used; and nontype parameters, which tell the compiler to expect a
value. For instance, Example 3 changes the Stack template, so you could
specify how large the stack should be. The difference between this and Example
2 is that I've added a parameter named Size to the template definition and
removed the enum that defined Size in the earlier version.
In Example 3, the first parameter is a type parameter, the second, a nontype
parameter. When the class keyword is used in a template parameter list, it
indicates that the name following it is a type parameter. This isn't the same
as the use of class in a class definition. It doesn't mean that when the
template is used, the argument must be the name of a class; it only means that
it must be the name of a type.
When using a template, you put an argument list after the name of the
template. That argument list contains the actual names or values that the
compiler should use when it expands the template. Where the parameter list
contains type parameters, the argument list must contain type names. Where the
parameter list contains nontype parameters, the argument list must contain
values of the type specified for the parameter. You'd use the two-parameter
version of the Stack by specifying the type of the values and the maximum
number of entries that the Stack holds: Stack<int,20> is a stack of integers
with a capacity of 20, just like the original definition.


Class Templates Create Types


When you use a class template, you're creating a type name. Don't get confused
because it's a template--it acts just like any other type name. All the
statements in Example 4 are legal, for example, and they mean just what you'd
expect. When using templates, keep in mind that they may seem new and strange
and have some odd-looking syntax, but they usually fit into your code without
requiring you to do anything differently. That's the beauty of good tools:
Once you understand how to use them, they get out of your way and let you do
your job.


Gotchas



Templates have a couple of syntactic quirks you should watch out for. The last
line in the list of declarations in Example 4 creates a stack of stacks. It
would be an error to declare Stack<Stack<int>> StackOfStacks. The problem is
that the two greater-thans (>>) after int in this declaration don't terminate
the two template-argument lists. C++ uses C's "maximum-munch" rule, so the >>
is interpreted as a shift-right operator. The compiler will give you strange
messages if you omit the space between > and >.
Parameter-list delimiters can also be a problem if you use arithmetic
expressions for nontype parameters. Using the two-parameter version of the
Stack class, you might try to write Example 5(a). The problem here is that the
first > is the end of the argument list, so everything after that is nonsense.
The way around this is to put the size expression in parentheses; see Example
5(b).
Finally, you're not allowed to overload a class-template name. Consequently,
you couldn't use the Stack<class Type> and Stack<class Type,int Size>
templates in one program.


Function Templates


Function templates can be used to create functions. The rules for dealing with
function templates are more complicated than those for class templates because
functions and function templates can be overloaded and because we don't use an
explicit template-parameter list when calling a template function. Consider
Example 6, the compiler has to figure out what the template parameters are
from the arguments used in the function call. In Show(f), the argument f is of
type float. Looking back at the template definition, you see that the only
parameter used in the function call is the template parameter Type. The
compiler infers that Type must correspond to float and acts accordingly.
Show(f) acts just as if you'd written it explicitly to handle floats; see
Example 6(b).


Parameter Types


One problem you can run into with function templates is the rule that the
compiler does not do conversions on the arguments to a function template. This
is a case where your intuition may lead you astray. The template in Example
7(a) calculates the lesser of two values. Example 7(b) uses this template for
some calculations. Here, min() is called with two arguments of type int. The
compiler looks at the template, sees that it takes two parameters of the same
type, and figures out that if it replaces T with int, it has a match. The
compiler knows there's a function int min(int,int) and calls it. Example 7(c)
is similar to 7(b): The compiler knows from the template that there's a
function long min(long,long) and calls it.
Example 7(d), however, isn't like the previous examples. The two types in the
call to min() are different. The function template takes two parameters of the
same type, so the compiler cannot use the template and the call is an error.
The compiler doesn't know of any function named min() that takes a long and an
int. If I'd simply written the usual #define min(a,b) ((a)<(b)?(a):(b)), this
call would have worked. The compiler would see that it was comparing an int
and a long, and would perform the usual arithmetic conversions, converting the
int argument to a long so that the two matched, and the result would be a
long. That doesn't happen with templates because of the "no-conversion" rule.
It's generally agreed that the no-conversion rule is too harsh, and will
likely change when the ISO C++ committee reviews the rules on templates.
Indeed, many of today's compilers, including cfront, allow some conversions
even though they're technically illegal; don't be surprised if some compilers
complain about them.


Overloading


Still, it's possible to use the example min() function template to compare an
int and a long. To understand how, look at how the compiler handles overloaded
functions. Consider the best-match rules in section 13.2 of the ARM. These
rules define the algorithm the compiler is supposed to use to determine which
function to call when several functions have the same name. This rule was
written before templates were added to the language. Templates don't change
the rule, they extend it. The new rule states:
A template function may be overloaded either by (other) functions of its name
or by (other) template functions of that same name. Overloading resolution for
template functions and other functions of the same name is done in three
steps:

[1] Look for an exact match (sec. 13.2) on functions; if found, call it.
[2] Look for a function template from which a function that can be called with
an exact match can be generated; if found, call it.
[3] Try ordinary overloading resolution (sec. 13.2) for the functions; if a
function is found, call it.
According to the first sentence, the declarations in Example 8(a) can all be
used in the same program. You end up with two different templates with the
same name, min(), and an ordinary function prototype for a function with this
name. That they may conflict isn't important at this point. Only when you try
to use one of these functions must the compiler worry about ambiguities. The
rest of the rule addresses this point. For example, suppose the program
contains two string variables, s1 instantiated to "abcd" and s2 instantiated
to "efgh", and calls the function min( s1, s2 ). The call to min() supplies
two arguments, both of type string. Declaration #2 in Example 8(a) says
there's a function, min(), that takes two arguments of type string. So, you
have a function that exactly matches the call. According to Step #1 of the
overloading rule, you don't have to go any further.
Next, consider two integer variables, i1=3 and i2=4, and a call to min(i1,
i2). Because the two arguments are of type int, Step #1 of the overloading
rule doesn't help here--we don't have any function that exactly matches the
call. Moving to Step #2, there's "a function template from which a function
that can be called with an exact match can be generated," namely, the template
min(T t1, T t2). The compiler can use that template to generate min(int,int).
That function exactly matches the call, so that's the function that will be
called.
Now consider a variable string s1="abcd" and a variable char *s2="defg". The
call string min( s1, s2 ) doesn't satisfy either Step #1 or #2. We haven't
seen the definition of this string class, but from its use you've probably
figured that it has a constructor that takes an argument of type char*. The
constructor is used to create the auto strings we've been passing as
parameters. The compiler could use that constructor to call min(string,string)
by creating a temporary object of type string and using that temporary object
as the second parameter. That's just an ordinary function call, and it's done
just as if the template weren't present. Step #3 says that's what the compiler
should do in this case.
Step #3 also tells us what we need to do to persuade the compiler that it's
okay to call min() with an int and a long--tell the compiler there's a version
of min() that takes two longs. Once that's done, the compiler will use Step #3
of the overloading rule and promote the int parameter to a long. You don't
have to supply a definition of this particular variation as long as the
compiler can generate it from the template. By supplying a prototype, you make
the call legal. By supplying the template definition, you give the compiler a
way of producing the actual function.


Template Specializations


Step #2 of the overloading rule says to, "look for a function template from
which a function that can be called with an exact match can be generated; if
found, call it." It doesn't say "generate the function and call it." That's
quite deliberate. The compiler simply generates a call to the function that
has that name. You can provide your own version of that function somewhere
else in the code; if you do, that function will be called when the program is
run.
Consider Example 8(b), a variation of the min() example, which, though
admittedly contrived, is supposed to take two command-line parameters (text
representations of numbers) and return the lower value of the two.
Unfortunately, it doesn't work because the min() template, when applied to two
char*s, compares the pointers, not the C strings they point to. This is where
specialization comes to the rescue: You can provide your own definition of
min(char *, char *) that calls strcmp() instead of comparing pointers; see
Example 8(c).
If this function appears anywhere in the program, even in a different module
from the one that contains the call, then it should be invoked at run time.
The template should not be used. Think of template instantiation as a
desperate measure, to be used only when nothing else works. The compiler uses
the overloading rules to figure out what call to make, then tries to find an
ordinary function to handle the call. If there's no ordinary function, it will
instantiate the template.


Forward References


Notice in Example 8 that I used a forward reference to the min() template.
This works much like a function prototype, telling the compiler there's a
function template named min() that takes two parameters of the same type and
returns a result of that type. That's all the compiler needs to know to call
functions created with this template. Of course, it eventually has to see the
actual definition, but the definition doesn't have to appear until later; see
Example 9. When you don't want to or can't expose the template definition, you
can tell the compiler that some name is, in fact, the name of a template. Of
course, if you haven't given the compiler the definition of the Stack
template, you can't use that template in any way that depends on its
definition.


Template Instantiation


But what happens when the compiler actually expands the template? Where does
it put the code that the template creates? These details depend on the
compiler.
cfront, for example, requires you to put class-template declarations and
corresponding inline functions in a header file, and non-inline
member-function definitions in a file with the same base name as the header
file, located in a special directory so that the compiler can find it. Each
time the compiler instantiates a template, it sticks the code into a special
area known as a "repository." During linking, the linker looks in the
repository for the template instantiations that it needs. If a particular
instantiation isn't there, the linker has to call the compiler to create it.
Borland C++, on the other hand, requires you to put everything in the header
file. It compiles the code directly into the .OBJ file where it's used. The
linker combines duplicate template instantiations, so your executable file
only ends up with a single copy of the code for any particular instantiation.
Whatever compiler you use, once you've set things up properly, you shouldn't
have to do anything special to get templates instantiated. The compiler should
take care of it.


Templates in Your Tool Kit



One of the greatest advantages of C++ over C is that it supports inheritance,
which allows you to factor out common features into single class. This
improves maintainability and gives you a set of reusable building blocks for
future use.
Templates extend this form of generalization by enabling the creation of
families of similar classes and functions. Your understanding of a problem
often changes during the development of an application, which means you
constantly adapt building blocks to the changing requirements. Having a well
thought-out set of templates makes these adaptations much easier and much less
error prone.


References


Ellis, Margaret and Bjarne Stroustrup. The Annotated C++ Reference Manual,
second edition. Reading, MA: Addison-Wesley, 1992.

Example 1: A C++ class that implements an integer stack.
class IntStack
{
public:
 IntStack() : Current(Data) {}
 int ItemsInStack() const
 {
 return Current - Data;
 }
 int Top() const
 {
 assert( ItemsInStack() != 0 );
 return *Current;
 }
 void Push( int i )
 {
 assert( ItemsInStack() < Size );
 *Current++ = i;
 }
 int Pop()
 {
 assert( ItemsInStack() != 0 );
 return *(++Current);
 }
private:
 enum { Size = 20; };
 int Data[Size];
 int *Current;
};

Example 2: Stack.h contains the Stack template.

template <class Type> class Stack
{
public:
 Stack() : Current(Data) {}
 int ItemsInStack() const
 {
 return Current - Data;
 }
 Type Top() const
 {
 assert( ItemsInStack() != 0 );
 return *Current;
 }
 void Push( Type t )
 {
 assert( ItemsInStack() < Size );
 *Current++ = t;
 }

 Type Pop()
 {
 assert( ItemsInStack() != 0 );
 return *(++Current);
 }
private:
 enum { Size = 20; };
 Type Data[Size];
 Type *Current;
};

Example 3: Changing the Stack template so that its size can be specified.

template <class Type, int Size> class
Stack
{
public:
 // same as above
private:
 Type Data[Size];
 Type *Current;
};

Example 4: Legal declarations using the Stack template.

#include <stack.h>

Stack<int> S1, S2;
Stack<int> *Ptr;
extern Stack<int> S3;
void f( Stack<int>& );
class DerivedStack : public Stack<int> {};
Stack< Stack<int> > StackOfStacks;


Example 5: (a) Attempting to use arithmetic expressions in the two-parameter
version of the Stack class; (b) placing the size expression within parentheses
avoids the parameter-list delimeter problem in 5(a).
(a)

const Count = 17;
Stack< int, Count>10?Count:10 > Data;


(b)

Stack< int, (Count>10?Count:10) > Data;

Example 6: (a) Usage for function templates; (b) writing this function once
with a template is much less time consuming than writing it for every data
type.
a)

template <class Type> void Show( Type T )
{
 Type times2 = T*2;
 cout << T << \t' << times2 << endl;
}

int main()
{
 float f = 3.14159;
 double d = 2.1;

 int i = 3;
 Show(f);
 Show(d);
 Show(i);
 return 0;
}

(b)

void Show( float f )
{
 float times2 = f*2;
 cout << f << \t' << times2 << endl;
}


Example 7: (a) Function template to calculate the lesser of two values; (b)
the compiler looks at the template in 7(a), replaces T with int, and calls int
min(int,int); (c) in this case the compiler calls long min(long,long); (d)
because the two types in the call to min() are different, the compiler cannot
use the template and the call is an error.

(a)

template <class T> T min( T t1, T t2 )
{
 return t1<t2 ? t1 : t2;
}


(b)

int GetMin( int i1, int i2 )
{
 return min( i1, i2 );
}


(c)

long GetMin( long l1, long l2 )
{
 return min( l1, l2 );
}


(d)

long GetMin( long l1, int i1 )
{
 return min( l1, i1 );
}




Example 8: (a) These declarations can all be used in the same program; (b) a
variation on the template class in 8(a); (c) definition of min(char *, char *)
that calls strcmp() instead of comparing pointers.
(a)

template <class T>
T min( T t1, T t2 ); // 1
string min( string,string ); // 2
template <class T>

T min( T t1, T t2, T t3 ); // 3


(b)

template <class T>
T min( T t1, T t2 )
{ return t1 < t2 ? t1 : t2; }

int main( int argc, char *argv[] )
{
 return atoi( min(argv[1],argv[2]) );
}

(c)

char *min( char *s1, char *s2 )
{
 if( strcmp( s1, s2 ) < 0 )
 return s1;
 else
 return s2;
}



A Virtual-array Class using C++ Templates




Douglas Reilly




Doug owns Access Microsystems, a software-development house specializing in
C/C++ software development. He is also the author of the BTFILER and BTVIEWER
Btrieve file utilities. Doug can be contacted at 404 Midstreams Road, Brick,
NJ 08724, or on CompuServe at 74040,607.


Class templates allow the use of a single set of functions to perform
identical operations on an unlimited set of types. Templates were first
described as "experimental" in The Annotated C++ Reference Manual by Margaret
Ellis and Bjarne Stroustrup (Addison-Wesley, 1990). Templates are currently
available in Borland C++ 3.x, and should soon be available in all C++
compilers. It's important to note that templates may not provide any savings
in the resulting object code, and in this way might be no more efficient than
macros. But machine efficiency isn't the only concern in programming. In the
machine/programmer cost equation, the programmer is more expensive, and
eliminating problems associated with macros makes the programmer more
productive.
On more than one occasion, I've hit the memory "wall" when adding features or
capacity. I'm primarily an MS-DOS programmer, and have long envied programmers
in other environments who could declare arrays much larger than would fit in
the physical memory of the machine. Templates, combined with creative use of
operator overloading, are the answer.
virtArray is a virtual-array template class; see Listings One and Two (page
102). It buffers a user-selectable number of elements, but stores the balance
of the array on disk. Because of the ability to overload the subscript
operator ([]) and return references, these virtual arrays can replace existing
arrays with only minor modifications to the source code. After a while, you
can even forget they're not part of the C++ language proper.
To help test performance and determine the degree of difficulty in integrating
virtArray into an existing application, I turned to an existing MS-DOS
application that used an array for storing moderate-sized structures that
defined fields in a file. This program was the perfect candidate for a virtual
array, since there was a need to use more fields than was possible with
traditional arrays.
The first change was the declaration. I used a typedef to hide the details
(template arguments, and so on) and to ensure that exactly the same template
arguments were specified each time the virtual array was declared. This is
important since template arguments must match exactly in the definitions and
declarations if you are to reference the same object.
Notice the nullOut() member function. A common way to initialize an array is
to use memset(). This won't work with a virtual array. Given a virtual array
like virtArray<int,256>intArray(20000); if you memset()&intArray[0] for 20,000
elements times (sizeof(int)), you'll initialize the 256 data elements of the
cache, and then initialize the next 39,488 bytes in memory. After rebooting
your machine, replace the memset() with a call to the nullOut() member
function. This function initializes the cache as well as the data elements in
the temporary disk file.
Another change in behaviors between traditional arrays and the virtual array
is that you cannot simply send the address of element 0 to a function and
expect that to act as a pointer to the whole array. At best, this will enable
the receiving function to get at cacheSize elements of the array, and will
then access whatever is next in memory, just as in the case of the memset()
described earlier.
An implementation detail that significantly effects performance is the way the
cache is implemented. If an array element not currently in the cache is
requested in the operator [] member function, the call to the get() member
function gets the requested element and the next cacheSize--1 element. This is
fairly effective in many cases, since a common construct is to loop through an
array starting at 0, looking for the end of an array or simply processing each
element. If the looping through the array takes place from the end of the
array back to 0, this will force a disk read for each call to operator []. A
reasonable alternative would be to fill the cache so that half the elements
cached would be before the current element and half would be after.
Example 9: A forward reference to a template works much like a function
prototype.
template <class T>
T min( T t1, T t2 );

int test( int i1, int i2 )
{
 return min( i1, i2 );
}
template <class T>
T min( T t1, T t2 )
{
 return t1 < t2 ? t1 : t2;

}


[LISTING ONE] (Text begins on page 44.)

#ifdef __cplusplus
#ifndef VIRTARRA_H
#define VIRTARRA_H

template <class T,int cacheSize=100>
class virtArray
{
private:
 int initialized;
 unsigned long cacheStart;
 unsigned long curEl;
 unsigned long numEls;
 size_t size;
 tfile *tempFile;
 T data[cacheSize];
 void put(unsigned long i);
 void get(unsigned long i);
public:
 virtArray(unsigned long tnumEls,T *defVal=0);
 ~virtArray();
 T& operator [](unsigned long i);
 T& current() { return operator [](curEl); };
 T& next() {
 if ( (curEl+1L<numEls) )
 {
 curEl++;
 }
 return(current());
 }
 T& prev() {
 if ( (curEl) )
 {
 curEl--;
 }
 return(current());
 }
 void nullOut();
 void grow(unsigned long numToAdd);
};
#endif
#endif

[LISTING TWO]

#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include "tfile.h"
#include "virtarra.h"

#define EOS \0'
template <class T,int cacheSize>
virtArray<T,cacheSize>::virtArray(unsigned long tnumEls,T *defVal)
{

 if ( tnumEls )
 {
 unsigned long totsize;
 char *tptr;
 initialized=1;
 tempFile=new tfile;
 numEls=tnumEls;
 curEl=0L;
 size=(sizeof(T));
 totsize=numEls*(long)size;
 // no default value. Just set everything to NULLS
 if ( defVal==0 )
 {
 memset(&data,EOS,size);
 totsize=numEls*(long)size;
 int grain=20480;
 tptr=new char[grain];
 if ( tptr==0 )
 {
 if ( tptr==0 )
 {
 grain=10240;
 tptr=new char[grain];
 if ( tptr==0 )
 {
 grain=2048;
 tptr=new char[grain];
 }
 if ( tptr==0 )
 {
 initialized=0;
 delete tempFile;
 return;
 }
 }
 }
 memset(tptr,EOS,grain);
 while ( totsize>grain && !(tempFile->getErrno()) )
 {
 tempFile->write(tptr,grain);
 totsize-=grain;
 }
 if ( totsize && !(tempFile->getErrno()) )
 {
 tempFile->write(tptr,totsize);
 }
 delete tptr;
 if ( tempFile->getErrno() )
 {
 delete tempFile;
 initialized=0;
 }
 }
 else
 {
 // Use the default value sent to constructor
 memcpy(&data,defVal,size);
 for ( unsigned long loop=0L ;
 loop<numEls && tempFile->getErrno()==0 ;

 loop++ )
 {
 tempFile->write(&data,size);
 }
 if ( tempFile->getErrno() )
 {
 delete tempFile;
 initialized=0;
 }
 }
 cacheStart=0L;
 get(0L);
 }
}
template <class T,int cacheSize>
virtArray<T,cacheSize>::~virtArray()
{
 if ( initialized )
 {
 delete tempFile;
 initialized=0;
 }
}
template <class T,int cacheSize>
T& virtArray<T,cacheSize>::operator [](unsigned long i)
{
 // cause reads past end to result in element 0
 if ( i>numEls )
 {
 i=0L;
 }
 if ( i>=cacheStart && i<(cacheStart+(unsigned long)cacheSize) )
 {
 curEl=i;
 return(data[i-cacheStart]);
 }
 else
 {
 put(cacheStart);
 cacheStart=i;
 get(i);
 curEl=i;
 cacheStart=curEl;
 return(data[0]);
 }
}
template <class T,int cacheSize>
void virtArray<T,cacheSize>::put(unsigned long i)
{
 if ( i<numEls )
 {
 int putSize=cacheSize;
 tempFile->seek(i*(long)size);

 if ( (i+putSize)>numEls )
 {
 putSize=(numEls-i);
 }
 tempFile->write(&data[0],putSize*size);

 }
}
template <class T,int cacheSize>
void virtArray<T,cacheSize>::get(unsigned long i)
{
 if ( i<numEls )
 {
 int getSize=cacheSize;
 cacheStart=i;
 tempFile->seek(i*(long)size);
 if ( (i+getSize)>numEls )
 {
 getSize=(numEls-i);
 }
 tempFile->read(&data[0],getSize*size);
 }
}
// Use this function to do the something like a memset()...
template <class T,int cacheSize>
void virtArray<T,cacheSize>::nullOut()
{
 char *tptr;
 unsigned long totsize;
 memset(&data,EOS,size);
 totsize=numEls*(long)size;
 int grain=20480;
 tptr=new char[grain];
 if ( tptr==0 )
 {
 if ( tptr==0 )
 {
 grain=10240;
 tptr=new char[grain];
 if ( tptr==0 )
 {
 grain=2048;
 tptr=new char[grain];
 }
 if ( tptr==0 )
 {
 grain=512;
 tptr=new char[grain];
 }
 }
 }
 memset(tptr,EOS,grain);
 while ( totsize>grain && !(tempFile->getErrno()) )
 {
 tempFile->write(tptr,grain);
 totsize-=grain;
 }
 if ( totsize && !(tempFile->getErrno()) )
 {
 tempFile->write(tptr,totsize);
 }
 delete tptr;
 get(curEl);
 return;
}

// allow the array to grow
template <class T,int cacheSize>
void virtArray<T,cacheSize>::grow(unsigned long numToAdd)
{
 unsigned long bytesToAdd=0L;
 unsigned int grain=10240;
 char *tptr;
 bytesToAdd=numToAdd*(long)size;
 for ( tptr=0,grain=10240 ; grain!=0 ; )
 {
 tptr=new char[grain];
 if ( tptr==0 )
 {
 grain-=2048;
 }
 }
 if ( grain!=0 && tptr!=0 )
 {
 tempFile->seek(0L,SEEK_END);
 while ( bytesToAdd>grain && !(tempFile->getErrno()) )
 {
 tempFile->write(tptr,grain);
 bytesToAdd-=grain;
 }
 if ( bytesToAdd && !(tempFile->getErrno()) )
 {
 tempFile->write(tptr,bytesToAdd);
 }
 delete tptr;
 numEls+=numToAdd;
 }
}

End Listings




























August, 1993
Calling C Functions with Variably Dimensioned Arrays


Bringing C up to snuff, Fortran-wise




John W. Ross


John is a computational scientist in the University of Toronto's
high-performance computing group. His interests include scientific programming
on vector and massively parallel supercomputers. John can be reached through
the DDJ offices.


With the growing popularity and acceptance of UNIX, many scientific and
engineering programmers are switching from Fortran to C to code their
technical applications. Both languages allow for compilation of individual
source modules and are therefore ideal for creating general-purpose
subroutines that can be incorporated into libraries that can then be used in a
wide variety of applications.


The Multidimensional-array Problem


However, when it comes to passing multidimensional arrays to a function or
subroutine, C is drastically inferior to Fortran. Fortran has an elegant
mechanism that allows a calling program to pass an array with adjustable
dimensions to a subroutine--the actual dimensions can be passed to the
subroutine as one of the subroutine parameters. This allows us to write a
completely general-purpose Fortran subroutine.
Such is not the case for C functions, however. The C compiler wants to know
the absolute size of all but the first dimension of any arrays passed to a
function. For instance, you can get away with calc(int n, float x[]) {_ but
not calc(int n, float x[][]) {_.
C can deal with variably dimensioned vectors (one-dimensional arrays), but
when an array has more than one dimension, the C compiler has to know the size
of the last dimensions expressed as a constant. This is a glaring shortcoming
to scientific programmers, whose entire data structures are often based
naturally on two- and three- (and higher) dimensional arrays. It's not really
surprising, though, when we consider that C was conceived as a sort of
high-order systems-programming language. Systems programmers generally don't
need multidimensional arrays--certainly not those with variable dimensions.


Unsatisfactory Solutions


This problem has long been recognized, of course, and those who work with
numerical applications in C have devised various solutions to the problem. An
obvious solution is to go ahead and declare the arrays in the functions to be
an absolute size and hope that no bigger arrays will ever be required. This
can be wasteful of memory in most cases, and seldom will you be sure what the
maximum array size should be.
The usual solution is to construct multiple-dimension arrays as vectors of
pointers, eventually pointing to a vector of the required data type. For
instance, a matrix of floating-point values (a two-dimensional array) would be
declared as a vector of type float *, with each element pointing to a vector
of type float.
This works, but has serious shortcomings. It's probably acceptable in a
program being written from scratch, but not in designing a subroutine that
will find its way into a library or will be reused in other programs. The
problem is that the calling program must be aware that the function expects
arrays to be defined this way and must define all arrays in this fashion. This
means that any other computations done on the arrays must take this special
structure into account. What we get is a case of the tail wagging the dog--the
array structure imposed by the function influences whatever else the main
program does with these arrays. This limits portability--someone who wants to
use a function that assumes arrays are defined in this way may be unwilling or
unable to change the definition in his or her main program. It also violates
the principle of information hiding, which is fundamental to the top-down
structured design process. High-level routines should not have to know or care
what data structures are used by lower-level routines.


The Easy Solution


Still, it is possible to pass to a C function multidimensional arrays which
the function can treat as having variable dimensions, and to do so in a
fashion transparent to the calling program. It is not necessary to define the
arrays in a special way in the main program. Experience has shown me that this
can be done by adopting the following principles:
Pass the array to the function as though it were a pointer to a vector of
floats (or the appropriate data type), no matter how many dimensions the array
actually has, along with the dimensions of the array.
Reference individual array elements as offsets from this pointer. For example,
for a two-dimensional array A, element aij has an offset of ncolxi+j from *a,
where ncol is the number of columns in the array. This idea can be extended in
a straightforward manner to arrays with more than two dimensions.
Write your algorithm so that array elements are accessed in storage order.
This is really the key principle to making this technique work. It allows us
to access subsequent elements of the array using the postincrement operator
(for instance, *a++) and eliminates the need to calculate offsets for
individual array elements. As a side benefit, algorithms written this way are
generally faster than those using array subscript notation.
To apply these principles successfully, it helps to be comfortable programming
and reading code written in C. Otherwise, applying the aforementioned
principles can make the code appear somewhat obscure.
To illustrate the procedure, I'll use an example of multiplying two matrices
together--a basic linear-algebra subroutine that has many arithmetic
operations.


Matrix Multiplication


Before looking at the example, I'd like to briefly consider the topic of
matrix multiplication on computers in general. Our task is to compute the
product of two matrices A and B and store the result in C. From linear algebra
we know that if A and B have dimensions mxn and nxp, respectively, then C will
have dimensions mxp. The generic matrix-multiplication algorithm consists of
three loops; see Example 1(a).
The loop indexes and termination points have been left blank. They will have
names i, j, and k and values m, n, and p, respectively. There are six possible
permutations for arranging the loops. Each does the same operations, but they
have very different memory-access patterns. The pattern used in standard
linear-algebra texts, which we would use if multiplying the matrices by hand,
is shown in Example 1(b).
Elements in matrices A and C are accessed in row order. However, since Fortran
stores arrays in column-major order, this does not access consecutive storage
locations. We would prefer to access all matrix elements as consecutive
storage locations, so we should use the algorithm in Example 1(c). Note that
this algorithm performs exactly the same operations as the first, but in this
case we access successive array elements by going down the matrix columns.
Since this is the way Fortran stores arrays, we're also accessing consecutive
locations in memory. This algorithm does very well on vector supercomputers,
such as those from Cray Research, since it eliminates memory-access conflicts
and also lends itself well to vectorization and chaining. This algorithm also
does better than the first on scalar machines, like workstations or PCs, since
we maintain coherency of the data cache and minimize page faulting.


An Example in C


We would like to maintain the same memory-access pattern in C, but since C
stores arrays in row-major order, we have to use a different permutation of
the algorithm; see Example 2. Now that we have an algorithm that accesses its
arrays in storage order, we can implement it as a C function; this is done in
Example 3.
Note how the principles outlined earlier are adhered to. The dimensions of the
arrays m, n, and p and the arrays themselves are passed to the function. The
arrays are treated as pointers to vectors of floats even though they are
two-dimensional arrays. We would treat them the same if they had three or more
dimensions.

Within the function, wherever possible, we set pointers to the arrays (or
subsets of the arrays) and access subsequent array elements by incrementing
these pointers. This works because we have designed our algorithm to access
array elements in storage order.
In this case we have a very compact function that will work with any size
arrays, just as though we were working in Fortran. As a bonus, this version
executes about twice as fast as the same algorithm coded using explicit
subscript references (as tested on a RISC workstation and a PC clone).
Example 4 shows a sample main C program that defines storage for three
matrices, initializes two of them, and calls the matrix-multiplication
function matmul to multiply them together. Note that we don't have to do
anything special when defining the arrays--they are just defined as standard
two-dimensional arrays.
One thing you might want to change about the main program is the line that
calls the matrix-multiplication routine. C compilers that support prototyping
will warn you that the argument types don't match the prototype descriptions,
though it will still compile and run. Call the function casting the arrays as
pointers to vectors of floats; for example, matmul(M, N, P, (float *)a, (float
*)b, (float *)c); and everything will be fine.


Summary


It's possible to write functions in C that allow you to treat arrays as being
variably dimensioned, the same way Fortran does. Just pass the arrays to the
function as though they were pointers to vectors. With some thought, you
should be able to write your algorithm so that it accesses the array elements
in storage order (that is, by row). This can then be coded by incrementing
pointers to the arrays, resulting in a fast and efficient implementation.
Example 1: (a) The generic matrix-multiplication algorithm consists of three
loops; (b) the pattern used if multiplying the matrices by hand; (c) the
algorithm for accessing all matrix elements as consecutive storage locations
under Fortran's array-storage pattern.
(a)

for ___= 1 to ___
 for ___ = 1 to ___
 for ___ = 1 to ___
 cij= cij+ aik x bkj
 end
 end
end


(b)

for i = 1 to m
 for j = 1 to p
 cij= 0
 for k = 1 to n
 cij= cij+ aik x bkj
 end
 end
end


(c)

for j = 1 to p
 for i = 1 to m
 cij= 0
 end
 for k = 1 to n
 for i = 1 to m
 cij= cij+ aik x bkj
 end
 end
end

Example 2: An algorithm that accesses its arrays in storage order using C's
array-storage scheme.
for i = 1 to m
 for j = 1 to p
 cij= 0
 end
 for k = 1 to n
 for j = 1 to p
 cij= cij+ aik x bkj
 end
 end
end
Example 3: The C function matmul, which multiplies two matrices together.



void matmul(int m, int n, int p, float *a, float *b, float *c)
{
 float *bp, *cp;
 int i,j,k,nc;

 nc = m*p;
 cp = c;
 while (nc--)
 *cp++ = 0;

 while (m--)
 { bp = b;
 k = n;
 while (k--)
 { cp = c;
 j = p;
 while (j--)
 *cp++ += *a * *bp++;

 a++;
 }
 c += p;
 }
}


Example 4: C main program to test matrix-multiplication function.

#define M 80
#define N 120
#define P 160

float a[M][N], b[N][P], c[M][P];
void matmul(int , int , int , float *, float *, float *);

main()
{
 int i,j;

 for (i=0; i<M; i++)
 for (j=0; j<N; j++)
 a[i][j] = i+j;
 for (i=0; i<N; i++)
 for (j=0; j<P; j++)
 b[i][j] = i+j;

 matmul(M, N, P, a, b, c);
 printf ("%f %f %f\n",c[0][0], c[39][79],c[M-1][P-1]);
}











August, 1993
Indexing Image Databases


A search algorithm implemented in C++




Art Sulger


Art works in the Technical Services Group for the New York State Office of
Mental Health. He can be reached on CompuServe at 75730,3076.


Imaging systems require a way to store and retrieve large amounts of
unstructured data. At the New York Office of Mental Health, for instance, we
estimate that at each of our 30 facilities there might be as many as
100,000,000 documents to be archived. However, the stability of vendors
offering archiving tools is often suspect. (Indeed, some of the software we're
using was provided by a leading vendor who has since filed for Chapter 11.)
Consequently, we needed an indexing system that would allow many different
image- file formats, be simple enough to test and implement within the project
time allowed, but not lock us into any vendor-specific solution.
Typically, a document image system uses at least two files to store and
retrieve documents. The first is a traditional file that has a text
description of the image along with a key to a second file. The second file
contains the document location. The user selects a record from the first file
using a search algorithm. This front end can be complex, as when the system
supports keyword searches, or even icons of the documents. Once the user
selects a record, the application keys into the location index, finds the
document, and displays it. The name of the image to be displayed is not
important to the user and is often generated by the system. The second file
can be any traditional indexing structure. In this article, I'll discuss the
search algorithm used in this second file to locate images.
Our first attempts used published B-tree code. Because the image filenames did
not have to be readable or intelligent in any way, we generated a meaningless
sequential name, similar to tmpnam in stdio.h. We wrote some batch-testing
programs that generated thousands of filename keys and stored them in our
B-tree. The filenames were 14 characters long, a length that caused some
concern. Large data items in B-trees cause the tree to be deeper and larger.
Disk thrashing ensued, and all we proved was that B-trees don't handle ordered
data very well. Obviously, we needed to build random names. Before heading
down the path of hash tables and random generators, however, we found a better
way.


The Mapper


One of the fastest ways to locate data in a file is by going to a direct
offset into a file. In C, you can do this with the lseek or fseek functions.
We could store any location directions we wished at the chosen offset. This
would yield very fast search times, and there would be less worry about the
size of the location entry. Even the size of the location entry is reduced by
using the image filename itself to derive an offset into a "mapper" file which
contains directions to locate the image.
For example, to store the location of a file named C:\AA\12345678.TIF, we
create an entry in the mapper file that's at a 12,345,678-byte offset. At this
location, we store the device, subdirectory, and file extension. Encoding and
decoding the image filename to and from a meaningful offset is simple.
Fortuitously, the 4-byte long value needed by or lseek can be represented by
the eight hex characters which form the filename. They're decoded as offsets
into the mapper file. We've encapsulated the functions to manipulate these
index entries in the mapper class written in C++.
The Mapper class header (see Listing One, page 104) contains a member struct
that describes the layout of a mapper record. You should tailor this to your
needs. The struct we're using is shown in Example 1. The device member is the
drive letter. The path is a 2-byte directory entry. A DOS path label may be
any alphabetic character or digit, as well as one of the following: ! _ - @ #
$ % ^ & ( ) ~ ` { }. In other words, there are 50 possible single-character
paths available: 50x50+50=2550 directories on each device. We can tell which
image viewer to use based on the FileExtension member. A .TIF extension will
call the TIFF viewer, a WP5 extension will call the WordPerfect viewer, and so
on. If your application uses a more intelligent viewer, you could drop this
member.
The larger this struct is, the fewer mapper entries you'll be able to write.
But don't get too ambitious in keeping the struct slim. Few systems have the
need or the space to store as many files as the 6-byte struct given in this
example (232/sizeof(struct)=715 million); see the text box entitled, "How Much
Can We Store?".
At minimum, the struct should indicate whether the document is stored on a
magnetic drive or on a removable, probably optical, drive. If you're using a
removable optical drive, also known as an "autochanger" or a "jukebox," you'll
need to store the directory information. Jukeboxes are treated as a single
device, and each platter looks to the programmer like a subdirectory. Note
that by storing the device in a file separate from the user interface you can
easily update storage-location changes. If you write an archiving application,
for example, you will only need to update one byte in the mapper file.


Multipage Documents


It would be nice if our imaging system could handle documents that have more
than one page. Treating multipage documents is a little more complex. We still
store the location information for each document page in the mapper file. A
second file contains a doubly linked list. Each linked-list entry also points
at a single mapper-file entry. This allows us to scan forward and backward
through the document's pages. A separate LinkedList class handles file I/O to
a linked-list file. The header is in Listing One (page 104), and the methods
of this class and the Mapper class are in Listing Two (page 104). The layout
of the linked-list file is shown in Example 2. Notice that the linked-list
structure is larger than that of the mapper. You'll have to estimate the ratio
of multipage to single-page documents. If every document is multipage you may
consider using a singly linked list, which will eliminate one-third of the
space required for the linked-list entries. The drawback, of course, is that
the application will have to write its own routine for scanning the pages of a
document in reverse.
Listing Three (page 105) contains the code for a program that exercises the
Mapper and LinkedList classes. It writes out 1000 single-page document entries
and 1000 four-page document entries, then reads them back.
The Mapper and LinkedList objects are contained in and contain other objects.
I've omitted a class that provides extensive error messages. If there are
run-time errors in this example, the objects will return a NULL if a character
value was expected and a --1 if an integer value was expected; a 0 returned
usually signals success.
The locking calls are compiler specific (Zortech), but I left them in to
indicate likely places you should lock out other users. Scanners take several
seconds, during which time you won't want your initialized index space updated
by another scanner user. Turn off locking by defining NOLOCK=0.
You use the Mapper class to create the name of the image file. The Lock- Spot
method does this and also locks the new record space in the mapper file. This
might be necessary if more than one scanner is in operation. The filename
returned does not include the extension.
You retrieve the location of the image file with the Read method which will
construct a fully qualified filename. Essentially, this method returns the
location instructions. You could expand these instructions beyond a simple
file specification.
You should call Write after LockSpot. First your application will get the
filename via LockSpot. Then you will scan the document using that filename.
Finally, you'll write this information in the already locked and initialized
MAPPER.DAT file.
At some time during the creation and storage of the image, you should get the
long value that is the offset into the mapper file. This is done with the
lMAPSSLOT number. If you're combining single- and multiple-page images, you'll
indicate that the offset is to the Mapper by keeping it a positive value;
offsets to the LinkedList are converted to a negative number.
Your application must know before creating the index whether the document is
to have a single or multiple pages. Multipage documents can be stored and then
retrieved in the same order. If the scanner operator picks multiple pages,
your application will create an instance of the LinkedList class and call the
LockSpot method. After getting a LinkedList slot, LockSpot gets the next
available mapper slot and saves the value in the MapperAddress member of the
LinkedListBuffer. The mapper value in hex is the filename you will use when
scanning. This function returns a pointer to the Mapper value.
For the second and all subsequent pages, after you have called LockSpot and
stored the next page of the image file, use Linkin to link the previous page
with a call to this member.
You'll want several members to make it easy to traverse the linked list.
LastImage is one that retrieves the fully qualified filename of the last image
in a multipage document. To save space, I've not listed the others here.
The application will store the offset in a database somewhere. If the offset
is a negative value, pass it to the Read member of the LinkedList class, which
returns a pointer to a fully qualified filename. Read assumes you know that
the value is in the LinkedList. Whether you pass a positive or negative
number, LinkedList will save it as a positive value.
Open and Close could be put into the constructor and destructor. Having
separate members will allow you to limit the number of open files. Close
allows you to free the two file handles while maintaining the internal
variable values. Open will automatically create a new file if one doesn't
already exist. Open is called by almost every member function just to check if
the file handle is valid. You can avoid this extra function call by checking
for the existence of a valid file pointer.


What We'd Do Differently Next Time


The current design never cleans up space released by deleted images. We felt
that going to optical storage would obviate the need to have a delete
function. However, there were more scanning errors than we anticipated. The
linked list and mapper should incorporate a single linked-list of deleted
records. I first saw this in the B-tree code in Al Stevens's C Database
Development (MIS Press, 1987). Both files would have a header that would
indicate the first available slot. That slot would have a pointer to the next
deleted slot, and so on.


Summary


A custom-tailored mapper file can open up new avenues for storage. For
instance, if you have access to a mainframe, you could store location
information for 9-track tape or high-density pack storage. You can use any
location information that you can encode in your customized mapper struct. You
can store images on a LAN, WAN, tape, optical disk, mainframe, or jukebox, and
this system will locate and retrieve each image.

Example 1: The member struct that describes the layout of a mapper record.
struct Mapper
 {
 char Device ;
 char Path [2] ;
 char FileExtension [3] ;
 };

Example 2: Layout of the linked-list file.

struct LinkedList
 {
 unsigned long Prev ;
 unsigned long MapperOffset ;
 unsigned long Next ;
 } ;


The purpose of the mapper structure is to hold location information. Each
mapper entry points to a single file. You want to make your structure flexible
enough to describe many different types of storage, because when you get into
image processing, disk sizes suddenly seem quite small. Imaging people buy
storage big and often. If you plan to store your million or so images on
magnetic disk, you may be in for a surprise. DOS uses a file-allocation table
(FAT) to locate disk clusters. Each file will occupy at least one cluster. The
FAT entry on DOS fixed disks larger than 17 megabytes is 16 bits long. So even
the largest drives available, 2 gigabytes, will limit you to about 64K files
(assuming that each file is <=32K). You will require 16 2-gigabyte drives to
store a million images.
Even if the FAT entry were increased, a 2-gigabyte disk would not have the
data space to store many more images. This is one of the reasons that
removable storage becomes important in imaging systems.
On the other hand, removable optical storage, even jukeboxes, can be very slow
when more than one user at a time requests data. So you must find a balance
between fixed and removable storage. In an ordinary system, volatility is the
most important part of the equation. Our image storage and retrieval
applications do not ordinarily update images, so read demand becomes the more
important ingredient.
--A.S.
[LISTING ONE]
// Mapper.hpp
#define NOLOCK 1
// these constants would be in a default file
// or WIN.INI (for a Windows app):
char cMap[] = "C:\\MAPTEST" ; // Mapper.Dat location
char cLinkL[] = "C:\\MAPTEST" ; // Linkedl.Dat location
#ifndef MAPPER
#define MAPPER
#include <stdio.h>
#include <sys\locking.h>
#include <share.h>
#include <io.h>
#include <fcntl.h>
#include <string.h>

class Mapper
 {
 private :
 int fp ; // file pointer
 long lBytePosition ;
 char cBytePosition [9] ; // hex representation of long value
 char * cMapperFileSpec ; // Mapper.dat full file name
 struct MapperBuffer // layout of the Index entry
 {
 char cDevice [1] ; // device where image is
 char cPath [2] ; // directory or Jukebox platter
 char cFileExtension [3] ; // type of image
 } Map ;
 void WipeMapper(){strnset((char *)&Map,'\0',sizeof(Map));}
 char cMwholefilename[19];
 public :
 Mapper() ;

 ~Mapper() ;
 int Close() ;
 char * Hexbytes(){return cBytePosition ;}

 long lMapSlot() { return lBytePosition ; }
 char * LockSpot() ;
 int Open() ;
 char * Read(long lOffset) ;
 int Write() ; // commit to disk
 int Write(char * Extension, // Image type
 char * Path, // subdirectory or jukebox platter
 char * Device) ;
 } ;
#endif // MAPPER

#ifndef LINKEDLIST
#define LINKEDLIST
class LinkedList
 {
 private :
 int fp ; // file pointer
 long lBytePosition ;
 char * cLinkedListFileSpec ;
 struct LinkedListBuffer
 {
 long Previous ;
 long Next ;
 long MapperAddress ; // points at Mapper Index entry
 } ll ;
 Mapper * M;
 public :
 LinkedList() ;
 ~LinkedList() ;
 int Close() ;
 int Linkin(long OldEntry) ;
 long lLinkSlot() { return lBytePosition ; }
 char * LastImage(long lOffset) ;
 char * LockSpot() ;
 long MapAddress() {return ll.MapperAddress ;}
 long Next(){return ll.Next ;}
 int Open() ;
 char * Read(long) ;
 int Write(char * Extension, // Image type
 char * Path, // subdir or jukebox platter
 char * Device) ;
 } ;
#endif // LINKEDLIST

[LISTING TWO]

#include "Mapper.hpp"
// M A P P E R M E T H O D S
//-----------------constructor-----------------------------
Mapper::Mapper()
 {
 fp = lBytePosition = 0 ;
 Open() ;
 }
//------------------------oblivion-------------------------
Mapper::~Mapper()
 {
 if (fp)
 {

 delete cMapperFileSpec ;
 close(fp) ;
 }
 fp = 0 ;
 }
//-------------------open and close members-----------------
Mapper::Open()
 {
 if (!fp)
 { // get file location defaults:
 cMapperFileSpec = new char [strlen(cMap) + 13] ;
 sprintf(cMapperFileSpec, "%s\\MAPPER.DAT", cMap) ;
 // append if exists, otherwise create :
 if (access(cMapperFileSpec, F_OK == -1))
 {
 FILE * fd = fopen(cMapperFileSpec, "w+") ;
 fclose(fd) ;
 }
 if ((fp = sopen (cMapperFileSpec,
 O_RDWR,

 SH_DENYNO)) == -1)
 return -1 ;
 }
 return 0 ;
 }
//------------------Close-----------------------------------
Mapper::Close()
 {
 if (fp)
 {
 close(fp) ;
 fp = 0 ;
 }
 return 0 ;
 }
//-------------------LockSpot-------------------------------
char * Mapper::LockSpot()
 {
 if (!fp) Open() ;
 lBytePosition = lseek(fp, 0L, SEEK_END) ;
 WipeMapper() ;
 if (write(fp, &Map, sizeof(Map) ) == -1)
 return NULL ;
 lseek(fp, lBytePosition, SEEK_SET) ;
 #ifndef NOLOCK
 // don't let anyone else append
 if (locking(fp, LK_LOCK, (long)sizeof(Map)) == -1)
 return NULL ;
 #endif
 sprintf(cBytePosition,"%8.8lx", lBytePosition) ;
 return cBytePosition ;
 }
//---------------------------Read---------------------------
char * Mapper::Read(long lOffset)
 {
 if (!fp) Open() ;
 if (lseek(fp, lOffset, SEEK_SET) == -1)
 return NULL ;

 if (read(fp, &Map, 6 ) == -1) // device,dir, & extension
 return NULL ;
 sprintf(cBytePosition, "%8.8lx", lOffset); // filename
 sprintf(cMwholefilename, "%1.1s:\\%1.2s\\%8.8s.%3.3s",
 Map.cDevice, // Image device
 Map.cPath, // Image path (or Jukebox disk)
 cBytePosition, // filename/offset in hex
 Map.cFileExtension ) ; // type (TIF,WP4,WP5...)
 return cMwholefilename;
 }
//--------------------------Write--------------------------
Mapper::Write(char * Extension,// Image type
 char * Path, // Image subdir or juke platter
 char * Device)// Image device (single letter)
 {
 if (!fp) Open() ;
 memcpy((char *)&Map.cDevice, Device, 1) ;

 memcpy((char *)&Map.cPath, Path, sizeof(Map.cPath)) ;
 memcpy((char *)&Map.cFileExtension, Extension,
 sizeof(Map.cFileExtension)) ;
 return Write() ;
 }
//----------------------------------------------------------
Mapper::Write()
 {
 lseek(fp, lBytePosition, SEEK_SET) ;
 if (write(fp, &Map, sizeof(Map) ) == -1)
 return -1 ; // should return error code here
 #ifndef NOLOCK
 if (locking(fp, LK_UNLCK, (long)sizeof(Map)) == -1)
 return - 1 ; // should return error code here
 #endif
 return 0 ;
 }
// L I N K E D L I S T M E T H O D S
//-------------------------constructor----------------------
LinkedList::LinkedList()
 {
 fp = 0 ;
 Open() ;
 M = new Mapper();
 }
//------------------------oblivion--------------------------
LinkedList::~LinkedList()
 {
 if (fp)
 close(fp) ;
 delete M ;
 }
//-------------------open and close members-----------------
LinkedList::Open()
 {
 int ok ;
 if (!fp)
 {
 cLinkedListFileSpec = new char [strlen(cLinkL) + 13] ;
 sprintf(cLinkedListFileSpec,"%s\\LINKEDL.DAT", cLinkL) ;
 // append if exists, otherwise create :

 if (access(cLinkedListFileSpec, F_OK == -1))
 {
 FILE * fd = fopen(cLinkedListFileSpec, "w+") ;
 fclose(fd) ;
 if ((fp = sopen (cLinkedListFileSpec,
 O_RDWR,
 SH_DENYNO)) == -1)
 return -1 ;
 lBytePosition = lseek(fp, 0L, SEEK_END) ;
 // Write a -1 header because a 0' file name * -1 = 0
 ll.Previous = ll.Next = 0 ;
 ll.MapperAddress = -1 ;
 lBytePosition = 0 ;
 lseek(fp, lBytePosition, SEEK_SET) ;

 if (write(fp, &ll, sizeof(ll) ) == -1)
 return -1 ;
 return 0 ;
 }
 else // file already exists
 if ((fp = sopen (cLinkedListFileSpec,
 O_RDWR,
 SH_DENYNO)) == -1)
 return -1 ;
 }
 return 0 ;
 }

LinkedList::Close()
 {
 if (fp)
 {
 delete cLinkedListFileSpec ;
 close(fp) ;
 fp = 0 ;
 }
 return 0 ;
 }
//---------------------LastImage----------------------------
char * LinkedList::LastImage(long lOffset)
 {
 Read(lOffset) ;
 while (ll.Next)
 Read(ll.Next);
 return (M->Read(ll.MapperAddress)) ;
 }
//--------------------------Read---------------------------
char * LinkedList::Read(long lOffset)
 {
 char lbuf[9];
 char buffer[7];
 if (lOffset < 0)
 lOffset *= -1 ;
 lBytePosition = lOffset ;
 if (!fp) Open() ;
 if (lseek(fp, lOffset, SEEK_SET) == -1)
 return NULL ;
 if (read(fp, &ll, sizeof(ll) ) == -1)
 return NULL ;

 return (M->Read(ll.MapperAddress));
 }
//---------------------LockSpot-----------------------------
char * LinkedList::LockSpot()
 {
 if (!fp) Open() ;
 lBytePosition = lseek(fp, 0L, SEEK_END) ;
 ll.Next = ll.Previous = 0 ;
 if (write(fp, &ll, sizeof(ll) ) == -1)
 return NULL ;
 lseek(fp, lBytePosition, SEEK_SET) ;

 #ifndef NOLOCK
 if (locking(fp, LK_LOCK, (long)sizeof(ll)) == -1)
 return NULL ;
 #endif
 M->LockSpot() ;
 ll.MapperAddress = M->lMapSlot() ;
 return M->Hexbytes() ;
 }
//---------------------------Write--------------------------
LinkedList::Write(char * Extension, // Image type
 char * Path, // subdir or platter
 char * Device)
 {
 if (!fp) Open() ;
 lseek(fp, lBytePosition, SEEK_SET) ;
 if (write(fp, &ll, sizeof(ll) ) == -1)
 return -1 ;
 #ifndef NOLOCK
 if (locking(fp, LK_UNLCK, sizeof(ll)) == -1)
 return -1 ;
 #endif
 M->Write(Extension, Path, Device) ;
 return 0 ;
 }
//---------------------------Write--------------------------
LinkedList::Linkin(long LLPr)
 {
 if (!fp) Open() ;
 lseek(fp, LLPr, SEEK_SET) ;
 if (read(fp, (char *)&ll, sizeof(ll) ) == -1)
 return -1 ;
 lseek(fp, LLPr, SEEK_SET) ;
 ll.Next = lBytePosition ;
 if (write(fp, (char *)&ll, sizeof(ll) ) == -1)
 return -1 ;
 lseek(fp, lBytePosition, SEEK_SET) ;
 if (read(fp, (char *)&ll, sizeof(ll) ) == -1)
 return -1 ;
 lseek(fp, lBytePosition, SEEK_SET) ;
 ll.Previous = LLPr ;
 if (write(fp, &ll, sizeof(ll) ) == -1)
 return -1 ;
 #ifndef NOLOCK
 if (locking(fp, LK_UNLCK, sizeof(ll)) == -1)
 return -1 ;
 #endif


 return 0 ;
 }

[LISTING THREE]
/* this program creates a Mapper.dat and LinkedL.dat and
writes 1,000 single image entries and 1,000 entries of 4 page
documents, then reads them back. The entries are stored in
a sequential file as 4 byte character strings. */
char cKey[] = "C:\\MAPTEST\\KEYS.X" ;
char cDev[] = "X"; // where images' are stored
const long TestCount = 1000 ;
const int MultiPage = 4 ; // # Images in multipage docs.
union value // converts 4 byte chars to long and visa-versa
 {
 long lValue ;
 char cValue[sizeof(long)] ;
 } uValue ;
Mapper * Map ;
LinkedList * LL ;
char * AvailableMapper ;
#include "stdlib.h"
int main(int argc, char * argv[])
 {
 char cmd [32] ;
 FILE * fd ;
 int fp ;
 long i ;
 long lLong ;
 char szDir[3] ;
 sprintf(cmd, "DEL %s\\MAPPER.DAT", cMap) ;
 system(cmd) ;
 sprintf(cmd, "DEL %s\\LINKEDL.DAT", cLinkL) ;
 system(cmd) ;
 fd = fopen(cKey, "w") ;
 fclose(fd) ;

 fp = open(cKey, O_WRONLY) ;
 Map = new Mapper ;
 itoa(1, szDir, 10) ; // make up directory names
 for (i = 0; i < TestCount; i++)
 {
 Map->LockSpot() ;
 Map->Write("TIF", szDir, cDev) ;
 uValue.lValue = Map->lMapSlot() ;
 lseek(fp, 0, SEEK_END) ;
 write(fp, (char *)&uValue.cValue, sizeof(long)) ;
 if ((i / 100) * 100 == i)
 {
 itoa(i, szDir, 10) ; // change directory name
 printf("\t%ld", i) ;
 }
 }
 delete Map ;
 close(fp) ;
 // Build some Multi-page:
 fp = open(cKey, O_RDWR) ;
 printf("\nMulti-page Documents\n") ;
 LL = new LinkedList ;
 itoa(1, szDir, 10) ; // make up directory names

 for (i = 0; i < TestCount; i++)
 {
 LL->LockSpot() ;
 LL->Write("TIF", szDir, cDev) ;
 uValue.lValue = LL->lLinkSlot() ;
 uValue.lValue *= -1 ; // say we are a linked list entry
 lseek(fp, 0, SEEK_END) ;
 write(fp, (char *)&uValue.cValue, sizeof(long)) ;
 lLong = LL->lLinkSlot() ;
 for (int j = 1; j < MultiPage; j++)
 { // next pages:
 LL->LockSpot() ;
 LL->Write("TIF", szDir, cDev) ;
 LL->Linkin(lLong) ;
 lLong = LL->lLinkSlot() ;
 }
 if ((i / 100) * 100 == i)
 {
 itoa(i, szDir, 10) ; // change directory name
 printf("\t%ld", i) ;
 }
 }
 close(fp) ;
 delete LL ;
 // we can read them all back now:
 Map = new Mapper ;
 LL = new LinkedList ;
 fp = open(cKey, O_RDONLY) ; // open the keys
 lseek(fp, 0, SEEK_SET) ;
 printf("\n'Long'\tFilename\n") ;
 while (read(fp, (char *)&uValue.cValue, sizeof(long)))
 {

 if (uValue.lValue >= 0)
 printf("\n%ld\t%s",
 uValue.lValue, Map->Read(uValue.lValue)) ;
 else
 {
 printf("\n%ld\t%s",
 uValue.lValue, LL->Read(uValue.lValue)) ;
 while (LL->Next())
 printf("\n%ld\t%s",
 uValue.lValue, LL->Read(LL->Next())) ;
 }
 }
 close(fp) ;
 delete Map ;
 delete LL ;
 return 0 ;
 }
End Listings











August, 1993
Programming the Motorola 68332


Tackling the queued serial peripheral interface




Jack J. Woehr


Jack is a freelance programmer and contributing editor to Embedded Systems
Programming magazine. He can be contacted at either P.O. Box 51, Golden, CO
80402, jax@well.sf.ca.us., or by fax at 303-277-9497.


Among Motorola's 300-bus microcontrollers, the 68332 is probably the most
applicable to embedded-systems programming. The 68332 offers a slightly
abbreviated 68020 instruction set along with a few added instructions. The
68332's time-processing unit (TPU) allows 16 digital I/O lines to be employed
in a variety of timer-event functions. A third functional block of the 68332,
the queued serial module (QSM), processes serial transactions on two channels:
The first is the serial communications interface (SCI), which is an on-chip
UART; the second is the queued serial peripheral interface (QSPI), a 4-wire
synchronous serial bidirectional transfer mechanism.
Synchronous serial protocols are simultaneous, bidirectional clockings of bits
between two stations. Synchronous serial data communications effect a design
trade-off between data-transfer speed on one hand, and economy on the other.
While a synchronous serial exchange between a CPU and a peripheral is slower
than parallel bus communication, the smaller pin package of the synchronous
serial periphal demands fewer printed wires and less board space. Another
advantage of the synchronous serial peripheral is that the circuit designer,
once having become familiar with such a peripheral device, is able to employ
it in various designs without regard to the host microcontroller. In
programmer's parlance, the synchronous serial peripheral is more portable than
the bus peripheral.
Among the popular synchronous serial protocols for which off-the-shelf silicon
devices are available are Philips/Signetics' I2C, National Semiconductor's
Microwire, and Motorola's SPI. While the I2C is attractive because it provides
synchronous, full-duplex communications and determinate arbitration of
multiple masters using two wires, the other two protocols (which require four
wires and depend upon single-mastering via chip selects and station addresses)
are useful, inexpensive, and available.


Not Just SPI


The queued serial peripheral interface (QSPI) submodule of the 68332 QSM is
highly programmable, enough so that it can be used not only with the various
and sometimes divergent implementations of Motorola SPI, but also with
National Semiconductor Microwire devices.
In discussing the 68332, I'll refer to Vesta Technology's SBC332, a
stand-alone, 68332-based, single-board computer. The SBC332 is a Motorola BCC
pin-alike/work-alike with socketed memory and certain other features intended
to make the SBC332 more generally useful than its ancestor, the BCC.
The SBC332 has been used to control an LCD/beeper/keypad (LBK) daughterboard
via SPI, using the QSPI on the 332 side and two Harris CDP68HC68P1 8-bit SPI
parallel ports on the LBK itself. If you're a data-sheet junkie and don't have
the Harris book handy, note that the CDP68HC68P1 is a part acquired from
GE/Intersil and appears in the latter organization's last few annual catalogs
also.
In certain applications we've made good use of the 93(C)46, a second-sourced
Microwire EEPROM which offers 64 16-bit words of readable/writable storage
with a life cycle ranging from 100,000 writes per cell to over 400,000 writes
per cell, depending on the manufacturer's specification. Larger devices
ranging up to 11-bit address space that are essentially code-compatible with
the smaller part are available in this family.
Communication with these two parts, the CDP68HC68P1 and the 9346, is the
subject of the code examples presented here.


Queued Serial


The QSPI queue has 16 entries for commands and synchronous replies that travel
on this bidirectional facility. The commands are written to command RAM, and
the replies returned in receive RAM.
Once the desired setup is achieved by programming the appropriate QSM
registers (for instance, by setting up the baud, master/slave relationship,
appropriate clock edge for data latching and desired chip-select levels as in
the accompanying source-code examples), conducting one or more QSPI
transactions is a matter of:
Writing commands to command RAM.
Setting the new queue pointer (NEWQP) and end queue pointer (ENDQP) to point
to the first and last command of a series of commands which are to be
transmitted. This is done after establishing whether the commands between
NEWQP and ENDQP should loop or execute in sequence once only by our settings
of serial peripheral control register 2 (SPCR2).
Sending a "go" message to the QSPI by setting the serial peripheral enable bit
(SPE) high in serial peripheral control register 1 (SPCR1).
The queue can also be used to store two or three multiple-message
transactions. When one of these transactions is desired, you simply set the
new/end pointers to the appropriate entries in the 16-entry queue and let 'er
rip. In this way, a number of short transactions can be used many times
without reprogramming each transaction every time it's needed.


The Code


The source code presented here is written for Vesta Forth Standard Edition
(VFSE), a fully orthogonal, 32-bit, subroutine-threaded, ROM-resident Forth,
optimized for generation of embedded-control applications.
LBK332.F (Listing One, page 106) illustrates the use of the QSPI to control
the Vesta LBK LCD/beeper/keypad board. While notable as a model of setting up
the various registers and activating queued operations, once they are
initialized the LBK's CD68HC68P1 ports are typical SPI protocol parts and
require no especially neat programming tricks, other than appropriate delays
for the controlled peripherals to react to commands. When I wrote the code, it
wasn't clear to me why the serial peripheral finished flag always seemed to
come True before the receive RAM that contains the response from the addressed
SPI peripheral becomes valid. It turns out that the SPF flag doesn't
self-reset; software must do this at the commencement of every queue
launching. Well, I'll be! I therefore resorted to long loops (see the word
LBK332! in the listing) between the instant QSPI asserts that a transaction is
finished by setting the serial peripheral interface finished (SPIF) bit in the
serial peripheral status register (SPSR) and the time I actually read the
receive RAM. Sounds silly, but it worked at the time.
QSM332.F (Listing Two page 108) provides the generalized constructs for
programming the registers of the QSM. All the registers concerned with QSM
control are defined, along with their relevant bits. Where multiple-bit mask
fields appear in the programming model, both the field and the shift required
to move a right-justified mask into the proper position for writing to the
register are defined.
Also defined are words to manipulate QSM interrupt structure, processor mode
of access, activity, and status during 68332 BACKGROUND mode.
In 9346_332.F (available electronically), the requirements for emulating
Microwire are more demanding. While it is easy enough to program data
transitions on the correct clock edge as suited to Microwire by setting the
CPOL and CPHA bits in serial peripheral control register 0 (SPCR0), the
problem of message length is not as tractable.
The 68332 QSPI is designed for multiple transactions of the same data width.
While that data width is programmable, it is difficult to reprogram on the
fly, but that is precisely what is required to interface a Microwire device to
the QSPI.
Certain operations upon the 9346 take one start bit, a 3-bit command, a 6-bit
address, and a 16-bit data value. This 10-bit+16-bit bidirectional data
exchange isn't native to QSPI: The bit-width of an SPI transaction is not part
of the command RAM entry. Instead, it's a bit mask in SPCR0. Therefore, a
transaction must be completed before this setting is switched from 10-bit to
16-bit transactions (the maximum) for the second portion of the Microwire
message.
The problem is that when the transaction is suspended after the preliminary
10-bit start+command+address is transmitted to the 9346, QSPI lets the
chip-select for the device go inactive, terminating (from the point of view of
the 9346) the transaction.
The trick is that the program must explicitly assert the chip-select for the
desired device by writing the appropriate mask to QPDR prior to activating the
SPI. If the chip select is asserted explicitly, SPI doesn't de-assert it upon
completing the single preliminary 10-bit transaction. Then the bit-width field
in SPCR0 can be changed quickly and the second portion of the transaction sent
on its way. The 9346 doesn't care that the serial clocking is thus a little
longer between the 10th and 11th bit than at other times, since transaction
clocking is the responsibility of the host processor. At the termination of
the complete transaction, the chip-select is explicitly de-asserted.


Conclusion



"Sixers" who've watched in frustration as the 68K family has been relegated
upwards to VME can relax now that Motorola has finally provided a part
suitable for control applications, the bread and butter of embedded-systems
programming. Just as 68332 on-chip discrete I/O and TPU help justify the cost
of the part, 68332 QSPI alleviates, in part, the often-difficult task of
selecting appropriate peripheral devices for a 68K-based embedded system.
[LISTING ONE]
DOWNLOAD

\ Filename: lbk332.f
\ Author: jack j. woehr jax@well.UUCP JAX on GEnie SYSOP, RCFB (303) 278-0364
\ Copyright (c) 1991, Vesta Technology, Inc. ALL RIGHTS RESERVED.
\ Platform: SBC332 w/ VFSE Purpose: Primitives & exercise LBK332 board.
\ Dependencies: Platform; W@ W! (16-bit ops)
\ TRUE FALSE 0<> from Core Extensions
\ YREG332.F and QSM332.F must be loaded
\ Some code meant to execute only in Supervisor mode.
\ References: MC68332 USER'S MANUAL [MC68332UM/AD] Motorola, 1990
\ GE Solid State Data Book SSD-260C Printed 12-87 [GE-SSD]

BASE @
HEX

\ ** Utilities -- Kludgy delays, may have to be adjusted when PAUSE is
\ implemented. Reason for this delay: 1) Receive RAM doesn't seem to be
written
\ yet even when SPI says it's done in the SPSR.SPIF register, and
\ 2) This stuff seems to be too fast for the LCD display.

: 20-MS ( --)
 2000 0 DO ( PAUSE) LOOP ;
: 100-MS ( --)
 10000 0 DO ( PAUSE) LOOP ;
\ ** Setting up for QSPI for the LBK332
\ These routines can serve as models for other types of QSPI setup, though
\ actual values, e.g., baud, int vects & int levels might be different in a
\ different application. These are specific to CDP68HC68P1 used on the LBK332.
\ [MC68332UM/AD] 5.4.3.1 Write first to set output state. Specifically, we had
\ better make sure that the LBK332 Chip Select is not active (low) (PCS0/_SS).
: QPDR-SETUP ( --) QPDR.PCS0/_SS QPDR TRUE SET-YREG ;

\ [MC68332UM/AD] 5.4.3.3
\ Then configure direction bits MISO input, PCS0 MOSI output
: QDDR-SETUP ( --)
 QDDR.MISO QDDR FALSE SET-YREG
 QDDR.PCS0/_SS QDDR.SCK OR QDDR.MOSI OR QDDR TRUE SET-YREG
;
\ [MC68332UM/AD] 5.5.4.6
\ Init RAM
: QSPI-RAM-INIT ( --) ;

\ [MC68332UM/AD] 5.4.3
: QPAR-BITS ( chip-sel-mask --)
 [ QPAR.PCS0/_SS QPAR.PCS1 OR QPAR.PCS2 OR QPAR.PCS2 OR ] LITERAL
 DUP ROT AND \ constrain chip select mask
 SWAP INVERT QPAR W@ AND \ NAND off CS bits in current reg mask
 OR \ OR in new chip select mask
 QPAR.MOSI QPAR.MISO OR OR \ OR in these bits
 QPAR W! \ store new mask
;
\ [MC68332UM/AD] 5.5.4.1 .. Set up as master of 8 bit xfers, 2.1MHz,
\ Data changes on leading edge, sampled on falling edge.
: SPCR0-SETUP ( --)
 SPCR0.MSTR \ SBC332 is master
 8 >SPCR0.BITS LSHIFT OR \ CDP68HC68P1 is 8 bit xfers

 SPCR0.CPHA OR \ [GE-SSD] pg. 537 Fig. 3
 FF OR \ SPCR0.SPBR 33 KHz frequency (Table 5-9)
 SPCR0 TRUE SET-YREG \ install mask register
;
\ [MC68332UM/AD] 5.5.4.2 .. Set up but don't start up
: SPCR1-SETUP ( --)
 SPCR1.DSCKL >SPCR1.DSCKL LSHIFT
 SPCR1.DTL OR SPCR1.SPE OR SPCR1 FALSE SET-YREG
;
\ [MC68332UM/AD] 5.5.4.3 .. Reads to this register get actual reg contents,
\ not necessarily last-written which latter is buffered during serial xfer,
\ taking effect next xfer. We setup here for no interrupts, no wraparound,
etc.
\ Of course, application will change this ... this is just init,
\ and is powerup default.
: SPCR2-SETUP ( --) 0 SPCR2 W! ;

\ [MC68332UM/AD] 5.5.4.4 .. We don't need any of this stuff.
: SPCR3-SETUP ( --) 0 SPCR3 W! ;

\ ** Setup of QSM in order prescribed by [MC68332UM/AD] 5.4.1
: QSM-SETUP ( --)
 FALSE QSM-SUPERVISOR
 1 QSM-IARB!
 FALSE QSM-STOP
 FALSE QSM-FREEZE
 40 QIVR W!
 1 1 QSM-INT-LEVEL
 QPDR-SETUP
 QDDR-SETUP
 QSPI-RAM-INIT
 QPAR.PCS0/_SS QPAR-BITS
 SPCR0-SETUP
 SPCR1-SETUP
 SPCR2-SETUP
 SPCR3-SETUP
;
\ Set QSPI a-runnin'
: QSPI-START ( --) SPCR1.SPE SPCR1 TRUE SET-YREG ;
\ Orderly QSPI shutdown
: QSPI-SHUTDOWN ( --)
 SPCR3.HALT SPCR3 TRUE SET-YREG
 BEGIN ( PAUSE) SPSR W@ SPSR.HALTA AND 0<> UNTIL
 SPCR1.SPE SPCR1 FALSE SET-YREG
;
\ Clear a HALT condition
: UNHALT ( --) SPCR3.HALT SPCR3 FALSE SET-YREG ;
\ Return indicates of QSPI was inactive and thus able to conduct one two-byte
\ transaction. Presumes only PCS0/_SS from among chip selects is assigned to
\ QSPI; otherwise, command masks should contain states for other chip selects.
: LBK332! ( u1 u2 -- flag)
 SPCR1 W@ SPCR1.SPE AND 0=
 IF
 TRAN-RAM 2 + W!
 TRAN-RAM W!
 COMD-RAM.CONT COMD-RAM C! \ PCS0/_SS low
 0 COMD-RAM 1+ C! \ ditto
 SPCR2.NEWQP FALSE SPCR2 SET-YREG
 [ 1 >SPCR2.ENDQP LSHIFT SPCR2.ENDQP AND ]
 LITERAL SPCR2 TRUE SET-YREG

 QSPI-START
 TRUE
 20-MS \ kludge delay for Receive RAM to get written
 ELSE
 2DROP FALSE
 THEN
;
: QSPI-DONE? ( -- flag)
 SPSR W@ SPSR.SPIF AND 0<> ;
: QSPI-READY? ( -- flag)
 SPCR1 W@ SPCR1.SPE AND 0= ;
: WAIT-QSPI-DONE ( --)
 BEGIN ( PAUSE) QSPI-DONE? UNTIL ;
: WAIT-QSPI-READY ( --)
 BEGIN ( PAUSE) QSPI-READY? UNTIL ;

\ Constants for controlling the two CDP68HC68P1 chips, addressed
\ 0 and 1. Port 0 is all outputs, except for KEY*. Port 1 is an output to
\ write to the LCD, and an input to read the keypad. [GE-SSD] pp. 537-539
00 CONSTANT PORT0
40 CONSTANT PORT1
00 CONSTANT DATA-REG
20 CONSTANT DIR-REG
00 CONSTANT READ
10 CONSTANT WRITE
00 CONSTANT WRITE-BITS
08 CONSTANT RESET-BITS
0C CONSTANT SET-BITS

\ These only need to be set if you want to read the compare flag
00 CONSTANT ONE-NON-MATCH
01 CONSTANT ALL-MATCH
02 CONSTANT ALL-NON-MATCH
03 CONSTANT ONE-MATCH

: PORT0-SETUP ( --)
 [ PORT0 DIR-REG WRITE WRITE-BITS OR OR OR ] LITERAL
 7F LBK332! DROP \ port 0 all output except KEY*
;
: PORT1-IN ( --)
 [ PORT1 DIR-REG WRITE WRITE-BITS OR OR OR ] LITERAL
 00 LBK332! DROP
;
: PORT1-OUT ( --)
 [ PORT1 DIR-REG WRITE WRITE-BITS OR OR OR ] LITERAL
 FF LBK332! DROP
;
: LBK332-CONFIG ( --)
 PORT0-SETUP PORT1-IN \ must be changed to write LCD
;
\ ** IO Port Control
\ Bits from Port 0
01 CONSTANT R/W* \ Read/Write for LCD Port
02 CONSTANT ENLCD0 \ LCD 0 Enable
04 CONSTANT ENLCD1 \ LCD 1 Enable
08 CONSTANT ENKEY* \ CS for the 74HC541 which reads keypad
10 CONSTANT ENBEEP* \ Beeper enable
20 CONSTANT TONE0/RS \ RS for LCD .. D0 of Beeper pitch
40 CONSTANT TONE1 \ n/a for LCD .. D1 of Beeper pitch

80 CONSTANT KEY* \ Active low when there is a keypress
05 CONSTANT TONE-SHIFT \ Shift for pitch bitmask
\ Only reads data, discards compare mask.
: PORT0@ ( -- u)
 WAIT-QSPI-READY
 [ PORT0 DATA-REG READ OR OR ] LITERAL
 DUP LBK332! DROP
 WAIT-QSPI-DONE
 REC-RAM 2 + W@
;
\ Writes data
: PORT0! ( u --)
 WAIT-QSPI-READY
 [ PORT0 DATA-REG WRITE WRITE-BITS OR OR OR ] LITERAL
 SWAP LBK332! DROP
;
\ sets data
: PORT0-SET! ( u --)
 WAIT-QSPI-READY
 [ PORT0 DATA-REG WRITE SET-BITS OR OR OR ] LITERAL
 SWAP LBK332! DROP
;
\ resets data
: PORT0-RESET! ( u --)
 WAIT-QSPI-READY
 [ PORT0 DATA-REG WRITE RESET-BITS OR OR OR ] LITERAL
 SWAP LBK332! DROP
;
\ Only reads data, discards compare mask.
: PORT1@ ( -- u)
 WAIT-QSPI-READY
 [ PORT1 DATA-REG READ OR OR ] LITERAL
 DUP LBK332! DROP
 WAIT-QSPI-DONE
 REC-RAM 2 + W@
;
\ Writes data
: PORT1! ( u --)
 WAIT-QSPI-READY
 [ PORT1 DATA-REG WRITE WRITE-BITS OR OR OR ] LITERAL
 SWAP LBK332! DROP
;
\ ** Keypad Control -- Enable, disable keypad reader. Necessary so that keypad
\ data doesn't trash LCD writes.
: KEYPAD-ENABLE ( --) ENKEY* PORT0-RESET! ;
: KEYPAD-DISABLE ( --) ENKEY* PORT0-SET! ;

\ Is a key currently pressed?
: PKEY? ( -- flag) PORT0@ KEY* AND 0= ;
\ @PKEY returns a mask in which D0-D2 are octal row and D3-D5 are octal
column.
\ The mask is only valid if KEY* is currently reading HI
: @PKEY ( -- mask)
 PORT1-IN KEYPAD-ENABLE PORT1@ KEYPAD-DISABLE
;
\ ** LCD Control -- Delays may have to be written into these LCD words
\ depending upon serial clock rate.
: LCD0-CMD! ( u --)
 KEYPAD-DISABLE
 PORT1-OUT PORT1!

 R/W* PORT0-RESET!
 TONE0/RS PORT0-RESET!
 ENLCD0 PORT0-SET!
 ENLCD0 PORT0-RESET!
 100-MS \ kludge delay for slow LCD
;
: LCD0-DATA! ( u --)
 KEYPAD-DISABLE
 PORT1-OUT PORT1!
 R/W* PORT0-RESET!
 TONE0/RS PORT0-SET!
 ENLCD0 PORT0-SET!
 ENLCD0 PORT0-RESET!
 100-MS \ kludge delay for slow LCD
;
: LCD1-CMD! ( u --)
 KEYPAD-DISABLE
 PORT1-OUT PORT1!
 R/W* PORT0-RESET!
 TONE0/RS PORT0-RESET!
 ENLCD1 PORT0-SET!
 ENLCD1 PORT0-RESET!
 100-MS \ kludge delay for slow LCD
;
: LCD1-DATA! ( u --)
 KEYPAD-DISABLE
 PORT1-OUT PORT1!
 R/W* PORT0-RESET!
 TONE0/RS PORT0-SET!
 ENLCD1 PORT0-SET!
 ENLCD1 PORT0-RESET!
 100-MS \ kludge delay for slow LCD
;
\ ** Generic LCD0 Stuph ... RTFM if this doesn't work
: LCD0-HOME ( --) 02 LCD0-CMD! ;
: LCD0-CLEAR ( --) 01 LCD0-CMD! ;
: LCD0-INIT ( --)
 38 LCD0-CMD!
 0F LCD0-CMD!
 LCD0-HOME
 LCD0-CLEAR
;
: LCD0-TYPE ( c-addr u --)
 OVER + SWAP ?DO I C@ LCD0-DATA! LOOP ;
\ ** Beeper
\ Deactivate the #@$!* beeper
: UNBEEP ( --) PORT0@ ENBEEP* OR PORT0! ;
\ Beep at pitches 0 - 3
: BEEP ( u --)
 4 MOD TONE-SHIFT LSHIFT
 PORT0@ [ TONE0/RS TONE1 OR ] LITERAL INVERT AND
 OR ENBEEP* INVERT AND PORT0!
;
\ ** Initialize
: LBK332-SETUP ( --) QSM-SETUP PORT0-SETUP ;
\ ** Quicky Test
: TEST-LBK332 ( --)
 LBK332-SETUP LCD0-INIT
 BEGIN

 BEGIN UNBEEP PKEY? EKEY? OR UNTIL
 PKEY?
 IF
 LCD0-HOME LCD0-CLEAR
 @PKEY
 1F AND DUP 3 RSHIFT
 DUP 4 MOD BEEP
 S" Row: " LCD0-TYPE
 30 OR LCD0-DATA!
 7 AND 30 OR
 DUP 4 MOD BEEP
 S" Column: " LCD0-TYPE
 LCD0-DATA!
 BEGIN PKEY? 0= EKEY? OR UNTIL
 ELSE
 KEY DROP UNBEEP EXIT
 THEN
 AGAIN
;
BASE !
END-DOWNLOAD

[LISTING TWO]

DOWNLOAD

\ Filename: qsm332.f
\ Author: jack j. woehr jax@well.UUCP JAX on GEnie SYSOP, RCFB (303) 278-0364
\ Copyright (c) 1991, Vesta Technology, Inc. ALL RIGHTS RESERVED.
\ Platform: SBC332 w/ VFSE
\ Purpose: Defines and primitives for SBC332 Queued Serial Modul (QSM)
\ Dependencies: Platform board; VFSE-332 operators W! W@ (16-bit ops).
\ Needs: YREG332.F
\ References: MC68332 USER'S MANUAL [MC68332UM/AD] Motorola, 1990
\ $Log: V:/vestasrc/forth/68332/tpuqsm/vcs/qsm332.f_v $
\ Rev 1.1 14 Jul 1992 11:26:26 jax
\ cleanup edits
\ Rev 1.0 17 Jun 1992 10:09:26 jax
\ Initial revision.
\ ** QSM registers and bits from MC68332UM/AD Tables 5-1 and 5-2.

BASE @
HEX
FFC00 YREG QMCR
 8000 CONSTANT QMCR.STOP
 4000 CONSTANT QMCR.FRZ1
 2000 CONSTANT QMCR.FRZ0
 80 CONSTANT QMCR.SUPV
 0F CONSTANT QMCR.IARB \ field, not bit mask
\ FFC02 YREG QTEST \ Not used in VFSE-332
\ 08 CONSTANT QTEST.TSBD
\ 04 CONSTANT QTEST.SYNC
\ 02 CONSTANT QTEST.TQSM
\ 01 CONSTANT QTEST.TMM
FFC04 YREG QILR \ 8 msbs
\ FFC05 YREG QILR \ byte address, but SET-YREG uses W@ W!
 3800 CONSTANT QILR.ILQSPI \ field
 0B CONSTANT >QILR.ILQSPI \ bit shift
 700 CONSTANT QILR.ILSCI \ field

 08 CONSTANT >QILR.ILSCI \ bit shift
FFC04 YREG QIVR \ 8 lsbs
 FF CONSTANT QIVR.INTV \ number
\ FFC06 YREG RESERVED

FFC08 YREG SCCR0
 1FFF CONSTANT SCCR0.SCBR \ field
FFC0A YREG SCCR1
 4000 CONSTANT SCCR1.LOOPS
 2000 CONSTANT SCCR1.WOMS
 1000 CONSTANT SCCR1.ILT
 800 CONSTANT SCCR1.PT
 400 CONSTANT SCCR1.PE
 200 CONSTANT SCCR1.M
 100 CONSTANT SCCR1.WAKE
 80 CONSTANT SCCR1.TIE
 40 CONSTANT SCCR1.TCIE
 20 CONSTANT SCCR1.RIE
 10 CONSTANT SCCR1.ILIE
 8 CONSTANT SCCR1.TE
 4 CONSTANT SCCR1.RE
 2 CONSTANT SCCR1.RWU
 1 CONSTANT SCCR1.SBK
FFC0C YREG SCSR
 100 CONSTANT SCSR.TDRE
 80 CONSTANT SCSR.TC
 40 CONSTANT SCSR.RDRF
 20 CONSTANT SCSR.RAF
 10 CONSTANT SCSR.IDLE
 8 CONSTANT SCSR.OR
 4 CONSTANT SCSR.NF
 2 CONSTANT SCSR.FE
 1 CONSTANT SCSR.PF
FFC0E YREG SCDR
 100 CONSTANT SCDR.R8/T8
 80 CONSTANT SCDR.R7/T7
 40 CONSTANT SCDR.R6/T6
 20 CONSTANT SCDR.R5/T5
 10 CONSTANT SCDR.R4/T4
 8 CONSTANT SCDR.R3/T3
 4 CONSTANT SCDR.R2/T2
 2 CONSTANT SCDR.R1/T1
 1 CONSTANT SCDR.R0/T0
\ FFC10 YREG RESERVED
\ FFC12 YREG RESERVED
FFC14 YREG QPDR \ 8 lsbs only .. 8 msbs reserved
 80 CONSTANT QPDR.TXD
 40 CONSTANT QPDR.PCS3
 20 CONSTANT QPDR.PCS2
 10 CONSTANT QPDR.PCS1
 8 CONSTANT QPDR.PCS0/_SS
 4 CONSTANT QPDR.SCK
 2 CONSTANT QPDR.MOSI
 1 CONSTANT QPDR.MISO
FFC16 YREG QPAR \ 8 msbs
\ FFC17 YREG QPAR \ byte address, but SET-YREG uses W@ W!
 4000 CONSTANT QPAR.PCS3
 2000 CONSTANT QPAR.PCS2
 1000 CONSTANT QPAR.PCS1

 800 CONSTANT QPAR.PCS0/_SS
 200 CONSTANT QPAR.MOSI
 100 CONSTANT QPAR.MISO
FFC16 YREG QDDR \ 8 lsbs
 80 CONSTANT QDDR.TXD
 40 CONSTANT QDDR.PCS3
 20 CONSTANT QDDR.PCS2
 10 CONSTANT QDDR.PCS1
 8 CONSTANT QDDR.PCS0/_SS
 4 CONSTANT QDDR.SCK
 2 CONSTANT QDDR.MOSI
 1 CONSTANT QDDR.MISO
FFC18 YREG SPCR0
 8000 CONSTANT SPCR0.MSTR
 4000 CONSTANT SPCR0.WOMQ
 3C00 CONSTANT SPCR0.BITS \ field
 0A CONSTANT >SPCR0.BITS \ shift
 200 CONSTANT SPCR0.CPOL
 100 CONSTANT SPCR0.CPHA
 FF CONSTANT SPCR0.SPBR \ field
FFC1A YREG SPCR1
 8000 CONSTANT SPCR1.SPE
 7F00 CONSTANT SPCR1.DSCKL \ field
 08 CONSTANT >SPCR1.DSCKL \ shift
 FF CONSTANT SPCR1.DTL \ field
FFC1C YREG SPCR2
 8000 CONSTANT SPCR2.SPIFIE
 4000 CONSTANT SPCR2.WREN
 2000 CONSTANT SPCR2.WRTO
 F00 CONSTANT SPCR2.ENDQP \ field
 08 CONSTANT >SPCR2.ENDQP \ shift
 0F CONSTANT SPCR2.NEWQP \ field
 00 CONSTANT >SPCR2.NEWQP \ shift
FFC1E YREG SPCR3 \ 8 msbs
\ FFC1F YREG SPCR3 \ byte address, but SET-YREG uses W@ W!
 400 CONSTANT SPCR3.LOOPQ
 200 CONSTANT SPCR3.HMIE
 100 CONSTANT SPCR3.HALT
FFC1E YREG SPSR \ 8 lsbs
 80 CONSTANT SPSR.SPIF
 40 CONSTANT SPSR.MODF
 20 CONSTANT SPSR.HALTA
 0F CONSTANT SPSR.CPTQP \ field
\ FFC20 YREG RESERVED \ YFFC20 - YFFCFF
FFD00 YREG REC-RAM \ YFFD00 - YFFD1F
FFD20 YREG TRAN-RAM \ YFFD20 - YFFD3F
FFD40 YREG COMD-RAM \ YFFD40 - YFFD4F
 80 CONSTANT COMD-RAM.CONT
 40 CONSTANT COMD-RAM.BITSE
 20 CONSTANT COMD-RAM.DT
 10 CONSTANT COMD-RAM.DSCK
 08 CONSTANT COMD-RAM.PCS3
 04 CONSTANT COMD-RAM.PCS2
 02 CONSTANT COMD-RAM.PCS1
 01 CONSTANT COMD-RAM.PCS0/_SS
\ ** QSM Functionality
\ *** [MC68332UM/AD 5.4.2.1]
\ TRUE stops, FALSE enables QSM. Supervisor mode only.
: QSM-STOP ( flag --)

 QMCR.STOP QMCR ROT SET-YREG ;
\ To avoid complications at restart & prevent data corruption, first disable
\ all submodules: SCI rx/tx should be disabled, and operation completion
\ verified before asserting STOP. QSPI submodule should be stopped by
asserting
\ SPCR3.HALT and asserting STOP after HALTA flag set. TRUE causes QSM to HALT
\ on transfer boundary when FREEZE asserted on IMB, FALSE causes QSM to ignore
\ said signal. FREEZE is asserted when 332 enters the background mode.
: QSM-FREEZE ( flag --)
 QMCR.FRZ1 QMCR ROT SET-YREG ;
\ TRUE sets QSM supervisor-only access, FALSE (from supervisor mode,
\ of course, resets to QSM user access permitted.
: QSM-SUPERVISOR ( flag --)
 QMCR.SUPV QMCR ROT SET-YREG ;
\ 0 causes 332 to view QSM int as spurious, 1 (lo) to 15 (hi) are
\ interrupt arbitration priority levels.
: QSM-IARB! ( 0-15 --)
 QMCR.IARB AND \ acceptable values 0 - 15
 QMCR W@ \ fetch reg contents
 QMCR.IARB INVERT AND \ mask off IARB bits
 OR \ OR in desired mask
 QMCR W! \ store reg
;
\ *** [MC68332UM/AD 5.4.2.2] is for when MCU is in test mode.
\ Not used in Vesta Forth Standard Edition
\ *** [MC68332UM/AD 5.4.2.3]
\ Int levels 0-7; if same, QSPI given priority.
: QSM-INT-LEVEL ( QSPI-level SCI-level --)
 >QILR.ILSCI LSHIFT QILR.ILSCI AND \ shift 0-7 mask and constrain
 SWAP
 >QILR.ILQSPI LSHIFT QILR.ILQSPI AND \ ditto
 OR \ form combination desired mask
 QILR W@ \ get register
 QILR.ILQSPI QILR.ILSCI OR INVERT AND \ mask off IL fields from reg
 OR \ OR in desired mask
 QILR W! \ store reg
;
\ *** [MC68332UM/AD 5.4.2.4]
\ Int vectors for QSPI/SCI are adjacent; bit D0 of int vect is 0 for SCI
\ and 1 for QSPI.
: QSM-INT-VECTOR ( vector --)
 QIVR.INTV AND QIVR W! ;
\ *** [MC68332UM/AD 5.4.3]
\ In general, bit assignments can be handled directly by the application
\ using SET-YREG from YREGS322.F

BASE !
END-DOWNLOAD
End Listings














August, 1993
Network Access to CD-ROMs


Client/server software for extending CD-ROM access across a NetBIOS-based
network




John H. McCoy and Wuhsiung Lu


John and Wuhsiung are members of the Mathematical and Information Sciences
faculty of Sam Houston State University, P.O. Box 2206, Huntsville, TX 77341.


There are two commonly used methods of providing network access to CD-ROMs.
One is to use Microsoft's MS-DOS CD-ROM Extensions (MSCDEX) or equivalent
redirector to make the CD-ROM appear to be a hard disk, which is then mapped
across the network. This approach works with LAN servers that run as MS-DOS
applications and do I/O with standard INT 21 services to access the drive
letter instead of bypassing MS-DOS. To the client, the CD-ROM drives are
indistinguishable from other server drives. The client, however, cannot
communicate with MSCDEX or access its ancillary functions.
A second approach to sharing CD-ROMs is shown in Figure 1. Here MSCDEX runs on
each client workstation along with a pseudo CD-ROM driver that accepts normal
CD-ROM driver requests from MSCDEX. These requests are transmitted over the
network to a pseudo redirector on a server, which then submits the request to
a bona fide CD-ROM device driver. The response from the CD-ROM is returned via
the network to the client pseudo driver that, in turn, responds to MSCDEX. So
long as the client pseudo CD-ROM driver responds appropriately, MSCDEX will be
unaware that the actual drives are located on a remote machine. This is the
scheme implemented by the client/server program presented here.
Jim Harper provided an in-depth look at implementing a DOS redirector (see "A
DOS Redirector for SCSI CD-ROM," DDJ, March 1993) to make a CD-ROM appear to
be a conventional, read-only hard disk. Jim wrote his redirector to interface
to his own SCSI device driver. MSCDEX is designed for use with drivers that
have a standard DOS-character device interface and respond to an extended set
of DOS-device driver commands. Either combination will make High Sierra and
ISO-9660 CD-ROMs appear as read-only hard disks to DOS.
This article describes a client/server software package which extends CD-ROM
access across a NetBIOS-based network. It supports file redirection and the
ancillary MSCDEX functions as well as digital sound files that are treated as
data files. Thus, programs such as Desert Storm work as they should. Normal CD
audio is not supported, so some programs run properly but silently.
The server program was written in Ada. It runs in real mode under DOS (or, a
specific version of DOS under OS/2) and takes advantage of Ada's built-in
multitasking to support concurrent access to seven CD-ROMs by up to 32 users.
The client program is written in MASM. It runs under DOS in conjunction with
MSCDEX. Multiple copies of the client can be loaded for simultaneous access to
multiple servers. The client can be loaded high with QEMM and can be unloaded.
Unfortunately, MSCDEX loads after the client and therefore cannot be unloaded.


The CD-ROM Device Driver


CD-ROM device drivers have the usual DOS format for device drivers. That is, a
device header followed by the strategy and interrupt procedures.
The device header is an extension of the normal character device header; see
Figure 2. The Attributes field identifies the device as a character device
supporting IOCTL and OPEN/CLOSE.
The extension consists of the three fields after the device name: Reserved,
DriveLetter, and NumberOfUnits. DriveLetter is a read-only field for the
driver and both it and Reserved should be initialized to 0. MSCDEX uses
DriveLetter when it assigns the devices supported by the driver to a drive
letter. According to my documentation, for drivers which support more than one
drive, the drive letter will indicate the first unit and each successive unit
will be assigned the next-higher driver letter. My observation, however, is
that MSCDEX puts the last, rather than the first, drive letter assigned to
this driver in the drive-letter field.
The device driver sets NumberOfUnits to the number of devices attached when
the driver loads. This is not the same as the NumberOfUnits returned to DOS
during the INIT call when a device is installed from CONFIG.SYS. DOS must
always be returned a value of 1 for a character device in the INIT field. The
number of units field in the device header, however, is read and used by
MSCDEX and never seen by DOS.
A program requests the services of a device driver by sending the driver a
request header message containing a command code and related data. The request
is passed to the driver in two calls. The first call is to the driver strategy
routine with the ES:BX registers pointing to the request header. The strategy
routine is expected to save the pointer in local storage for use by the
interrupt routine when it processes the request. Immediately following the
call to the strategy routine a second call is made to the interrupt routine,
which retrieves the request and processes it.
DOS rigidly adheres to this strategy/interrupt calling sequence; MSCDEX does
not. Instead, MSCDEX assumes it has exclusive control of the driver. It uses a
DOS call to open the device driver and gets the driver's header address
through an IOCTL input request. The device is then closed to release the
handle.
MSCDEX uses the device-header address to locate the driver's interrupt-routine
entry point. Subsequent calls are directed to the interrupt routine and assume
the request-header address was stored on the original open call. Consequently,
calls to the driver which bypass MSCDEX change the stored request-header
address and invalidate this assumption. The system usually crashes on the next
MSCDEX call. Hence, communications with the driver should use the
MSCDEX-supported send-device-driver request rather than calling the driver
directly.
In the usual DOS sequence that I've observed, the call to the interrupt
routine immediately follows the call to the strategy routine and the ES:BX
register values are not changed. While MSCDEX bypasses the strategy call, it
does point ES:BX at the request header when it calls the interrupt routine.
Thus, the strategy routine can be reduced to a return statement and the
request-header address stored upon entry to the interrupt routine. This is
nonstandard but it will allow direct calls to the driver without fatal
conflicts with MSCDEX.
Another common problem is the failure of many programs to fill in the
request-header information correctly. Numerous programs do not correctly
specify the control-block length for IOCTL input and output and require a
"lookup" fix in the client. Guinness Disk of Records (a CD-ROM version of the
standard Guinness Book of Records, available from Compton's New Media) doesn't
even get the request-header length correct on the IOCTL output commands that
control the analog audio settings.
In addition to the fields added to the device header, the minimal CD-ROM
driver must support five of the standard DOS-driver commands and two of the
extended commands for locating and reading CD-ROM sectors. These commands are
shown in Table 1, where codes 128 and higher are unique to CD-ROMs.


The Pseudo CD-ROM Driver


On a client workstation, the pseudo CD-ROM driver replaces the CD-ROM device
driver supplied with the controller card, as in Figure 1. It installs from the
command line like a TSR rather than being loaded from CONFIG.SYS.
The executable file consists of approximately 3K of code and is divided into
two parts: a resident part that implements the driver functions, and a
transient part that loads/unloads and initializes the resident part and is
then discarded.
During installation, command-line parameters, if present, are parsed and used
to replace default values for the driver name, the CD-ROM server name, and/or
the client's net name. Installation fails if the net name cannot be added to
the NetBIOS name table for any reason except a duplicate name in the table.
This exception allows the driver to be installed more than once using the same
net name for simultaneous access to more than one CD-ROM server. A different
driver name must be used for each instance or the second installation will
hide the first in the driver chain.
The resident part of the client driver implements the driver functions that
simulate the actions of a true CD-ROM driver. It is essentially a large CASE
statement which switches on the request- header command code. All required
CD-ROM driver functions and the optional READ LONG PREFETCH function are
supported. The CLOSE command is processed locally. The others transfer the
request to the CD-ROM server through NetBIOS calls implemented in an include
file.
When a device driver is loaded from CONFIG.SYS, DOS calls INIT as part of the
installation procedure. When the client driver is loaded as a TSR, as it is
done here, the normal initialization is done by the transient loader. The
first (and only) call that DOS makes to the driver is when MSCDEX opens it to
locate the device header. Thus, INIT isn't needed, so it's omitted. The OPEN
function establishes a connection with the server and uses a dummy INIT call
to the server to obtain the number of CD-ROMs available. This number is put in
the client driver header for later use by MSCDEX, and DOS is simply told that
the driver has been opened.
The network connection is simple. The request header is sent to the server
first, followed by any accompanying data. After the server has processed the
request, the request header is returned to the client first, followed by the
data. Dynamically allocated buffers are used in the server. Only cooked-data
read mode is supported. The NetBIOS maximum message size (64K--1) restricts
READ LONG requests to 31 sectors in a single transfer. When a request exceeds
this, the client makes multiple requests to the server.


The Server Side


The server is implemented as a group of non-preemptive tasks, as shown in
Figure 1. The SCHEDULER handles new calls and assigns a separate, available
SESSION to each client. NET handles the interaction between NETBIOS and the
SESSIONs, and CD is the interface task between the SESSIONs and the CD-ROM
driver. The net names of the attached clients are displayed by the CONSOLE
task, which is not shown in the figure. The main program, SERVER, locates the
CD-ROM driver and statically activates all the tasks. After initialization,
the session tasks call the scheduler and are queued to be assigned to client
calls in a fifo basis. The scheduler in turn makes a listen call to NetBIOS.
When a client call is received, it is passed on to the session task. This is
repeated until all the sessions have been assigned clients. When the scheduler
queue is empty, there will not be an active LISTEN and the server will not
respond to new client calls. The session tasks receive the request header and
any associated data from the client via NetBIOS packets and pass it on to the
CD-ROM task. The CD-ROM's response is in turn encoded in NetBIOS packets and
returned to the client. When a session terminates, the session requeues itself
with the scheduler.
Non-preemptive tasking is used. To ensure the needed cooperation, the calls to
NetBIOS are all nonblocking calls and pauses are inserted at appropriate
points in the various tasks.


Real World Problems



Most of the server is written in the usual, straightforward style
characteristic of Ada. However, Meridian Software's implementation (the system
we used) of certain data structures makes interfacing to externally generated
data streams difficult. If you remain totally within an environment of
"correct" Ada programs, there's no problem. Unfortunately, the real world
seldom affords such luxury.
The first problem relates to alignment holes. Alignment holes occur when space
is inserted between the components of a record to satisfy alignment
requirements of the individual components. For instance, a far pointer which
uses four bytes of storage may have an alignment requirement that its storage
address be divisible by four. When an object of such a type is included in a
record type it is aligned at an offset divisible by four with filler bytes
inserted as needed to effect the alignment. This makes it difficult to match
externally defined records which ignore the alignment constraints.
To work around the alignment-hole problem, first you declare two new numeric
data types: byte (range 0..255) and word (range 0..65535). A third type bytes
is then declared to be an unconstrained Byte array. Type bytes can then be
used to declare multibyte fields without introducing alignment holes. The
declaration of W to be a 2-byte subtype of Bytes and DW to be a 4-byte subtype
accommodates the two most common multibyte fields encountered in externally
generated records. Typecasting functions convert objects of type W to/from
type Word and objects of type DW to/from type system.address. Long_Integers
are also declared to simplify the handling of these object types. These
declarations are in the TYPES package shown in Listing One (page 113). Their
use in record descriptions is shown in the Drivers package in Listing Two
(page 113). Other required files are provided electronically; see
"Availability," page 5.
The second problem relates to discriminant records. Discriminant records are
used when a record may have any one of several different descriptions,
depending upon the value in a specified field called the discriminant. (A
strongly typed version of Cobol's redefine facility.) The restriction that the
discriminant field must be the first component of the record imposed by some
Ada implementations, can often be worked around easily through judicious type
declarations. Meridian Ada introduces a further complication for discriminant
records whose discriminant may change during the execution of the program by
preceding the discriminant with a constrained flag byte. This flag will have a
value of 0 unless the record object is constrained.
A discriminant record is used for the internal representation of the
externally generated device-driver request header. The command code is the
discriminant that determines the variant-record description. The external form
of the request header is treated as a packet of type bytes (that is, a byte
array). When the packet arrives from the client via NetBIOS, the command code
and all subsequent bytes of the packet are treated as a byte slice and are
shifted right one byte in the array. The third byte, formerly the command-code
byte, is zeroed and becomes the constrained-flag byte when an
unchecked_conversion is used to typecast the packet as a request header; see
Figure 3.
The request header must be converted back to a packet before being returned to
the client via NetBIOS. Similarly, when the request header is sent on to the
CD-ROM, it is first converted to a packet before it is sent, and back to a
request header when it is returned. Fortunately, request headers are only
about 30 bytes in length and there is not a great deal of overhead associated
with the conversion.


Putting It All Together


The client was assembled using MASM 6.0b, and the server was compiled and
linked using Meridian Software's Ada 286 Version 4.1.4. The compiler switch,
-K, was used to retain the internal form information needed for optimization
in the link step and the switch, -fs, used to turn off runtime checks. The
latter improves performance significantly.
The link parameters, -G -s 27500 -m 38000, were used to invoke global
optimization and to increase the tasking stack and main-program stack space,
respectively.
The server must be started before MSCDEX is started on the client workstation
but after NetBIOS and the CD-ROM driver are loaded on the server machine. The
CD-ROM driver should be installed with the device name CD001 (for example, use
the command-line parameter /d:CD001). When the server starts, it attaches to
CD001 and installs itself on the network using the net name SHSU-CD-SERVER.
There are no command-line parameters and the server will have to be recompiled
to change either the server name or the driver name.


Mapping the Request Header


The client program, CDNET, can be loaded at any time after NetBIOS is started
on the client machine. A list of the command-line parameters supported is
displayed if CDNET is started with the /? switch. Normally, at least the /c:
switch will be needed to assign the workstation a unique net name. When CDNET
is loaded, it installs itself and validates the client name. This takes a few
seconds while the name is broadcast. The client software will not load if the
client name is already in use on the network. The client does not attempt to
establish a session with the server until MSCDEX is loaded.
Assuming NetBIOS is installed, a typical sequence to load CDNET and MSCDEX
might look something like this (case is not important):
CDNET /c:Monique
MSCDEX /d:CD-NET
Additional drivers, either local or network, can be included using the /d:
switch on the MSCDEX command line. Prior to loading MSCDEX, CDNET can be
unloaded with the /U switch. Unfortunately, MSCDEX is not unloadable, and it
is best to not unload CDNET after MSCDEX has been loaded.
Both the client and the server have been tested on stand-alone, DOS-based
machines and in specific DOS sessions under OS/2 2.0 using Novell's NetBIOS.
However, David Schwaderer's C Programmer's Guide to NetBIOS, IPX, and SPX
(Sams, 1992), based upon IBM's NetBIOS implementation, was the source book for
coding both the Ada and MASM NetBIOS interfaces. Thus, the programs should run
with any NetBIOS implementation that supports the 5Ch interrupt.
The server is configured to handle 32 simultaneous users. To do so, however,
NetBIOS must be configured to handle at least 32 sessions and 32 outstanding
commands. Novell's NetBIOS defaults to handling a maximum of 12 outstanding
commands and thus must be configured to handle additional commands. The test
servers were configured by placing the commands NETBIOS COMMANDS=32 and
NETBIOS SESSIONS=32 in the NET.CFG file. The commands are holdover SHELL
commands and should be left-justified. If the commands are placed at the end
of the file, make certain that the last line is terminated with a return.
NetBIOS configuration commands written in the newer NET.CFG format are shown
in some books written about NetWare. Perhaps they will work for you. NetBIOS
displays the configuration information in its installation message (that is,
if it's configured for 32 commands, it will say so).
 Figure 1: Networked CD-ROMs.
Figure 2: CD-ROM driver header.
 DeviceHeader label word
 NextDriver dword -1
 Attributes word 0C800h
 word Strategy
 word Interrupt
 DeviceName byte CD-NET 
 Reserved word 0
 DriveLetter byte 0
 NumberOfUnits byte 1


Table 1: DOS driver commands (required commands are in upper case, optional
commands in lower case, inapplicable commands in italics).
 0 INIT
 1 media check
 2 build bpb
 3 IOCTL INPUT
 4 input
 5 input nowait
 6 input status
 7 input flush
 8 output
 9 output with verify
 10 output status
 11 output flush
 12 IOCTL OUTPUT
 13 DEVICE OPEN
 14 DEVICE CLOSE
 15 removable media
 16 output until busy

 128 READ LONG
 129 reserved
 130 read long prefetch
 131 SEEK
 132 play audio
 133 stop audio
 134 write long
 135 write long verify
 136 resume audio
 Figure 3: Mapping the request header (a = length, b = sub unit, c =
constrained flag, d = command, e = status).
[LISTING ONE] (Text begins on page 72.)

--*****************************************************
-- TYPES.ADS
-- A copyright-reserved, free use program.
-- (c)John H. McCoy, 1993
-- Sam Houston St. Univ., TX 77341-2206
--*****************************************************

with system, memory;
with unchecked_conversion;

package Types is

type byte is range 0..255;
type word is range 0..65_535;

-- The following types and functions are declared for use
-- instead of integer, and system.address to avoid "align-
-- ment holes" in record declarations. This may not be a

-- problem with other than Meridian's implementation of ADA.

type bytes is array (word range <>) of byte;
type words is array (word range <>) of word;
subtype W is bytes(1..2);
subtype DW is bytes(1..4);
function W_to_Word is new unchecked_conversion(W, Word);
function DW_to_SA is new unchecked_conversion(DW, system.address);
function DW_to_Long is new unchecked_conversion(DW, Long_Integer);
function Word_to_W is new unchecked_conversion(Word, W);
function SA_to_DW is new unchecked_conversion(system.address, DW);
function Long_to_DW is new unchecked_conversion(Long_Integer, DW);


subtype string8 is string(1..8);
subtype string16 is string(1..16);


function "+"(left, right: byte) return byte;
function "OR"(left, right: byte) return byte;
function "+"(left, right: word) return word;
function "OR"(left, right: word) return word;
function "+"(left, right: W) return W;
function "OR"(left, right: W) return W;

end Types;

[LISTING TWO]


--******************************************************************
-- DRIVERS.ADS Excerpted Listing
-- A copyright-reserved, free use program.
-- (c)John H. McCoy, 1993, Sam Houston St. Univ., TX 77341-2206
--******************************************************************
with system, memory, unchecked_conversion;
with Types; use Types;

package drivers is

type DEV_CommandCodes is new byte;
 DeviceReadLong : constant DEV_CommandCodes := 128;
type DEV_ReturnCodes is new W;
 DeviceDone : constant DEV_ReturnCodes := Word_to_W(16#0100#);
type rhIoctlIns is record
 Status : DEV_ReturnCodes;
 reserved : bytes(6..13);
 MediaDesc : byte;
 CBPtr : DW;
 TransferCount : W;
 Start : W;
 VolIdPtr : DW;
 end record;
type rhXs (Command: DEV_CommandCodes := DeviceReadLong) is record
 case Command is
 when DeviceInit => Init : rhInits;
 when DeviceIoctlInput => IoctlIn : rhIoctlIns;
 when DeviceIoctlOutput => IoctlOut: rhIoctlOuts;
 when DeviceReadLong 
 DeviceReadLongPrefetch => ReadLong: rhReadLongs;
 when DeviceSeek => Seek : rhSeeks;
 when others => Other : rhOthers;
 end case;
 end record;
type rhs is record
 Length : byte;
 SubUnit : byte;
 rhX : rhXs;
 end record;
subtype pkts is bytes(1..rhs'size/8);

function Rhs_to_Pkts is new unchecked_conversion(Rhs, Pkts);
function Pkts_to_Rhs is new unchecked_conversion(Pkts, Rhs);
end drivers;
End Listings
















August, 1993
Integrating Desktop Mapping with Geographic Data


A programmer's toolkit for geographical information systems




Peter D. Varhol


Peter is an assistant professor of computer science and mathematics at Rivier
College in Nashua, New Hampshire.


At first glance, people working in fields such as geology, civil engineering,
aviation, tele-communications, sales, real estate, and insurance appear to
have little in common. Nonetheless, they do share the need to manipulate
geographical data, then integrate that data with electronic maps.
Conventional mapping software such as MapInfo meets some of their needs and,
for small geographical areas, maps can be added to AutoCAD drawings. Neither
approach, however, is flexible enough for you to customize maps to include
highly-detailed information or to integrate maps with other applications.
What's really needed is a mapping engine that provides a programmer's toolkit
for combining mapping functions with traditional database or spreadsheet data.
TerraLogics' TerraView is a toolkit that does just that by providing a library
of C function calls for displaying and manipulating maps. The library works
with several different UNIX platforms (VAX/VMS, ULTRIX, SCO, and X Window),
Macintosh, Microsoft Windows, and text environments such as DOS (for which it
provides its own basic windowing systems). In addition to C, the library
functions can be called from Basic, Fortran, Pascal, and Cobol.


Putting TerraView to Use


To examine TerraView, I collected information on manufacturing facilities in
Southern New Hampshire. This is typical of a database maintained by a traffic
engineer who's looking at employers to determine rush-hour traffic patterns,
or who's looking for new manufacturing facility sites.
My project involved displaying facility locations and identifications on a
TerraView map of the area (TerraLogics provides street-level maps) as shown in
Figure 1. This required getting data to a TerraView application, bringing up
the appropriate map, and displaying the symbols in the proper locations. Since
TerraView doesn't include data management or full-fledged user interface
layout capabilities, it's necessary to either build them in by hand (a full
development effort), or to make use of already-existing database development
environments. I opted for the latter.
I chose Microsoft Access as the target platform for the database portion of
the application because of its powerful tools for passing data to external
functions and applications. To interact with a TerraView mapping application
from Access, I could either compile my TerraView application as a DLL and link
it to my Access application through the Call statement, or create a
stand-alone TerraView application and call it from Access using either the
RunApp or Shell instructions; see Figure 2.
Because in the future, I hope to build DDE capabilities into the TerraView
application, I decided to take the RunApp approach. While the TerraView
library doesn't specifically support DDE, the library is a regular Windows
application, and the DDE support can be coded within the C-based application
environment. For the time being, I decided to interact the old fashioned
way--by writing a file out to disk and calling my TerraView application
directly from Access. I was able to add DDE support by #include-ing DDE.H,
which comes with Borland's Turbo C++ 3.1. (Turbo C++ had no problems compiling
the Terraview shell as is.)


Preparing the Database Applicationin Access


I began by creating Access database tables representing the company name and
its location. While I had additional information (company CEO and primary
products, which I included in other tables), it was name and location data I
was going to plot on the map.
Because plotting by address involves a layer of complexity I hoped to avoid, I
represented company locations using approximate latitudes and longitudes
obtained manually from standard U.S. Geographical Service topographic maps
rather than by street addresses. Plotting by street addresses makes it
necessary to query the map to determine the range of addresses along a street,
then match the known address to an approximate location on the map. The amount
of code I'd have to write would be substantial. So that I might be able to
plot with street addresses at a later time, I included street address and
latitude/longitude in the database table. I considered writing my own
algorithm to perform the conversion in Access, but without using the map I
couldn't think of an appropriate computational approach.
I created Access input screens so that a user could easily enter, modify, and
delete members of the database. One button I included in my Access user
interface initiated the mapping sequence. The code behind this button created
a text file, wrote out the company name and its location to the file, and
called my TerraView application. This sequence was little more than a few
lines of code in the Access programming language. Once the Access application
was laid out, I turned to TerraView.


Inside TerraView


Creating a TerraView application is like creating any other Windows
application. The WinMain function provides the basic window, processes the
command-line arguments, and provides the event loop. TerraLogics includes a
simple WinMain shell with the package, which I used as the basis of my own
application.
While most of the other windowing routines come from TerraView, the actual
application (see Listing One, page 114) is a sometimes confusing mixture of
Windows and TerraView calls to drive the mapping functionality. Envision
TerraView as a set of library functions. These functions can do things such as
open windows, display maps, place locations (either latitude/longitude or
street addresses) on the maps, and incorporate text information into the place
locations. TerraView does all this (and more) from within the Windows
application that you provide.
After setting up the appropriate Windows routines and interpreting the command
line, I started the mapping portion of the application by initializing a
window and bringing up a blank map in it. Then I loaded a map. This can be
done in several ways. The WinMain function includes an argv call to read from
the command line, so a map can be loaded automatically at program launch.
Alternately, the shell provided by TerraLogics displays a simple command line
in the empty TerraView window, which prompts the user to load a map file.
Another, more elegant, approach is to create a Windows dialog box within the
TerraView shell that lets the user select a mapping file from the dialog file
list. Even with no experience in doing so, there's enough sample code in Turbo
C++ that adding a Select File dialog only takes a few minutes. This is both a
better looking and easier way of prompting the user for a map.
However, since the mapping application was to be opened from an Access
application, I supplied the name of the map from Access, and placed it into
the Call statement as an argument to the map. Given a more robust database
application, it's possible to set up a dialog box to let the user select the
appropriate map, or even to have the application itself choose the proper map,
based on the locations included in the data to be mapped.
Maps are loaded as overlays onto the blank map set up with the window. It's
possible to load several different maps into the same window using overlays,
but the most common use of the overlay is to add features to a base map. These
features may include highways, railroads, natural features, or structures such
as buildings or power lines. I used an overlay to add company locations to the
map. It's also possible to include code so that the user can control the
addition, deletion, or layering of overlays from a pull-down menu.
The TerraView application reads the data file, using a simple fopen and scanf,
and loads the data into a structure inside the application. Next, the
TerraView calls in Example 1 place the appropriate symbol and accompanying
information onto a map overlay, which is then placed on the display. Like any
Windows application, the menu structures are set up in the application
resource (.RC) file. The menu operations are structured in the normal Windows
fashion in C, usually with a switch statement containing calls to the
appropriate functions. The functions are usually TerraView calls, although
they can also be interspersed with regular Windows calls.
The final step was to use TerraLogics' IconEdit utility to create a simple
triangle icon to represent the location of the companies on the map. IconEdit
is a simple pixel paint application that enables you to create custom graphics
to display on a map overlay. Presumably the bitmap painting capability in
Windows Paintbrush can also be used to create icons, although I wasn't able to
load a .BMP file into IconEdit.


Beyond Simple Mapping Applications


I've already discovered enhancements to add to the mapping application. In
addition to other ways of calling a mapping application and getting data into
the application, you can scale and rotate maps, select and manipulate specific
street segments, inquire about specific map features, and scroll a map.
For those of you who have used Maxis' SimCity, it is also possible (though
quite ambitious) to use TerraView to create a city or geographical region
simulation along these same lines. A TerraView application can be combined
with a discrete-event simulator to include models of automobile traffic for
examining traffic congestion. A city or regional planner could use such an
application to study the effects of new development on traffic, or to examine
the best locations for freeways or public transportation.
The best way to implement many of these enhancements would be via DDE. It
would be nice to have a TerraView library that brings the tedious work of DDE
to a higher level, to make it more convenient to the programmer. After all,
this will undoubtedly be an ideal way to create a generic mapping application
that could display geographic data from a wide variety of existing
applications, such as spreadsheets, other relational database packages, and
even word processors.



Using TerraView with Commercial Applications


Developing a mapping application using TerraView isn't for the weak of heart.
It's a full-fledged Windows application, with the addition of numerous library
calls to open and display data on a map. It requires programming talent, along
with a normal application development effort, to bring this type of mapping to
the desktop.
The positive side of this cautionary statement is that TerraView is extremely
flexible. I can envision the libraries being used in three different ways.
First, users can construct a TerraView application, in much the same way I
did, and communicate with other applications via a DDE interface. The
TerraView application will probably be written to work with a specific
application or even single data set, although it is also possible to imagine a
generic mapping capability that can work with any data-driven application.
Second, TerraView calls may be compiled into a DLL and called by one or more
applications specifically for mapping functions. This approach may be provided
by the commercial application vendor, or it may be added on by a third party
or even the end user through DLL interfaces increasingly being included in
commercial products.
Third, and perhaps most interesting, the TerraView functions may be
incorporated seamlessly into a commercial application to include integrated
mapping. TerraLogics has a demo that includes a TerraView application
integrated as a user-defined SmartButton in Lotus 1-2-3 for Windows. While
this demo SmartButton macro simply calls an external TerraView application,
future versions of Lotus may be compiled with this feature built in.
The combination of the TerraView mapping engine and libraries, along with
inexpensive and widely available electronic maps provided by the Federal
government and by commercial vendors, may herald a new type of application for
handling geographic data. Not everyone has the need for geographic data but of
those who do, many lack the technical ability to work with TerraView today.
However, commercial developers may, in time, make these capabilities available
to every user of ordinary commercial software.
 Figure 1: Map from the sample TerraView application.
 Figure 2: Calling the TerraView application from Access using the RunApp
macro.
Example 1: TerraView calls to place symbols onto map overlay.

 stat = TvBeginFeature(ovl_sym, 0, 0, name, id);
 stat = TvInsertSymbol(ovl_sym, symbol, 1, point);
 stat = TvEndFeature(ovl_sym);
 TvRefresh(map);



For More Information
TerraView
TerraLogics Inc.
76 Northeastern Blvd., Suite 25B
Nashua, NH 03062
603-889-1800

[LISTING ONE] (Text begins on page 84.)
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <windows.h>
#include "TvTypes.h"
#include "Tv.h"
#include "windemo.h"

#define MAX_OVERLAYS 8
#define MAX_ARGS 10

struct WinEvent
 {
 HWND hWnd;
 int code;
 int x;
 int y;
 };

DOUBLE zoom_factor = 2.0;
file *companies;
int grid_up = 0, overlays = 0, first_screen = 1, curshow, stat;
char txtbuf[132];
TvWindowID nowin;
TvMapID map;
HANDLE hInst, hAccTable;

static DOUBLE proj_data[16];
static LONG proj_type = 0;
static TvOverlayID overlay_user[MAX_OVERLAYS], overlay_tmp = 0,
 locator = 0, grid = 0,

 compass = 0, scale = 0;
int PASCAL WinMain( HANDLE, HANDLE, LPSTR, int );
BOOL InitTvApplication( HANDLE );
TvWindowID InitTvWindow( VOID );
LONG FAR PASCAL TV_process( HWND, UINT, WPARAM, LONG );
BOOL FAR PASCAL Tv_About( HWND, UINT, WORD, LONG );

/***********************************************************************
This is a standard WinMain function that initializes the appropriate Windows
variables and checks and interprets the command line.
***********************************************************************/
int PASCAL WinMain( hInstance, hPrevInstance, lpCmdLine, nCmdShow )
 HANDLE hInstance;
 HANDLE hPrevInstance;
 LPSTR lpCmdLine;
 int nCmdShow;
 {
 TvWindowID win;
 MSG msg;
 LONG i, atol();
 int argc = 0;
 char *argv[MAX_ARGS], cmdline[80], *cp1, *cp2;

 if ( !hPrevInstance )
 if ( !InitTvApplication( hInstance ) )
 return( (int)NULL );
 hInst = hInstance;
 curshow = nCmdShow;

 strcpy( cmdline, "tv" );
 argv[0] = cmdline;
 argc++;

 for( cp1 = &cmdline[strlen(cmdline)+1], cp2=lpCmdLine;
 *cp2 && (argc <= MAX_ARGS);
 )
 {
 argv[ argc++ ] = cp1;
 while( *cp2 && (*cp2 !=  ) )
 *cp1++ = *cp2++;
 *cp1++ = \0';
 if ( *cp2 ==   ) cp2++;
 }
 TvInitialize();

 /* Create blank map. All actual maps will be written as overlays to this. */
 if ( !(win = InitTvWindow()) 
 !(map = TvCreateMap( win, "Map Demo Application" )) 
 !(nowin = TvCreateWindow( TvNoDevice, 0.0, 0.0, 1.0, 1.0 )) )
 {
 printf( "Cannot create map!\n" );
 TvExit( 0 );
 }

/* If no map file is entered on command line, this will open a simple prompt
 in TerraView window and let user enter name of a map. My application
shouldn't
 display this, since I'm passing a map name from Access database */
 {
 char line[132];

 while( (overlays == 0) line[0] )
 {
 TvPrompt( map, "Enter overlay name: ", line );
 if ( line[0] &&
 (overlay_user[overlays++] = TvReadOverlay( line )) == TvStatusError )
 {
 sprintf( txtbuf, "Cannot open overlay %s. Press ENTER to exit.", line );
 TvPrompt( map, txtbuf, line );
 TvExit( 0 );
 }
 }
 }
 /* Read overlays from command-line */
 while( (overlays < MAX_OVERLAYS) && (overlays < (argc-1)) )
 if ( (overlay_user[overlays] = TvReadOverlay( argv[overlays+1] ))
 != TvStatusError )
 overlays++;
 else
 {
 sprintf( txtbuf, "Cannot open overlay %s. Press ENTER to exit.",
 argv[overlays+1] );
 TvPrompt( map, txtbuf, txtbuf );
 TvExit( 0 );
 }
 /* Now add the overlays to the blank map created above */
 for( i = 0; i < overlays; i++ )
 TvAddOverlay( map, overlay_user[i] );
 proj_type = TvInqOverlayProjection( overlay_user[0], 16, proj_data );

 /* Generate a grid, compass, and scale overlays */
 grid = TvGenerateGrid( TvGridStyleLabeled );
 compass = TvGenerateCompass( TvPositionBottomRight );
 scale = TvGenerateScale( TvPositionBottomRight );
 TvAddOverlay( map, compass );
 TvAddOverlay( map, scale );

 /* Display the map on the screen */
 TvRefresh( map, TvAllOverlays );

/* This reads the file and puts the data onto the map overlay. */
companies = fopen("outfile.txt", "r");
for companies != EOF;
{
stat = TvBeginFeature(ovl_sym, 0, 0, name, id);
stat = TvInsertSymbol(ovl_sym, symbol, 1, point);
stat = TvEndFeature(ovl_sym);
TvRefresh(map);
}
/* This is a very simple main event loop that is used to process menu
selections and mouse movements. */
 while( GetMessage(&msg, (HWND)NULL, 0, 0) )
 {
 TranslateMessage( &msg );
 DispatchMessage( &msg );
 }
 return( msg.wParam );
 }
/* InitTvApplication initializes window data and registers window class. */
BOOL InitTvApplication( hInstance )

 HANDLE hInstance;
 {
 WNDCLASS wc;
 wc.lpszClassName = "TerraView";
 wc.hInstance = hInstance;
 wc.lpfnWndProc = TV_process;
 wc.style = CS_HREDRAW CS_VREDRAW;
 wc.hbrBackground = GetStockObject( BLACK_BRUSH );
 wc.hCursor = LoadCursor( (HINSTANCE)NULL, IDC_ARROW );
 wc.hIcon = LoadIcon( hInstance, (LPSTR)"TvIcon" );
 wc.lpszMenuName = "ViewMenu";
 wc.cbClsExtra = sizeof( LONG );
 wc.cbWndExtra = sizeof( LONG );
}
/* InitTvWindow() creates MS Window and register it with TerraView */
TvWindowID InitTvWindow( VOID )
 {
 HWND hWnd;
 TvWindowID win;
 DOUBLE left, right, top, bottom;

 hWnd = CreateWindow( (LPSTR)"TerraView", (LPSTR)"TerraView for Windows",
 (DWORD)WS_OVERLAPPEDWINDOW, 0, 0,
 GetSystemMetrics( SM_CXSCREEN ),
 GetSystemMetrics( SM_CYSCREEN ),
 (HWND)NULL, (HMENU)NULL,
 (HANDLE)hInst, (LPSTR)NULL );
 if (!hWnd) return( (TvWindowID)NULL );
 win = TvUseWindow( TvDefaultDevice, (void *)&hWnd );
 TvPDCtoNDC( TvDefaultDevice, GetSystemMetrics( SM_CXFRAME ),
 GetSystemMetrics( SM_CYFRAME ) + GetSystemMetrics( SM_CYCAPTION ) +
 GetSystemMetrics( SM_CYMENU ), &left, &top );
 TvPDCtoNDC( TvDefaultDevice, GetSystemMetrics( SM_CXFRAME ),
 GetSystemMetrics( SM_CYFRAME ), &right, &bottom );
 TvSetWindowBorderSize( win, left, right, top, bottom );
 ShowWindow( hWnd, curshow );
 curshow = SW_SHOWNORMAL;
 UpdateWindow( hWnd );
 return( win );
 }
/* TV_process(HWND, UINT, WPARAM, LONG) processes messages */
LONG FAR PASCAL TV_process( hWnd, message, wParam, lParam )
 HWND hWnd;
 UINT message;
 WPARAM wParam;
 LONG lParam;
 {
 FARPROC lpProcAbout;
 struct WinEvent event;

 event.hWnd = hWnd;
 switch( message )
 {
 case WM_CREATE:
 hAccTable = LoadAccelerators( hInst, "FunctionKeys" );
 break;
 case WM_DESTROY:
 PostQuitMessage( 0 );
 break;

 case WM_COMMAND:
 switch( wParam )
 {
 case IDM_EXIT:
 case IDM_EDIT:
 case IDM_HELP:
 MessageBox( GetFocus(), " ",
 "TerraView Mapping Application", MB_ICONASTERISK MB_OK );
 break;
 case IDM_ABOUT:
 lpProcAbout = MakeProcInstance( Tv_About, hInst );
 DialogBox( hInst, "AboutBox", hWnd, lpProcAbout );
 FreeProcInstance( lpProcAbout );
 break;
 }
 case WM_PAINT:
 event.code = KBD_WINDOW_RESIZE;
 event.x = event.y = 0;
 TvQueueEvent( TvDefaultDevice, &event );
 TvResizeWindows( TvDefaultDevice );
 return( DefWindowProc( hWnd, message, wParam, lParam ) );
 case WM_SIZE:
 TvResizeWindows( TvDefaultDevice );
 if (map) TvRefresh(TvDefaultDevice, TvAllOverlays);
#if 0
 case WM_PAINTICON:
#endif
 case WM_QUIT:
 case WM_CLOSE:
 case WM_SYSCHAR:
 case WM_SYSDEADCHAR:
 case WM_DEADCHAR:
 return( DefWindowProc( hWnd, message, wParam, lParam ) );
 case WM_CHAR:
 event.code = wParam;
 event.x = event.y = 0;
 TvQueueEvent( TvDefaultDevice, &event );
 break;
 case WM_LBUTTONDOWN:
 event.code = KBD_MOUSE1_DOWN;
 event.x = LOWORD( lParam );
 event.y = HIWORD( lParam );
 TvQueueEvent( TvDefaultDevice, &event );
 break;
 case WM_LBUTTONUP:
 event.code = KBD_MOUSE1_UP;
 event.x = LOWORD( lParam );
 event.y = HIWORD( lParam );
 TvQueueEvent( TvDefaultDevice, &event );
 break;
 case WM_MBUTTONDOWN:
 event.code = KBD_MOUSE2_DOWN;
 event.x = LOWORD( lParam );
 event.y = HIWORD( lParam );
 TvQueueEvent( TvDefaultDevice, &event );
 break;
 case WM_MBUTTONUP:
 event.code = KBD_MOUSE2_UP;
 event.x = LOWORD( lParam );

 event.y = HIWORD( lParam );
 TvQueueEvent( TvDefaultDevice, &event );
 break;
 case WM_MOUSEMOVE:
 event.code = KBD_MOUSE_MOVED;
 event.x = LOWORD( lParam );
 event.y = HIWORD( lParam );
 TvQueueEvent( TvDefaultDevice, &event );
 break;
 default:
 return( DefWindowProc( hWnd, message, wParam, lParam ) );
 }
 return( (LONG)NULL );
 }
/* Tv_About processes messages for "About" dialog box */
BOOL FAR PASCAL Tv_About( hDlg, message, wParam, lParam )
 HWND hDlg;
 UINT message;
 WORD wParam;
 LONG lParam;
 {
 switch (message)
 {
 case WM_INITDIALOG:
 return (TRUE);
 case WM_COMMAND:
 if ( (wParam == IDOK) 
 (wParam == IDCANCEL) )
 {
 EndDialog( hDlg, TRUE );
 return (TRUE);
 }
 break;
 }
 return (FALSE);
 }
End Listing

























August, 1993
A Quick Port with QuickWin


Moving your graphics app from DOS to Windows has its payoffs--and pitfalls




Al Williams


Al is the author of several books, including DOS 6: A Developer's Guide (M&T
Books,1993) and Commando Windows Programming (forthcoming from
Addison-Wesley). He can be reached at 310 Ivy Glen Court, League City, TX
77573 or on CompuServe at 72010,3574.


If you have a DOS program, chances are you've thought of porting it to
Windows. However, rewriting your entire application to work with Windows can
be difficult and expensive. To ease that burden, Microsoft's Visual C++
provides QuickWin, a library that allows you to port many text- and
graphics-based DOS programs to Windows. But, it's not a panacea--some programs
require major changes to work with QuickWin. To examine the QuickWin library,
I decided to dust off Turtle, a Turtle-graphics program I developed in a
previous DDJ article; see "Programming with Phar Lap's 286DOS-Extender," (DDJ
February, 1992). Turtle uses Microsoft C with Phar Lap's 286DOS Extender.
Since QuickWin allows programs to access more than one Mbyte (thanks to
Windows), it shouldn't make much difference that the program originally used a
DOS extender. However, Turtle uses some techniques that won't work with
QuickWin. In this article, I'll discuss what it took to port Turtle (a
moderately complicated Microsoft C program) to QuickWin.


About QuickWin


In its simplest form, QuickWin creates a single text window inside a
multiple-document interface (MDI) frame window. This window is a surrogate for
the DOS console device. All output to stdout and stderr appears in the window.
Keyboard input appears as data in the stdin stream. When your program exits,
QuickWin leaves the window on the screen until the user closes it.
Simple programs that use stdin and stdout for I/O work well with QuickWin.
However, there are some restrictions:
 You can't spawn other programs.
 QuickWin doesn't allow you to call the Windows API.
 Your program can't use console I/O functions such as getch().
 You can't control the buffering of stdin--you can't read a single character
of input.
 You can't directly access the screen.
The QuickWin menu can't be changed.
 QuickWin's help can't be changed.
If you have source code that you can compile and run with basic QuickWin, it
should also compile for DOS or UNIX. If your program uses the Microsoft
graphics library, QuickWin will create a graphics-output window to contain
your program's graphics. Once your program works with these default windows,
the next step is to use QuickWin-specific features to make improvements to
your program.
Table 1 shows QuickWin's enhanced calls (all are defined in STDIO.H). Some
calls pertain to the QuickWin environment. The _wabout() function, for
example, sets the text that appears in the about box. The _wsetexit() function
controls how QuickWin responds when the program exits. By default, QuickWin
retains the program's windows until the user closes them explicitly. With
_wsetexit() you can override this behavior so that your windows close upon
exit.
Other QuickWin calls allow you to create new text and graphic windows. You can
also change the size, position, and text capacity of any text window. If you
use _wopen(), QuickWin returns a file handle that corresponds to a new text
window. You pass this handle to other calls (like _wsetsize()) to refer to the
window. You use file handles with calls like read() and write(), not calls
like fprintf(), which take a stream pointer. To get a stream pointer for a
window, use _fwopen() instead of _wopen(). You can get a file handle from a
stream pointer using _fileno().
Both _wopen() and _wfopen() take three arguments. The first two are structures
that determine the window's title, text capacity, size, and position. The
third argument determines the access mode to the pseudofile.
By default, a QuickWin window can hold 2048 characters (a little more than a
25x80 display). However, you can change this limit when you create a window or
by using _wsetscreenbuf(). You can set the text capacity by using a specific
number, or you can use one of two special constants: _WINBUFDEF or _WINBUFINF.
The _WINBUFDEF constant sets the default buffer size (2048); _WINBUFINF sets
no limit on the window's text capacity. If the window contains more text than
it can display, QuickWin automatically provides and manages scroll bars.
You can close a window using _wclose(). If you pass _WINNOPERSIST as an
argument, QuickWin removes the window from view when you close it. If you want
the window to remain, you can pass _WINPERSIST, instead. If you want your
program to have a custom look, you might want to close the stdout window by
specifying _wclose(_fileno(stdout),_WINNOPERSIST).
Graphics windows are not quite as sophisticated as text windows. You can open
a new graphics window by calling _wgopen(). This call returns a file handle.
You then have to activate the new window with _wgsetactive(). You can close a
graphics window with _wgclose(). Closing a graphics window always causes it to
disappear. Like text windows, you can close the default graphics window so
that you can open a new one with your own title by specifying
_wgclose(_wgetactive()). You can't control the size or position of graphics
windows. You can set any legal video mode using the standard _setvideomode()
call. However, the window may be larger than the virtual graphics screen, and
this can be disconcerting. For example, if you fill a 640x480 graphics mode
with a color, it will look odd if the window is 700x500 pixels. The extra
pixels form a border that doesn't fill.


About Turtle


Turtle is a graphics language similar in spirit to Logo. The original Turtle
program only supports the VGA's 320x200x256 mode. Since it uses Phar Lap's
286DOS Extender, Turtle takes advantage of DLLs to provide an extensible
command set. Although most Turtle commands are built in, you can add
additional commands with a DLL. The default Save and Load commands are in a
DLL.
Turtle's commands are straightforward. You should be able to figure out its
operation by referring to the online help. If you want more details about the
original Turtle program, see the article in the February 1992 issue. To avoid
confusion, I'll refer to the original version as Turtle 1.0 and the new
Windows version as Turtle 2.0. The source code for the QuickWin version of
Turtle is available electronically; see "Availability," on page 5. The source
files are the same as those for Turtle 1.0 (also available electronically),
although there are many significant changes.


Problem Areas


Porting Turtle to QuickWin presents several immediate challenges. The original
program switches between a text and graphics screen. To do this, Turtle 1.0
directly accesses video memory. It also reads and sets the text screen's
cursor position. None of these are possible under QuickWin. But thanks to
Windows, sometimes less is more. Since QuickWin uses different windows for
graphics and text, the new Turtle adopts this model for output. All the
cursor-positioning code and most of the direct screen accesses vanish with
separate windows. The gotext() function, which switches to text mode, becomes
an empty stub; gograph() initializes the graphics window the first time you
call it. Subsequent calls to it do nothing. The Show command becomes
unnecessary with this scheme, so Turtle no longer supports it. The only other
direct screen accesses occur when you save or load the screen to a file or
buffer. Since these operate on the graphics window, we can use Microsoft's
standard _putimage() and _getimage() calls to effectively read and write the
graphics window.
Huge pointers. While _putimage() and _getimage() will work to read and write
the graphics screen, it presents a problem for the new version of Turtle.
Turtle 1.0 operates on a 320x200x256 screen. Therefore, it always deals with
blocks of memory exactly 64,000 bytes long. The Microsoft calls require more
than 64K to store an entire 320x200x256 screen, due to some overhead in the
image (more about that later). Although QuickWin programs can malloc() large
amounts of memory, they still suffer from the 64K-per-object limit. Therefore,
the new Turtle has to resort to huge pointers to deal with screen buffers. You
can learn the exact size of a screen buffer from the _imagesize() function.
Help windows. Turtle 1.0's help system uses getch() to pause between help
screens. QuickWin, of course, doesn't support any console I/O calls. Again,
less is more--the new Turtle opens a separate window to contain the help
information. Turtle sets the help window's buffer size to _WINBUFINF, so the
window will retain all text sent to it. You can use the window's scroll bars
to browse the help information. Subsequent help commands simply bring the
existing help window to the foreground. Turtle 2.0 doesn't need getch().
That's good, because I couldn't find a direct way to replace getch() or
kbhit().
The help system causes each command to print its own help to the stdout
stream. Therefore Turtle needs a way to redirect stdout to the help window.
Ordinarily, you would use the dup() and dup2() calls to redirect a stream.
However, you can only pass QuickWin's special file pointers to a few
functions. Unfortunately, dup() and dup2() are not among them. Turtle resorts
to the method shown in Figure 1 to change the stdout stream.
DOS shell. Another problem is that the original Turtle could run a DOS shell
and a text editor, and display a DOS directory. QuickWin doesn't allow any of
these operations. However, under Windows, these are not very important. You
can always switch to a DOS window or an editor by using Windows. Turtle 2.0
simply deletes these commands.
Interrupting execution. Turtle can execute script files. While a script file
is running under Turtle 1.0, you can press Control-Break to interrupt
execution. Under QuickWin, you can't catch Control-Break. If Turtle gets stuck
in an endless loop, you're simply out of luck. However, the new Turtle will
break out of all running scripts if you enter a Q (upper or lower case) while
Turtle is waiting for input (the %i special variable).
DLLs. Even though Windows supports DLLs, QuickWin doesn't allow you to call
the Windows API. Therefore, I didn't even attempt to port Turtle's DLL system
to QuickWin. Since the system depends on some Phar Lap-specific functions, it
probably would have been difficult to port anyway. The new Save and Load
commands, of course, have to move to TCMDS.C.



Other Changes


The new Turtle program sports several minor changes. The Delay command, for
example, couldn't use the special Phar Lap interrupt calls. Some variables and
functions that Turtle no longer needs were deleted. However, some things from
the old program still remain, but are unused. For example, Turtle 1.0 uses
large model, but Windows can't load multiple copies of programs that use large
model. In an ordinary Windows application, you could prevent multiple copies
from running, but not in QuickWin. It was a simple matter to change Turtle to
use small model. Only the screen buffers need huge pointers, anyway. This
prevents the new Turtle from using fread() and fwrite() inside the Save and
Load commands, however.
Some changes were not strictly necessary, but were worth the small effort
required. By default, when your program exits, QuickWin retains your windows
until the user closes them. By calling _wsetexit(_WINEXITNOPERSIST), QuickWin
will close your windows when you exit, which is a more natural method for
Turtle.
By default, while Turtle is running, other Windows programs can't execute.
(Remember, Windows multitasking is cooperative.) To make Turtle more polite,
the XCI module calls _wyield() periodically while processing scripts. This
allows other programs to run while Turtle is performing lengthy operations.
Since Turtle can no longer directly access graphics memory, it doesn't depend
on the 320x200x256 mode anymore. As an experiment, I added a mode command that
allows you to try different graphics modes. However, Turtle always assumes
256-color mode, regardless of the mode you are using.


Turtle in Action


Turtle certainly looks good as a Windows application; see Figure 2. By using
the standard QuickWin menus, you can copy text or graphics to the clipboard,
and manage the child windows as you would expect in a MDI application. Of
course, the help menu only explains QuickWin's operation--you can't customize
it. Also, Microsoft doesn't document this, but under QuickWin, _putimage() and
_getimage() operate on standard Windows bitmaps. That means that Turtle can
now read and write BMP files! This was certainly an unexpected bonus; however,
QuickWin may not set the color palette correctly for some files.
Prior to Visual C++, QuickWin did not support graphics. While the graphics
support is welcome, it is also a major weak point in the package. When a
window is larger than the virtual graphics screen it represents, the effect is
disconcerting. The user (but not the program) can tell QuickWin to scale the
graphic to fill the window, but this slows processing and can distort the
image. You can't control the placement or size of graphics windows from within
your program.
Considering its complexity, Turtle was surprisingly easy to port to QuickWin.
Of course, tweaking the design--creating separate text and graphics windows,
for example--helped ease the transition. These changes improved Turtle's look
and feel, too. If Turtle had originally been a true DOS program instead of a
DOS-extended program, the DLL use would not have been a problem. Also, the
break handling was a disappointment. The workaround--detecting the Q key on
input--isn't a very good solution. If your program sticks in a loop that
doesn't do input, you won't have any recourse but to close the application.


Is QuickWin for You?


Turtle is probably about as complex a program as you would want to port to
QuickWin. Even though the port was successful, there were a few quirks. If you
just want to get your DOS program over to Windows as soon as possible,
QuickWin works well. However, if you plan on continuing development of your
program under Windows, QuickWin is essentially a dead end. Although you can
make certain improvements (separate output windows, for example) to a QuickWin
program, you can't call the Windows API. Once you move your program to
QuickWin, you can't make many Windows-oriented improvements.
By using a custom DEF file, you can bind a DOS version of your program with
the QuickWin version. Then you will have a single executable (made from the
same source code) that runs under DOS or Windows. Consult the Microsoft Visual
C++ documentation for more details about binding DOS programs to Windows
applications.
Certainly QuickWin is not a long-term answer to developing Windows
applications. Still, it will make your DOS program behave better under Windows
(especially in standard mode) and give it a better look and feel. Your
QuickWin program will have access to more memory than a DOS program.
Nevertheless, QuickWin is no DOS extender (although Windows actually is).
QuickWin is probably too limited to port most real DOS applications to
Windows, but it's useful for writing quick programs that you would ordinarily
write for DOS.


References


Microsoft Visual C++ Version 1.0 Reference Manual. Microsoft Corp., 1993.
Williams, Al. "Programming with Phar Lap's 286DOS-Extender." Dr. Dobb's
Journal (February, 1992).
Williams, Al. Commando Windows Programming. Reading, MA: Addison-Wesley, 1993.
Table 1: QuickWin Calls.
 Function Description
 _wopen() Opens text window; returns file handle.
 _fwopen() Opens text window; returns file pointer.
 _wgopen()* Opens graphics window.
 _wclose() Closes text window.
 _wgclose()* Close graphics window.
 _wsetexit() Sets exit behavior.
 _wgetexit() Gets exit behavior.
 _wsetsize() Sets text window's position and size.
 _wgetsize() Gets window's position and size.
 _wsetscreenbuf() Sets text window's screen-buffer size.
 _wgetscreenbuf() Gets text window's screen-buffer size.
 _wsetfocus() Makes window active.
 _wgetfocus() Gets current active window.
 _wgsetactive()* Sets active graphics window.
 _wggetactive()* Gets active graphics window.
 _wmenuclick() Simulates menu action.
 _wyield() Yields processing time to other applications.
 _wabout() Sets custom message for About box.
 _inchar()* Reads a character in a graphics window.
 *Graphics-only functions
Figure 1: Redirecting stdout to the help window.
FILE old_stdout; /* Not a pointer */
FILE *helpwin; /* set by fwopen() somewhere else */
old_stdout=*stdout; /* save stdout */
*stdout=*helpwin;
printf("This goes to help window");

/* go back to old stdout */
*stdout=old_stdout;


 Figure 2: QuickWin version of Turtle.

























































August, 1993
PROGRAMMING PARADIGMS


Apple Changes its Stripes




Michael Swaine


I would like to respectfully suggest that we have had enough research into the
harmful effects of tobacco smoking.
I make this suggestion for two reasons.
First, we need the Federal revenue produced by cigarette taxes to reduce the
deficit. There seems to be no practical limit on cigarette taxes. Smokers will
basically pay whatever they have to pay for cigarettes.
Don't believe it?
Consider: Yes, we've all heard ex-smokers claim that they just couldn't afford
it any more. But have you ever met an ex-smoker for whom you truly believed
that price was the decisive factor in kicking the habit? There are so many
other reasons to quit smoking: social ostracism, the need to climb stairs, the
health of family members, inconvenience, the spot on the X-ray, dead
relatives. According to my research, there has never been a case of a smoker
giving it up solely because of the cost. But what about cost as a contributing
factor?
Consider: Canada has raised taxes on cigarettes with absolutely no effect on
cigarette sales. Granted, it did create a thriving black market in cigarettes
imported from the Lower 48. But adjustments to U.S. and Mexican tax codes and
a little tweaking of the North American Free Trade Agreement should take care
of Canada's black market problem and head off any such problem for the U.S.
Consider: Nicotine is an addictive drug, and the street price of other
addictive drugs suggests that addicts will always find the money to support
their habit. My research suggests that a tax of $100 per pack would have zero
effect on cigarette consumption, and would bring in an extra $0.3 trillion per
year in Federal revenues. (Or, to put it in more meaningful terms, it would
pay for 20 haircuts per second for the President for the duration of his
term.)
What does seem to have some slight effect on consumption is all these
discouraging studies. Eliminating them and instituting a $100/pack tax would
put quite a dent in the deficit. That's my first reason. Lest it appear that I
am placing economic issues above public-health issues, my second reason will
make clear that I am very concerned about public health.
Reason two: We should take the money currently being spent on research into
the harmful effects of tobacco smoking on physical health and spend it on
research into the harmful effects of tobacco growing on mental health. This
area of research has been shamefully neglected, and it's obviously a serious
problem.
What other explanation could there be for Senator Jesse Helms?
All the examples of Helmsmanship look pretty much the same. Here's the latest,
as of press time.
1. Helms effectively admits to a journalist that his opposition to a HUD
undersecretary nominee is based on his own intolerance. (Actually, Helms
invited the journalist to characterize his attitude as bigotry, but I will not
take up the invitation. While I find Helms and his views as repugnant as rat
vomit, I see no need to be insulting.)
2. Days later, Helms accuses this nominee of intolerance. His reason,
translated into English, is that the nominee is an unabashed civil rights
activist. In Helms logic, being an activist for civil rights ipso facto makes
one intolerant. Intolerant of intolerance, you see.
Sounds like the same logic that says that all animals are equal but some are
more equal than others, as articulated by a pig in George Orwell's Animal
Farm. There are Orwellian pigs in every state in the union, but only North
Carolina pigs are so well represented in the Senate of the United States.


MacOpen


The Senator from North Carolina probably wouldn't care for what's going on in
operating systems today. The key word is "open". PowerOpen, Open Systems
Foundation, everything is openness. Especially at Apple, where, it would seem,
a long-standing tradition of proprietariness has been trash-canned.
The most dramatic evidence in how far things have gone was a statement made by
Apple's Senior Vice-President and General Manager of the Macintosh Software
Architecture Division and the Advanced Technology Group, David Nagel, at
Apple's Worldwide Developer's Conference this May.
Nagel said, "From now on, no major technology will be introduced from Apple
that is not cross-platform. The goal is near-simultaneous release."
Now, there are at least three ways one could read this statement.
You could take it to mean that Apple will henceforth follow a model of
developing system tools and making them simultaneously available on the Mac
and on Windows. This seems to be a safe reading, supported by a recent
announcement from Kaleida, the Apple-IBM joint venture in multimedia
technology. Kaleida decided to delay the release of its ScriptX multimedia
scripting technology until next year so that it could be released
simultaneously on Mac and Windows.
You could also take Nagel's statement to mean that Apple is decoupling its
operating system and user-interface software from its hardware. I'd say that's
a safe reading, too. In a dramatic break with the past, Apple is licensing its
operating system technology to others, and porting or supporting the porting
of the Mac OS and/or GUI to DOS, Windows, and various UNIX platforms.
You could also read Nagel's statement to mean that Apple is getting out of the
hardware business. Well, couldn't you? "From now on...no major
technology...that is not cross-platform." Ever heard of a cross-platform
computer? Taken literally, Nagel's statement means either that Apple is out of
the hardware business as of today, or that any new machines it introduces in
the future will not be major technology.
But this is nonsense. Apple will be introducing PowerPC Macs within the next
year, based not only on a new chip family (the PowerPC) but on a different
chip technology (RISC vs. CISC). That's surely new hardware technology. Nagel
didn't mean that Apple was through pushing iron, and he didn't mean that its
hardware releases wouldn't be "major technology." Anyway, Nagel is a
system-software guy; he was clearly talking about his domain, not all of
Apple.
But the assertion wasn't qualified in any way. So isn't it just a little
suggestive that it didn't raise any eyebrows within Apple? This major
announcement surely passed under many eyes before being delivered at the
conference. Yet nobody thought to mention that Apple was still, of course,
going to produce machines, for a while at least. Which shows, perhaps, where
their minds are. Apple really is starting to think of itself as something
other than a computer manufacturer. It's comforting, I guess, to know that
some things don't change, and that the Apple corporate mind remains farther
out on the curve than the Apple market reality.
In any case, the uncoupling of the Macintosh operating system and user
interface from the Mac hardware is radical. What does it mean?


MacDOS


First, there's the Apple-Novell plan to put the Mac operating system and UI on
Intel-based machines. This sounds like it would have been a stunning coup for
Apple two years ago, but today it seems a little late to challenge Windows on
its home court. What's the deal?
Well, it's supposedly a plan to offer an alternative to DOS users who are
ready to bite the bullet and move up to a graphical UI, or Windows users
considering the move to NT. It's due out in the first half of '94, and here's
what it consists of:
The Mac Finder and Toolbox, displaying a System 7 interface, running on top of
Novell's DR DOS, with the Mac file system mapped onto the DOS file system, the
whole thing capable of running on 386, 486, and Pentium boxes, and supporting
multitasking of Mac and DOS applications. Not Windows apps, at least not yet.
And that's not just existing DOS applications, but ported Mac applications. So
what the deal is for developers is this: You port your Macintosh apps to this
Intel platform and you'll have an expanded market for them. Since the Toolbox
and some other Mac components are already ported, the port won't be too
daunting a job, and Apple and Novell will explain to you what steps you need
to take to do the port.
I'm sorry, but I just don't see this as a terribly compelling deal for
application vendors.
If they are considering porting their apps to Windows, they now get to compare
the currently nonexistent market for applications on this MacDOS platform with
the existing huge market for Windows apps, weighed against the relative
porting costs. Unless the port is very easy or Apple and Novell push this
platform very strenuously, it's hard to see the comparison working out in
Apple's and Novell's favor. Vendors will port to Windows first.
And if the vendors have already ported to Windows, then any success that this
platform has in cutting into the Windows market will arguably be at the cost
of the application vendors' sales on Windows. If they do find it necessary to
port to what I'm calling "MacDOS," it will be grudgingly, to win back market
share that they were already counting as theirs. Not an inviting scenario.
There are scenarios for success with this thing. If porting is extremely easy
and if enough vendors port their apps to it soon enough that Apple is able to
say convincingly, "You can now have the virtues of the splendiferous Macintosh
interface rather than that inferior copy of haphazard design and inconsistent
execution, and run true Macintosh applications on those inelegant dirt-cheap
clone boxes from the Orient that you're so fond of," then maybe there's a
decent-sized market. And that market would be even more decent-sized if Apple
and/or Novell could figure out how to give an affirmative answer to what you
just know will be the first question that anybody who owns an Intel machine
will ask about this platform; to wit: "Will it run [fill in your favorite
Windows app]?"
But how likely is any of that?


Macnix



Then there are the plans, very plural, to port the Mac system and/or interface
to UNIX.
First of all, there's Apple Services for Open Systems. This approach lets Mac
applications run without modification on UNIX platforms in an X window. The
pitch for the user is, you keep the benefits of the UNIX operating system, and
the existing pool of Mac applications is open to you as well.
Second, there's the plan to port the Toolbox and APIs to various UNIX
platforms. This would mean, if I've got it right, that in addition to porting
existing apps to UNIX, you could write something that is obviously a Mac app,
but that's written specifically for the UNIX platform you're working on. A
weird trip. The pitch to the developer is, you want to work in UNIX but you
like QuickDraw or AppleShare or Apple Open Collaborative Environment? We'll
sell you the technology you want, ported to your particular hardware platform.
Then there are the Quorum approaches. Quorum Software Systems (Menlo Park,
California) has two products that let Mac software run on UNIX machines. One
is a $6,000 porting tool named Latitude; it lets you port Mac apps to UNIX
without touching source code and reportedly delivers good performance. The
other is a $695 end-user product that lets several specific popular Mac
applications run on UNIX systems, using the host system's user interface;
Motif, for example.
If you're one of those who thinks that the worst thing about the Mac is the
operating system, then some of these UNIX moves may sound a lot more
interesting than the Apple-Novell plan of grafting the Mac UI onto DOS.


MacRISC


One almost gets the feeling that these ports of major pieces of the Macintosh
to existing platforms are just practice for the main event, which is the port
to the PowerPC. In fact, Apple is telling vendors that making their apps
portable to the Intel platform will help in porting them to the PowerPC. This
IBM/Apple/Motorola-engendered RISC chip is Apple's future (and not so distant
future, at that) hardware platform. At the developer's conference, Apple
executives were gloating that PowerPC was delivering better performance than
expected, and drawing a lot of comparisons with the bigger, doubtless more
expensive, more heat-producing Pentium. More than double the Specmarks per
watt was one comparison. Apple expects to sell a million PowerPC Macs in '94;
this is the first mass-market delivery of RISC technology, and Apple really
plans to mass-market it.
Unported Mac software can run on Apple's PowerPC machines via emulation, an
approach that has rarely in the past produced acceptable performance on any
platform. But at the developer's conference, Apple demonstrated Mac software
running with impressive speed in 680x0 emulation mode on a PowerPC machine.
Because the Toolbox has been ported to the PowerPC, and because of the
advantages of the PowerPC over existing Motorola CPUs, the emulated software
can actually run as fast as or faster than it would run on a fast 68040 Mac.
If ported rather than emulated, it could run several times faster.
There are currently two ways to port Mac apps to the PowerPC. EchoLogic's
FlashPort is a direct binary-to-binary translator from 680x0 code to PowerPC
code. That doesn't sound ideal, but Apple is using it to port elements of the
Mac OS to PowerPC, with apparently excellent results. And Apple and Symantec
should have a C compiler for PowerPC out about the time you read this. Pascal
programmers will have to use FlashPort or wait longer, I guess. And then
Bedrock is the next step. It'll be possible to get your hands on Bedrock late
this year; the real final release is scheduled for mid-94.
Bedrock is the replacement for Apple's unloved MPW, which is going away
unmourned. Bedrock is Symantec's and Apple's C++ application framework for
writing portable applications. The idea is that you'll have a single source
for your entire program, including all resources, and it'll be written with
Bedrock, and that source will compile to a Macintosh application for a 680x0
Mac, a Windows application for an Intel machine, or a PowerPC Mac application
for a PowerPC Mac. It's important to note that: 1. Apple intends this to be
used not only by commercial developers but also by corporate developers; and
2. Mac and PowerPC Mac and Windows platforms are just the first platforms that
Bedrock is intended to support.
Bedrock isn't Apple's (or Symantec's) last word on multiplatform development.
Bedrock has been designed to let you use the parts of it you like and bypass
it where there is some advantage in doing so. Apple and Symantec are also
working on various native PowerPC development tools, including, I suspect,
MacApp. Some of these will be out this year.


MacNeXT?


Why all this porting madness all of a sudden? Is it just a prelude to PowerPC,
or is there another reason? Maybe, mmm, oh I don't know, could it be--SATAN?!?
Yep, Apple's responding to the latest campaign by the Evil Empire to take over
the world.
And one of the most dramatically open technologies is a direct response to
Microsoft's OLE. (Well, some might call it a response to NeXTstep.)
I'm talking about Amber.
Amber is a technology for the development of a new form of software product
for a new model of computer use. The model is document-centered computing, and
the form of software is the component tool.
Document-centered computing means that, rather than working with application
programs that support specific kinds of documents, the user works with a
document and calls up particular tools or suites of tools to operate on
different kinds of data in that document. The user never leaves the document
to get something done. Any document can support any kind of data. All tools
are editing tools of some sort, and all editing tools are usable in all
documents.
This model implies a different form of software product, the small, focused
tool or suite of tools, as opposed to the monster app. Since the
document-centered model takes away from the third-party developer most of the
interface and all of the integration of capabilities, it pretty much destroys
the whole concept of the application as we know it today. Writing
document-centered software tools won't be hard, but it will take some guidance
and specialized tools from Apple, and that's what Amber is about.
Amber is like Microsoft's OLE and will compete with it, but there are a couple
of interesting twists. First, Amber will fully support OLE. Second, Amber
source code will be public domain. Apple is going to release all source for
any use without restriction. Apple is setting up a nonprofit Amber association
to promote Amber, discuss standards, port the code, and the like. Amber is
intended to be available for Windows and UNIX as well as Mac. The Mac source
code will be seeded to developers by the end of this year.
The final twist is that Amber itself, Apple admits, is just a stepping stone
to programming for Taligent, Apple's next operating system.
Next month: Satan's side of the story.






























August, 1993
C PROGRAMMING


Embedding a Keyboard and Singin' the Blues


It's August, the annual C issue and my anniversary. I began writing this
column five years ago this month. I'd regale you with the story of my rise to
eminence, but at a recent conference a young reader took me to task for my
occasional trips down memory lane. "Who cares about the good old days?" he
wanted to know. I realized then that someone too young to have a past might
very well have no appreciation or reverence for tradition and nostalgia.
So, instead of an inconsequential reminiscence, I'll tell a simple story that
leads into this month's project, in which I'll build a simple hardware device
to emulate the PC keyboard, then develop the necessary driver software in C.
First the story.
My brother, Fred, is a consulting engineer who, among other things, designs
hardware. One of his clients is an entrepreneur named Wolfie, and some of
Wolfie's enterprises involve the application of color terminals in embedded
systems. The color screens that announce flight arrivals and departures at
some airports are Wolfie's. Another of Wolfie's systems is a script-driven
automated driver's-license test that enables applicants to answer
multiple-choice questions displayed on the color screens. At the time of this
story, the system used a popular color terminal and a five-key keyboard with
A, B, C, D, and Enter keys. Wolfie's prototype employed the terminal's
standard QWERTY keyboard, but the final product had to have only the five
keys. Building the custom keyboard was Fred's job. Fred called one day and
asked me to help assemble some keyboards. They were due the next day, he was
late getting started, and Wolfie was fretting. We stayed up all night building
them. Each keyboard had five embossed keys, cable, and keyboard-controller
electronics built into a nice professional-looking box. It was mindless work
and we had a great time soldering, assembling, drinking coffee, and talking
until dawn. Wolfie shall be, he said, eternally grateful for my assistance.
That's how Wolfie talks. End of story; now on to the project.
Those custom keyboards came to mind recently when I began to prepare a lecture
for the Borland International Conference, held in San Diego in May. Usually I
use overhead transparencies, but this particular lecture called for a number
of slides interspersed with online demonstrations of compiles and executions
of the demonstration program. The conference speaker liaison said that it
might not be easy to switch projection systems during the speech. So I decided
to prepare everything on a laptop, plug into their VGA projector, and proceed
with one projection system.
When speaking to programmers, I like to walk around the room with the briefing
slides in the background and my attention on the attendees. The conference
rooms usually come with a wireless microphone, which lets me be Phil Donahue,
running all over the room, even going down into the audience. That has always
worked, because I could stroll back to the podium to change slides. I figured
on using the same technique with the automated slide show but found that in
rehearsal I'd often hit the wrong laptop key. They don't lay those things out
very well, and during a speech my attention is anywhere but on that cramped
keyboard. That's when I remembered Wolfie's five-key keyboard. With one of
those to carry around, I could control the slides from anywhere in the room
and have fewer wrong keys to press. A keyboard like that would work in many
embedded applications as well--arcade games, shopping-mall directories,
electronic catalogs, computer-assisted instruction, and the like--so it seemed
to be a project worth investigating.


The Hardware


Building a PC keyboard is easy enough if you know how the keys interact with
the electronics and you have the necessary parts and cabling. I didn't want to
do anything that fancy in hardware, though, because I prefer using software to
solve problems. Besides, I wanted to build the keyboard with inexpensive parts
off the Radio Shack rack.
Not using a keyboard-controller chip means I can't use the keyboard port
itself. The mouse has the serial port tied up, so that leaves the printer
port. The next problem is figuring out how to use it.
A PC's printer port has 3 input/output addresses and about 13 lines that could
be used for data. Eight lines carry the eight bits of printer data, but those
are write-only signals on some machines. The other lines are control signals.
Whether those lines are readable or not, and how you read them differs from
machine to machine. The Centronics protocol and the standard PC printer
specification leave a lot of room for unique implementations. Devices exist
that use the printer port for bidirectional data exchange. I have a Xircom
network adaptor that uses the printer port as an Ethernet port. The infamous
dongle sends an encoded value from the printer port to serialize protected
software. Many laptops use the printer port for an external disk-drive port.
So, it must be possible to connect something other than a printer to that port
and read data from it.
The printer connector has no voltage-supply pin, which means that there's no
power supply for any digital circuitry in a device. The Xircom adaptor uses an
external AC adaptor power supply of its own. I didn't want to do that.
Instead, I decided to try to read meaningful signals by grounding the data and
control pins at the printer connector, which is a mechanical solution that
involves no electronics. You can do this, but differences in printer-port
implementations means there's no common denominator. Not even one pin. So,
before you begin, you must determine how your particular printer port works.
Listing One, page 148, is port.c. It reads and displays the hexadecimal values
of the two sets of three printer I/O ports. To use the program, you need a
25-pin cable with a male end to plug into the computer's printer port and a
female end for your test. You also need two metal paper clips and a test lead
with alligator clamps. Plug the cable's male end into the printer port. Bend
out one end of each paper clip, and insert one of them into the hole for pin
25 of the female end of the cable. Connect one end of the test lead to this
paper clip and the other end to the other, loose, paper clip.
Compile and run port.c. When you run it, type a 0 as the only command-line
parameter. The 0 is the value that the program will write to the printer ports
before reading them. Many printer cards leave the signals "floating," which
means they're neither +5 volts nor grounded, but depend on the printer to pull
them up and ground them. Without a +5 line to pull these floating lines up,
their values at any particular time are unreliable. You can, however,
condition the initial value by writing all 0s or all 1s to the port before you
read it. The idea is to pull the port's lines up to +5 volts and then ground
pins one at a time to see if you can read the effect by reading the ports.
Whether you write 0s or 1s depends on which value sets the port lines "high."
You have to experiment to find out what works with your system. There are many
unconventional implementations, particularly in laptops, that manage to look
conventional at the BIOS level and also work at higher levels such as DOS and
Windows.
The port.c program continuously displays the output in Figure 1(a). The
display shows the two sets of three ports for LPT1 and LPT2 with their port
addresses across the top and the data values read from each port under its
address. In Figure 1(a), the first three addresses, which address LPT1, have
values; the second three, which address LPT2, are all 0xff, which means that
the second printer port probably isn't installed. The association between port
addresses and the logical LPT1 and LPT2 devices is made by DOS at startup
time. If DOS finds only one set of active ports, it sets those ports up as
LPT1, regardless of which port addresses it finds.
While port.c is running, insert the free paper clip into the hole for pin 1 in
the cable's female connector, then the hole for pin 2, and so on, all the time
watching the program's output. When a grounded pin produces a readable change
in one of the ports, the hexadecimal data value under the port's address
changes. Only one bit position of one port changes. If you don't get enough
satisfactory results, rerun the program with 255 as a command-line parameter
to write all 1s to the ports before you read them. Some signals are asserted
high, others low. Record the results of the program for all the pins that
change the display when you short them. The signals that change can be used
for keystroke values on your remote keyboard. The output will tell you which
pins to connect to the buttons on the keyboard and how to translate the port
values into simulated keystrokes. Press Ctrl-Break to exit from the program.
Next you must build the keyboard. A trip to Radio Shack serves up a hobby
project box (#270-230), some momentary, normally open SPST push buttons
(#275-1556), and a 100-foot spool of six-conductor telephone wire (#278-874)
with enough conductors for five buttons. The only part I couldn't get at Radio
Shack was the 25-pin male EIA connector and hood, which I found at a local
electronics supply house. Drill holes in the hobby box for the buttons and
cable to go in. Mount the buttons in the hobby box, run the cable into the
box, tie a knot in the cable on the inside of the box to keep it from pulling
out, and connect one end of the the black wire from the telephone cable to one
side of all of the buttons. This wire will be the ground wire. Connect the
other end of the black wire to pin 25 of the EIA connector. Connect the other
five wires to the other sides of the push buttons, one wire per connection.
Connect the other ends of the five wires to other pins on the EIA connector
choosing those pins that provided readable signals when you grounded them in
your test. Make a diagram of which buttons are connected to which pins.
After the keyboard is built, you must decide which buttons correspond to which
simulated keystrokes. Plan this part carefully. You'll want to arrange the
buttons in intuitive positions with respect to one another, depending on how
you use them. I use the five buttons to simulate F1, PgUp, PgDn, and the up
and down arrow keys.


The Software


So much for the hardware. Now you must determine which PC keys to simulate.
That depends on the application that uses the keyboard. To get the keyboard
scan codes for the selected keys, compile and run kb.c (Listing Two, page 148)
which reads the keyboard and displays the scan codes of the keys you type.
Type the keys you want to simulate and write down their scan codes. Type Esc
to exit from the program.
Next build an ASCII file with a table that describes the keys to simulate; I
use the table in Figure 1(b). The first line specifies how many keys to
simulate, five in this example. Each of the remaining lines defines a key. The
first four-digit value is the hexadecimal keyboard scan code, as reported by
kb.c. The second value on the line is the hexadecimal printer port that will
deliver a changed value when the corresponding button grounds a pin. The third
value is a hexadecimal value that identifies the bit position that will
change. The program that uses this table will AND this value with the value
read from the printer port. The fourth value is either the same as the third
or 0, depending on whether the bit changes from 0 to 1 or 1 to 0,
respectively, when you push the button. The last value is either 00 or ff, to
specify the value that conditions the port before the program reads it.
Listing Three, page 148 (pb.c), converts the push buttons to keystrokes. It is
a TSR, and it intercepts the BIOS 0x16 keyboard interrupt vector and the
system-timer interrupt vector. To run the program, enter its name and specify
the table file as standard input like this:
C:>pb< keytable
When the program loads, it reads the table into memory. Then it hooks and
chains the two interrupt vectors and becomes resident.
Application programs call interrupt 0x16 to read the keyboard. kb.c intercepts
that interrupt and uses the button-definition table to test the printer ports
to see if one of the programmed buttons has been pressed. If so, the
programmed scan code is returned. Otherwise, the program passes control to the
chained 0x16 interrupt handler so that the real keyboard works as usual.
The program uses the timer interrupt to regulate the typematic repeat rate of
the keyboard. As with the standard keyboard, an initial delay is followed by a
shorter one for subsequent repeats for the same, prolonged key press.
The keyboard and pb.c program are not perfect. For example, the program does
not work with Windows. You'd need to write a specific Windows-keyboard device
driver program for that. Your meanderings around the lecture hall are limited
to the length of the cable, too, which tends to get tangled if you go too far
or if you stuff the box into your briefcase. Wireless devices that use radio
frequencies or infrared technology (like a TV's remote control) can emulate
more keys and the mouse, but they're more expensive than our home-built
device. Even with its shortcomings, though, the keyboard simulator solves a
problem for me and is an inexpensive way to add custom keyboards to other
embedded systems.


The Tech-support Blues


Some of you are responsible for technical support in your companies. That's
where programmers do penance for having written programs that other people
want to use. It's called, "paying your dues with the tech-support blues."
Besides providing tech support, most of you, as users of compilers and such,
will at one time or another be on the receiving end of a tech-support service,
and that experience can give you another version of the blues.
I've had those tech-support blues three times in the recent past, as a
consumer rather than a supporter, and I have some observations that might help
you the next time you consider buying a product or the next time your own
tech-support phone rings.
There are several ways to provide tech support. You can provide a phone
number, preferably an 800 number so your users don't have to pay to be on
hold. You can use a fax. You can put a sysop on a CompuServe forum, a method
that I like because other users can chime in to help, and the phone is never
busy. But regardless of which method you use, if you offer tech support, you
ought to be prepared to deliver on your promise. Here's how well all three
methods worked for me recently.
I bought a new computer from Gateway 2000. They're one of the high-visibility
clone makers with multipage, full-color, centerfold ads in magazines. They
have an 800 number for taking orders. You never have any trouble getting
through to that number. There's always an eager sales person ready to take
your order. My new computer is a belch-fire, neck-snapper 66-MHz 486, a
programmer's dream because it bangs out long makefiles quicker than you can
watch for warning messages. Gateway has a 30-day return policy. If you don't
like it, send it back. I liked it, but it had a problem: It wouldn't run Brief
3.1. Every time I saved a file and exited, the computer displayed a
divide-by-zero error message and locked up. Surely, says I, they've heard
about this one. They have an 800 number for tech support. Trouble is, it's
always busy. That's odd, the sales line is always available. It seems to me
that a company should have more sales than problems to stay in business. If a
line's going to be busy, it ought to be the one getting the most calls. But
for two weeks I dialed tech support, and for two weeks I got a busy signal.
Finally I called the company's reception line and they promised me a
"guaranteed call-back." Two more weeks went by. No call-back. I kept trying
the tech-support line, which was always busy, and my 30 days were running out.
So I called reception again and talked to Heather. Right off, she wanted to
give me a guaranteed call-back. No soap, I said, I've tried that and it
doesn't work. Let me talk to someone right now or I'm sending this turkey
back. Enter the Catch-22. You can't get a return authorization, said Heather,
until you've discussed your problem with tech support, and their line is busy.
Let me talk to your supervisor. He's busy, too. Let me talk to the president
of the company. We don't have his number. Where does he live? Don't know.
Well, I'm not hanging up until I talk to someone who can resolve this problem.
Please hold. After a while, Heather, who has the patience of a saint, returned
to repor



t that her supervisor had authorized her to interrupt a very busy tech-support
person, putting me ahead of other poor souls who were also waiting, obviously
more patiently than I. After about ten minutes of unbearable elevator
soft-rock music, the tech-support person came on line. I described the problem
and two minutes later had a Debug patch script to fix the Brief 3.1 executable
so that it wouldn't do whatever bad thing it was doing. The problem is fixed
and the computer is fine, but it took me a month to get a two-minute solution
to a show-stopping, wall-hitting, dead-in-the-water problem. And I would not
have gotten it if I had not obstinately insisted on the support that was my
due.
The lesson? Before you buy an expensive product from anyone, make a test run
through their tech-support service. Call them to see how easy it is to get to
talk to someone. If you're satisfied, proceed. If not, buy elsewhere.
Second case in point: Recently I revised a C++ book to include templates. To
make sure that my programs were not compiler specific, I ran them through
Borland C++ 3.1, Zortech C++ 3.1, and Comeau C++ 3.0. The Zortech compiler
wouldn't compile some of the exercises. In one case, it reported an internal
error with nothing more than an error number. The other cases were perfectly
good C++ template code, which the other compilers accepted. I reduced the four
problems to four very short code samples and posted a message on CompuServe in
the Symantec developers' forum. When you join that forum, a cheery message
greets you and promises 48-hour turnaround time on problem reports. I was on
deadline and needed an answer. My message was ignored. After about two weeks I
posted an inquiry and was greeted by a new sysop, who asked me to repost the
problems. I did. It's three weeks later now, and there's no response. A final
message asked them to comment. No response. My book will go to press with
caveats for Zortech users. But worse, I am left with a bad taste that
compromises my objectivity about Symantec products. I don't like having my
problems ignored, and I don't trust a company that does that.
The lesson? Don't ignore someone who has a tech-support problem. At best
you'll promote ill will. At worst, you'll be reading about yourself in the
press, probably in letters to the editor, but sometimes in a column. By the
way, the best tech support often comes from small companies. You may even get
to talk to the programmer who built the product. Comeau C++ is an example.
Greg Comeau is always available on CompuServe or by phone.
This final tale could have had an unhappy ending. I could be writing this
column from the confines of a Federal country club in the good company of
lawyers, S&L executives, drug runners, and high-ranking government officials.
For years I used TurboTax from ChipSoft to prepare my income-tax returns. This
year I did not upgrade the venerable DOS version because that new 486 computer
included a bundled copy of TurboTax for Windows. Not wanting to wait until the
last minute, I cranked it up fully one week before the April 15 filing
deadline. I've never seen a released software product less ready for release.
The user interface is a joke, unlike any CUA standard Windows application I
have seen, and the program just doesn't work. I was able to get enough out of
it to file for an extension, but I cannot file that return with our favorite
uncle. It's no surprise that ChipSoft's tech-support line was always busy. So
was their fax line. There are so many problems that I wrote them a three-page
report. I couldn't get through to their fax until a week after April 15th.
Finally, the report went out, and I waited to hear from them. Nothing. Not
even an acknowledgment. So how does one get their attention? Perhaps this will
work. If all of you fax a copy of this column to ChipSoft maybe someone there
will take notice. Their fax number is 1-800-766-8890.
Figure 1: (a) port.c output; (b) table that describes the keys to simulate.

(a)

0378 0379 037a 03bc 03bd 03be
000b 007f 00C0 00ff 00ff 00ff


(b)

5
5100 379 40 00 00
4900 379 80 80 00
4800 379 20 00 00
5000 379 10 00 00
3b00 37a 01 01 00


[LISTING ONE]

#include <stdio.h>
#include <dos.h>
#include <stdlib.h>

int ports[] = {
 0x378,
 0x379,
 0x37a,
 0x3bc,
 0x3bd,
 0x3be
};
void main(int argc, char *argv[])
{
 int i;
 for (i = 0; i < 6; i++)
 printf("%04x ", ports[i]);
 putchar(\n');
 while (1) {
 putchar(\r');
 for (i = 0; i < 6; i++) {
 if (argc > 1)
 outportb(ports[i], atoi(argv[1]));
 printf("%04x ", inportb(ports[i]));
 }
 }
}

[LISTING TWO]

#include <stdio.h>
#include <bios.h>

void main()
{
 int c = 0;
 while ((c & 255) != 27) {
 c = bioskey(0);
 printf("\n%04x", c);
 }
}


[LISTING THREE]

#include <dos.h>
#include <stdio.h>

// -------- typematic values
#define DELAY 10
#define TYPEMATIC 1
/* ------- the interrupt function registers -------- */
typedef struct {
 int bp,di,si,ds,es,dx,cx,bx,ax,ip,cs,fl;
} IREGS;
// ---- map signals to keystrokes
struct pmap {
 int KeyCode;
 int Port;
 int bit;
 int mask;
 int prime;
};
int keycount; // number of keys simulated
#define MAXKEYS 20 // maximum keys to simulate
struct pmap PTbl[MAXKEYS]; // table of simulated keys
// ---------- interrupt vectors
#define KEYBOARD 0x16
#define TIMER 0x1c
#define ZEROFLAG 0x40
// -------- 0x16 BIOS functions
#define READKEY 0
#define KEYSTATUS 1
// ------ interrupt vectors
static void (interrupt *oldtimer)(void);
static void (interrupt *old16)(void);
static void interrupt newtimer(void);
static void interrupt int16(IREGS);
// ------- TSR stuff
static unsigned highmemory;
static unsigned sizeprogram;
extern unsigned _heaplen = 1;
extern unsigned _stklen = 1024;

void main()
{
 int i;
 // ---- read in the number of keys to simulate
 scanf("%d", &keycount);
 if (keycount <= MAXKEYS) {
 // ---- read the key simulation table
 for (i = 0; i < keycount; i++)
 scanf("%X %X %X %X %X",
 &PTbl[i].KeyCode,
 &PTbl[i].Port,
 &PTbl[i].bit,
 &PTbl[i].mask,
 &PTbl[i].prime);
 /* ----- attach interrupt vectors ------ */
 old16 = getvect(KEYBOARD);
 oldtimer = getvect(TIMER);

 setvect(KEYBOARD, int16);
 setvect(TIMER, newtimer);
 /* ------ compute program size ------- */
 highmemory = _SS + ((_SP+8) / 16);
 sizeprogram = highmemory - _psp + 1;
 /* ----- terminate and stay resident ------- */
 keep(0, sizeprogram);
 }
}
static int Timer = 0;
/* ----- timer interrupt service routine ------- */
static void interrupt newtimer(void)
{
 if (Timer > 0)
 --Timer;
 (*oldtimer)();
}
// ---- test for simulated key pushed
static int pushbutton(void)
{
 int i, b, key = 0;
 static int TimerValue = DELAY;
 if (Timer == 0) {
 for (i = 0; i < MAXKEYS; i++) {
 struct pmap pm = PTbl[i];
 outportb(pm.Port, pm.prime);
 b = inportb(pm.Port);
 if ((b & pm.bit) == pm.mask) {
 Timer = TimerValue;
 TimerValue = TYPEMATIC;
 key = pm.KeyCode;
 break;
 }
 }
 if (i == 5) {
 Timer = 0;
 TimerValue = DELAY;
 }
 }
 return key;
}
/* ----- Keyboard BIOS ISR ------- */
static void interrupt int16(IREGS ir)
{
 int func = (ir.ax >> 8) & 0xff;
 static int newkey = 0;
 static int rdflags = 0;
 static int stflags = 0;
 /* -- for read key bios call, loop until key pressed -- */
 if (func == READKEY) {
 int flg = ZEROFLAG;
 while (flg & ZEROFLAG) {
 _AH = KEYSTATUS;
 geninterrupt(KEYBOARD); /* this will call myself */
 flg = _FLAGS;
 }
 }
 if (func == READKEY func == KEYSTATUS) {
 if (!newkey)

 newkey = pushbutton();

 if (newkey) {
 ir.ax = newkey;
 if (func == READKEY) {
 newkey = 0;
 ir.fl = rdflags;
 }
 else
 ir.fl = stflags & ~ZEROFLAG;
 return;
 }
 }
 _BX = ir.bx;
 _CX = ir.cx;
 _AX = ir.ax;
 (*old16)();
 ir.ax = _AX;
 ir.fl = _FLAGS;
 if (func == READKEY)
 rdflags = ir.fl;
 if (func == KEYSTATUS)
 stflags = ir.fl;
}
End Listings





































August, 1993
ALGORITHM ALLEY


Diving into Windows Bitmaps: Part One




Tom Swan


Until you look below the surface, you never know what you'll find. It was a
lesson I learned the hard way a few months ago, when my wife Anne and I
chartered a sailboat in the British Virgin Islands for a vacation with my
parents. Despite getting us lost three times--not easy to do where the islands
are so close and the weather so heavenly--I managed to locate a former pirate
cove, now called "The Bight," where we decided to anchor for the evening. By
unanimous vote, the crew ordered the skipper (that's me, by virtue of having
read Basic Sailing twice) to snorkel out to the anchor and check that the hook
was well buried in the sand. Wanting a swim, I happily agreed. But as I
paddled away from the boat, I was distracted by a large sea turtle swimming an
underwater ballet along with several silvery fish. What a glorious creature
that turtle was! I spent many minutes enjoying her beauty and imagining that I
truly belonged there in paradise with her.
Finally, remembering my job, I continued searching for the anchor. When I
reached it, I saw to my horror that the heavy plow was sitting upside down in
the sand and had dragged along the bottom, allowing 38 feet of rented yacht to
drift steadily toward the shore, now only a few yards away. Friends, you have
never seen anyone swim back to a boat with greater haste. I should probably
get an honorary Olympic medal. There I was, wasting my time watching a sea
turtle and her fishy companions while unknowingly facing untold calamities in
the next few minutes. Put it this way: Running aground definitely would have
wrecked dinner.
Eventually, I managed to secure the anchor and avert disaster, but the event
reminded me of an important lesson--not just in sailing, but also in other
endeavors such as programming and debugging. In the course of investigating a
potential problem, I had allowed myself to become sidetracked by other
interests, causing me to forget my original goal. The moral is: Stay focused.
Get that anchor down before swimming off on new adventures. Look below the
surface of your code for potential problems, and resist the urge to implement
new features before completing less-interesting chores such as testing and
debugging. Boats should float and software should run cleanly. With a little
care, neither should ever end up on the rocks.


Bitmaps Down Below


Peering beneath the surface of a Windows bitmap (you don't need a snorkel,
just a debugger) reveals several interesting facts about bitmap files and the
algorithms for compressing them. The techniques are briefly described in the
Microsoft Windows Programmer's Reference, Volume 4: Resources, but to my
knowledge, they have never been expressed in algorithm form. Because I had
only that book's sketchy descriptions to guide me, my first attempts to lay
down the steps failed. Finally, I cracked the puzzle by devising a suite of
test programs to compress and decompress make-believe pixel values stored in
plain text files. This approach made debugging easier, and because my sample
test files were in ASCII, I could easily create and modify them with a text
editor--far easier than creating real bitmap files using a drawing program. It
also was no trouble to write another program for creating sample bitmaps at
random--part of a "Monte Carlo" test that I'll describe next month. All of
this advance work paid off in the final compression utility, which
successfully packed a bitmap on the program's maiden voyage. (Okay, maybe I
got lucky, but anyway, it's a real ego booster to write code that works on the
first try.)
This month, I'll describe the Windows bitmap compression and decompression
algorithms and list two of the programs in the test suite. Next month, I'll
list the remaining test programs and the final bitmap compressor. Although in
past columns, I've used pseudocode for algorithms and Pascal for sample
listings, this time I'll present the algorithms in Pascal and the programs in
C++ (which for other reasons, this project required). The sample programs are
not object oriented, so it shouldn't be difficult to convert them to other
languages.
A device-independent bitmap (DIB) file begins with various headers that
describe the image and its colors. Following this information is an array of
pixels in one of several formats. Monochrome bitmaps are stored one bit per
pixel; 16-color bitmaps take four bits per pixel; 256-color bitmaps use one
8-bit byte per pixel; and so-called "true-color" bitmaps use a whopping 24
bits for each pixel. Most pixel values are actually indexes into the bitmap's
color table--a pixel p's color equals colorTable[p]. In a true-color image,
which lacks a color table, pixels directly represent red-green-blue (RGB)
colors.
You can find more information on bitmap file formats in many Windows
programming books. I'll focus here on a bitmap's array of pixels and ignore
the other information in the DIB file. I'll describe the compression method
for 8-bit, 256-color images, but the techniques are very nearly the same for
4-bit, 16-color files.


Bitmap Compression


Bitmap pixels are compressed using a combination of three modes. In
run-length-encoded mode (RLE), 2-byte "compression units" represent from 1 to
255 runs of pixels, all of the same color. The unit 04 07, for example, packs
4 pixels of color 07. In escape mode, the first byte is 0 and the next byte
signifies one of three special instructions: 0 to mark the end of the scan
line (the proper term for a horizontal row of pixels), 1 for the end of the
bitmap, or 2 for a delta command, sometimes called a "move instruction." Delta
commands are followed by two more bytes that represent horizontal and vertical
offsets to where the next pixel should appear. For example, the delta unit 00
02 05 08 indicates that the next pixel is to be drawn five positions to the
right and eight down from the current location. Deltas are useful for
compressing foreground images to be drawn on a fixed background (an important
animation technique), but they aren't that valuable for compressing
run-of-the-mill bitmaps.
In the third and final compression unit, called absolute mode, the first byte
is 0 and the next byte is 3 or greater, representing the number of
uncompressed pixels that follow. For instance, the absolute-mode unit 00 03 09
08 07 encodes the three pixel values 09, 08, and 07. Because absolute-mode
runs must have three or more pixels--just one of several undocumented quirks
in the algorithms--different-color runs of one or two pixels must be expanded
using RLE pairs. For example, an absolute run of two adjacent pixels 09 and 07
must be encoded as 01 09 01 07 (one 09 and one 07). This means that compressed
files with few same-color runs might actually grow in size. Obviously, in such
cases, there would be nothing to gain from compressing the file.
Based on these descriptions, I devised algorithms in Pascal to pack and unpack
bitmaps. (I did not implement delta escape codes.) As Listing One (page 150),
Algorithm #10, shows, unpacking is the simpler of the two jobs. The method
reads bytes from an open file f and requires three subroutines not shown:
GetByte(f) returns the next byte from the file, PutByte(p) displays or writes
one pixel, and StartNewScanLine performs the graphical equivalent of a
carriage return and linefeed.
The algorithm for packing a bitmap is fairly complex, so I decided to
implement the parser using a state machine, which helped organize the code
into manageable chunks. Several constants at the beginning of Listing Two
(page 150), Algorithm #11, describe various states that the machine can
assume. The algorithm requires two input values: an Integer np set to the
number of pixels per scan line and an array of uncompressed pixel bytes; and
sl, representing one scan line of the image. The method uses a While loop and
a Case statement to implement the state machine. Each labeled case represents
one state: READING examines the current pixel and determines what to do next;
ENCODING packs same-color pixels into RLE units; ABSMODE packs different-color
pixels as absolute-mode runs; SINGLE handles absolute-mode runs of one or two
pixels; and ENDOFLINE marks the end of a compressed scan line.


Test Suite


Like most complex algorithms, the two listed here might seem obscure if you
simply read them from top to bottom. To better understand the methods, do as I
did: Process sample bitmap data in text files, as illustrated in Example 1.
All values are hexadecimal bytes. The file begins with an information line of
two bytes representing the number of pixels in a scan line (0A hex here) and
the number of scan lines in the image (04). The next several lines give the
pixel values for each scan line.
Listing Three, TUNPACK.CPP (page 150), implements Algorithm #10. Listing Four,
TPACK.CPP (page 151), implements Algorithm #11. Both programs write to the
standard output file. To run the tests, store Example 1 in a text file named
TEST.DAT, then enter the DOS command:
TPACK TEST.DAT >PACKED.DAT
To unpack the result, enter:
TUNPACK PACKED.DAT >FINAL.DAT
If the test passes, the values in FINAL.DAT should be the same as those in
TEST.DAT, minus the original file's information line, which TUNPACK does not
recreate.
Although TPACK and TUNPACK do not operate on real bitmap files, they simplify
debugging and testing, and they might also be useful for porting the
algorithms to other languages and operating systems. Writing data-compression
software is a nerve-wracking business, and it's vital to do everything
possible to ensure that compressed files are completely recoverable. You can't
perform too many tests of a data-compression utility.


Your Turn


I wrote the sample programs in this column for DOS in order to keep the
listings short. If anyone cares to rewrite the programs for Windows, I'd be
interested in receiving them in care of DDJ. Or, you can upload files
(compressed text only, please) to my CompuServe ID, 73627,3241. Next month,
I'll list the rest of the programs in the test suite along with the final
utility that can compress real Windows bitmap files.
Meanwhile, I wonder where that sea turtle is right now and whether she noticed
my panicked race back to the boat. Call me paranoid, but I can't shake the
image of a smart-aleck turtle going around harbors, pulling up anchors, and
watching the fun. You don't suppose that's possible, do you? Nah.
Example 1: Sample "fake" bitmap text file.
0A 04
01 01 01 02 02 02 02 02 03 03
01 02 03 04 05 06 07 08 09 0A
01 02 03 04 04 04 01 02 03 04
01 02 01 02 01 02 01 02 01 02



[LISTING ONE]





{ bunpack.txt -- Algorithm #11: Unpack 8-bit Windows bitmap by Tom Swan }

const
 ESCAPE = 0;
 ENDOFLINE = 0;
 ENDOFBITMAP = 1;
 DELTA = 2;

procedure UnpackRLE8(var f: FILE);
var
 done: Boolean;
 count: Integer;
 oldCount: Integer;
 byt: Byte;
begin
 while ((not done) and
 (not eof(f))) do
 begin
 byt := GetByte(f);
 if (byt = ESCAPE) then
 begin { Unpack escape code }
 byt := GetByte(f);
 case byt of
 ENDOFLINE:
 StartNewScanLine;
 ENDOFBITMAP:
 done := True;
 DELTA:
 begin { Not implemented }
 Writeln('Delta escape!');
 Halt
 end;
 else begin
 { Absolute-mode run }
 count := byt;
 oldCount := count;
 while count > 0 do
 begin
 byt := GetByte(f);
 PutByte(byt);
 count := count - 1
 end;
 if Odd(oldCount) { padding }
 then byt := GetByte(f)
 end { else }
 end { case }
 end else
 begin { Unpack RLE unit }
 count := byt;
 byt := GetByte(f);
 while count > 0 do

 begin
 PutByte(byt);
 count := count - 1
 end { while }
 end { else }
 end { while }
end; { UnpackRLE8 }


(*
// --------------------------------------------------------------
// Copyright (c) 1993 by Tom Swan. All rights reserved
// Revision 1.00 Date: 04/27/1993 Time: 08:52 am
*)

[LISTING TWO]
{ bpack.txt -- Algorithm #10: Pack 8-bit Windows bitmap by Tom Swan }

const
 READING = 0;
 ENCODING = 1;
 ABSMODE = 2;
 SINGLE = 3;
 ENDOFLINE = 4;
 NOSTATE = 999;

procedure PackRLE8(
 np: Integer; sl: ScanLine);
var
 slx: Integer;
 state: Integer;
 pixel: Integer;
 count: Integer;
 done: Boolean;
 oldcount: Integer;
 oldslx: Integer;
begin
 slx := 0; { Scan line index }
 state := READING;
 done := False;
 while not done do begin
 case state of

 READING:
 (* Input:
 np = # pixels in scan line
 sl = scan line
 sl[slx] = next pixel *)
 begin
 if slx >= np then
 state := ENDOFLINE
 else if slx = np - 1 then
 begin
 count := 1; { 1 pixel left }
 state := SINGLE
 end else
 if sl[slx] = sl[slx + 1] then
 state := ENCODING
 else

 state := ABSMODE
 end;

 ENCODING:
 (* Input:
 slx <= np - 2 (Run of 2+ pixels)
 sl[slx] = first pixel of run
 sl[slx] = sl[slx + 1] *)
 begin
 count := 2;
 pixel := sl[slx];
 slx := slx + 2;
 while ((slx < np) and
 (pixel = sl[slx]) and
 (count < 255)) do
 begin
 count := count + 1;
 slx := slx + 1
 end;
 PutByte(count); { RLE unit }
 PutByte(pixel);
 state := READING
 end;

 ABSMODE:
 (* Input:
 slx <= np - 2 (Run of 2+ pixels)
 sl[slx] = first pixel of run
 sl[slx] <> sl[slx + 1] *)
 begin
 oldslx := slx; { Save index }
 count := 2;
 slx := slx + 2;
 { Compute # bytes in run }
 while ((slx < np) and
 (sl[slx] <> sl[slx - 1]) and
 (count < 255)) do
 begin
 count := count + 1;
 slx := slx + 1
 end;
 { Back up on same-color run }
 if ((slx < np) and
 (sl[slx] = sl[slx - 1]))
 then if (count > 1)
 then count := count - 1;
 slx := oldslx;
 if (count < 3 ) then
 state := SINGLE {short run}
 else begin {normal run}
 PutByte(0);
 PutByte(count);
 oldcount := count;
 while (count > 0) do
 begin
 PutByte(sl[slx]);
 slx := slx + 1;
 count := count - 1
 end;

 if Odd(oldcount) then
 PutByte(0); {word padding}
 state := READING
 end { else }
 end;

 SINGLE:
 (* Input:
 count = # pixels to output
 slx < np
 sl[slx] = first pixel of run
 sl[slx] <> sl[slx + 1] *)
 begin
 while count > 0 do
 begin
 PutByte(01);
 PutByte(sl[slx]);
 slx := slx + 1;
 count := count - 1
 end;
 state := READING
 end;

 ENDOFLINE:
 begin
 PutByte(0);
 PutByte(0);
 done := TRUE;
 state := NOSTATE
 end;
 else
 begin
 Writeln('Unknown state');
 Halt
 end
 end { case state of }
 end { while }
end; { PackRLE8 }

begin
 Read(np);
 Read(ns);
 while (ns > 0) do
 begin
 GetNextScanLine(sl);
 PackRLE8(np, sl);
 ns := ns - 1
 end;
 PutByte(0); { Mark bitmap end }
 PutByte(1)
end.


(*
// --------------------------------------------------------------
// Copyright (c) 1993 by Tom Swan. All rights reserved
// Revision 1.00 Date: 04/27/1993 Time: 09:00 am
*)


[LISTING THREE] /* -----------------------------------------------------------
*\
** tunpack.cpp -- Test unpacking compressed bitmap data **
** ----------------------------------------------------------- **
** **
** Note: Delta escape-code unpacking is not implemented. **
** **
** ----------------------------------------------------------- **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
\* ----------------------------------------------------------- */

#include <iostream.h>
#include <iomanip.h>
#include <fstream.h>
#include <stdlib.h>

typedef unsigned char Byte;

#define FALSE 0
#define TRUE 1

// Compression escape codes
#define ESCAPE 0
#define ENDOFLINE 0
#define ENDOFBITMAP 1
#define DELTA 2

void Error(const char *msg);
void DecompressFile(const char *fname);
int Odd(int v);
int GetByte(ifstream &ifs);
void PutByte(unsigned char b);

int main(int argc, char *argv[])
{
 if (argc <= 1)
 Error("filename missing");
 DecompressFile(argv[1]);
 return 0;
}

// Display error message and halt
void Error(const char *msg)
{
 cerr << endl << "Error: " << msg << endl;
 exit(1);
}

// Decompress fake bitmap file fname
// Write decompressed results to stdout
void DecompressFile(const char *fname)
{
 int endOfBitmap; // True if 00 01 found at end of input
 int count; // Used at various locations in function
 int oldCount; // Copy of count for absolute-mode units
 unsigned byte; // Input bytes

 ifstream ifs(fname, ios::in);
 if (!ifs)
 Error("unable to open file");

 endOfBitmap = FALSE;
 while (!endOfBitmap && !ifs.eof()) {
 byte = GetByte(ifs);
 if (byte == ESCAPE) {
 // Unpack escape code unit
 byte = GetByte(ifs);
 switch (byte) {
 case ENDOFLINE:
 cout << endl;
 break;
 case ENDOFBITMAP:
 endOfBitmap = TRUE;
 cout << endl;
 break;
 case DELTA:
 Error("Delta escape codes not implemented");
 break;
 default: // Absolute-mode run
 count = byte;
 oldCount = count;
 while (count > 0) {
 byte = GetByte(ifs);
 PutByte(byte);
 count--;
 }
 if (Odd(oldCount))
 byte = GetByte(ifs); // Read word-boundary padding byte
 break;
 }
 } else {
 // Unpack run-length-encoded unit
 count = byte;
 byte = GetByte(ifs);
 while (count > 0) {
 PutByte(byte);
 count--;
 }
 }
 }
 if (!endOfBitmap)
 Error("End of bitmap marker (00 01) not found");
}

// Return next byte from input file ifs
// Count number of bytes read
int GetByte(ifstream &ifs)
{
 int b;
 ifs >> hex >> b;
 return b;
}

// Return true if v is odd
int Odd(int v)
{
 return v & 0x01;
}

// Write byte b in hex in 2 columns with leading 0

// plus one blank to cout
void PutByte(unsigned char b)
{
 cout << setiosflags(ios::uppercase);
 cout << setw(2) << setfill('0') << hex << (int)b << ' ';
 cout << setfill(' ') << dec;
 cout << resetiosflags(ios::uppercase);
}


// --------------------------------------------------------------
// Copyright (c) 1993 by Tom Swan. All rights reserved.
// Revision 1.00 Date: 04/27/1993 Time: 08:55 am

[LISTING FOUR]
/* ----------------------------------------------------------- *\
** tpack.cpp -- Test packing uncompressed bitmap data **
** ----------------------------------------------------------- **
** **
** Data file format (in ASCII hex): **
** **
** np ns # pixels per line; # scan lines **
** pp pp pp pp pp ... pp pixel byte values in hex **
** pp pp pp pp pp ... pp " " " **
** pp pp pp pp pp ... pp " " " **
** **
** ----------------------------------------------------------- **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
\* ----------------------------------------------------------- */

#include <iostream.h>
#include <iomanip.h>
#include <fstream.h>
#include <stdlib.h>

typedef unsigned char Byte;

#define FALSE 0
#define TRUE 1

// State-machine definitions
#define READING 0 // General reading mode
#define ENCODING 1 // Encoding same-color pixel runs
#define ABSMODE 2 // Encoding different-color pixel runs
#define SINGLE 3 // Encoding short absolute-mode runs
#define ENDOFLINE 4 // End of scan line detected

void Error(const char *msg);
void CompressFile(const char *fname);
int Odd(int v);
Byte *NextScanLine(ifstream &ifs, int np);
void PutByte(Byte b);
void PackRLE8(int np, const Byte *sl);

int main(int argc, char *argv[])
{
 if (argc <= 1)
 Error("filename missing");
 CompressFile(argv[1]);

 return 0;
}

// Display error message and halt
void Error(const char *msg)
{
 cerr << endl << "Error: " << msg << endl;
 exit(1);
}

// Compress test bitmap file fname
// Write compressed results to stdout
void CompressFile(const char *fname)
{
 int np, ns; // Number of pixels per line, number of scan lines
 Byte *sl; // Pointer to scan line

 ifstream ifs(fname, ios::in);
 if (!ifs)
 Error("unable to open file");
 ifs >> hex >> np >> ns;
 if ((np <= 0) (ns <= 0))
 Error("bad file format");
 while (ns-- > 0) {
 sl = NextScanLine(ifs, np);
 PackRLE8(np, sl);
 delete sl;
 }
 PutByte(0); // Mark end of bitmap
 PutByte(1);
 cout << endl;
}

// Return true if v is odd
int Odd(int v)
{
 return v & 0x01;
}

// Read next scan line of np pixels from file ifs
Byte *NextScanLine(ifstream &ifs, int np)
{
 if (np <= 0) Error("np zero or less in NextScanLine()");
 Byte *sl = new Byte[np]; // Allocate scan line
 if (!sl) Error("out of memory");
 int j = 0;
 int k;
 while (np-- > 0) {
 ifs >> hex >> k;
 sl[j++] = k;
 }
 return sl;
}

// Write byte b in hex in 2 columns with leading 0
// plus one blank to cout
void PutByte(Byte b)
{
 cout << setiosflags(ios::uppercase);

 cout << setw(2) << setfill('0') << hex << (int)b << ' ';
 cout << setfill(' ') << dec;
 cout << resetiosflags(ios::uppercase);
}

// Compress np pixels in sl
// Write compressed line to stdout
void PackRLE8(int np, const Byte *sl)
{
 int slx = 0; // Scan line index
 int state = READING; // State machine control variable
 int pixel, count; // Used by various states
 int done = FALSE; // Ends while loop when true
 int oldcount, oldslx; // Copies of count and slx

 while (!done) {

 switch (state) {

 case READING:
 // Input:
 // np == number of pixels in scan line
 // sl == scan line
 // sl[slx] == next pixel to process

 if (slx > np) Error("READING in PackRLE8()");

 if (slx >= np) // No pixels left
 state = ENDOFLINE;
 else if (slx == np - 1) { // One pixel left
 count = 1;
 state = SINGLE;
 } else if (sl[slx] == sl[slx + 1]) // Next 2 pixels equal
 state = ENCODING;
 else // Next 2 pixels differ
 state = ABSMODE;
 break;

 case ENCODING:
 // Input:
 // slx <= np - 2 (at least 2 pixels in run)
 // sl[slx] == first pixel of run
 // sl[slx] == sl[slx + 1]

 count = 2;
 pixel = sl[slx];
 slx += 2;
 while ((slx < np) && (pixel == sl[slx]) && (count < 255)) {
 count++;
 slx++;
 }
 PutByte(count); // Output run-length-encoded unit
 PutByte(pixel);
 state = READING;
 break;

 case ABSMODE:
 // Input:
 // slx <= np - 2 (at least 2 pixels in run)

 // sl[slx] == first pixel of run
 // sl[slx] != sl[slx + 1]

 oldslx = slx;
 count = 2;
 slx += 2;
 // Compute number of bytes in run
 while ((slx < np) && (sl[slx] != sl[slx - 1]) && (count < 255)) {
 count++;
 slx++;
 }
 // If same-color run found, back up one byte
 if ((slx < np) && (sl[slx] == sl[slx - 1]))
 if (count > 1) count--;
 slx = oldslx; // Restore scan-line index
 // Output short absolute runs of less than 3 pixels
 if (count < 3 )
 state = SINGLE;
 else {
 // Output absolute-mode run
 PutByte(0);
 PutByte(count);
 oldcount = count;
 while (count > 0) {
 PutByte(sl[slx]);
 slx++;
 count--;
 }
 if (Odd(oldcount))
 PutByte(0); // End run on word boundary
 state = READING;
 }
 break;

 case SINGLE:
 // Input:
 // count == number of pixels to output
 // slx < np
 // sl[slx] == first pixel of run
 // sl[slx] != sl[slx + 1]

 while (count > 0) {
 PutByte(01);
 PutByte(sl[slx]);
 slx++;
 count--;
 }
 state = READING;
 break;

 case ENDOFLINE:
 PutByte(0);
 PutByte(0);
 done = TRUE;
 break;

 default:
 Error("unknown state in PackRLE8()");
 break;

 }
 }
}


// --------------------------------------------------------------
// Copyright (c) 1993 by Tom Swan. All rights reserved.
// Revision 1.00 Date: 04/27/1993 Time: 09:07 am






















































August, 1993
UNDOCUMENTED CORNER


The Windows .RES File Format




by Alex G. Fedorov and Dmitry M. Rogatkin


Alex G. Fedorov is a freelance programmer and an executive editor for
ComputerPress magazine in Moscow, Russia. Alex can be contacted at
alex@computerpress.msk.su via the Internet. Dmitry M. Rogatkin, a freelance
programmer specializing in Windows software, is a lecturer at the Moscow
Institute of Electronic Machinery. Dmitry can be contacted at
datasc@adonis.ias.msk.su.




Introduction




by Andrew Schulman


The "Undocumented Corner" seems to be on a Windows file-format roll. Last
month, Mike Maurice explored the .PIF file format. Next month, Peter Davis and
Ron Burk will uncover the long-concealed .HLP file format. This month's
"Undocumented Corner," which comes to us from Russia (the authors are regular
contributors to the Russian-language magazine ComputerPress, published in
Moscow), reveals the Windows .RES file format.
Windows .RES files are produced by the Windows resource compiler (RC) and
contain the binary images for Windows resources (menus, dialogs, and so on)
prior to their inclusion in an executable file. While Microsoft has documented
"Resource Formats Within Executable Files" (Microsoft Windows 3.1 SDK,
Programmer's Reference, Volume 4: Resources, Chapter 7), it has not publicly
documented the .RES format--that is, the resource format outside executable
files. While resources within an executable are located using the "Resource
Table" (see section 6.2.3 of the SDK, volume 4), .RES files contain no such
resource table. As Alex and Dmitry explain, the .RES file format is slightly
but fundamentally different from the format of resources within executable
files.
Instead of a resource table, a .RES file is simply a concatenation of
individual resources, each of which has its own small variable-length header.
Alex and Dmitry document the format for this header. Since there's no central
header, there's no way to determine with 100 percent reliability that you
actually have a genuine .RES file. Yuck! In any case, aside from the header on
each individual resource, the resources in a .RES file are identical to those
in an .EXE file.
Originally, Alex and Dmitry's article also discussed the format of .BMP
(bitmap), .CUR (cursor), and .ICO (icon) files, but I've skipped over this
discussion, as this material is already documented by Microsoft in Chapter 1
("Graphics File Formats") of the 3.1 SDK, volume 4.
The .RES file format makes an interesting case study of why some important
interfaces are undocumented. First, Microsoft has already produced a
limited-circulation document ("Microsoft Windows 3.0 Internal Resource
Formats," November 12, 1990) on this subject, so the lack of public
documentation can't be attributed to a lack of sufficient resources (as it
were) in Microsoft's documentation department. The documentation already
exists, and has simply not been made public. Yet, for reasons that will soon
become clear, I don't think this is a simple case of "they're hiding it from
us."
The Microsoft "Internal Resource Formats" document was made available to
independent software vendors (ISVs) on a limited basis as part of the
Microsoft Open Tools effort. Despite its name and good intent, Open Tools
appears to be limited to only the largest and most important ISVs. Many of you
have complained about never having been able to pry the (therefore somewhat
misnamed) Open Tools materials out of Microsoft. I've previously dismissed
this as unimportant, since the Windows 3.1 SDK incorporated almost everything
previously available only through Open Tools, but I now see that at least the
.RES file-format documentation from Open Tools was not brought over to the 3.1
SDK.
The Open Tools document refers to the .RES file format as a "proprietary
binary format," and explains that "Until now, the .RES file format has been
undocumented. This was mostly because its structure was version-dependent and
changed frequently." Fair enough. Apparently, the reason for documenting it as
part of Open Tools was that, since much of the .RES file contents are copied
directly into .EXE files, this helped document the more-important format of
resources within .EXE files. Presumably, because Microsoft later went all the
way and properly documented resources within .EXE files, it didn't bother
publicly releasing the documentation for resources within .RES files, which it
seems to consider merely an "intermediate file" of little intrinsic
importance.
My favorite Microsoft product, the Microsoft Developer Network (MSDN) CD-ROM,
does provide several crucial snippets of information about the .RES file
format, yet still fails to fully document it (rather amazing, considering what
we'll see is the utter simplicity of this format). First, the MSDN CD does
properly document the Win32 .RES file format, including a brief note on how
(using illegal type and name ordinal numbers) to distinguish Win32 from Win16
.RES files. (This same document, by Floyd Rogers, may be found in RESFMT.TXT
on the Win32 CD-ROM.) Second, the MSDN CD includes the source code for
RESTOOL, a program that reads in Windows .RES files and generates C++ class
declarations for any dialog boxes in the .RES. While no other resource types
are handled, this demonstrates the basics of walking through a .RES file.
All this is fairly typical of Microsoft's extremely annoying (but not very
nefarious), half-hearted approach to documentation. Rather than some
deliberate master-minded attempt to deprive software developers of
information, we see almost the exact opposite: an almost complete lack of
deliberateness. Unlike the example of, say, the MS-DOS network redirector,
Microsoft has in the case of the .RES file format made no particular effort to
either hide or provide this information. It's not a conspiracy; it's just
stupidity.
That Microsoft dances around the issue so much, inadvertently documenting
little bits of the .RES file format on a piecemeal basis, but (very likely
also inadvertently) not documenting the whole thing once and for all, does
indicate that this simple little format is at least somewhat important.
Besides its use by programs such as RESTOOL, knowing the .RES file format
would also allow interactive programming environments to load .RES files on
the fly.
Once a Windows program has located a resource in a .RES file, how could a
Windows program load it? The Windows API includes resource-handling functions
such as CreateDialog, LoadMenu, FindResource, and LoadResource, but these only
work off of executable files. How can a resource be loaded from something
other than an executable file? The Windows API includes documented functions
such as CreateDialogIndirect and LoadMenuIndirect to create certain resources
from binary resource images in memory. For example, Microsoft Excel uses these
functions to let a user's macros create menus and dialogs on the fly.
Similarly, once a menu or a dialog in a .RES file has been located and read
into memory, these same Indirect functions can be used to transform the binary
data into a bona fide Windows resource.
Unfortunately, there isn't an equivalent Indirect function for every Windows
resource type. However, Alex and Dmitry's sample code (too long to include
with this article, but available electronically, see "Availability," page 5)
provides functions such as LoadRESAccelerators and LoadRESBitmap that fill in
the gaps. This code may also be useful to those working with self-loading
Windows executables.
As usual, please me send comments, suggestions, criticisms, and any
interesting gossip about Microsoft trade practices. My e-mail address is
76320,302 on CompuServe, and andrew@pharlap.com on the Internet; you can also
reach me on MCI Mail.
While most of the file formats used in the Microsoft Windows environment are
documented (although not always adequately), the format of .RES files, which
hold resources before their inclusion in a Windows executable file, has been
obscured by a curtain--until now! Knowing this format can help developers to
load and unload resources on the fly, in interactive programming environments,
for example, or in commercial applications with user-programmable macro
languages. This article explains how stand-alone Windows resources are
organized and provides a set of utilities that shows how to use them.
It is important to note that much of what .RES files contain is already
documented. The files contain the binary images of resources, in the format
documented in the Windows 3.1 SDK. What needs to be discovered in .RES files
are the surroundings for the already-documented binary resource images.
To uncover these magic cookies, we can use a text editor to create a resource
script, pass it through the resource compiler (RC), and use a hex-dump utility
to study the contents of the resulting .RES file. Listing One (page 153) shows
a simple TEST.RES file, and Figure 1 shows a partial hex dump of the resulting
TEST.RC file produced by RC. In the ASCII display on the right side of the hex
dump, many of the strings from TEST.RC, such as BARFOO and FOOBAR, are plainly
visible.
Figure 2 shows an analysis of the hex dump. Each resource within the .RES file
may start with the byte 0xFF, which tells us that the next word is a numeric
resource type. The predefined resource types, such as RT_CURSOR (1), RT_BITMAP
(2), RT_ICON (3), and so on, are listed in WINDOWS.H. (Type 16 is listed in
VER.H.) Unlisted resource types are user defined. Any starting byte other than
0xFF is the first character in an ASCIIZ (0-byte terminated) resource-type
name.
Following the resource type comes either the resource name, stored as an
ASCIIZ string, or an ordinal ID, stored as a number. Just as with the resource
type, if the first byte is 0xFF, the resource is identified by number rather
than name, and the next word is the ID itself.
The next two fields are memory flags, stored as a WORD, and the resource
length, stored as a DWORD. The memory flags are usually 0x1030 indicating
(MOVEABLE PURE DISCARDABLE), or 0x0x30 indicating (MOVEABLE PURE). The
resource length is for the actual resource, and does not include the header;
this length is used to get to the next resource in the file. These two fields
complete the resource header. They are followed immediately by the resource
data itself, which is in the same format as documented in the Windows 3.1 SDK,
Programmer's Reference, Volume 4: Resources, Chapter 7.
The .RES file is simply a collection of resources. There's no signature at the
start of the .RES file, so any kind of file can be supplied to a resource
utility (and kill it). On the other hand, the absence of a .RES-file signature
means that you can simply concatenate .RES files with the COPY /B command.
While this allows you to combine a set of small .RES files into a single
larger one, a problem can occur: The resulting file may end up with more than
one resource of the same type and ID.
While a .RES file is simply a concatenation of individual resources, in
Windows 3.0 the resource compiler sometimes builds a "name table" at the end
of the file. However, this name table is just another resource type (obsolete
in 3.1), and resources can be included in an executable file without it.
To distinguish Win32 .RES files from Win16 .RES files, Microsoft starts off
all Win32 .RES files with an entry that is illegal in both Win16 and Win32.
For our purposes, we need only note that the first byte is 0: In Win16 terms,
the 0 (because it is something other than 0xFF) means that this is a named
(rather than numeric) type, but the same 0 then means that the ASCIIZ name has
a zero length, which is illegal. This strange method (necessary only because
.RES files have no header) is documented in Microsoft's RESFMT.TXT on the
Win32 CD-ROM.
The format of a .RES file entry cannot be easily wrapped up in the structure
of a programming language such as C or Pascal, because the optional presence
of an ASCIIZ string gives the resource-header format a variable length.
However, "close-enough" C pseudocode can be presented; see Figure 3. A C
"union" (like a variant record in Pascal) is used to indicate that the
resource type can be either an ASCIIZ string or a 3-byte 0xFF-prefixed ordinal
number.
RESLIST.CPP (Listing Two, page 153) is a short C++ program that walks through
a .RES file, displaying the type, name, size, and file offset of each
resource. (The size and offset are for the resource itself, not including the
header.) For the 16 predefined resource types, RESLIST uses a table of strings
indexed by the resource type. Note that RESLIST.CPP does not use the
RES_FILE_ENTRY structure from Figure 3; this is because that structure is
merely pseudocode. The variable-length structure means that knowledge of how
to get from one field to another, and from one .RES file entry to another, has
to be embedded in the code itself.
Sample RESLIST output for TEST.RES is shown in Figure 4; compare this with the
original TEST.RC in Figure 1. RESLIST just lists the resources in a .RES file,
one per line. Details on each individual resource are not given, as the
information necessary to decode these is already documented in the SDK. For
example, upon encountering a dialog box (type.id==RT_DIALOG), you'd need to
get the number of controls from the DLGHEADER, along with the dialog style and
dimensions; you would then walk through each control, using the CTRLHEADER.
The SDK documentation works here because the only significant difference
between resources in .RES files and resources within .EXE files is that each
resource in the .RES file has one of the headers shown in Figure 3, whereas
the .EXE file has a Resource Table (the format of which is documented in the
SDK; also see Microsoft Systems Journal, September 1991). The resource data
itself is identical between the two formats and is transferred from the .RES
to the .EXE without change. Another difference between .RES files and
resources in .EXE files is that the .RES file is an unordered collection of
resources, whereas RC reorders resources before attaching them to an
executable file. Resources in an .EXE file are ordered by type; there is a
field in the executable's Resource Table that tells how many resources exist
of each type.
There's another small difference: The .RES-file length field is a DWORD, and
the resource file offset is only implicit; the .EXE file, on the other hand,
has room for both the offset and length of the resource, but uses only a WORD
to store them. As Windows executables can be very large, the resource offset
and length are stored and shifted by some amount; this resource shift count is
stored as the first word in the Resource Table.
The information in this article, together with the information in the SDK,
could be used to decompile a .RES file back to the original .RC files. The
information in the SDK alone is sufficient to decompile resources in
executable files back to the original .RC file.
[Editor's Note: Stan Mitchell of Eclectic Software has written such a resource
decompiler. I've written a resource dumper (the RESDUMP program included with
my Windows Source tool, distributed by V Communications), but mine doesn't
produce an .RC file; Stan's is a genuine resource decompiler. Alex and Dmitry
also have such a tool which is available electronically.]


Accessing Resources



Usually, resources are stored in an executable file; the Windows kernel
provides us with a set of API functions to load them. But, using the .RES file
format presented here, it is possible to implement a set of Windows functions
which will instead allow you to load resources from a .RES file. We've created
such a set of functions and a test program showing how to use them. Some
Windows API functions already exist to load resources "indirectly;" these are
very helpful when loading resources from a .RES file. Where such functions do
not exist (for BMP, ICO, and CUR resources), we created the closest
equivalent.
Our library, READRES.CPP and READRES.H, is too large to include here, but it
is available electronically, as is the test program, CDI; see Figure 5. (Note
that CDI is examining the .RES file for the Cyrillic version of Program
Manager; this .RES file was extracted from PROGMAN.EXE using our resource
decompiler.)
First, we need to implement a function which will load a particular resource
from a .RES file. As seen in Listing Three (page 153), LoadRESResource() walks
through a .RES file, just like RESLIST.CPP. However, instead of printing out
information on each resource, it searches for the resource whose type and name
the caller specifies. If the requested resource is found, LoadRESResource()
calls GlobalAlloc() to allocate sufficient memory for it (the amount of memory
comes from the DWORD size found in the .RES-file entry header), GlobalLocks
the memory, and then reads in the resource data. It returns to the caller with
a handle to the raw resource data.
Okay, here we are with a requested resource loaded in some portion of memory
referenced through a global memory handle. What to do next? This depends on
the type of resource we've loaded. Some resource types are easy to use from a
.RES file, because Windows already provides "indirect" functions. For example,
take a look at LoadRESMenu() in Listing Four (page 153), which uses
LoadMenuIndirect().
Loading dialog boxes from a .RES file is also fairly easy, except that any
program that calls CreateRESDialog() must supply an additional parameter: the
address of a dialog proc. All other work is done through
CreateDialogIndirectParam(). Supplying a generic dialog proc for someone
else's dialog box can be quite difficult. Your dialog proc must handle at
least one message, WM_INITDIALOG.
Loading other resource types (such as bitmaps, cursors, icons, and
accelerators) from a .RES file is trickier; the code itself may be found in
READRES.CPP (available electronically).
As noted in the "Introduction," knowledge of resource formats can be applied
to Windows environments based on interpreting languages. You could also apply
this knowledge to security: You could scramble your resources with a password
or simply XOR them and add some unscrambling features into a FindResource
"engine." Another interesting use might be resource packing: You could
implement your own resource loading (borrowing from the code in READRES.CPP)
and include some unpacking features in it. For example, you can store packed
bitmaps to reduce the total size of your application. You don't need to use
user-defined RCData resources for that.
Figure 1: Hex dump of TEST.RES.

0000 FF 03 00 FF 01 00 30 10 E8 02 00 00 28 00 00 00 ......0.....(...
0010 20 00 00 00 40 00 00 00 01 00 04 00 00 00 00 00 ...@...........
02f0 00 00 00 00 FF 0E 00 42 41 52 46 4F 4F 00 30 10 .......BARFOO.0.
0300 14 00 00 00 00 00 01 00 01 00 20 20 10 00 01 00 .......... ....
0310 04 00 E8 02 00 00 01 00 FF 05 00 46 4F 4F 42 41 ...........FOOBA
0320 52 00 30 10 4E 00 00 00 00 00 C0 00 02 0A 00 14 R.0.N...........
0330 00 1E 00 28 00 00 46 4F 4F 42 41 52 00 46 4F 4F ...(..FOOBAR.FOO
0340 42 41 52 00 01 00 02 00 03 00 04 00 FF FF 01 00 BAR.............
0350 02 50 82 54 68 69 73 20 69 73 20 61 20 74 65 73 .P.This is a tes
0360 74 00 00 05 00 06 00 07 00 08 00 01 00 01 00 01 t...............
0370 50 80 4F 4B 00 00 FF 04 00 42 41 52 42 41 52 00 P.OK.....BARBAR.
0380 30 10 54 00 00 00 00 00 00 00 10 00 26 54 65 73 0.T.........&Tes
0390 74 00 00 00 65 00 43 6D 64 20 26 31 00 80 00 66 t...e.Cmd &1...f
03a0 00 43 6D 64 20 26 32 00 10 00 26 41 6E 6F 74 68 .Cmd &2...&Anoth
03b0 65 72 20 74 65 73 74 00 00 00 6F 00 43 6D 64 20 er test...o.Cmd
03c0 26 31 31 00 80 00 70 00 43 6D 64 20 26 31 32 00 &11...p.Cmd &12.
03d0 80 00 C8 00 26 48 65 6C 70 00 FF 09 00 46 4F 4F ....&Help....FOO
03e0 46 4F 4F 00 30 00 05 00 00 00 81 70 00 C8 00 FF FOO.0......p....
03f0 0F 00 FF 01 00 30 10 3A 00 00 00 0E 00 0E 00 01 .....0.:........
0400 80 00 42 41 52 46 4F 4F 00 0E 00 04 00 01 80 00 ..BARFOO........
0410 42 41 52 42 41 52 00 0E 00 05 00 01 80 00 46 4F BARBAR........FO
0420 4F 42 41 52 00 0E 00 09 00 01 80 00 46 4F 4F 46 OBAR........FOOF
0430 4F 4F 00 00 00 OO...

Figure 2: Analysis of the hex dump of TEST.RES.

 +----------------------------------------------Tag1
 +-----------------------------------------Res Type (3 = RT_ICON)
 +-------------------------------------Tag2
 +--------------------------------Resource ID (#1)
 +--------------------------Resource Flags
 +---------+------------Resource Length (0x02E8)
 +-+-+ +-+-+ +-+-+ 
 Start of ICON data
0000 FF 03 00 FF 01 00 30 10 E8 02 00 00 28 00 00 00 ......0.....(...
0010 20 00 00 00 40 00 00 00 01 00 04 00 00 00 00 00 ...@...........

skip this hdr size (3+3+2+4=0x0C) + res length (0x02e8) =
next res at 0x02f4

 +----------------------------------Tag1
 +-----------------------------Res Type (14 =
 RT_GROUP_ICON)
 +----------------Res Name ("BARFOO")
 
 +-Resource Flags
 
 +-+-+ +--------+---------+ +-+-+
 

02f0 00 00 00 00 FF 0E 00 42 41 52 46 4F 4F 00 30 10 .......BARFOO.0.
0300 14 00 00 00 00 00 01 00 01 00 20 20 10 00 01 00 .......... ....
 
 +----+----+ +-------------------------------Resource data
 
 +--------------------------------------Resource Length (0x14)

 +--------------------------------------------More resource data
 
 +--------------------Tag1
 +---+-------------Res Type (5 = RT_DIALOG)
 +-----------Res Name ("FOOBAR")
 
0310 04 00 E8 02 00 00 01 00 FF 05 00 46 4F 4F 42 41 ...........FOOBA
0320 52 00 30 10 4E 00 00 00 00 00 C0 00 02 0A 00 14 R.0.N...........
 start of binary dlg data
 +---------+----------------------Res Length (0x4E)
 +---+----------------------------------Res Flags
 +---+----------------------------------------rest of Res Name

Figure 3: C pseudocode for .RES file structure.
typedef struct {
 BYTE ff; // 0xFF
 WORD id;
 } ID;
typedef struct {
 union {
 ID id; // 1 = RT_CURSOR, 2 = RT_BITMAP, etc.
 char name[variable_length]; // first byte *not* 0xFF
 } type;
 union {
 ID id; // ordinal number of resource
 char name[variable_length]; // first byte *not* 0xFF
 } name_or_id;
 WORD mem_flags; // 0x10=MOVEABLE, 0x20=PURE, 0x40=PRELOAD,
 // 0x1000=DISCARDABLE
 DWORD size; // *not* including this header
 BYTE res_data[size]; // the resource data: see SDK v. 4, ch. 7
 } RES_FILE_ENTRY; // NOT VALID C!
// RES file is just an accumulation of these entries
RES_FILE_ENTRY RES_FILE[num_resources];


Figure 4: RESLIST output for TEST.RES.
C:\DDJ>reslist test.res
Icon: #1 (744 bytes at ofs cH)
Group icon: BARFOO (20 bytes at ofs 304H)
Dialog box: FOOBAR (78 bytes at ofs 328H)
Menu: BARBAR (84 bytes at ofs 386H)
Accelerator table: FOOFOO (5 bytes at ofs 3eaH)
Name table (obsolete in 3.1): #1 (58 bytes at ofs 3fbH)


 Figure 5: CDI loads resources from .RES files, using the READRES library.
Here, CDI has loaded a dialog box from PROGMAN.RES, which in turn was
extracted from the Cyrillic version of Program Manager using a
resource-decompilation tool.
[LISTING ONE] (Text begins on page 133.)

// TEST.RC -- simple input file
// rc -r test.rc, then dump test.res (see Figure 1)
#include <windows.h>


BARFOO ICON some.ico
FOOBAR DIALOG 10,20,30,40
CAPTION "FOOBAR"
CLASS "FOOBAR"
STYLE WS_BORDER WS_CAPTION {
 CTEXT "This is a test", -1, 1, 2, 3, 4
 DEFPUSHBUTTON "OK", IDOK, 5, 6, 7, 8
 }
BARBAR MENU {
 POPUP "&Test" {
 MENUITEM "Cmd &1", 101
 MENUITEM "Cmd &2", 102
 }
 POPUP "&Another test" {
 MENUITEM "Cmd &11", 111
 MENUITEM "Cmd &12", 112
 }
 MENUITEM "&Help", 200
 }
FOOFOO ACCELERATORS {
 VK_F1, 200, VIRTKEY
 }

[LISTING TWO]

// RESLIST.CPP -- Reads Windows .RES files from "Undocumented Corner," DDJ,
// August 1993 -- Alex G. Fedorov and Dmitry M. Rogatkin
// alex@computerpress.msk.su & datasc@adonis@ias.msk.su

#include <iostream.h>
#include <fcntl.h>
#include <io.h>
#include <stdlib.h>
#include <string.h>

#define LastResNum 16

char *StandardResName[] = {
 /* 1 */ "Cursor",
 /* 2 */ "Bitmap",
 /* 3 */ "Icon",
 /* 4 */ "Menu",
 /* 5 */ "Dialog box",
 /* 6 */ "String table",
 /* 7 */ "Font directory",
 /* 8 */ "Font",
 /* 9 */ "Accelerator table",
 /* 10 */ "RCData (user-defined)",
 /* 11 */ "Not alowed",
 /* 12 */ "Group cursor",
 /* 13 */ "Not alowed",
 /* 14 */ "Group icon",
 /* 15 */ "Name table (obsolete in 3.1)",
 /* 16 */ "Version info"
 };
// read an ordinal number into aId, or an ASCIIZ string into aStr
int RdHeadItem(char *aStr, int &aId, int hFile)
{

 char ch;
 if (_read(hFile, &aStr[0], 1) != 1)
 return 1;
 if (aStr[0] == \xFF') // ordinal
 {
 aStr[0] = 0; // number, not name
 _read(hFile, &aId, sizeof(int));
 }
 else // name
 {
 if (aStr[0] == 0) // invalid magic
 {
 cout << "This is a Win32 .RES file\n";
 exit(1);
 }
 aId = 0; // name, not number
 signed long pos = lseek(hFile, 0, SEEK_CUR); // where are we?
 _read(hFile, aStr+1, 126); // probably read too much
 lseek(hFile, pos + strlen(aStr), SEEK_SET); // back up
 }
 return 0;
}
void main(int argc, char *argv[])
{
 int hResFile, RId, Id;
 long ResLen;
 unsigned int Flgs;
 char st[128], sn[128];
 if (argc != 2)
 {
 cout << "Usage: Reslist resfile\n";
 exit(1);
 }
 if ((hResFile = _open(argv[1], O_RDONLY)))
 {
 // get type: ordinal or string
 while (! RdHeadItem(st, RId, hResFile))
 {
 if (st[0] == 0) // ordinal number, not name
 {
 if (RId <= LastResNum)
 strcpy(st, StandardResName[RId-1]);
 else
 {
 itoa(RId, st, 10);
 strcat(st, " (user defined)");
 }
 }
 else
 ; // already have type string in st
 // get ID: ordinal or string
 RdHeadItem(sn, Id, hResFile);
 if (sn[0] == 0) // ordinal number, not string
 {
 sn[0] = #';
 itoa(Id, &sn[1], 10);
 }
 // get memory flags
 _read(hResFile, &Flgs, sizeof(unsigned int));

 // get length in bytes of following resource data
 _read(hResFile, &ResLen, sizeof(long));
 // where are we in the file?
 // so we can output file offset of actual res data
 long pos = lseek(hResFile, 0, SEEK_CUR);
 cout << st << ": " << sn << " (" << ResLen
 << " bytes at ofs " << hex << pos << dec << "H)\n";
 // in a genuine program, we would read in resource data
 // and switch on resource type in RId
 lseek(hResFile, ResLen, SEEK_CUR);
 };
 _close(hResFile);
 }
}

[LISTING THREE]

HGLOBAL LoadRESResource(LPCSTR ResFName, LPCSTR lpszName, LPCSTR lpszType) {
 char st[128], sn[128];
 void huge *AddrTemp;
 long ResLen;
 HFILE hResFile;
 HGLOBAL hMem=0;
 WORD Flags, RId, Id, fndtyp = 0;

 if (HIWORD(lpszType) == 0) fndtyp = LOWORD(lpszType);
 if ((hResFile = _lopen(ResFName, OF_READ)) == HFILE_ERROR)
 return 0;
 while(! RdHeadItem(st, RId, hResFile)) { // get type
 RdHeadItem(sn, Id, hResFile); // get name or ID
 _lread(hResFile, &Flags, sizeof(unsigned int)); // get flags
 _lread(hResFile, &ResLen, sizeof(long)); // get length
 if (fndtyp != 0 && RId == fndtyp 
 fndtyp == 0 && strcmp(lpszType, st) == 0) { // match type
 if (HIWORD(lpszName) != 0 && strcmp(lpszName, sn) == 0 
 Id == LOWORD(lpszName)) { // match name
 if (! (hMem = GlobalAlloc(GMEM_FIXED, ResLen)))
 return 0;
 if (! (AddrTemp = GlobalLock(hMem))) {
 GlobalFree(hMem);
 return 0;
 }
 long count = 0;
 unsigned portion;
 long len = ResLen - count;
 while (len) {
 portion = (len <= 0xFFFF) ? len : 0xFFFF;
 if (_lread(hResFile, (char huge *)AddrTemp+count,
 portion) != portion) {
 GlobalUnlock(hMem);
 GlobalFree(hMem);
 hMem = 0;
 }
 count+=portion;
 }
 GlobalUnlock(hMem);
 break; // we found it! done!
 }
 }

 _llseek(hResFile, ResLen, SEEK_CUR); // to next resource entry
 }
 _lclose(hResFile);
 return hMem;
 }

[LISTING FOUR]

// LoadRESMenu() -- excerpted from READRES.CPP
HMENU LoadRESMenu(LPCSTR ResFileName, LPCSTR lpszMenuName) {
 HGLOBAL hTempMem;
 void far* AddrTemp;
 HMENU retMnu;
 if ((hTempMem = LoadRESResource(ResFileName, lpszMenuName,
 MAKEINTRESOURCE(RT_MENU))) != 0) {
 if ((AddrTemp = GlobalLock(hTempMem)) != 0) {
 retMnu = LoadMenuIndirect(AddrTemp);
 GlobalUnlock(hTempMem);
 }
 GlobalFree(hTempMem);
 }
 return retMnu;
 }
End Listings






































August, 1993
PROGRAMMER'S BOOKSHELF


Roaming the Internet, Part 3


Up until a year ago, I was only vaguely aware of the Internet's existence. In
fact, my only exposure to the Internet was those funny electronic-mail
addresses on some peoples' business cards: mickey@disney.com and so on. Then
by a strange quirk of fate, I became involved in a project with some genetics
researchers at my hospital, and found that nearly all of the databases they
needed access to were available only via the Internet. For better or worse, at
just about the same time, the trade-book publishers were also discovering the
Internet as a new market niche, so I had no dearth of reading material. And
fortunately for me, some of the books actually turned out to be helpful!
Becoming acquainted with the Internet is like falling through a manhole and
discovering a city of aliens with their own markets, libraries, and culture,
living out an unsuspected parallel existence only a few feet away. This new
world has its own code of conduct, its own heroes and villains, its own
pantheon of gods and sacred cows, and its own collection of legends and myths.
The players, the playing field, and for that matter the rules of the game are
totally unfamiliar, even allowing for some previous experience with
stand-alone UNIX systems or commercial online services such as CompuServe. To
me, after having floundered my way up the learning curve over the last year,
there are three particularly amazing aspects of the Internet:
The TCP/IP, e-mail, and other fundamental Internet protocols, although adopted
in a fairly ad hoc manner a decade ago, continue to work well, even though the
number of hosts on the net has been scaled up by at least three orders of
magnitude.
The government and the primary network users pour money into the network
backbone and increase its capacity year after year, but no attempt is made to
make individual hosts liable for the traffic generated by their users. (For
that matter, no one is exactly sure how many hosts are on the net, let alone
how many users.)
Many Internet users are totally oblivious to the incredible technology and
infrastructure they are exploiting, and squander network bandwidth with
pointless flame wars, trivial appeals for help, thinly disguised attempts to
cheat on take-home exams, and downloading of semi-useless shareware utilities
from archive servers all over the globe.
Those of you who don't work in a large corporation or educational institution
will probably have your first encounters with the Internet via dial-up to a
so-called "public provider." This gives you access to world-wide electronic
mail and USENET news, but the low bandwidths of a modem connection essentially
make you a second-class citizen on the net. To reap the full benefits of the
archive sites and tools such as gopher and WAIS, you really need to be
hardwired to the network. Making this happen is no trivial chore, and a job I
recommend you leave to the experts. Nevertheless, I wanted to wind up this
series of book reviews about the Internet (see also the December 1992 and
February 1993 issues of DDJ) by mentioning a few books that will help you
decipher the many layers of network arcana and (eventually) become somewhat
self-sufficient. In addition, I've included a list of other Internet-related
books, some previously reviewed.
DNS and Bind, by Paul Albitz and Cricket Liu, and TCP/IP Network
Administration, by Craig Hunt, are members of the "Nutshell Handbook" series
from O'Reilly and Associates. Although there is some degree of overlap between
the two books, I strongly recommend that you get both; the differing
perspectives and emphasis of the authors can be quite helpful when trying to
debug TCP/IP or mail problems. TCP/IP Network Administration starts with basic
TCP/IP protocol concepts, moves on to routing, domain name services, and
sendmail configuration, and finishes up with chapters on troubleshooting and
security. DNS and Bind, as you would expect from the title, focuses much more
intensely on setting up and maintaining domain name servers and resolvers.
Both books, like all the other O'Reilly books I've been exposed to,
demonstrate careful writing, tasteful editing, and painstaking production.
They're a pleasure to own and use.
The Internet Message, by Marshall T. Rose, is billed as "The Exciting Fourth
Book in MTR's Networking Trilogy." Rose is well known for his work on the
Internet mail system and OSI directory services over the last decade, and more
recently has been influential in the development of multimedia-mail protocols.
The Internet Message is basically an explanation of how the Internet name
services, mail protocols, mail-transport agents, and mail-user agents work and
interact, with a great deal of Rose's personal humor, philosophy, and
editorialization thrown in at no extra charge. For example:
There are still people in the world who think OSI is going to happen. I
suppose there are also people in the world who think that the moon is made of
cheese. However, I wouldn't necessarily trust the judgement of either kind of
optimist.
Although Rose has little patience with the ponderous, poorly thought-out OSI
standards and implementations, he's also ecumenical. The notorious weak points
of the UNIX-based Internet tools come in for their share of criticism:
Perhaps the most commonly used implementation of a mail transfer agent in the
Internet is sendmail. It is a tribute to the Internet mail system that it
works so well given that sendmail behaves so poorly_. Clearly, sendmail is an
excellent example of how to do a lot of things wrong. But, since sendmail is
shipped with Berkeley UNIX, most sites just put up with it_. People just
stumble along with a canned sendmail configuration, poking at it from time to
time if problems arise.
After spending more than a few hours puzzling over sendmail configuration
files, I was relieved to find out that I wasn't the only person who considered
it brain-damaged. You needn't trouble yourself to read The Internet Message if
you're content to use your system's mail programs blindly. However, if you are
considering writing your own mail or news client, or even if you are just
curious about the underlying mechanisms of electronic mail, The Internet
Message is an excellent place to start.
The Internet System Handbook, edited by Daniel Lynch and Marshall Rose, is not
so much a book as a hardbound collection of technical essays by diverse
networking computer scientists, gurus, and engineers. There's a lot of
valuable information in this book, but you have to mine it for what you need;
the book is only loosely organized, and there is a significant amount of
overlap and redundancy. The book is also uneven in both style and technical
level and suffers from an obvious lack of copy editing; some articles are
stilted and opaque, while others are refreshingly direct and practical. One
can only regret that, presented with such a unique collection of raw material,
the publisher didn't invest a little more effort in processing that material
into a structured, coherent, approachable whole. Nevertheless, this book
should be on the shelf of every serious network programmer and administrator.
End-user books: The Whole Internet User's Guide and Catalog, by Ed Krol
(O'Reilly & Associates, 1992, ISBN 0-56592-025-2, $24.95). An excellent
mid-level introduction to the Internet. How to connect, why to connect, how to
use basic network tools, and how to troubleshoot networking problems. Highly
recommended; by far the best of all the end-user Internet books.
Internet: Getting Started, April Marine, Susan Kirkpatrick, Vivian Neou, and
Carol Ward, editors (Prentice Hall, 1993, ISBN 0-13-327933-2, $28.00). A lot
of useful reference material: index to RFCs, list of public providers,
overseas contact information, and so on. Not primarily a "how-to" book, so it
makes a good companion to the Krol book.
Zen and the Art of the Internet: A Beginner's Guide, second edition, by
Brendan P. Kehoe (Prentice Hall, 1993, ISBN 0-13-010778-6, $22.00). Brief
guide to Internet utilities and resources from a typical UNIX viewpoint. In
spite of the title, definitely not for the average PC user. A bit too technoid
and smug for my taste, and the editing and production values are dismal.
The Internet Companion: A Beginner's Guide to Global Networking, by Tracy
LaQuey and Jeanne C. Ryer (Addison-Wesley, 1993, ISBN 0-201-62224-6, $10.95).
Apparently directed at the technologically illiterate--could have been titled
Bill and Ted's Excellent Network Adventure.
Internet: Mailing Lists, Edward T.L. Hardie and Vivian Neou, editors (Prentice
Hall, 1992, ISBN 0-13-327941-3, $39.00). Comprehensive guide to mailing lists,
many of which are reflected to USENET (or vice versa). Especially valuable for
Internet users who have dial-up e-mail access only, or for Bitnet users.
Using UUCP and UseNet, by Grace Todino and Dale Dougherty (O'Reilly &
Associates, 1986, ISBN 0-937175-10-2, $21.95). UNIX-centric, but instructions
on the use and abuse of Internet "news" will be helpful to all.
Network administration: DNS and BIND, by Paul Albitz and Cricket Liu (O'Reilly
& Associates, 1992, ISBN 1-56592-010-4, $29.95). Very helpful explanations of
DNS, bind, sendmail configuration, and so on. Coverage of Sun OS peculiarities
is sometimes spotty.
TCP/IP Network Administration, by Craig Hunt (O'Reilly & Associates, 1992.
ISBN 0-937175-82-X. $29.95). Clearly written, extremely helpful overview of
TCP/IP from protocol basics to configuration of gateways, DNS, and sendmail.
Also includes nice discussions of network troubleshooting and security
considerations.
Internetworking: A Guide to Network Communications, by Mark A. Miller (M&T
Books, 1991, ISBN 1-55851-143-1, $34.95). A somewhat abstract overview of
internetworking and protocols, both LAN and WAN.
Practical UNIX Security, by Simson Garfinkel and Gene Spafford (O'Reilly &
Associates, 1991, ISBN 0-937175-72-2, $29.95). UNIX-centric, but includes
discussions of passwords, gateways, firewall machines, and the like that will
be valuable to any system administrator.
Managing UUCP and UseNet, tenth edition, by Tim O'Reilly and Grace Todino
(O'Reilly & Associates, 1992, ISBN 0-937175-93-5, $27.95). General discussion
of e-mail and news servers and clients.
!%@:: A Directory of Electronic Mail Addressing and Networks, by Donnalyn Frey
and Rick Williams (O'Reilly & Associates, 1990, ISBN 0-937175-15-3, $27.95). A
coffee-table book for Internet nerds.
Networking technology: TCP/IP: Architecture, Protocols, and Implementation, by
Sidnie Feit (McGraw-Hill, 1993, ISBN 0-07-020346-6, $45.00). Textbook
approach: thorough but not very friendly.
The Simple Book: An Introduction to Management of TCP/IP-based Internets, by
Marshall T. Rose (Prentice Hall, 1991, ISBN 0-13-812611-9, $54.00). A nice
explanation of SNMP by one of its inventors.
The Internet Message: Closing the Book with Electronic Mail, by Marshall T.
Rose (Prentice Hall, 1993, ISBN 0-13-092941-7, $44.00). Overview of Internet
mail protocols by one of the most famous Internet gurus.
Internet System Handbook, Daniel C. Lynch and Marshall T. Rose, editors
(Addison-Wesley 1993, ISBN 0-201-56741-5, $59.25). A massive collection of
technical essays and overviews, of varying levels of usefulness.
Stacks: Interoperabilitiy in Today's Computer Networks, by Carl Malamud
(Prentice Hall, 1992, ISBN 0-13-484080-1, $35.00). A succinct overview of
competing network protocols and transports: OSI, TCP/IP, ISDN, X.25, and the
like.
Exploring the Internet: A Technical Travelogue, by Carl Malamud (Prentice
Hall, 1992, ISBN 0-13-296898-3, $26.95). This book defies classification. The
author recounts his jaunts around the world to meet Internet wizards and taste
exotic foods.
DNS and BIND
Paul Albitz and Cricket Liu
O'Reilly & Associates, 1992
418 pp. $29.95
ISBN 1-56592-010-4
TCP/IP Network Administration
Craig Hunt
O'Reilly & Associates, 1992
502 pp. $29.95
ISBN 0-937175-82-X
The Internet Message: Closing the Book with Electronic Mail
Marshall T. Rose
Prentice Hall, 1993
370 pp. $44.00
ISBN 0-13-092941-7
The Internet System Handbook
Daniel C. Lynch and
Marshall T. Rose, editors
Addison-Wesley, 1993
700 pp. $59.25

ISBN 0-201-56741-5





























































August, 1993
OF INTEREST
Candela's Color Management System (CCMS) Library now supports the Macintosh
Think C 5.0 compiler and Nextstep 3.0, SunSparc OS 4.1.3, DOS 5.0, and Windows
3.1.
CCMS Library includes linearizing and characterizing devices, color-gamut
mapping, and operation of test targets. The CCMS Evaluator helps to link and
transform color among scanners, monitors, and output devices.
The technology is based on nonlinear, nonseparable math functions that provide
smooth color and gray tonal transitions while processing large image files
that have millions of color pixels at fast speeds. CCMS costs $7500.00 and
includes the Evaluator and API documentation. Reader service no. 20.
Candela Ltd.
1676 E. Cliff Road
Burnsville, MN 55337-1300
612-894-8890
The WinClient library is a C++ object library for Windows-based relational
database management system (RDBMS) front-end application development. The
library unifies programming styles for different windows, including MDI frame,
MDI child, and ordinary windows. Windows initialization routines can be
encapsulated in the application base class. A single language--C++, for
example--can handle forms, reports, and large batch programs, and can share
program logic between them. Reader service no. 21.
WinClient Technologies
411 University Street, Suite 1200
Seattle, WA 98101
206-623-0171
MainWin, from Machine Independent Software, is a software-development kit for
porting Microsoft Windows applications (written in ANSI C) to UNIX (POSIX and
X Windows). MainWin supports Solaris, UnixWare, AIX for RS/6000, and HP-UX. On
the PC side, the first release supports Windows 3.0, with 3.1 support expected
later this year. The MainWin SDK is priced at $5000.00. Reader service no. 22.
MainSoft
185 Berry Street, Suite 5411
San Francisco, CA 94107
800-624-6946
The Public Windows Interface (PWI), a specification that would put the
Microsoft Windows API into the public domain, has been proposed by SunSelect
(a Sun Microsystems business unit) and endorsed by a number of third-party
companies, among them Borland, Corel, HP, IBM, Quarterdeck, SCO, USL, and
Wordperfect. As proposed, the PWI standard would enable the development of
applications and tools so that users of systems based on multiple operating
systems could run Windows apps.
As you might expect, however, Microsoft is not included on an endorsement
list. A Microsoft spokesperson told told DDJ:
Sun's actions are basically irrelevant. There's a known and well-published
Windows interface today, certainly more broadly published and broadly
documented than anything that Sun or another UNIX vendor has to offer.
We have an Open Process whereby we review all the new proposed API
specs...with the industry well in advance before the APIs are done, solicit
feedback, and make changes if necessary to meet people's needs. The process
combine[s] the speed from leadership...with an open process that makes sure
that, in fact, we're doing the right thing.
We do not frankly believe that Sun has any intention of doing right by the
users of Windows or Windows itself. Clearly, the only economic benefit Sun
might hope to derive is to fragment the Windows interface the way the UNIX
interface and the UNIX API is fragmented in the hope they can continue to
participate in a market that seems to have already selected Windows.
We don't understand what the benefit would be of taking a Windows standard
that's now supported by 25 million people and countless thousands of software
developers andfragmenting it or turning it over to a body that essentially
consists of hostile UNIX competitors.
If they've finally acknowledged that you have to support the Windows interface
in order to be a player in the market, we're tickled.
For information on PWI, contact SunSelect. Reader service no. 23.
SunSelect
Two Elizabeth Dr.
Chelmsford, MA 01824-4195
508-442-0000
Also new from SunSelect is a software technology called "WABI'' that makes it
possible for UNIX users to directly run Microsoft Windows apps on UNIX-based
PCs and workstations. Because the Windows apps are part of the UNIX desktop,
users can cut-and-paste between Windows and UNIX apps. WABI (presumably short
for "Windows Application Binary Interface") is not Window emulation; it
translates Windows function calls into X Windows calls. WABI, which was
originally developed by Praxsys Technologies, includes bitstreams font
handling (including TrueType fonts).
WABI has so far been licensed by USL, SCO, and SunSoft. SunSelect has set up a
self-certification program, whereby developers can receive a free preview copy
of WABI for conducting compatibility testing during application development.
Reader service no. 24.
SunSelect
Two Elizabeth Dr.
Chelmsford, MA 01824-4195
508-442-0000
NeuroForecaster, a neuro-fuzzy network program from NIBS Pte, provides
investment analysis for stocks, options, fixed-income securities, foreign
exchange, interest rates, and so on. It uses only historical data for
training, so explicit expert rules aren't needed. The program provides
time-series forecasting, cross-sectional classification, and indicator
analysis. Forecasts can be built with any number of input and user-specified
complexity, limited only by the amount of computer memory. Twelve built-in
neural-network models include backpropagation, radial-basis function, and
neuro-fuzzy models. Rescaled range analysis is included to help the user
predetermine the predictability of the input data, and to reveal hidden cycles
in the time series. NeuroForcaster 2.1 for Windows or the Macintosh sells for
$350.00. Reader service no. 25.
NIBS Pte Ltd.
62 Fowlie Road
Republic of Singapore 1542
+65-344-2357
Visual Basic 3.0 has been released by Microsoft. The two major 3.0
enhancements are integration of the Microsoft Access 1.1 database engine and
support for OLE 2.0 Automation. The database engine provides direct access to
Access, dBase, Paradox, Btrieve, and ODBC. (ODBC drivers for Microsoft SQL
Server, Sybase SQL Server, and Oracle are provided.) The engine has multiuser
support, transaction processing, and support for rich data types (sound,
video, OLE objects, and pictures). OLE 2.0 automation lets you build custom
programs that take advantage of the capabilities of other OLE 2.0-compliant
applications. Other VB 3.0 enhancements include new controls, pop-up menus,
data-access objects, data-aware controls, and integration of Crystal Reports
2.0.
Visual Basic 3.0 sells for $199.00. Reader service no. 26.
Microsoft Corp.
One Microsoft Way
Redmond, WA 98052-6399
206-882-8080
The Plug-In Components for Windows from Access Softek are DLLs that developers
can snap into Windows applications. Among those recently released by Access
are: P.I. Edit Control, which replaces the Windows Edit Control for WYSIWYG
multiline edit boxes ($495.00); P.I. Document Editor, which provides
compound-document processing and reading and writing of RTF files ($995.00);
P.I. Text Import/Export Filters for importing and exporting DOS, Microsoft
Word, and Wordperfect files ($395.00); P.I. Bitmap Import Filters for
importing TIFF, PCX, GIF, BMP, and EPS files ($695.00); and P.I. Vector Import
Filters for importing DXF, CGM, WPG, WMF, GEM, Lotus PIC, HPGL, and DRW files
($695.00). Reader service no. 27.
Access Softek
2550 9th Street, #206
Berkeley, CA 94710
510-848-0606
A standard specification for ink as a data type has been proposed by pen-based
developers GO, General Magic, Slate, Microsoft, Lotus, and Apple. The
specification, called "Jot,'' defines a common format to be used for the
storage and interchange of electronic ink data. Jot, which is written in C but
which can be implemented in any language, is a record-based format that covers
properties such as: multiple ink strokes combined into single objects, bounds,
scales, offsets, color, pen tips, timing, height of pen over digitizer,
stylus-tip force, stylus buttons, and x and y angle of stylus. Forty-seven
additional record types have been reserved for future use. Reader service no.
28.
Slate Corporation
15035 North 73rd Street
Scottsdale, AZ 85260
602-443-7322

Logivolve combines neural networks and genetic algorithms into one package,
making it possible to develop neural-network systems without trial-by-trial
training. The packet comes as a programmer's library with several functions
and is available for Visual Basic and C. Logivolve for C, with source, sells
for $459.00, and for Visual Basic it costs $259.00. Reader service no. 29.
Scientific Consultant Services
20 Stagecoach Road
Selden, NY 11784
516-696-3333
Tired of sitting through a 30-to 50-disk installation process--and sometimes
having to start over because of an error? This problem is solved by the LAN
Configuration Facility (LCF), an automated software-installation utility which
allows large installations to be performed over a network. LCF, which is
distributed by ForeFront Software, was developed by the Royal Bank of Canada,
where it was used for the installation and configuration of OS/2
mission-critical applications; LCF is capable of distributing, installing, and
configuring DOS and Windows apps as well.
LCF stores software in a central library on a server, where it can either be
"pulled" by users or "pushed" by administrators from the server to unattended
workstations. Server software sells for $500.00 (usually one server per
network is required, although you can bridge from one LAN to another), while
client software sells for $50.00/workstation. Reader service no. 30.
ForeFront Software
2202 2 Ave. NW
Calgary, Alberta
Canada T2N 0G9
403-531-2160
The Development Tools Handbook: Support Solutions for the Intel MCS-51,
MCS-96, and 80C186, a new book for embedded-systems designers, has been
released by Market Works. The Handbook lists suppliers of development tools
and products that support Intel's microcontrollers. The 136-page book contains
product data sheets and des-criptions of hardware and software-development
tools such as in-circuit emulators, compilers and assemblers, debuggers,
real-time operating systems, logic analyzers, and boards.
The Handbook also details individual Intel embedded microcontrollers in terms
of on-chip peripherals and memory, power consumption, packaging, and special
features. Copies of the Handbook sell for $24.00. Reader service no. 31.
Market Works
50 W. San Fernando, Suite 675
San Jose, CA 95113
408-286-4200
VBAssist 3.0, a Visual Basic 3.0 add-on tool from Sheridan Software, lets
developers design forms for database applications using drag-and-drop to link
table columns to bound controls. VB-Assist's Data Assistant opens a window
showing all fields in the database associated with the bound data control.
With a mouse click on the desired field, the user drags and drops it into the
target control. It also allows developers to create and modify table
structures, and view/modify data for any table. VBAssist 3.0 sells for
$179.00; upgrades are $39.00.
Sheridan is also offering 3D Widgets, a Visual C++ add-on that provides custom
controls such as list boxes, command buttons, ribbon buttons, versatile
panels, and specialized file, directory and drive list boxes. 3D Widgets costs
$109.00. Reader service no. 32.
Sheridan Software Systems
65 Maxess Road
Melville, NY 11747
516-753-0985
In further support of its pSOS+ real-time operating system for embedded
applications, Integrated Systems has extended the pSOSystem C++ development
environment by providing an "object register" garbage collector for more
efficient memory management, a source-code and class-library browser for
understanding complex programs, and support for 386/486 processors. Host
systems include Sun, HP, DEC, and IBM workstations as well as PCs. Target
processors include the Motorola 68x00 family, Intel i960, and Intel 386/486.
Reader service no. 33.
Integrated Systems
3260 Jay Street
Santa Clara, CA 95054-3309
408-980-1500
RockWare Scientific Software has released Z-CON, a mapping and contouring
application designed for Windows. Z-CON creates publication-quality maps from
randomly distributed data points. Features include automatic contour smoothing
and labeling, control-point plotting, and border annotation. Maps can be
exported to .DXF format or to the Windows clipboard for use within
spreadsheets, paint programs, or desktop-publishing applications. Z-CON sells
for $79.95. Reader service no. 34.
RockWare Scientific Software
4251 Kipling Street, Suite 595
Wheat Ridge, CO 80033
303-423-5645




























August, 1993
SWAINE'S FLAMES


The Great Letter Shortage


The pool is finally done and summer is here. Unfortunately, the pool attracts
mosquitoes. And my cousin Corbett. The pool is attractively situated in a
madrone grove. Madrones are beautiful trees with three distinct seasons.
First, they drop berries, billions of berries that cover the ground like snow
and then rot and attract fruit flies. Then there's leaf-fall, which actually
comes twice a year in California. Finally there's the period when all the bark
peels off in ticket-stub-sized pieces that smell like salami.
This was berry season, and cousin Corbett had volunteered to sweep madrone
berries off the pool deck, but he was instead pacing furiously up and down the
deck, and the broom was nowhere in sight. I was sitting in a deck chair with
the portable, working on my column, my few square inches of exposed skin white
with sunscreen and Avon Skin-So-Soft.
Here's a tip: The best mosquito repellent in the world is Avon Skin-So-Soft. I
know an Avon lady who sells it to lumberjacks and deer hunters in northern
Wisconsin, and she told me so.
"Where's the broom?" I asked.
Corbett was fuming. "I don't know. I put it down somewhere. Do you know what
they did to me now?"
I didn't much care. "You're walking on the madrone berries."
"They ripped me off again. It's all this task-based computing stuff. Microsoft
has its OLE and Apple has its Amber and NeXT has its NeXTstep and Taligent has
its Taligent_."
"You're not claiming that you came up with that technology?"
"No, but I did try to trademark the word task'."
"Corbett, you can't trademark a common word like that. Listen to me: When you
walk on the berries, you grind them into the deck."
"Sure you can. I should have had a killer intellectual property infringement
suit, but they ripped me off." He picked up a rock and tossed it pensively
into the pool.
"Corbett," I snapped, "any suit over your intellectual property could be
settled in small claims court."
Instead of shutting him up, this set him off in a new direction.
"I wonder what the lower limit is on the size of an intellectual property?
There are deals where you can buy one square inch of land; could you, say,
sell your name?"
I perked up. A MacUser editor had recently published some anagrams of my name
in his column. Was there an intellectual property issue there?
"Or individual letters," Corbett went on, pacing faster. "I know Intel wanted
to trademark lowercase i, but I see AT&T using it in ads these days. And Zilog
wanted to trademark Z_."
"_which everybody knows is the mark of Zorro," I added. Then I recalled a
billboard in San Francisco that had puzzled me recently. The company's logo
looked so familiar, yet I couldn't place it. It finally hit me that a C in a
circle ought to look familiar to a writer. "Can you trademark the copyright
symbol?" I asked.
"How about the trademark symbol? Ask the transcendental meditation people. But
listen, it just occurred to me that this minimal intellectual property
business is a crisis in the making for the entire world economy."
"What the devil are you talking about? And if you have to stamp your foot like
that, could you find a spot with fewer berries?"
He pointed an accusatory finger at me. "Are you aware that every stock ever
listed on the New York Stock Exchange is given a unique four-letter code? That
code is a kind of intellectual property, and it's worth a lot."
"So what?"
"So the number of possible codes is a finite limit on the number of possible
stocks. We're talking about a few million codes here, a meaningful limit. And
you know that a limit doesn't actually have to be reached to exert an
inhibiting effect. The mere existence of a limit can be inhibiting, in this
case inhibiting the growth of the economy." He picked up the portable. "Do you
have the White House's CompuServe ID?"
I walked over to the pool and stood looking down into the deep end. At least
now I knew where he put the broom.
Michael Swaineeditor-at-large






























September, 1993
September, 1993
EDITORIAL


Through the Past, Digitally


There are some common misperceptions concerning those of us in the magazine
biz: that we keep odd hours, don't return phone calls, take long lunches
(usually paid for by somebody else), and hightail it to the woods as soon as
the current issue's out the front door. I'm here to set the record straight.
At Dr. Dobb's, we return phone calls as promptly as possible.
Leisurely, granola-laced, yogurt-laden lunches not withstanding (hey dude,
this is California), we spend the interlude between polishing off one issue
and rolling up our sleeves for the next by thinking up special projects and
making long-range plans. Here's a taste of what we've been up to.
The 1994 Dr. Dobb's Journal Editorial Calendar
It sure sounds strange saying "1994," but then it sounded even more Orwellian
saying "1984" a decade ago. Nevertheless, we're planning for next year, where,
as the following editorial calendar shows, you'll be exploring both familiar
and new topics with us.
 January PC Supercomputing
 February Software Design, Testing, and Optimization
 March Portability and Cross-Platform Development
 April Algorithms
 May Operating Systems and Microkernels
 June User Interfaces
 July Graphics Programming
 August C/C++ Programming
 September Data Structures and File Formats
 October Object-oriented Programming
 November Database Programming
 December Communications
If you've followed DDJ over the years, you know we'll also be covering
embedded systems, network computing, programming toolkits, cognitive
computing, distributed computing, encryption, books, and a raft of other
useful and fascinating programming-related issues.
We'd love to hear your ideas for articles on these or other real-world
programming techniques. If you've come up with a unique algorithm, or a new
twist to a tried-and-tested one, you might want to contact contributing editor
Tom Swan who writes our "Algorithm Alley" column. Or if you've run across an
undocumented feature in DOS, Windows, Netware, or other programming interface,
contributing editor Andrew Schulman will work with you in his "Undocumented
Corner." In any event, Mike Floyd, Ray Valdes, and I will be glad to talk with
you about your article ideas and send you copies of DDJ's author guidelines.
Dr. Dobb's Sourcebook of Windows Programming
Every year, we publish a bonus issue or two to supplement your regular Dr.
Dobb's. In the last few months alone, we've published a special issue devoted
to database development, another covering C++, and a special section on
scientific and engineering computing.
This month's special project is Dr. Dobb's Sourcebook of Windows Programming,
an issue packed with programming pearls for Windows 3 and Windows NT
developers. Among the articles included in it are those examining:
multivendor Windows database development
memory-mapped file I/O in NT
multitasking and multithreading under NT
custom controls
horizontally scrollable listboxes
TrueType fonts
We're taking a different approach with this Sourcebook. Unlike previous
supplemental issues, you won't automatically receive it with your regular
issue. Instead, you can buy the $4.95 issue on the newsstand or order it by
calling 800-433-0700. This tack makes it possible for us to provide you with
much more detailed information than we would have been able to otherwise. We
hope you enjoy the issue and find it to be a valuable and lasting resource. As
always, let us know how you put the articles in the Sourcebook to good use.
Dr. Dobb's/CD
Another special project we're really excited about is Dr. Dobb's/CD, the first
electronic version of the magazine on CD-ROM. Dr. Dobb's/CD pulls together
more than five years worth of DDJ--from January 1988 to June 1993--and
includes hyperlinked text, listings, figures, tables, and examples for all
regular and special issues. The CD lets you flip through individual articles
just as with the paper-and-ink version now in your hands, or you can search
across the entire CD for a specific topic. You can copy information to your
PC, print to a printer, or whatever. Dr Dobb's/CD, which initially is
available for DOS and Windows PCs, sells for $79.95; call 800-228-3141 for
more information.
Jonathan Erickson
editor-in-chief





















September, 1993
LETTERS


Truly 3-D




Dear DDJ,


The article "Algorithms for Stereoscopic Imaging" (DDJ, April 1993) was
well-written and informative. And I'm glad to see Lenny Lipton's and
Stereographics "Crystal Eyes" get some long overdue coverage. I don't believe
Tektronix manufactures the polarizing panels. I understand
Lipton/Stereographics holds the patents and Tektronix buys them from him.
Tektronix does manufacture a stereo monitor that uses the panel.
Contrary to popular opinion, you can show stereo on your 8088 XT without
special hardware, except for a viewer you can easily make.
At Visonics Labs, we've been using stereoscopic imaging of 3-D mathematical
curves and surfaces for some years. I've included some Basic code snippets
(see Example 1) to show how. The two images are presented side-by-side. For
your viewer, purchase a large Fresnel lens from Edmund Scientific and cut
2x2-inch squares from it from the outside edge (you're looking for a prism
effect) and mount it in a frame.
For information on how to build a display to show live, real-time, electronic
holoform images that you can see in stereo without any viewer and look around
simply by moving, see my book The 3D Oscilloscope (Prentice-Hall, 1987). The
book also gives a history of 3-D displays.
Homer B. Tilton
Tucson, Arizona


Fortran and Direct Memory Access Update




Dear DDJ,


Ken Hamilton's article "Direct Memory Access From PC Fortrans" (DDJ, May,
1993) solves a problem I had with Microsoft Fortran Powerstation. However, in
the case of SVS Fortran (Version 2.8.2) the functions PEEK86 and POKE86
(included in the "Interactive Library" supplied with the compiler) provide
essentially the same functionality as his subroutine CRTPUT.
The PEEK86 and POKE86 functions seem to be relatively fast. Transferring a
320x200 byte array from host memory to display memory as a block (Example 2)
provides a frame rate of about 42.8 frames per second on a Tri-Star 486 33Mhz
EISA with a Diamond SpeedStar HC graphics adapter. Curiously, passing the same
array by column ("commented out" code in Example 3) provides a slightly
greater frame rate (43.3 fps).
Terry Hendricks
Encinitas, California


Searching for Mr. Big Number




Dear DDJ,


I've recently taken an interest in high-precision arithmetic and have been
unable to find a good reference. I believe work has been done in this area
before, under the name "arbitrary precision," "variable precision," and
"unlimited precision." However, I haven't run across any code libraries
addressing this issue. What's a fellow to do when 40 decimals of accuracy
isn't enough. I'd appreciate hearing from any other readers who might have
relevant information.
Mike Neighbors
Huntsville, Alabama
DDJ responds: For starters, take a look at Fred Motteler's article "Arbitrary
Precision Floating-point Arithmetic" in this issue. Also, you might refer to
"Multiple Precision Arithmetic in C" by Burt Kaliski (DDJ, August 1992) as
well as Burt's "The Z18080 and Big Number Arithmetic" in this issue. If any
readers have further recommendations, we'll publish the references.


It's a Dog-Eat-Dog World




Dear DDJ,


In his March 1993, "Programming Paradigms" Michael Swaine discusses a product
called Serius Workshop. He also mentions that Serius is the name of the dog in
a classic novel by the British philosopher/author, Olaf Stapleton. This is an
error. The dog in question is named Sirius, after the star, in the
constellation Canis Major, also known as the Dog Star, one of the brighter
stars in the sky.
Olaf Stapleton is not very well read in this country and I was exposed to his
works as a child growing up in England. He also wrote an excellent story
called "Last and First Men," the history of Mankind as written by the last
remaining human at some very distant time in the future.

Robert Hutchins
Rancho Palos Verdes, California


Big Wheels Keep Turning




Dear DDJ,


The article "Object-oriented Finite Element Software" (DDJ, June 1993), Al
Vermeulen states that a bicycle wheel will not even begin to undergo buckling
failure until over 25,000 pounds of force have been applied. A bicycle wheel
will not sustain a load of this magnitude. It will fail catastrophically long
before the load reaches 25,000 pounds.
Assume for the moment that bicycle spokes are fabricated from SAE 1095 steel,
or a steel of similar strength, which has a yield strength of 97,000 pounds
per square inch (97 ksi) and an ultimate strength of 109 ksi. The area of the
spoke is given in this article as 0.0031 square inches. Simple tensile stress
can be obtained from the equation: S= P/A, where S is stress in pounds per
square inch, P is the tension load in pounds, and A is the cross-sectional
area. If we rearrange this expression to solve for the load that will result
in a stress of 97 ksi in our bicycle spoke we get a load of 300.7 pounds. If
for the sake of simplicity, we assume that two-thirds of the spokes are
uniformly loaded in tension (an extremely optimistic loading pattern), we get
a total load of 7217 pounds. That's well below Al's buckling failure load.
Al also states that the initial strain in the spokes is 0.003 in/in. If we
solve for the equivalent stress using Young's Modulus, we get a stress of 90
ksi. The spokes are under an initial tension of 279 pounds! The spokes will be
at the yield point after an additional load of 521 pounds has been added using
the aforementioned loading pattern.
The more likely failure modes are the spoke retainer pulling through the
aluminum rim when the rim suffers localized shear failure or the edge of the
rim crushing due to bearing stress.
Also, there aren't enough spokes on the wheel. Figure 7 in his article states
that the rim radius is 27 inches. The spoke length is given as 12 inches. I
think it's more likely that it's the rim diameter that has a value of 27
inches.
Thirty-six spokes have a spacing of 2.356 inches on a wheel with a 13.5 inch
radius. This number of spokes simplifies the model, but perhaps he should have
used an equivalent spoke that would approximate two spokes with twice the area
of a real spoke. Seventy-two spokes would give a spacing of about 1.18 inches,
which seems about right. The equivalent spoke would have an initial loading of
558 pounds (same strain) and the wheel would need a load of 1042 pounds, using
my assumed loading pattern, to take the spokes to the yield point. Even this
is well below 25,000 pounds.
Nor are spokes pinned at the rim once the deflection of the wheel has reduced
the tension in the spokes near the point where the external load is applied to
zero. Once this has happened, the rim end of the spoke is free and it is no
longer contributing to the structural integrity of the wheel.
Just for fun, I calculated what load would have to be applied to a thin
aluminum ring subjected to diametrically opposed point loads to give a maximum
deflection of --1.4 inches. Since the ring is free to deflect any way it wants
to using this case, the calculated bending moment will be a minimum for the
applied deflection. I obtained a value of 1288 pounds.
Then I used this load to calculate the maximum bending moment. This turned out
to be 3252 inch-pounds. Finally, I calculated the maximum bending stress in
the ring using the classic equation: S= Mz/I, where S is the stress, M is the
bending moment, I is the moment of inertia of the section of the ring with
respect to its neural axis and z is the distance from the neutral axis to the
point where the stress is to be calculated. Since I have no idea what the
maximum value for z is for the rim in question, I assumed that the rim is made
of a high strength aluminum alloy with a yield strength of 72 ksi and solved
for the distance which turned out to be 0.055 inches. I don't believe that
bicycle rims are that thin.
Buckling failures occur at elastic stresses and have proved to my satisfaction
using simplistic approximations of the actual situation that for the rim to
behave in the manner described.
I have been a computer programmer for 15 years, before that I spent ten years
in the aerospace industry working as a structural engineer.
Daniel L. Curtis
Cincinnati, Ohio
Al responds: I agree with you Dan that the wheel model I used is physically
unrealistic for moderate to large deflections. The most crucial flaw is that
it is a two-dimensional model: An actual wheel will deform in three dimensions
once the spokes begin to unload. I deliberately chose to use a vastly
oversimplified model in order to draw attention to the point of the article:
the application of object-oriented methodology to the spline finite element
method. I think the model served this purpose well.
Example 1
10 'This is PLOT_XYZ.BAS by Homer TiltonJune 1990.
20 'Use in conjunction with PLOT_XYZ.BAT
30 'X,Y,Z are spaceform coords. X is horiz, Y is vert, Z is depth.
40 'SX,SY are screen coords. SX is horiz, SY is vert.
...
250 '-------------------Your parameters-------------------------
260 INPUT "Enter max x,y,z value (default is c):",E
270 IF E=- THEN E=3.14
280 PRINT290 INPUT "Enter max t value (default is same as x,y,z value):",H
300 IF H=0 THEN H=E
310 GOSUB 1220 'Load defined functions
320 S1=0 'Roll (Euler angles in degrees)
330 S2=-30 'Yaw
340 S3=25 'Pitch
350 M=3.5 'Magnification
360 DP=-1 'Depth position in inches
370 '-------------------- Definitions ------------------------------
380 R1=S1*PI/180:R2=S2*PI/180:R3=S3*PI/180 'Angles in radians
390 D=M*50 'Scale of data
400 ZM=100*DP 'Depth position in hundredths of an inch
410 A=1200 'Observer distance in hundredths of an inch
...
970 '----------------------------
980 '---------------------- SUBROUTINES AND DATA -----------------
990 '--- Stereo-scenographic linear transformations ---
1000 X=D*X:Y=D*Z:Z=D*Z
1010 XA=CR1*X-SR1*Y ' Rotation 1
1020 YA=SR1*X+CR1*Y
1030 X=CR2*XA+SR2*Z ' Rotation 2
1040 ZA=CR2*Z-SR2*XA
1050 YB=CR3*YA-SR3*ZA
1060 ZB=SR3*YA+CR3*ZA+ZM ' Rotation 3

1070 IF TOG=0 THEN XB=XB+ZB/20 'LH view
1080 IF TOG=1 THEN XB=XB-ZB/20 'RH view
1090 P=1/(1-ZB/A) 'Perspective transformation
1100 SX=XB*P:SY=YB*P 'Screen image
1110 RETURN
....

 Example 2






















































September, 1993
Recursive Worlds


Repeatedly replacing replicas




Clifford A. Pickover


Cliff is the author of numerous books including Computers, Pattern, Chaos, and
Beauty (1990), Computers and the Imagination (1991), and Mazes for the Mind
(1992), all published by St. Martin's Press. He is also a researcher at IBM's
Thomas J. Watson Research Center. Cliff can be contacted at
cliff@watson.ibm.com.


He watched her for a long time and she knew that he was watching her and he
knew that she knew he was watching her, and he knew that she knew that he
knew; in a kind of regression of images that you get when two mirrors face
each other and the images go on and on and on in some kind of infinity.
--Robert Pirsig, Lila
Whatever can be done once can always be repeated," begins Louise B. Young in
The Mystery of Matter when describing the shapes and structures of nature.
From the branching of rivers and blood vessels, to the highly convoluted
surface of brains and bark, the physical world contains intricate patterns
formed from simple shapes through the recursive application of dynamic
procedures. Questions about the fundamental rules underlying the variety of
nature have led to the search to identify, measure, and define these patterns
in precise scientific terms.
Recursion is a fundamental concept in computer science, mathematics, biology,
art, and even linguistics. Imagine, for instance, you're paging through a
dictionary to find the definition of "recursion," finding it just below
"recuperate:"
recumbent - lying down
recuperate - to recover health
recursion - look up the definition of recursion
Here, "recursion" is defined in terms of itself--a recursive definition, in
other words. ("Recursion" is distinct from "iteration," which is exemplified
by turning the pages of the dictionary to find a particular word, expressed in
a program as: NextPage=CurrentPage+1.)
Other linguistic recursive definitions include "a wolf pack is two wolves or a
wolf pack together with a wolf," or more simple constructs such as "art is
art," "a dog is a dog," or even "a dog is not a dog." Sometimes the term
"self-referential" is used when referring to these constructs.
The related concept of "quining," discussed in Douglas Hofstadter's Godel,
Escher, Bach (Vintage, 1980), is a process of taking a group of words, and
forming a self-referential sentence by preceding the original group with the
same group enclosed in quotes. For example: "is a sentence fragment of seven
words" is a sentence fragment of seven words.
Perhaps the most striking application of recursion occurs in the biological
world where growth starts with a bud, which grows into a pipe, which then
branches into two buds, each of these two buds branching in a recursive growth
process. Iteration, or repeated application of these simple rules, results in
a self-similar oak tree, arterial blood system, the bronchial system of lungs,
or the like. The branching patterns are thought to be the result of the
simplest of growth algorithms: The steps repeat the previous ones on smaller
and smaller scales. Example 2
Figure 1 shows a coral-like form I computed using simple recursive branching
rules at smaller and smaller size scales. In geometry, one of the most
interesting consequences of recursive processes is "self-similarity." A
self-similar object appears roughly the same after increasing or shrinking in
size. Like a nested collection of Russian dolls within dolls, self-similar
objects contain within themselves miniature copies of themselves. Look inside
a turbulent stream of water: The largest eddies contain smaller ones, and
these contain smaller ones still. The beautiful consequences of
self-similarity are intricate, fine-grained patterns, now generally called
"fractals." The term fractals was coined in The Fractal Geometry of Nature by
Benoit Mandelbrot (Freeman, 1982) to encompass many of the detailed and
convoluted shapes found in nature and produced by recursion in both the
mathematical and natural worlds.
One of my earliest (and still most interesting) introductions to recursion and
self-similarity was a Don Martin cartoon in Mad magazine. In the first frame,
a man lies anesthetized on a hospital operating table. In the second, a
surgeon uses a small circular saw to cut along the circumference of the man's
head. Then, he removes the patient's skull cap.
Inside the head, the surgeon doesn't find a brain, but rather, another fully
formed head, identical to the original, but slightly smaller. The surgeon
removes this smaller head and opens its skull cap, finding yet another head.
He continues to open smaller and smaller heads until the last head sits in the
palm of his hand. He opens this one and finds a slip of paper that reads,
"Inspected by number 47."
In computer programming, recursion often refers to the structure and
functioning of programs. The common definition of a recursive program is one
that calls itself, and a recursive function is one that is defined in terms of
itself. (Interestingly, recursion can be removed from any recursive program
using iteration.)
Perhaps the most common example of recursion in programming and in mathematics
(where recursion is called a "recurrence relation") is the factorial function:
X! =Xx(X-1)! for X_1, 0! =1 where X is an integer. This can be accomplished
with a simple recursive program; in Example 1(a), the program calls itself in
line 3. Another common example of a recurrence relation is one that defines
the Fibonacci numbers. This sequence of numbers, called the Fibonacci
sequence, plays important roles in mathematics and nature. These numbers are
such that, after the first two, every number in the sequence equals the sum of
the two previous numbers: FN= FN--1+FN--2 for N_2,F0=F1=1. This defines the
sequence: 1,1,2,3,5,8,13,21,34,55_.
Like the factorial function, a simple recursive program can be written to
generate the Fibonacci sequence such as in Example 1(b).
In actuality, as with many recurrence relations, it's easy to compute FN using
arrays in a non-recursive program like Example 1(c). This program computes the
first 30 Fibonacci numbers using an array size of 30. This method of using
arrays to store previous results is usually the preferred method for
evaluating recurrence relations, because it allows even complex expressions to
be processed in a uniform and efficient manner. Of course, in this example,
you can avoid the array by retaining the last two values in two variables.


Recursive Lattices


I'll now turn to a particular class of self-similar objects I call recursive
lattices because they can easily be constructed using checkerboards of
different sizes. The concept of repeatedly replacing copies of a pattern at
different-size scales to produce interesting patterns dates back many decades,
including the work of mathematicians Koch, Hilbert, and Peano. More recent
work has been done by Mandelbrot and Lindenmeyer. Artists such as Escher,
Vasarely, Shepard, and Kim have also experimented with recursive patterns.
I'll provide computational recipes for some intriguing, yet simple to compute,
designs.
To create the intricate forms, start with a collection of squares called the
initiator lattice or array. You can see what these look like in the upper-left
corners in Figures 2 and 3, for instance. The initial collection of squares
represents one size scale. At each filled (black) square in the initial array
I place a small copy of the filled array. This is the second-size scale. At
each point in this new array, I place another copy of the initial pattern.
This is the third-size scale. In practice, I only use three size scales for
computational speed, and because an additional-size scale does not add much to
the beauty of the final pattern.
In mathematical terms, begin with an SxS square array, (A), containing all 0s
to which 1s, representing filled squares or sites, are added at random
locations. For example:
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 1 1 1 0 0
0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0
0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Just how many patterns can you create by randomly selecting array locations
and filling them with 1s? Think of the process of filling array locations in
terms of cherries and wineglasses. Consider an SxS grid of beautiful crystal
wineglasses. Throw M cherries at the grid. A glass is considered occupied if
it contains one or more cherries. With every throw each cherry goes into one
of the glasses. How many different patterns of occupied glasses can you make?
(A glass with more than one cherry is considered the same as a glass with one
cherry in the pattern.)
It turns out that for an SxS array and M cherries, the number of different
patterns is shown in Figure 4(a). As an example of how large the number of
potential patterns is, consider that 32 cherries thrown at a 9x9 grid creates
more than 1022 different patterns. This is far greater than the number of
stars in the Milky Way galaxy (1012) and greater than the number of atoms in a
person's breath (10x1021). In fact, it is about equal to the estimated number
of stars in the universe (1022).
For patterns like Figures 2 and 3, I use S=7. Smaller arrays would lead to
fewer potential patterns (particularly with the added induced symmetry,
discussed later), and greater values of S lead to diffuse patterns with the
scaling used. In C, the process of filling the initial array can be coded as
in Example 2(a). Notice that lines 6--9 introduce a fourfold--symmetrical
pattern which leads to an overall symmetry in the design at several size
scales. Since each of the "cherries" is symmetrically placed in each of the
four quadrants of the initial array, each pattern is really defined by the
smaller quadrant subpattern. Although this symmetrization decreases the number
of possible patterns to that in Figure 4(b).
For even values of S, use Floor((S+1)/2), where "Floor(x)" returns the largest
integer value not greater than the x parameter. I find that inducing the
symmetry of the original pattern produces more aesthetically appealing designs
than those produced by a random initial array.
For convenience, a program may store the values of the A array in two 1-D
arrays, x and y, whose values are centered at the origin. The final value of N
is the number of filled points in the symmetrized array in Example 2(b). The
major computational step begins with the generation of the three size scales
from these x and y arrays, the values of which store the positions of the
filled (one) sites in the original structural motif. To determine the final
positions of all the black squares in the final design, a scale factor a is
applied to each position and the resulting terms summed; see Figure 4(c). For
the patterns in Figures 2 and 3, a1=1,a2=S,a3=S2. To do this in a program, see
Example 2(c).
Several numerical measures of the patterns can be computed and displayed in an
effort to quantify the structure of the patterns (note the graphs in the
upper-right of Figures 2 and 3). If these parameters are found to correlate
with the "beauty" of certain patterns, computer programs can then examine
these parameters and automatically generate classes of patterns of aesthetic
value.
Perhaps the most obvious parameter to compute is the fractal dimension D,
which characterizes the size-scaling behavior of the pattern. This value gives
an indication of the degree to which the pattern fills the plane--how the
pattern "behaves" through different magnifications. Luckily, the D-value for
these recursive lattice patterns is easy to compute; in general, N=SD (where N
is the number of filled sites in the original symmetrical array of squares,
and S is the magnification factor used, in this case the same as the size of
an edge of the original array, 7). Notice as the number of filled (black)
sites (N) in the initial array increases, the dimension (D) increases. In one
sense, D quantifies the degree to which the porous patterns fill the plane in
which they reside. If all the sites in the initial array are filled, N is 49
and therefore D=2. This makes sense: If the entire plane is filled, the
dimension of the object should be 2. In a program, D is calculated from
D=log(N)/log(7). In recursive words, dimensions are tangled up like a ball of
twine, and all the patterns are neither one nor two dimensions, but somewhere
in between.
Another quantitative measure of the patterns' spatial characteristics is p(r),
the length-distribution function. The function indicates the distribution of
all the interpoint distances, rij, in the pattern. Mathematically speaking,
this new function p(r) is defined by Figure 4(d) where N is the number of
points in the pattern. The sums are over i and j, and what's being summed is
d(rij, r)=1 if rij=r and 0 if rij_r. In simple English, to compute p(r),
select a point in the pattern and compute the distances from that point to
every other point in the pattern. Do this for each point in the pattern, and
create a graph showing the number of different lengths as a function of each
length. To interpret these graphs, the right-most point in the p(r) graph is
the maximum distance found in the structure, and the left-most point is the
minimum. The most common distance is the highest point of the curve.

In a program, I use a Monte Carlo approach to compute p(r), because actually
computing all the interpoint distances is computationally expensive. Instead,
you can randomly select pairs of points in the structure and produce the final
p(r) curve after 500,000 pairs are cataloged. In C, the Monte Carlo process
looks like Example 3(a). Rmax, the longest vector in the structure, or maximum
linear dimension, can also be estimated by examining the last non-zero value
of p(r). The variable Npts is the number of points in the final recursive
lattice pattern. Obviously there is some noise in the Monte Carlo process, and
I use a simple nearest-neighbor smoother to reduce the noise so the eye can
concentrate on the global structures of the curve; see Example 3(b). If you
run the Monte Carlo process twice using different random numbers, the graphics
look almost identical.
Another parameter, the radius of gyration, Rg, can be computed from the
unsmoothed length-distribution function p(r) in Figure 5. The radius of
gyration quantifies the spatial extent of the structure in the plane. Small,
compact patterns have small values of Rg. Large, extended patterns have large
values of Rg. To get an intuitive feeling for this parameter, if all of the
squares in the final pattern were to lie on the edge of a circle of radius R,
then Rg=R. If the circle were stretched in one direction to form an ellipse,
the Rg value would increase because the "mass" of the pattern is further from
the center of mass of the entire pattern. In C, the Rg computation looks like
Example 4(a).
After examining hundreds of recursive lattice patterns, I find that many
people prefer high values of the fractal dimension of around 1.8. The p(r)
curves for the preferred structures usually do not exhibit global features
(bumps and valleys). To date, I haven't been able to determine a correlation
between perceived beauty and the Rg parameter. You may wish to compute the
lattice patterns and also look for correlations between perceived beauty and
the fractal dimension.
As indicated earlier, most people seem to prefer patterns with a symmetrical
structure over those with purely random structures. I prefer patterns with
fourfold symmetry; however, I've also experimented with patterns with
inversion symmetry, bilateral symmetry, and random-walk symmetry. Figure 3,
for example, is a bilaterally symmetric pattern produced by making the left
and right sides contain the same patterns: Aij=1, AS--1--i,j=1. Example 4(b)
shows this in C. Inversion symmetry can be computed by: Aij=1, AS--1--i,
S--1--j=1. Example 4(c) shows this in C.
Random walks can be used to force greater correlation between points in the
initial array. Rather than selecting each site randomly with random-walk
symmetry, the position of each new site is related to the previous. For
example, select a point and continually subtract 1 from, or add 1 to, the
initial site's x and y coordinates. Other symmetries, such as the sixfold
symmetry of a snowflake, can also be used.
Example 1:
(a)
1 Factorial(X);
2 IF X=0 then Factorial=1
3 ELSE Factorial=X*Factorial(X-1)
4 END
(b)
1 Fibonacci(X)
2 IF N <= 1 then Fibonacci=1
3 ELSE Fibonacci=Fibonacci(N-1)+Fibonacci(N-2)
4 END
(c)
Fibonacci
 F[0]=1;F[1]=1;
 For i = 2 to 30
 F[i]=F[i-1]+F[i-2]
 END
END



Example 2:
1 S = 7; Sz = S-1; M = 20;
2 for (h=1; h<=M; h++) {
3 /* rand returns a value between 0 and 32767 */
4 v = ((float) rand()/32767.)*S;
5 w = ((float) rand()/32767.)*S;
6 a[v][w]=1;
7 a[Sz-v][w]=1;
8 a[Sz-v][Sz-w]=1;
9 a[v][Sz-w]=1;
10 }
(b)
 N=0; Sz = S - 1;
 for (i=0; i<=S; i++)
 for (j=0;j<=S; j++)
 if (a[i][j]==1)
 {N++; x[h]=i-Sz/2; y[h]=j-Sz/2;}
(c)
 for (i=1; i<=N; i++) {
 for (j=1; j<=N; j++) {
 for (k=1; k<=N; k++) {
 X = alpha[1]*x[h]+alpha[2]*x[h]+alpha[3]*x[h]
 Y = alpha[1]*y[h]+alpha[2]*y[h]+alpha[3]*y[h]
 PlotSquareAt(X,Y)
 }
 }
 }



Example 3:
(a)
 /* initialize p array to zero */

 for(i=0; i<900; i++) p&lbrk.i&rbrk.=0;
 total=500000;
 /* catalog 500000 vector lengths */
 for(i=0; i<total; i++) {
 rnd1 = ((float) rand()/32767.)*Npts;
 rnd2 = ((float) rand()/32767.)*Npts;
 xterm = corx[rnd1]-corx[rnd2];
 yterm = cory[rnd1]-cory[rnd2];
 dist = sqrt (xterm*xterm+yterm*yterm);
 dist = dist+.5;
 p[(int)dist]++;
 }
(b)
 for(i=1; i<900;i++) {
 ps[i]=.25*(float)p[i-1]+.5*(float)p[i]+.25*(float)p[i+1];
 }
Example 4:
(a)
 sum=0; summ=0;
 for(i=0; i<900;i++) {
 summ=summ+p[i];
 sum=sum+p[i]*i*i;
 }
 Rg=sqrt(sum/((float) summ*2)) ;
(b)
 Sz = S - 1;
 for (h=1; h<=40; h++) {
 v = ((float) rand()/32767.)*S;
 w = ((float) rand()/32767.)*S;
 a[v][w]=1;
 a[Sz-v][w]=1;
 }
(c)
 Sz = S - 1;
 for (h=1;i h<=30; h++) {
 v = ((float) rand()/32767.)*S;
 w = ((float) rand()/32767.)*S;
 a[v][w]=1;
 a[Sz-v][Sz-w]=1;
 }

 Figure 1: Coral-like form generated from a recursive program
 Figure 2: Recursive lattice design using a fourfold-symmetrical pattern
 Figure 3: Recursive lattice design using a bilaterally symmetric pattern
 Figure 4: (a) The number of different patterns for an SXS array; (b)
symmetrization decreases the number of possible patterns; (c) determining the
final positions of all the black squares in the final design; (d) p(r), the
length-distribution function, is a quantitative measure of the patterns'
spatial characteristics
 Figure 5: The radius of gyration R sub g, can be computed from the unsmoothed
length-distribution function p(r).
















September, 1993
Arbitrary Precision Floating-point Arithmetic


Here's a library that's portable and IEEE-754 binary format compatible




Frederick C. Motteler


Fred is a senior engineer at Zetron and has worked on a variety of portable
native applications and embedded applications. He can be contacted at either
fred@derfdom.UUCP or Zetron Inc., 12335 134th Ct. N.E., Redmond, WA
98052-2433.


Floating-point arithmetic is usually something most programmers take for
granted because virtually all compilers support single- and double-precision
for applications that require it. Many compilers even support double-extended
precision.
However, there are exceptions: Your application might need a few more bits of
precision than what your compiler or hardware provides, or you may have just
spent the last three days (and nights) trying to optimize your algorithm,
attempting to squeeze it into doubles without success. Perhaps you have a ton
of data in IEEE-754 format and your compiler doesn't know what an IEEE number
is. Alternately, your application may need to run on a processor that's either
so old or so new that the only compiler you can get for it doesn't support
floating point. Possibly you're looking for something that will run on a DEC
Alpha and on a CP/M machine and produce the same result.
Whatever the reason, there are times when extended precision, IEEE-754
compatibility, and/or portability are important constraints. The C library
presented here was developed with IEEE-754 compatibility and portability as
its primary goals. Its first application was as part of a portable cross
compiler and I assumed that both host and target environments did not have
IEEE compatible single- and double-precision floating-point support.
Later applications required extended precision and the ability to convert to
and from IEEE-754 double- and single-precision formats. The result is a
general-purpose library that supports single, double, double-extended, and
longer IEEE-754-like formats. The library has been ported to and tested on a
variety of systems including CP/M, PCs running MS-DOS (Zortech, Microsoft,
Lattice, Mark Williams compilers), PCs running Coherent, Sun 3s (Sun's C
compiler and the GNU compiler), Sparcstations (Sun's C compiler), and the IBM
RS/6000 under AIX. The package is K&R, ANSI C, and C++ compatible. A
table-driven tester included with the library checks if it has compiled
properly. The tester also gives you an idea of what the package is capable of
doing.
Sound to good to be true? The only known drawback is that slugs and snails
probably run faster. (All right, you DEC Alpha users, make that a really fast
slug!) This is the result of numerous tradeoffs required to achieve
portability, extensibility, and IEEE compatibility. Also, if execution speed
is important, the integer portion of the package can be significantly
optimized. Most of the time required is actually spent doing integer
operations (see my article "Statistical Performance Analysis," DDJ, December
1991). Operations such as shifting multibyte integers are not efficient when
coded in C.
The package requires dynamic memory allocation, malloc(), and free(). Included
is support for verifying that all memory malloced during an operation has been
properly freed when the operation is complete. Also required are unsigned char
(8 bits or more), unsigned int (16 bits or more), and unsigned long (32 bits
or more) data types that are compatible with the ANSI C specification.
The package is quite big, so rather than examine the nitty gritty details of
how to do arithmetic and write portable software, I'll highlight interesting
features and aspects of it. For implementation details, source code comments
are the best guide. Listing One, page 88, (fmlib.h) gives function prototypes
for all user-interface functions. The complete package is provided
electronically; see "Availability," page 3.


Dynamic Creation of a Numerical Representation


All numbers are represented as unsigned character (byte) arrays. The length of
a specific array depends on the length of the numerical representation that it
contains. For example, a standard IEEE double requires eight bytes.
The advantage of byte arrays is the ease with which new representations may be
created and manipulated. For example, a representation with 100 mantissa bits
and 11 exponent bits needs a total of 112 bits, or 14 bytes. Creating a value
with this representation is a two-step process. First, 14 bytes of memory must
be allocated for each specific instance of the representation. The memory may
be either statically allocated (at compile time) or dynamically allocated (at
run time) on the stack, or by using malloc() or calloc().
Once the memory is allocated, one of the functions fltoflt(), intoflt(), or
strtoflt() perform the initial conversion operations. Example 1(a), for
instance, converts the double (pointed to by doubleptr) to the 14-byte
representation (pointed to by destptr). With similar ease, Example 1(b)
converts the long pointed to by longptr to the 14-byte representation. And for
data represented in a string format, Example 1(c) converts a string (up to 128
bytes of it) to the 14-byte representation. A relatively free format string
representation with no internal white space is expected by the strtoflt()
function. Examples of valid strings are 3.501, 5, --.34e-12, and +123.456E+19.
The only caveat to these operations is that the byte ordering of native
integer and float representations may not match the byte order and byte
packing assumed in this library. The representations used here are all
big-endian format. For integers, this means that the most significant byte is
in low memory. For floats, the exponent is in low memory. These
representations are compatible with Motorola 68xxx formats, but backwards for
Intel 80x8x formats (for more information, refer to Motorola's MC68881
Floating-Point Coprocessor User's Manual and Intel's Microsystem Components
Handbook). For PC users, this means that the native values must be byte-order
flipped before being passed to a conversion function.


Arithmetic With a Numerical Representation


Once a number has been converted to the desired representation, the number may
be part of an arithmetic operation. The functions, faddm(), fsubm(), fmultm(),
and fdivm() perform the basic +, --, *, and / operations.
For example, if dividend_ptr and divisor_ptr point to two values in our 14
byte representation, then Example 2 divides the value pointed to by
dividend_ptr by the value pointed to by divisor_ptr. The resulting quotient is
returned in the 14 bytes pointed to by dividend_ptr. This overwrites the
original dividend value.
All interface functions in the package return an unsigned byte condition code.
Functions that return a floating-point value use the following floating-point
condition code format:
msb 0 0 0 0 N Z I n lsb
where N=sign (1 if result negative), Z=zero (1 if result 0), I=infinite (1 if
result infinite), and n=Not A Number (1 if result is Not A Number). After each
operation, the value of the returned condition code must be examined to
determine the integrity of the result. For example, a divide-by-zero operation
returns a condition code with the I (infinite) bit set.


Getting Results


Conversion of results back to a native representation is just as important as
the initial conversion to a custom format. The fltoflt(), fltoint(), and
fltostr() functions perform the inverse conversion operations. Example 3(a)
converts the 14-byte representation pointed to by srcptr to a double in the 8
bytes pointed to by doubleptr. The fltoflt() function is its own inverse. As a
conversion function, it is very powerful since it can convert between any of
the possible floating-point representations supported by the package.
Correspondingly, Example 3(b) converts the 14-byte representation to a long in
the 4 bytes pointed to by longptr. For data represented in a string format,
Example 3(c) converts the 14-byte representation to a string (up to 128 bytes
long) pointed to by stringptr. The output format is similar to the printf()
"e" format. All significant figures are included. The strings produced by
fltostr() are compatible with strings expected by strtoflt().
Functions that return an integer value, such as fltoint(), use the integer
condition code format:
msb 0 0 0 0 Z V S C lsb
where Z=zero (1 if result 0), V=overflow (1 if overflow result), S=sign (1 if
result negative), and C=carry. As mentioned earlier, the value of the returned
condition code must be examined to determine the integrity of the result. For
example, an overflow during conversion of a large float value to an integer
returns a condition code with the V (overflow) bit set.


An Example: The Square Root of 1/2 to 100 Places


Listing Two, page 88, (root2ext.c) was used to calculate the square root of
1/2 presented in Figure 1. The command line root2ext .5 1e-105 372 11 1
guarantees accuracy to about 105 places. The first 100 digits are in Figure 1.
A simple power series expansion is used. Calculation of successive terms
involves a simple recursion relation that requires six floating operations per
iteration. When calculating the square root of 1/2, each successive term
results in about one bit of additional accuracy. The series converges for
values between 0 and 2. Most rapid convergence is for values near 1.
All numerical input values are accepted as strings. For the sake of
convenience, even constant float values are initially represented as strings
and converted to the desired representation. The root2ext program was run on a
Sun 3/60 (big-endian byte order) and a PC (little-endian byte order). When
compiling for a PC, the symbol IBM_PC should be defined on the compile-command
line to ensure proper conversion of PC native integers to float values.

The result is presented as a string with all significant digits represented.
The root2ext program was also used to determine approximate flops
(floating-point operations per second) values for various representations.
Table 1 summarizes the results. As expected, the time required for each
floating-point operation increases rapidly with increasing precision.


Float-to-String Conversion


One of the more useful aspects of this package is that it includes functions
to convert binary representations to and from string representations (see
Listing Three, page 88). This is one operation that most programmers leave to
printf() and scanf()--probably for good reason. Efficient conversion of binary
floating-point values is much more difficult than binary integer values.
Conversion from a binary float representation to a string representation first
requires that the binary exponent be converted into a decimal exponent. This
is determined by using:
decimal exponent
 = binary exponent * log10(2)
 = binary exponent * 0.30103000
It's relatively easy to convert the binary exponent to a float, multiply it by
0.30103000, and convert the result back to an integer. Use of a
single-precision float restricts the accuracy of the conversion to 24 bits. As
a result, this limits the maximum exponent-field width to 24 bits.
Once the approximate decimal exponent is known, the actual floating-point
value of ten raised to the decimal exponent must be calculated. A common
approach to this is to have a table of floating-point powers of ten. For the
standard-IEEE single-precision float representation, a table with 12 entries
(101, 10--1, and so on, for powers of 10 2, 4, 8, 16, and 32) is
required. For the standard double representation, a table with six more
entries (for powers of 10 64, 128, 256) is required. To calculate a
specific power of 10, say 1027, the values for 1016, 108, 102, and 101 are
multiplied together (27=16+ 8+2+1). This approach is a reasonable compromise
between accuracy, table size, and computation time.
The disadvantage of this approach is that the table values must be known in
advance. This is fine for supporting a limited number of representations such
as IEEE float and double. For arbitrary representations that are defined at
run time, the table values must effectively be calculated on the fly. Starting
with a conversion of the integer 10 to the desired representation, subsequent
binary powers of 10 in the representation are calculated by squaring the
previous value. For the example above (1027), this requires that 101, 102,
104, 108, 1016 be calculated. In the library, these operations are combined
together in the intopten() function.
The next step is to determine decimal mantissa value. The decimal mantissa
must be between 0.1 and 1. To good approximation, the decimal mantissa is
given by dividing the original float value by 10 raised to the decimal
exponent:
mantissa value = original float/10E
where E equals decimal exponent.
Due to truncation errors, the resulting mantissa value may not be between 0.1
and 1. If it is less than 0.1, then the mantissa value is multiplied by 10 and
the exponent value is decremented. If it is greater than or equal to 1, then
the mantissa value is divided by 10 and the exponent value is incremented. A
mantissa value equal to 0.1 is a special case and is explicitly handled.
At this point, the mantissa value is just a floating point value between 0.1
and 1. It must be converted to decimal digits. The number of decimal digits is
determined by the bit length of the mantissa:
decimal mantissa digits
 = binary mantissa digits * log10(2)
 = binary mantissa digits * 0.30103000
The binary mantissa value is then multiplied by 10D, where D equals the number
of decimal mantissa digits. When converted back to an integer, this creates a
multi-byte integer value that has all of the significant figures of the
original binary mantissa. The integer value is then converted to decimal by
repeated integer division by 10. The remainder of each division gives
successively more significant decimal digits.
The resulting decimal mantissa digits and decimal exponent are then
concatenated into a string representation.


String-to-Float Conversion


Conversion of a string representation back to a binary representation is
handled in somewhat of a reverse order from float-to-string conversion. The
mantissa portion is converted into a float value by successively adding in
each digit after multiplying the previous digits' value by 10. While the
mantissa is being converted, the number of digits behind the decimal point (if
any) is counted.
Next, the number of digits behind the decimal point (D) is subtracted from the
decimal exponent (E). The value 10(E-D) then is calculated using intopten().
This value is multiplied by the mantissa value to obtain the converted binary
float value. Special precautions are required to handle cases where partial
results may underflow or overflow, but the final result is within range of the
representation.


Verification


An arithmetic package is useful only if it reliably produces accurate results
over the entire domain (all possible input values) and range (all possible
output values) of all functions. This places stringent, but well-defined
analytical requirements on the arithmetic package.
A comprehensive table-driven tester is included with the package and is
provided electronically; see "Availability," page 3. The tables give string
representations of input values and expected output values. This has the
advantage of testing the string conversion functions as well as the specific
arithmetic operations. In addition to providing comprehensive regression
testing, new test cases are easy to set up and verify. In all cases, the
returned condition codes are checked.
The current version tests a variety of input and output conditions. These
include overflow, underflow, almost overflow, barely overflow, almost
underflow, barely underflow, divide-by-zero, multiply-by-zero, infinite input
value, and so on.


Portability


As mentioned earlier, the library has been ported to, and tested on, a variety
of systems and is K&R, ANSI C, and C++ compatible. The table-driven tester
checks if it has compiled properly. For ANSI and C++ environments, the symbol
PROTOTYPES must be defined on the compile command line. For environments
without <stdlib.h>, the symbol MWC must be defined on the compile-command
line. To verify that all dynamically-allocated memory is properly released,
the symbol MEMTEST must be defined on the compile command line.
When building the package, I recommend that imlib.c be compiled with the TEST
symbol defined. This produces a stand-alone tester of the integer portion of
the package. The integer portion of the package must work reliably for the
rest of the package to work.


What About C++?


The next step for this package is to encapsulate it into a C++ class. The
package already supports static and dynamic creation and manipulation of
floating-point data representations. The ability to overload operators in C++
allows arbitrary precision representations to be used with the same ease as
native representations. This would greatly facilitate porting application
software to take advantage of arbitrary-precision floating point.
Example 1:
(a)
 unsigned char *destptr;
 double *doubleptr;
 destptr = malloc(14);
 fltoflt(doubleptr, 52, 11, destptr, 100, 11);
(b)

 long *longptr;
 intoflt(longptr, 4, destptr, 100, 11);
(c)
 char *stringptr;
 strtoflt(stringptr, 128, destptr, 100, 11);
Example 2:
 unsigned char *dividend_ptr, *divisor_ptr;
 fdivm(dividend_ptr, divisor_ptr, 11, 100);
Example 3:
(a)
 unsigned char *srcptr;
 double *doubleptr;
 fltoflt(srcptr, 100, 11, doubleptr, 52, 11);
(b)
 long *longptr;
 fltoint(srcptr, 100, 11, longptr, 4);
(c)
 char *stringptr;
 fltostr(srcptr, 100, 11, stringptr, 128)
Figure 1:
 sqrt(1/2) = 0.70710 67811 86547 52440 08443
 62104 84903 92848 35937 68847
 40365 88339 86899 53662 39231
 05351 94251 93767 16382 07864
Table 1: Results of the root2ext program. Each iteration requires six
floating-point operations. (Flops = 6 * Iterations * Trials/Time.)
==============================================================================
 Sun 3/60 40MHz/386DX
 Mantissa Exponent Iterations Trials Time Flops Time Flops
 bits bits (sec) (sec)
==============================================================================
"float" 23 8 17 60 54 110 9 680
"double" 52 11 42 15 69 55 14 270
 104 11 90 4 108 20 22 98
 212 11 188 1 189 6 40 28
 372 11 319 1 965 2 195 10
==============================================================================
_ARBITRARY PRECISION FLOATING-POINT ARITHMETIC_
by Frederick Motteler

[LISTING ONE]

/* Extended IEEE Compatible Floating Point Arithmetic Library
** Version 1.1 Copyright 1990, 1992 by Fred Motteler, All Rights Reserved */

/* Bit masks for floating point condition code values */
#define FFNAN 1
#define FFINF 2
#define FFZERO 4
#define FFNEG 8

/* Floating point functions */
#ifdef PROTOTYPES
unsigned char fmultm(unsigned char *prodPB, unsigned char *termPB,
 int expbitN, int fracbitN);
unsigned char fdivm(unsigned char *dividPB, unsigned char *termPB,
 int expbitN, int fracbitN);
int fcmpm(unsigned char *flt1PB, unsigned char *flt2PB, int expbitN,
 int fracbitN);
unsigned char fsubm(unsigned char *diffPB, unsigned char *termPB,

 int expbitN, int fracbitN);
unsigned char faddm(unsigned char *sumPB, unsigned char *termPB,
 int expbitN, int fracbitN);
#else
unsigned char fmultm(); /* Floating point multiply */
unsigned char fdivm(); /* Floating point dividend */
unsigned char faddm(); /* Floating point addition */
unsigned char fsubm(); /* Floating point subtraction */
int fcmpm(); /* Floating point comparison */
#endif
/* Numeric conversion functions */
#ifdef PROTOTYPES
unsigned char intoflt(unsigned char *intvalBP, int intlenN,
 unsigned char *fltvalBP, int fracbitN, int expbitN);
unsigned char fltoint(unsigned char *fltvalBP, int fracbitN, int expbitN,
 unsigned char *intvalBP, int intlenN);
unsigned char fltoflt(unsigned char *fltinBP, int mantinN, int expinN,
 unsigned char *fltoutBP, int mantoutN, int expoutN);
#else
unsigned char intoflt(); /* Integer to float */
unsigned char fltoint(); /* Float to integer */
unsigned char fltoflt(); /* Float to float */
#endif

/* Numberic / string conversion functions */
#ifdef PROTOTYPES
unsigned char intostr(unsigned char *intvalBP, int intlenN,
 char *strBP, int slenN, int radixN);

unsigned char fltostr(unsigned char *fltvalBP, int fracbitN, int expbitN,
 char *strBP, int slenN);
unsigned char strtoint(char *strBP, int slenN, unsigned char *intvalBP,
 int intlenN, int radixN);
unsigned char strtoflt(char *strBP, int slenN, unsigned char *fltvalBP,
 int fracbitN, int expbitN);
unsigned char intopten(unsigned char *fltexpBP, int expbyteN,
 unsigned char *ptenfltBP, int fracbitN, int expbitN);
#else
unsigned char intostr(); /* Integer to ASCII string */
unsigned char fltostr(); /* Float to decimal ASCII string */
unsigned char strtoint(); /* ASCII string to integer */
unsigned char strtoflt(); /* ASCII string to float */
unsigned char intopten(); /* Generate float integral powers of 10 */
#endif



[LISTING TWO]

/* This is a simple program to find the square root of a number
** using a power series expansion.
** sqrt(1 - x) = 1 - (1/2) * x - (1/8) * x ^ 2 - ... + f[n](0)/n! * x ^ n
** Recursion relation:
** f[n](0)/n! = ((2n - 3) / (2 * n)) * f[n-1](0) n > 1
*/
#include <stdio.h>
#include "fmlib.h"

#define LENGTH 48 /* Up to 44 byte mantissa, 4 byte exponent */

#define TRIALS 10

void
usage()
{
 printf("Usage: root2ext value accuracy mant exp trials\n");
 printf("Where: value = value to find square root of\n");
 printf(" accuracy = desired accuracy of result\n");
 printf(" mant = number of mantissa bits (23 to 384)\n");
 printf(" exp = number of exponent bits (8 to 31)\n");
 printf(" trials = number of repeat calcuations\n");
 exit(-1);
}
void
main(argc, argv)
int argc;
char *argv[];
{
 unsigned char y[LENGTH]; /* Value to find square root of, this
 * should be between 0 and 2. */
 unsigned char x[LENGTH]; /* Value used in series expansion, this
 * should between -1 and 1 */
 unsigned char term[LENGTH]; /* Value of specific term */
 unsigned char sum[LENGTH]; /* Sum of terms */
 unsigned char limit[LENGTH]; /* Accuracy limit */
 unsigned char two_n[LENGTH]; /* Temporary variable for (2 * n) */
 unsigned char three[LENGTH]; /* Value of 3 */
 unsigned char one[LENGTH]; /* Value of 1 */
 unsigned char temp[LENGTH]; /* Temporary variable */
 unsigned char cc; /* Condition codes */
 char string_array[128]; /* String conversion area */
 int mant_bits; /* Number of bits in the mantissa */
 int exp_bits; /* Number of bits in the exponent */
 int trials; /* Number of trials */
 int n; /* Current term number */
 int twon;
 int i; /* Copy index */
 int j; /* Trial index */

 /* Check input arguments for acceptable values */
 if (argc != 6)
 usage();
 if (sscanf(argv[3], "%d", &mant_bits) != 1)
 usage();
 if (sscanf(argv[4], "%d", &exp_bits) != 1)
 usage();
 if (sscanf(argv[5], "%d", &trials) != 1)
 usage();
 if ((mant_bits < 23) (mant_bits > 384) 
 (exp_bits < 8) (exp_bits > 31))
 usage();
 for (j = 0; j < trials; j++)
 {
 /* Convert value to find square root of and accuracy */
 strtoflt(argv[1], 128, y, mant_bits, exp_bits);
 strtoflt(argv[2], 128, limit, mant_bits, exp_bits);
 /* Calculate constants, constant 3 for recursion relation */
 strtoflt("3", 2, three, mant_bits, exp_bits);
 /* Sum = 1.0, initial sum */

 strtoflt("1.0", 12, sum, mant_bits, exp_bits);
 /* x = 1 - y, Calculate x value to use in series */
 for (i = 0; i < LENGTH; i++)
 x[i] = sum[i];
 fsubm(x, y, exp_bits, mant_bits);
 /* Calculate initial term = 0.5 * x, n = 1 term */
 strtoflt("0.5", 4, term, mant_bits, exp_bits);
 cc = fmultm(term, x, exp_bits, mant_bits);
 n = 2; /* Next term to calculate */
 /* Loop until term is less than limiting value. Note that for
 * an alternating series, only the positive values are tested. */
 while ((fcmpm(term, limit, mant_bits, exp_bits) > 0) 
 ((cc & FFNEG) != 0))
 {
 /* sum -= term */
 fsubm(sum, term, exp_bits, mant_bits);
 /* term(n) = term(n-1) * (x * ((2 * n) - 3) / (2 * n)) */
 twon = 2 * n;
 intoflt(&twon, 4, two_n, mant_bits, exp_bits);
 for (i = 0; i < LENGTH; i++)
 temp[i] = two_n[i];
 fsubm(temp, three, exp_bits, mant_bits);
 fdivm(temp, two_n, exp_bits, mant_bits);
 fmultm(temp, x, exp_bits, mant_bits);
 cc = fmultm(term, temp, exp_bits, mant_bits);
 n++;
 }
 }
 printf("Done...\n");
 /* Print out the result */
 fltostr(sum, mant_bits, exp_bits, string_array, 128);
 printf("Iteration %d is: %s\n", n, string_array);
 return;
}



[LISTING THREE]

/* Extended IEEE Compatible Floating Point Arithmetic Library
** Version 1.1 Copyright 1990, by Fred Motteler, All Rights Reserved */
#include <stdio.h>
#ifndef MWC
#include <stdlib.h>
#endif
#include "imlib.h"
#include "ffmlib.h"
#include "fmlib.h"

#define SINGLEXP 8
#define SINGLEFRAC 23
#define SINGLETOT 4

static unsigned char log10of2AB[4] = {0x3e, 0x9a, 0x20, 0x9b};

#ifdef TEST
#define DOUBLEXP 11
#define DOUBLEFRAC 52
#define DOUBLETOT 8

#define EXTENDEXP 15
#define EXTENDFRAC 63
#define EXTENDTOT 10
unsigned char intval1AB[4] = {0x0, 0x0, 0x12, 0x34};
unsigned char intval2AB[4] = {0x12, 0x34, 0x56, 0x78};
unsigned char fltval1AB[4] = {0x3e, 0x9a, 0x20, 0x9b};
unsigned char dblval1AB[8] = {0x52, 0x34, 0x56, 0x78,
 0x9a, 0xbc, 0xde, 0xf0};

/* Function: unsigned char fltostr(unsigned char *fltvalBP, int fracbitN,
** int expbitN, char *strBP, int slenN)
** Converts floating point value pointed to fltvalBP to a decimal ASCII string
** representation of the value pointed to by strBP. fracbitN length of the
** floating point mantissa in bits. expbitN is length of floating
** point exponent in bits. strBP is the length of string buffer in bytes. */
unsigned char
#ifdef PROTOTYPES
fltostr(unsigned char*fltvalBP,int fracbitN,int expbitN, char*strBP,int slenN)
#else
fltostr(fltvalBP, fracbitN, expbitN, strBP, slenN)
unsigned char *fltvalBP;
int fracbitN;
int expbitN;
char *strBP;
int slenN;
#endif
{
 int expbyteN, fracbyteN;
 unsigned char *expbiasBP;
 unsigned char *exponeBP;
 unsigned char *fltfracBP, *fltexpBP;
 unsigned char *tempfltBP, *tempexpBP;
 unsigned char *ptenfltBP, *onetenthBP, *oneBP;
 unsigned char ptenccB;
 unsigned char fltsignB;
 unsigned char condcodeB;
 int i, totalenN;
 unsigned char minusoneB, zeroB;
 unsigned int mantlenN;
 unsigned char *mantintBP;
 unsigned char *mantlenBP;

 minusoneB = 0xff;
 zeroB = 0;
 /* Initialize the condition code byte to zero */
 condcodeB = 0;
 /* Determine the total byte length of the floating point number */
 totalenN = fftotlen(expbitN, fracbitN);
 /* Determine number of bytes required to hold mantissa and exponent. */
 expbyteN = ffexplen(expbitN);
 fracbyteN = ffraclen(fracbitN);
 fltfracBP = (unsigned char *) FCALLOC(fracbyteN, 1, "FLTOSTR1");
 fltexpBP = (unsigned char *) FMALLOC(expbyteN, "FLTOSTR2");
 expbiasBP = (unsigned char *) FCALLOC(expbyteN, 1, "FLTOSTR3");
 exponeBP = (unsigned char *) FCALLOC(expbyteN, 1, "FLTOSTR4");
 /* Isolate the mantissas, exponents, and signs. */
 ffextall(fltvalBP, totalenN, fracbitN, fracbyteN, expbitN, expbyteN,
 fltfracBP, fltexpBP, &fltsignB);
 /* Write sign bit and decimal point to the output string. */

 if (fltsignB == 0)
 {
 *strBP++ = '+';
 }
 else
 {
 *strBP++ = '-';
 condcodeB = FFNEG;
 ffbitclr(fltvalBP, totalenN, (fracbitN + expbitN));
 }
 *strBP++ = '.';
 *(exponeBP + (expbyteN - 1)) = 1;
 /* Check the type of floating point number that we have: zero, infinity,
 * or Not-A-Number. First check for zero value. */
 if (ffchkzero(fltexpBP, expbyteN) == 0)
 {
 /* The exponent value is zero. Check if the mantissa is also zero.
 * First clear the implied most significant mantissa bit. */
 ffbitclr(fltfracBP, fracbyteN, fracbitN);
 if (ffchkzero(fltfracBP, fracbyteN) == 0)
 {
 /* The mantissa value is also zero, we have a zero value. */
 strcpy((char *) strBP, "0e+0");
 FFREE(fltfracBP);
 FFREE(fltexpBP);
 FFREE(expbiasBP);
 FFREE(exponeBP);
 return((unsigned char) (condcodeB FFZERO));
 }
 /* Set the implied most significant mantissa bit to 1 */
 ffbitset(fltfracBP, fracbyteN, fracbitN);
 }
 /* Having the exponent bias is useful for the next step. */
 ffgenbias(expbiasBP, expbyteN, expbitN);
 /* Check if exponent value is set to maximum possible value. This is done
 * by making a copy of exponent bias, shifting it left once, then setting
 * LSB to one. The result is compared with exponent value. */
 tempexpBP = (unsigned char *) FMALLOC(expbyteN, "FLTOSTR5");
 for (i = 0; i < expbyteN; i++)
 *(tempexpBP + i) = *(expbiasBP + i);
 ushftlm(tempexpBP, expbyteN);
 *(tempexpBP + expbyteN - 1) = 1;
 if (ucmpm(tempexpBP, fltexpBP, expbyteN) == 0)
 {
 /* The exponent value is set to its maximum value.
 * First clear the implied most significant mantissa bit. */
 ffbitclr(fltfracBP, fracbyteN, fracbitN);
 if (ffchkzero(fltfracBP, fracbyteN) == 0)
 {
 /* The mantissa value is zero, we have an Infinite value. */
 strcpy((char *) (--strBP), "Infinity");
 condcodeB = FFINF;
 }
 else
 {
 /* The mantissa value is non-zero, we have a Not-A-Number. */
 strcpy((char *) (--strBP), "Not-A-Number");
 condcodeB = FFNAN;
 }

 FFREE(fltfracBP);
 FFREE(fltexpBP);
 FFREE(expbiasBP);
 FFREE(exponeBP);
 FFREE(tempexpBP);
 return(condcodeB);
 }
 /* Ok floating point value. */
 FFREE(tempexpBP);
 /* Back to working on exponent... Subtract exponent bias, note if result
 * is negative, sign is properly extended. Important since it allows
 * exponent overflow and underflow to be detected much more easily. */
 isubm(fltexpBP, expbiasBP, expbyteN);
 /* Convert power of 2 exponent into an approximate power of 10 exponenet.
 * This is done by converting exponent value into a float, then
 * multiplying it by log10(2). Result is converted back to a integer. */
 tempfltBP = (unsigned char *) FMALLOC(SINGLETOT, "FLTOSTR6");
 intoflt(fltexpBP, expbyteN, tempfltBP, SINGLEFRAC, SINGLEXP);
 fmultm(tempfltBP, log10of2AB, SINGLEXP, SINGLEFRAC);
 fltoint(tempfltBP, SINGLEFRAC, SINGLEXP, fltexpBP, expbyteN);
 /* Add one to the power of 10 exponent */
 iaddm(fltexpBP, exponeBP, expbyteN);
 /* Convert the exponent power of ten into a float value */
 ptenfltBP = (unsigned char *) FMALLOC(totalenN, "FLTOSTR7");
 tempexpBP = (unsigned char *) FMALLOC(expbyteN, "FLTOSTR8");
 for (i = 0; i < expbyteN; i++)
 *(tempexpBP + i) = *(fltexpBP + i);
 ptenccB = intopten(tempexpBP, expbyteN, ptenfltBP, fracbitN, expbitN);
 /* Check if either an overflow or underflow occurred. */
 if (((ptenccB & FFINF) == FFINF) ((ptenccB & FFZERO) == FFZERO))
 {
 if ((ptenccB & FFINF) == FFINF)
 {
 /* The power of ten is a bit too big... */
 isubm(fltexpBP, exponeBP, expbyteN);
 }
 else
 {
 /* The power of ten is a bit too small... */
 iaddm(fltexpBP, exponeBP, expbyteN);
 }
 /* Re initialize temporary copy of exponent value, and recalculate
 * the modified power of ten. */
 for (i = 0; i < expbyteN; i++)
 *(tempexpBP + i) = *(fltexpBP + i);
 ptenccB = intopten(tempexpBP, expbyteN, ptenfltBP, fracbitN, expbitN);
 }
 FFREE(tempexpBP);
 /* Divide the original float by the power of 10. */
 fdivm(fltvalBP, ptenfltBP, expbitN, fracbitN);
 /* Check if the result is less than 1/10. First generate
 * 1/10 in the desired floating point format. */
 onetenthBP = (unsigned char *) FMALLOC(totalenN, "FLTOSTR9");
 intopten(&minusoneB, 1, onetenthBP, fracbitN, expbitN);
 if ((i = fcmpm(fltvalBP, onetenthBP, expbitN, fracbitN)) < 0)
 {
 /* If so, divide the result by 1/10 (same as multiplying it by 10)
 ** and subtract 1 from the power of ten. */
 fdivm(fltvalBP, onetenthBP, expbitN, fracbitN);

 /* Subtract one from the power of 10 exponent */
 isubm(fltexpBP, exponeBP, expbyteN);
 }
 else if (i == 0)
 {
 /* The result is equal to 1/10... This is a special boundary
 * case that needs to be handled by brute force... */
 strcpy((char *) strBP, "1e+1");
 FFREE(fltfracBP);
 FFREE(fltexpBP);
 FFREE(expbiasBP);
 FFREE(exponeBP);
 FFREE(tempfltBP);
 FFREE(ptenfltBP);
 FFREE(onetenthBP);
 return(condcodeB);
 }
 /* Check if the result is greater than 1. First generate 1 in the
 * desired floating point format. */
 oneBP = (unsigned char *) FMALLOC(totalenN, "FLTOSTR10");
 intopten(&zeroB, 1, oneBP, fracbitN, expbitN);
 if (fcmpm(fltvalBP, oneBP, expbitN, fracbitN) > 0)
 {
 /* If so, multiply the result by 1/10 and add 1 to the power
 * of ten. */
 fmultm(fltvalBP, onetenthBP, expbitN, fracbitN);
 /* Add one to the power of 10 exponent */
 iaddm(fltexpBP, exponeBP, expbyteN);
 }
 /* fltvalBP points to a floating point number between 1/10 and 1. */
 /* Convert binary length of the mantissa into decimal digit length.
 * This gives the number of significant digits and determines the power
 * of then to multiply the mantissa by to convert it to an integer. */
 mantlenN = (fracbitN + 1);
 /* Convert mantissa bit length from an int to a format compatible with
 * two byte integers. */
 mantlenBP = (unsigned char *) FMALLOC(2, "FLTOSTR11");
 *(mantlenBP+1) = (unsigned char) mantlenN;
 *mantlenBP = (unsigned char) (mantlenN >> 8);
 /* Now do conversion from binary length to decimal digit length. */
 intoflt(mantlenBP, 2, tempfltBP, SINGLEFRAC, SINGLEXP);
 fmultm(tempfltBP, log10of2AB, SINGLEXP, SINGLEFRAC);
 fltoint(tempfltBP, SINGLEFRAC, SINGLEXP, mantlenBP, 2);
 FFREE(tempfltBP);
 /* Convert result back to int */
 mantlenN = (((unsigned int) (*mantlenBP)) << 8) +
 (unsigned int) (*(mantlenBP+1));
 mantlenN++;
 /* And then back to a two byte integer */
 *(mantlenBP+1) = (unsigned char) mantlenN;
 *mantlenBP = (unsigned char) (mantlenN >> 8);
 /* Calculate appropriate power of 10 to multiply mantissa by, then multiply
 * it. Convert result to an int, then to an ASCII string. */
 tempfltBP = (unsigned char *) FMALLOC(totalenN, "FLTOSTR12");
 intopten(mantlenBP, 2, tempfltBP, fracbitN, expbitN);
 fmultm(fltvalBP, tempfltBP, expbitN, fracbitN);
 mantlenN = mantlenN >> 1;
 mantintBP = (unsigned char *) FMALLOC(mantlenN, "FLTOSTR13");
 fltoint(fltvalBP, fracbitN, expbitN, mantintBP, mantlenN);

 condcodeB = intostr(mantintBP, mantlenN, strBP, (slenN - 2), 10);
 /* Free the temporary buffer for the two byte integer mantissa length */
 FFREE(mantlenBP);
 /* Add on exponent part */
 strBP += strlen((char *) strBP);
 slenN -= (2 + strlen((char *) strBP));
 *strBP++ = 'e';
 /* Determine the exponent sign. */
 if (((*(fltexpBP + expbyteN - 1)) & 0x80) != 0x80)
 *strBP++ = '+';
 condcodeB = intostr(fltexpBP, expbyteN, strBP, slenN, 10);
 FFREE(fltfracBP);
 FFREE(fltexpBP);
 FFREE(expbiasBP);
 FFREE(exponeBP);
 FFREE(tempfltBP);
 FFREE(ptenfltBP);
 FFREE(onetenthBP);
 FFREE(oneBP);
 FFREE(mantintBP);
 return(condcodeB);
}
/* Function: unsigned char *intopten(unsigned char *fltexpBP, int expbyteN,
** unsigned char *ptenfltBP, int fracbitN, int expbitN)
** This function calulates 10 to the integer value pointed to by fltexpBP.
** fltexpBP points to an integer value that is expbyteN bytes long. The float
** result is written to memory buffer pointed to by ptenfltBP. fracbitN and
** expbitN give number of bits in floating point's mantissa and exponent. */
unsigned char
#ifdef PROTOTYPES
intopten(unsigned char *fltexpBP, int expbyteN, unsigned char *ptenfltBP,
 int fracbitN, int expbitN)
#else
intopten(fltexpBP, expbyteN, ptenfltBP, fracbitN, expbitN)
unsigned char *fltexpBP;
int expbyteN;
unsigned char *ptenfltBP;
int fracbitN;
int expbitN;
#endif
{
 unsigned char condcodeB;
 unsigned char *ptenBP;
 unsigned char tenB;
 unsigned char oneB;
 int totalenN;
 /* Initialize one and ten values */
 oneB = 1;
 tenB = 10;
 /* Initialize the condition code byte to zero */
 condcodeB = 0;
 /* Determine the total byte length of the floating point number */
 totalenN = fftotlen(expbitN, fracbitN);
 /* Allocate space for the power of ten multiplier/divisor and initialize
 * it to 10. */
 ptenBP = (unsigned char *) FMALLOC(totalenN, "INTOPTEN");
 intoflt(&tenB, 1, ptenBP, fracbitN, expbitN);
 /* Initialize the result to 1. */
 intoflt(&oneB, 1, ptenfltBP, fracbitN, expbitN);

 /* Check the sign bit of the exponent */
 if (((*fltexpBP) & 0x80) == 0)
 {
 /* Exponent is positive. Multiply result value by binary power of 10
 * for each corresponding bit set in the exponent. Loop until the
 * exponent value is zero. */
 while (ucheckm(fltexpBP, expbyteN) != 0)
 {
 /* Check if the least significant bit is one. */
 if (((*(fltexpBP + expbyteN - 1)) & 1) != 0)
 {
 /* If so, multiply in the power of ten. */
 condcodeB = fmultm(ptenfltBP, ptenBP, expbitN, fracbitN);
 /* Check if an overflow has occurred */
 if ((condcodeB & FFINF) == FFINF)
 {
 FFREE(ptenBP);
 return(condcodeB);
 }
 }
 ushftrm(fltexpBP, expbyteN);
 /* Generate next binary power of 10 by squaring previous result. */
 fmultm(ptenBP, ptenBP, expbitN, fracbitN);
 }
 }
 else
 {
 /* Exponent is negative. Change sign of exponent value, then divide
 * result value by binary power of 10 for each corresponding bit set
 * in exponent. Loop until the exponent value is zero. */
 inegm(fltexpBP, expbyteN);
 while (ucheckm(fltexpBP, expbyteN) != 0)
 {
 /* Check if the least significant bit is one. */
 if (((*(fltexpBP + expbyteN - 1)) & 1) != 0)
 {
 /* If so, then divide the previous result by power of ten. */
 fdivm(ptenfltBP, ptenBP, expbitN, fracbitN);
 /* Check if an underflow has occurred */
 if ((condcodeB & FFZERO) == FFZERO)
 {
 FFREE(ptenBP);
 return(condcodeB);
 }
 }
 ushftrm(fltexpBP, expbyteN);
 /* Generate the next binary power of 10 by squaring the previous
 * result. */
 fmultm(ptenBP, ptenBP, expbitN, fracbitN);
 }
 }
 /* Clean up and return. */
 FFREE(ptenBP);
 return(condcodeB);
}







September, 1993
Algebra and the Lambda Calculus


The Jacal's tale




Aubrey Jaffer


Aubrey is a programmer and electronic engineer who also serves as board member
of the League for Programming Freedom. He can be reached at 84 Pleasant
Street, Wakefield, MA 01880.


Some years ago, in order to facilitate the design of constant impedance
electrical filters (diplexers), I wrote a symbolic circuit-analysis program.
The initial version of the program was written in Lisp over a two-week period.
It implemented canonical, rational expressions and was not particularly
sophisticated (it used, for example, Euclidean GCD); it worked directly with
the small-signal Laplace Transform of currents, voltages, and impedances.
After reading further about symbolic manipulation, I became fascinated with
the problem of canonical forms. This interest led to Jacal.
Jacal is a symbolic mathematics system for the simplification and manipulation
of equations and single- and multiple-valued algebraic expressions constructed
of numbers, variables, radicals, and algebraic, differential, and holonomic
functions. In addition, Jacal can work with vectors and matrices of the
aforementioned mathematical objects.
I wrote Jacal using SCM, my implementation of the Scheme language. Jacal and
SCM are both copylefted programs (distributed under the Gnu software license)
and are available from various Internet ftp sites. (You can use the Internet
"archie" utility to find out which sites; you can also purchase the package
from me.) My implementation of Scheme runs on Amiga, Atari-ST, MacOS, MS-DOS,
OS/2, NOS/VE, VMS, UNIX, and similar systems, and complies with the IEEE P1178
and R4RS specifications.
This article stems from my work with Jacal. It does not focus on details of
Jacal's implementation, but rather, describes an approach--to my knowledge,
unique--to implementing the lambda calculus in an algebraic system such as
Jacal. This article deals with an issue many programmers face, that of naming
conflicts, and shows how a symbolic-mathematics program addresses this
problem. In the study of mathematical logic, the usual connection made between
the lambda calculus and algebra is to construct the integers using the lambda
calculus and then construct algebraic (and other) formulas by encoding them by
numbers (known as "Godelizing" them, or assigning Godel numbers). Although the
notion of Godelized formulas may have been nonintuitive when it was invented
60 years ago, the concept of encoding formulas by integers is stock-in-trade
for programmers today.
My approach reverses this connection by implementing the lambda calculus on
top of an algebraic system. This approach gives Jacal the ability to represent
functions as members of an algebraic system. The resulting benefit is that all
of the system's operations (including simplification) can then be applied to
functions as easily as to expressions. The system can handle both function
application, as well as currying (partial application), and closures, all
using variable elimination from polynomials. Listing One, page 92, presents
the Scheme code that implements variable elimination. Before discussing how
this is done, let's briefly review some basic concepts.


Lambda Calculus and Currying


The lambda calculus, created by logician Alonzo Church in the 1930s, is
familiar to most programmers in the form of macros--for C programmers, this
means preprocessor #define directives. A lambda expression is similar to a
macro definition or a #define directive--all contain a list of symbols which
are bound to the arguments when the macro or lambda expression is applied.
Those symbols not bound by the macro are called free.
Currying (attributed to logician Haskell B. Curry), which may not be as
familiar, is the process of partially applying a function. For instance, if
the C macro preprocessor supported currying, then the directives
#define F(x,y,z) x+y*z
#define G F(a)
would be equivalent to
#define F(x,y,z) x+y*z
#define G(y,z) a+y*z


Algebraic Representation


Given an underlying representation for multivariate polynomials, we can
represent an equation as a multivariate polynomial with the understanding that
the polynomial is equal to 0. We can convert typical equations to this form by
multiplying both sides by the denominators and then subtracting the left side
from both sides. For instance, consider the following equation:
f/(c+d)=(a-b)/g. We can trivially convert this equation to the following
equivalent: 0=(a-b)c+(a-b)d-fg.
But how to represent expressions? The usual approach is to use a polynomial or
a ratio of polynomials. Instead, we'll introduce a special variable @, calling
it the "value variable." The value of a polynomial involving @ is that
polynomial "solved" for @. For instance, the expression (-1+x)/(1+x) can be
represented internally as: 0=-1+x+(-1-x)@.
This technique allows us to represent irrational expressions as well. For
example, the following expression is the root of a fifth-degree polynomial: 0=
-1-y-@-@5.
For the rest of this article, I won't show @ in the examples if the values can
be represented in usual mathematical notation without it (as Jacal does).


Substitution


At this level of algebraic abstraction we can accomplish all operations by
using variable elimination. Variable elimination is the process of combining n
polynomial equations so that m variables do not appear in the (n-m) resulting
equations (where n>m). Common techniques used include resultants (see Bareiss,
Upsensky, Hoffman, and Geddes et al.) and Groebner Bases (refer to Dube,
Hoffman, and Geddes et al.).
For example, given the following Jacal statement:
eliminate([a+c2=b,b+c2=2],[c]). After variable elimination, the statement
yields: 0=2+a-2b.
Common symbolic transformations can be accomplished by constructing auxiliary
polynomial equations and eliminating variables between them and the original
polynomial equations.
The operation we are interested in for the next section is substitution. We
can substitute an expression for a variable by constructing an auxiliary
equation of the variable and then eliminating that variable. Suppose we want
to substitute (a*x+b)2 for g in g+1/g. We construct the equation g=(a*x+b)2
and then the statement: eliminate([g=(a*x+b)2, g+1/g],[g]). After substition,
the statement yields the results in Example 1.
Eliminate deals only with polynomial equations, so remember that g+1/g
internally is: 0=1+g2-g@. This is also true for the result.


Functions


Similarly, @ can be used for arguments as well. We'll name these arguments @1,
@2, and so on.
Functions do not need to use all of their arguments. However, in this system,
a function must use at least one of its arguments. With this constraint, the
only difference between a function and an expression is the presence of @n
variables. Our functions can return either equations or expressions; those
containing @ are expressions and those without, equations.

An example of a function which ignores its first two arguments is:
lambda([x,y,z],1/z-z). This function can be represented using @ notation as:
(1-@32)/@3.
Using this scheme, functions can freely mix bound and free variables. Consider
the following expression, with free c and bound x and y: f :
lambda([x,y],c*(y+x)/(y-x)). This expression can be represented simply as: (c
@1+c @2)/(-@1+@2).
We can now apply this function. We don't have to always apply it to two
arguments, we can also apply it to just one. (This is currying an argument.)
The application g : f(x); substitutes x for @1 from the polynomial equation
for f. It also "bumps" @2 down to @1 (also by substitution). This results in a
new function of one argument: (cx+ @1)/(-x+@1)
If this function is applied to one argument (which is a nonfunction), the
result will be a nonfunction. For example g(a+b) yields:
((-a-b)c-cx)/(-a-b+x). This result is exactly the same as the result of
applying f(x,a+b).


Alpha-conversion


The above method is sufficient when the arguments to functions are not
themselves functions. But when applying functions to functions, differences in
the order of elimination produce different results. Consider applying @1-@2 to
the arguments (@2, @1). The result should be @2-@1. But if we curry an
argument, we get @2-@2 0 applied to @1.
This problem is similar to the inadvertent capture of free identifiers by
macros in languages like C and Lisp. For example, given the directives:
#define F(z) 1+G(z)
#define G(y) y*z
then F(x) expands to (1+z*z) rather than (1+x*z).
The solution to this problem is called "alpha-conversion" in the lambda
calculus. It is also termed "Hygenic Macro Expansion" in a paper of that title
by Kohlbecker, Friedman et al.
Since the names of bound identifiers are unimportant we will substitute new
names for those lambda variables for which we will later substitute arguments.
This eliminates possible conflicts between the variables bound in the current
function and variables in its arguments (which are free, relative to this
function).
In the above example substitute :@1 for @1 and :@2 for @2 in @1-@2 yielding
:@1-:@2. Now substitute @2 for :@1 and @1 for :@2. This then yields the
desired result @2-@1.
Currying an argument would substitute @2 for :@1 and @1 for :@2 in :@1-:@2 to
produce @2-@1. When this function is applied to the remaining argument, @1,
the result is @2-@1, as before.


Lambda


A symmetrical situation to currying of arguments is binding a variable over a
function. lambda([y],lambda([x],x-y)) should yield the same function as
lambda([y,x],x-y). The trick here is to "bump up" any lambda variables in an
expression when binding additional variables. To execute lambda([y],@1-y) we
substitute @2 for @1 and then @1 for y.


Vectors and Matrices


Vector and matrix-valued functions can be represented by vectors and matrices,
of which some entries are lambda expressions. Clearly, a vector or matrix
function applied to scalar arguments should return a vector or matrix with the
same shape as the function.
The case of a scalar function applied to vector arguments can work if the
multiplication used by the function is of a type compatible with the
arguments. Inner product is commutative while matrix multiplication is not.
Another possibility here is to have a mechanism for allowing lambda
expressions to reference elements of vector and matrix arguments.
The case of vector or matrix functions applied to vector or matrix arguments
is stickier. Allowing only elements of the arguments to be operated on is one
solution; another is to incorporate the structure of the arguments inside the
structure of the function or vice versa.


Differential Algebra


These techniques can be extended to differential algebra as well. In
differential algebra, the derivative of a variable, written as v', can act as
an variable in polynomials. Derivatives of derivatives are also allowed and
can be written as v'' and so on.
When applying a function, each distinct derivative of a lambda variable with a
corresponding argument requires that an equation be generated and that
derivative variable be eliminated. We equate the nth derivative variable with
the nth total derivative of its corresponding argument. For instance, to apply
the function @1'/@2' to (x3,x) we use the following statement:
eliminate([@1'/@2',@1'=(x3)',@2'=(x)'],[@1',@2']). This statement yields: 3
x2.
As illustrated by this example, differential operators are now as easily
expressed as functions. This worked well for the univariate case; what about
multiple variables? Given the function (@1'/@2')((x+y)2,x), the desired result
is shown in Example 2(a).
We need to set y' to 0 to get the expected answer. But in currying, we need to
have the differentials remain until all the lambda variables are consumed.
(@1'/@2')((x+y)2,@1) gives the result in Example 2(b).
The solution here, I suspect, is to remove from an expression not containing
lambda variable (@) differentials, the expression differentials which appear
in its denominator. However, at this writing, I do not have experience with
this solution.


References


Bareiss, E.H. "Sylvester's Identity and Multistep Integer-Preserving Gaussian
Elimination." Mathematics of Computation 22, 1968.
Uspensky, J.V. Theory of Equations. New York, N.Y.: McGraw-Hill, 1948.
Thomas W. Dube. "The Structure of Polynomial Ideals and Groebner Bases." SIAM
Journal of Computing, August, 1990.
Hoffmann, C. M. Geometric and Solid Modeling: An Introduction. San Mateo, CA.:
Morgan Kaufmann Publishers, 1989.
Geddes, K.O., S.R. Czapor, G. Labahn. Algorithms for Computer Algebra. Boston,
MA.: Kluwer Academic Publishers, 1992.
Kohlbecker, E.E., D.P. Friedman, M. Fellinson, and B. Duba. "Hygenic Macro
Expansion." ACM Conference on Lisp and Functional Programming, 1986.
 Example 1: Output from Jacal showing a result of variable elimination and
substitution
 Example 2: (a) The desired result of a sample differential function
application; (b) the actual result.
_ALGEBRA AND THE LAMBDA CALCULUS_
by Aubrey Jaffer



[[LISTING ONE]

;;; Excerpt from Jacal: Symbolic Mathematics System, written in Scheme.
;;; Copyright 1989-1993 Aubrey Jaffer. See the file "COPYING" in
;;; the Jacal distribution for terms applying to this program.

;;;; Variable elimination
(define (poly:elim poleqns vars)
 (cond (math:trace
 (display-diag "eliminating:")
 (newline-diag)
 (write-sexp (math->sexp (map var->expl vars)) *output-grammar*)
 (display-diag " from:")
 (newline-diag)
 (write-sexp (math->sexp (poleqns->licits poleqns)) *output-grammar*)))
 (do ((vs vars (cdr vs)) (polys poleqns) (poly #f))
 ((null? vs)
 (cond (math:trace
 (display-diag "yielding:")
 (newline-diag)
 (write-sexp (math->sexp polys) *output-grammar*)))
 polys)
 (do ((var (car vs))
 (pl polys (if (null? pl)
 (math-error "not enough equations" poleqns vars)
 (cdr pl)))
 (npl '() (cons (car pl) npl)))
 ((poly:find-var? (car pl) var)
 (set! poly (promote var (car pl)))
 (do ((pls (cdr pl) (cdr pls)))
 ((null? pls) (set! polys npl))
 (if (bunch? (car pls)) (math-error "elim bunch?" (car pls)))
 (set! npl (cons (poly:resultant poly (car pls) var)
 npl))))
 (if (bunch? (car pl)) (math-error "elim bunch?" (car pl))))))

(define (infinite-list-of . elts)
 (let ((lst (copy-list elts)))
 (nconc lst lst)))

;;; This tries to solve the equations no matter what is involved.

;;; It will eliminate variables in vectors of equations.
(define (eliminate eqns vars)
 (bunch:norm
 (if (some bunch? eqns)
 (apply map
 (lambda arglist (eliminate arglist vars))
 (map (lambda (x)
 (if (bunch? x) x (infinite-list-of x)))
 eqns))
 (poly:elim eqns vars))))

(define (elim:test)
 (define a (sexp->var 'A))
 (define x (sexp->var 'X))
 (define y (sexp->var 'Y))
 (test (list (list a 0 0 124 81 11 3 45))
 poly:elim

 (list (list y (list x (list a 0 0 2) (list a 0 1)) 1)
 (list y (list x (list a 5 1) 0 -1) 0 1)
 (list y (list x (list a -1 3) 5) -1))
 (list x y)))

(define (bunch:map proc b)
 (cond ((bunch? b) (map (lambda (x) (bunch:map proc x)) b))
 (else (proc b))))
(define (licits:for-each proc b)
 (cond ((bunch? b) (for-each (lambda (x) (licits:for-each proc x)) b))
 ((eqn? b) (proc (eqn->poly b)))
 (else (proc b))))
(define (licits:map proc b)
 (cond ((bunch? b) (map (lambda (x) (licits:map proc x)) b))
 ((eqn? b) (poleqn->licit (proc (eqn->poly b))))
 (else (proc b))))
(define (implicits:map proc b)
 (cond ((bunch? b) (map (lambda (x) (implicits:map proc x)) b))
 ((eqn? b) (poleqn->licit (proc (eqn->poly b))))
 ((expl? b) (proc (expl->impl b)))
 (else (proc b))))

;;; replaces each var in poly with (proc var).
;;; Used for substitutions in clambda and capply.
(define (poly:do-vars proc poly)
 (if (number? poly) poly
 (univ:demote (cons (proc (car poly))
 (map (lambda (b) (poly:do-vars proc b))
 (cdr poly))))))
(define (licits:do-vars proc licit)
 (licits:map (lambda (poly) (poly:do-vars proc poly))
 licit))

;;;; Canonical Lambda
;;;; This needs to handle algebraic extensions as well.
(define (clambda symlist body)
 (let ((num-new-vars (length (remove-if lambdavar? symlist))))
 (licits:do-vars

 (lambda (var)
 (let ((pos (position (var:nodiffs var) symlist)))
 (cond (pos (lambda-var (+ 1 pos) (var:diff-depth var)))
 ((lambdavar? var) (bump-lambda-var var num-new-vars))
 ((lambdavarext? var) (bump-lambda-ext))
 (else var))))
 body)))

(define (clambda? cexp)
 (cond ((number? cexp) #f)
 ((matrix? cexp) (some (lambda (row) (some clambda? row)) cexp))
 ((expr? cexp) (poly:find-var-if? cexp lambdavardep?))
 ((eqn? cexp) (poly:find-var-if? (eqn->poly cexp) lambdavardep?))
 (else #f)))

;;;In order to keep the lambda application hygenic (in case a function
;;;of a function is called), we need to substitute occurences of
;;;lambda variables in the body with shadowed versions of the
;;;variables before we eliminate them. See:
;;; Technical Report No. 194

;;; Hygenic Macro Expansion
;;; E.E.Kohlbecker, D.P.Friedman, M.Fellinson, and B.Duba
;;; Indiana University
;;; May, 1986

;;;currently capply puts the structure of the clambda inside the
;;;structure of the arguments.
(define (capply body larglist)
 (let* ((arglist (licits->poleqns larglist))
 (arglist-length (length arglist))
 (svlist '()) (dargs '())
 (sbody
 (licits:do-vars
 (lambda (var)
 (cond
 ((lambdavar? var)
 (let ((lshf (- (min-lambda-position var) arglist-length)))
 (cond ((< 0 lshf) (bump-lambda-var var (- arglist-length)))
 (else (set! var (var:shadow var))
 (set! svlist (adjoin var svlist))
 var))))
 ((not (lambdavarext? var)) var)
 ;; must be some sort of extension
 ((radicalvar? var) var)))
 body)))
 (set! dargs (diffargs svlist arglist))
 (implicits:map (lambda (p) (eliminate (cons p dargs) svlist)) sbody)))
(define (bump-lambda-var var delta)
 (lambda-var (+ (lambda-position var) delta) (var:diff-depth var)))
(define (diffargs vlist args)
 (map (lambda (var)
 (bunch:map (lambda (e)
 (univ:demote (cons var (cdr (licit->poleqn e)))))
 (diffarg var args)))
 vlist))

(define (diffarg var args)
 (cond ((var:differential? var)
 (total-differential (diffarg (var:undiff var) args)))
 (else (list-ref args (- (lambda-position var) 1)))))






















September, 1993
Examining the Windows AARD Detection Code


A serious message--and the code that produced it




Andrew Schulman


Andrew is a contributing editor to DDJ, and coauthor of the books Undocumented
DOS and Undocumented Windows. Portions of this article are excerpted from
Undocumented DOS, Second Edition, (Addison-Wesley, 1993).


If you were one of the thousands of Windows 3.1 beta testers, and if you
happened to be using DR DOS rather than MS-DOS, you probably butted heads with
a seemingly innocuous, yet odd, error message like that in Figure 1. As you'll
see, this message is a visible manifestation of a chunk of code whose
implementation is technically slippery and evasive.
While it's impossible to gauge intent, the apparent purpose of this code is to
lay down arbitrary technical obstacles for DOS-workalike programs. The message
appears with the release labeled "final beta release (build 61)" (dated
December 20, 1991), and with "pre-release build 3.10.068" (January 21, 1992).
Similar messages (with different error numbers) are produced in builds 61 and
68 by MN.COM, SETUP.EXE, and by the versions of HIMEM.SYS, SMARTDRV.EXE, and
MSD.EXE (Microsoft diagnostics) packaged with Windows.
Although the error is non-fatal--that is, the program can continue
running--WIN.COM's default behavior is to terminate the program, rather than
continue.
The message first appeared in build 61, a late-stage beta, and seemed to
disappear in the final retail release of Windows 3.1. However, the code that
generates the message is present in the retail release, albeit in quiescent
form, and executes every time you run Windows 3.1.
It's significant that the message, which appeared when running on DR DOS
(including Novell's "Novell DOS 7" beta), did not appear when running on
MS-DOS or PC-DOS. This raises the question then, what causes the error
message? As it turns out, finding the answer required substantial system-level
sleuthing, an interesting challenge in its own right.
In this article, I'll summarize the results of chasing this chunk of source
which I call the "AARD code" (after a plain-text signature that's buried
within the otherwise-encrypted code). The full technical details of the
chase--the run-time disassembly and decryption of the code--is the subject of
a subsequent DDJ article. The raw information is available now in electronic
form; see "Availability," page 3. Here, I'll present a pseudocode summary of
the AARD code, then focus on the code's effects and implications rather than
on precise details of its implementation.


Maybe It's a Bug?


Whether in spite or because of the books Undocumented DOS and Undocumented
Windows, I've often had to publicly defend Microsoft against what I felt were
acts of scapegoating from whining competitors (including Novell, Borland,
Lotus, and Wordperfect), complaints which remind me of the way some Americans
like to blame Japan for what are ultimately our own domestic problems.
In fact, much of Microsoft's practice, far from targeting competitor's
applications, points in the opposite direction: Microsoft sometimes goes to
extremes to maintain compatibility, even with competitor's mistakes (see, for
example, the crazy GetAppCompatFlags() function discussed in Chapter 5 of
Undocumented Windows).
Certainly, it's true that DOS workalikes such as DR DOS have to pretend to be
an older version of DOS (DOS 3.31, for instance) if they want to run Windows
Enhanced mode. This is because of an undocumented interface shared by the
Windows Enhanced mode DOSMGR virtual device driver (VxD) inside WIN386.EXE and
MS-DOS 5 and 6. To appear as more recent versions of DOS, would-be clones must
reverse-engineer and implement this undocumented protocol.
So whenever I've heard accusations that Microsoft practices so-called "cruel
coding" to keep Windows from running on DR DOS, I look at the facts: Windows
3.1 Enhanced mode does run on DR DOS. Standard mode does not run, but that's
because of a DR DOS bug acknowledged by Novell (see Undocumented DOS, Second
Edition).
Consequently, if you didn't know how the error message in Figure 1 was
generated, it's reasonable to think that it's the manifestation of yet another
bug in Novell DOS. (It wouldn't be the first time company N's bug has been
misinterpreted as company M's "deliberate incompatibility.")


Defeating a Debugger


The first step in discovering why the error message appeared under DR DOS but
not MS-DOS was to examine the relevant WIN.COM code. However, the WIN.COM code
that produced this message turned out to be XOR encrypted, self-modifying, and
deliberately obfuscated--all in an apparent attempt to thwart disassembly.
The code also tries to defeat attempts for a debugger to step through it. For
example, Figure 2 shows a code fragment in which the INT 1 single-step
interrupt is pointed at invalid code (the two bytes FFh FFh), which disables
DEBUG. The same is done with INT 2 (nonmaskable interrupt) and INT 3 (debug
breakpoint). However, since modern debuggers (I used Nu-Mega's Soft-ICE) run
the debugger and debuggee in separate address spaces, the AARD code's
revectoring of INTs 1-3 has no affect on the Soft-ICE debugger. In any case,
these attempts to throw examination off-track are in themselves revealing.
For whatever reasons, while much of it is XOR encrypted, the code contains, as
plain-text, a Microsoft copyright notice and the initials "AARD" and "RSAA,"
perhaps the programmer's initials.


A Gauntlet of Tests


Figure 3 shows a pseudocode summary of the disassembled code. In essence, this
code (which, remember, is part of Windows, a product sold separately from
MS-DOS) checks for genuine MS-DOS or PC-DOS. As seen in Figure 3, the AARD
code relies heavily on undocumented DOS functions and data structures. The
undocumented INT 21h Function 52h is called to get a pointer to the DOS
internal SysVars structure, popularly known as the "List of Lists." SysVars
contains pointers to other DOS internal data structures, such as the current
directory structure (CDS) and system file table (SFT). The AARD code checks a
number of these pointers in SysVars, ensuring that none are null.
Any moderately self-respecting DOS workalike should pass unscathed through
this gauntlet of tests. Interestingly, however, when this code is incorporated
in a device driver such as HIMEM.SYS, it fails under DR DOS 5 and 6. These
versions of DR DOS do not contain a genuine CDS, and the simulated CDS is
apparently not set up until after device-driver initialization time. Thus, the
Windows 3.1 beta HIMEM.SYS produces a non-fatal error message under DR DOS 5
and 6. Similarly, the AARD code fails under the Windows NT beta, where the DPB
pointer in SysVars is null. Finally, the code fails in an OS/2 DOS box, where
the DOS version number is 10.0 or greater (for example, OS/2 2.1 masquerades
as DOS 20.10).
The crucial and, appropriately, most obfuscated test, however, appears at the
end of the AARD test gauntlet. This test, which was unraveled by Geoff
Chappell (geoffc@cix.compulink.co.uk) first checks to see whether a network
redirector (such as MSCDEX) is running. If a redirector is running, the AARD
code checks that DOS's default upper case-map is located in the DOS data
segment. If a redirector is not running, the code checks that the pointer to
the first simulated-file control block (FCB-SFT) is located on a paragraph
boundary; that is, it has a 0 offset. For ease of reference, this code is
repeated in Figure 4.
All versions of MS-DOS pass this test; no version of DR DOS does.
To test whether this interpretation of the encrypted and heavily-obfuscated
code is correct, I wrote MSDETECT.C (Listing One, page 89). This program
(compiled with Microsoft C) performs the same tests as the original AARD code,
but without the obfuscations, and with more informative "error" messages. My
MSDETECT program succeeds under all versions of MS-DOS I tested (Compaq DOS
3.31, MS-DOS 5.0, MS-DOS 6.0), yet fails under all versions of DR DOS tested
(DR DOS 5.0, DR DOS 6.0, beta Novell DOS 7). If running under DR DOS with a
redirector, MSDETECT fails with the message "Default case map isn't in DOS
data segment!". Otherwise it fails under DR DOS with the message "First
FCB-SFT not located on paragraph boundary!".


A Gratuitous Gatekeeper


But what does "country information" like the DOS default upper case-map have
to do with a network redirector? Why does a piece of Windows care whether this
mapper is located in the DOS data segment? And why should it care whether the
first FCB-SFT is located on a paragraph boundary? What kind of "errors" are
these, anyway?
These are all reasonable questions. In fact, the address of the default upper
case-map has nothing to do with the network redirector, and no other part of
Windows cares about what particular form is taken by DOS's default case-map or
first FCB-SFT pointers. The AARD code has no relation to the actual purpose of
the five otherwise-unrelated programs into which it has been dropped. It
appears to be a wholly arbitrary test, a gratuitous gatekeeper seemingly with
no purpose other than to smoke out non-Microsoft versions of DOS, tagging them
with an appropriately vague "error" message.
Suitably, the section of the AARD code that performs this crucial test
(highlighted in Figure 4) is the most heavily XOR encrypted and obfuscated.
The test in Figure 4 is the critical piece of information used by Windows to
determine if it is running on MS-DOS, or on a DOS "emulator." But this code
seems to have no technically-valid purpose, checking instead some rather
unimportant aspects of DOS. In short, you can have an otherwise perfectly
workable DOS, capable of running Windows, and yet not pass this test.
To see if the case-map and FCB-SFT tests serve a technically useful purpose, I
used Microsoft's SYMDEB debugger to slightly alter ("denormalize") DOS's
pointers to the default case-map and the FCB-SFT. As you may recall, it's
possible to change a real-mode segment:offset pointer without necessarily
changing what location it points to. In real mode, a single memory location
can be addressed by different pointers; there are many combinations of
different segment and offset values that all resolve to the same physical
address and are therefore equivalent. Windows (and all other software I ran)
was unaffected by my change to these pointers. As Figure 5 shows, the only
software that noticed was my MSDETECT and the AARD code in WIN.COM.

In other contexts (such as MSD's need to identify the operating system), it
would be perfectly legitimate to walk internal DOS data structures to see that
they were the same as would be expected under genuine MS-DOS. However, that
WIN.COM and other programs incorporating AARD code don't make any use of the
information gained in this way, other than to print the non-fatal error
message, suggests a deliberate incompatibility, rather than a legitimate need
to know some information about the underlying DOS.
The very non-fatality of the "error" further underscores the fact that it
isn't Windows's legitimate business to care whether it's running on genuine
MS-DOS. If the program can continue running despite the detected "error," then
how much of an error was it to begin with? It seems that the only "error" is
that the user is running Windows on someone else's version of DOS.


Does Beta Code Really Matter?


The non-fatal error message appeared only in two widely-distributed beta
builds of Windows. But since the retail version of Windows 3.1 doesn't produce
it, this is just dead history, right?
Not quite. Anyone with a copy of Windows 3.1 can hex dump WIN.COM (or WIN.CNF,
from which WIN.COM is built during Windows setup) and see the error message
(including the mention of "beta support") and the AARD and RSAA signatures.
Using DEBUG, you can try your hand at unassembling the AARD code at offset
3CE2h in WIN.COM. In other words, the crazy-looking AARD code paraphrased in
Figure 3 executes everytime you run Windows. The AARD code remains in Windows
SETUP and in the Windows version of SMARTDRV.EXE (it appears to have been
removed from HIMEM and MSD).
It's perfectly natural for software to contain vestigial remnants of past
implementations. For example, WIN.COM also refers to the short-lived MSDPMI
utility from the Microsoft C 7.0 beta. But in the case of the AARD code, new
instructions were added to the AARD portion of Windows 3.1 retail
WIN.COM--instructions that weren't present in the beta.
In the retail version of WIN.COM, the AARD code contains additional
instructions as well as a control byte. The control byte determines whether or
not the error message appears; this byte is currently 0. As shown in Figure 6,
when running the retail WIN.COM under DR DOS, you can easily use DEBUG to turn
on the control byte, and the message is issued just as under the beta
versions. Changing the single byte at offset 16D4h in WIN.COM triggers the
printing of the message, running on DR DOS or an MS-DOS in which the FCB
and/or case-map pointers have been suitably denormalized.
This opens the door for Microsoft to reenable this byte in the retail shipping
version in the future, if it chooses. There's no indication that Microsoft
plans to do so, but it remains that neither the code nor warning message were
removed--and, in fact, code was added. (I wonder to what extent you can
dismiss something if it's only present in a beta and not in the retail
version. The sheer size of Microsoft's beta test programs are significant
product releases in themselves. I'm not sure of the number of Windows 3.1 beta
sites, but a Microsoft article on the earlier DOS 5.0 beta claimed over 7000
beta sites. The Windows NT beta program reportedly shipped 70,000 units to
influential developers and corporate beta testers. The size of the Windows 3.1
beta program was likely somewhere in between.)


So What?


A non-fatal error message in a beta version--that's it? If you have an axe to
grind with Microsoft, you may have expected some more nakedly
robber-baronesque behavior. If this is the worst that can be found, perhaps
things aren't so bad after all. However, other examples of similar behavior
have surfaced, including a warranty-related error message in QuickC and
Microsoft C 6.0 (discussed in Chapter 4 of Undocumented DOS, Second Edition).
While it's difficult to second-guess the precise goal of the encrypted and
obfuscated AARD code, its results are clear enough. Windows beta sites that
used DR DOS rather than MS-DOS might have been scared into not using DR DOS.
("Doctor, every time I do this I get a non-fatal warning." "Then stop doing
it.")
The effect of the AARD code is to create a new and highly artificial test of
DOS compatibility. The obfuscations and encryptions make it difficult to even
determine what is being tested. An indication that the AARD code's obfuscation
is successful is the fact that Novell's most recent version of DR DOS (that
is, Novell DOS 7) fails the test, even though it is otherwise far more
compatible with MS-DOS than previous versions.


Microsoft's Initial Response


I've presented the substance of these findings to Microsoft, at both
engineering and management levels. At press time, a detailed response was not
forthcoming, perhaps due to the ongoing FTC investigation. It's likely that a
subsequent issue of DDJ will contain a more specific response. However, a
high-level manager at Microsoft repeatedly told me that the company is
"agnostic" regarding DR DOS. He added, "They [Novell] claim 100 percent
compatibility, but DR DOS is full of bugs. If DR DOS has problems running
Windows, Novell should fix them."
The implication is that if a Windows/DR DOS user gets an error message that a
Windows/MS-DOS user doesn't, then by definition it is Novell's fault and proof
that DR DOS isn't "100 percent DOS compatible." The problem with this is that,
as Figure 5 shows, the AARD code's test for DOS compatibility is 100 percent
artificial. By Microsoft's definition, only MS-DOS or something byte-for-byte
identical with MS-DOS (and therefore in violation of copyright) is "100
percent DOS compatible."
As for "agnostic," this seems unlikely given the effort required to write this
tricky code. Its presence in five otherwise-unrelated programs also suggests a
fairly concerted effort, as it is unlikely that five so different programs are
all maintained by the same person. In fact, the programs probably fall under
the domain of several different product managers or divisions.


Undocumented Interfaces and the Industry


The AARD code once again raises the issue of undocumented interfaces in the
software industry. Because it is relatively easy for competitors to be
compatible with a documented interface, companies try to create artificial
kingdoms by selectively documenting only parts of their product interfaces.
You have to wonder if that's the case here. Whenever an application calls on
undocumented DOS services and uses data structures internal to DOS, as many
successful applications now do, it ties itself more closely to the MS-DOS
binary--to a particular sequence of bytes--rather than to the more-general DOS
standard.
Furthermore, Microsoft apparently encourages this reliance on undocumented
interfaces through "selectively documented" interfaces whereby Microsoft
selectively allows some of its competitors and/or customers access to an
interface, while denying similar access to other companies and to the rest of
the developer community. There are numerous instances of this, including the
XMS 3.0 specification, the Global EMM Import specification, and the LoadHi
code for Windows 3.1.
Much of the discussion revolving around the need for a "Chinese Wall" at
Microsoft has focused on the apparent absence of any genuine wall between the
applications and operating-systems groups. The AARD code suggests that, if
there's a need for an application/operating-systems wall, there may be the
need for one between MS-DOS and Windows too. No one would dispute that DOS and
Windows have to work together smoothly, but that togetherness needs to be
open.
"Chinese Walls" are good engineering practice. They're what software
engineering calls "firewalls": narrow and well-documented interfaces. But the
standard software-engineering texts fail to mention that to properly document
interfaces is also to throw them open to potential competition, and that,
conversely, undocumented interfaces are a way of creating and reinforcing an
enviable monopoly position. Hopefully, Microsoft's AARD code, which is not
only an undocumented interface but--something new--an encrypted one, is not
intended to stifle competition unfairly.


Systems Rivalry and the Courts


While I'm (mercifully) not an attorney, I have found that some of the legal
literature surrounding the issue of "deliberate incompatibilities" to be
fascinating reading. PC-centric readers may well be surprised that many of the
issues surrounding Microsoft's travails with the U.S. Federal Trade Commission
have been dealt with before in cases involving companies such as Eastman Kodak
and IBM. "Deliberate incompatibilities" forms a fairly well-established part
of antitrust law, going under monikers such as "non-price predation" and
"predatory innovation."
One issue in the FTC's investigation of Microsoft was the relation between two
of Microsoft's operating-systems products, Windows and MS-DOS. In particular,
the FTC's Bureau of Competition tried to determine whether Microsoft had "done
something" to Windows to deliberately keep it from running with Novell's DR
DOS, which competes with MS-DOS.
Despite the relative insignificance of DR DOS in the market, this is an
important question. MS-DOS and Windows are sold as separate products. While
Microsoft wants to make MS-DOS a better platform for Windows, creating an
artificial tie between Windows and MS-DOS for the sole purpose of hurting
Novell would constitute unfair competition.
The two crucial words here are "sole" and "artificial." Surely, Microsoft
should be allowed to improve Windows, even in ways that might ultimately hurt
DR DOS. This is a legitimate part of the competitive process, and whining
about "predatory innovation" has rightly been rejected by the courts. For
example, many manufacturers of so-called "plug compatibles" tried in the '70s
to have the courts characterize IBM's System/360 as "predatory innovation."
The courts rejected these claims (along eventually with all of U.S. v. IBM,
1969-82), thereby "effectively requiring plaintiffs to prove that the
defendant's design had no redeeming virtue for consumers" (Stephen F. Ross,
Principles of Antitrust Law, Foundation Press, 1993). A detailed analysis from
IBM's perspective, Folded, Spindled, and Mutilated: Economic Analysis and U.S.
vs. IBM by Franklin Fisher et al. (MIT Press, 1983) makes sobering reading for
anyone who might think there is some kind of open-and-shut case against
Microsoft. Good luck, Novell.
Cases involving Eastman Kodak also have many parallels to Microsoft. In
general the courts have sided with Kodak against competitors, although a
recent Supreme Court case (Kodak v. Image Technical) went the other way. For a
look at how Kodak has possibly created deliberate incompatibilities, see
"Structural Monopoly, Technological Performance, and Predatory Innovation:
Relevant Standards under Section 2 of the Sherman Act" by James W. Brock
(American Business Law Journal, Fall 1983). Likewise, Antitrust Economics on
Trial: A Dialogue on the New Laissez-Faire by Brock and Walter Adams
(Princeton University Press, 1991) contains a useful section on "the Predation
'Problem'."
An apparently important article for the FTC is "Anticompetitive Exclusion:
Raising Rivals' Costs to Achieve Power over Price" by Thomas Krattenmaker and
Steve Salop (Yale Law Journal, December 1986). Merely the phrase "raising
rivals' costs" is a useful handle for anyone trying to ponder Microsoft's
current role in the software industry.
A far more interesting article is "Predatory Systems Rivalry: A Reply" by
Ordover, Sykes, and Willig (Columbia Law Review, June 1983), which describes
"systems rivalry" as follows:
Suppose that company A manufactures a product system with two components, A1
and A2, each sold separately. Company A has monopoly power over A1, but
company B competes in the market for the second component with its compatible
offering, B2. Thus, consumers initially can use a product system comprised of
either A1 and A2 or A1 and B2. Company A now introduces a new product system,
A1' and A2', which serves roughly the same function for consumers as the old
product system. Component B2, however, is incompatible with A1'. Furthermore,
company A discontinues the sale of A1 or else reprices A1 substantially higher
than before. As a consequence, consumers switch to the new product system and
company B is driven from the market for component two.
When, if ever, should the antitrust laws sanction company A for driving B out
of the market?
There's a clear comparison here to Windows (A1), MS-DOS (A2), and DR DOS (B2).
The scenario discussed in the article, which has nothing to do with
operating-system software, underscores the fact that there's really nothing
new in the questions surrounding Microsoft.
Normally, A driving B out of the market is what competition is all about.
That's goal of competition, and should be protected. So how can you tell when
this ceases to be honest competition, and becomes predation? "Predatory
Systems Rivalry" provides a good summary:
...the plaintiff must bear the burden of proof on this issue. To establish the
illegitimacy of R&D expenses by a preponderance of the evidence, the plaintiff
would most likely need a "smoking gun"--a document or oral admission that
clearly reveals the innovator's culpable state of mind at the time of the R&D
decision. Alternatively, the plaintiff could prevail if the innovation
involves such trivial design changes that no reasonable man could believe that
it had anything but an anticompetitive purpose.
 Figure 1: Typical message generated by the AARD code, produced, in this case
from SETUP.EXE.
--A.S.
Figure 2: The AARD code attempts to disable a debugger, by pointing INT 1
(single step) at invalid code (the two bytes FFh FFh). The same operation is
performed with INT 2 (nonmaskable interrupt) and INT 3 (breakpoint). This
disassembly is from the Windows 3.1 retail version of WIN.COM.
C:\DDJ\AARD>debug \win31\win.com
-u 3d0a

;;; Note that setting DS to 0; going to fiddle with intr vect table
7055:3D0A 33C0 XOR AX,AX
7055:3D0C 8ED8 MOV DS,AX
;;; ...
7055:3D12 A10400 MOV AX,[0004] ; get INT 1 offset
7055:3D15 2EA3D034 MOV CS:[34D0],AX ; save away
7055:3D19 A10600 MOV AX,[0006] ; get INT 1 segment
7055:3D1C 2EA3D234 MOV CS:[34D2],AX ; save away
7055:3D20 BBAC3F MOV BX,3FAC ; set new intr handler offset
7055:3D23 891E0400 MOV [0004],BX
7055:3D27 8C0E0600 MOV [0006],CS ; set new intr handler segment
-u 3fac
6B30:3FAC FFFF ??? DI ; the new intr handler
6B30:3FAE CF IRET ; is invalid code!
Figure 3: Pseudocode of AARD code, as found in WIN.COM
move (and fixup) code from 2D19h to 4E0h
call code at 4E0h
 call AARD code at 39B2h:
 -- see below
 IF (AX doesn't match 2000h)
 AND IF (control_byte is non-zero) ;; added in retail
 THEN overwrite BYTE at 4E0h to a RET instruction
; ...
IF (byte at 4E0h is a RET instruction)
 THEN issue non-fatal error message
 call AARD code at 39B2h:
 point INT 1, 2, 3 and at invalid code to confuse debuggers
 call undocumented INT 21h AH=52h to get SysVars ("List of Lists")
 copy 30h bytes to SysVars to stack
 copy first 4 bytes (DPB ptr) of copy of SysVars to stack
 IF DOS version >= 10.0 (i.e., OS/2)
 THEN don't set [bp+196h], so eventually OR AX, 2000h fails
 ELSE
 check fields in SysVars to ensure non-zero:
 SysVars[0] -- Disk Parameter Block (DPB)
 SysVars[4] -- System File Table (SFT)
 SysVars[8] -- Clock device
 SysVars[12h] -- Buffers header
 SysVars[16h] -- Current Directory Structure (CDS)
 SysVars[0Ch] -- CON device
 SysVars[22h] -- Device driver chain (NUL device next ptr)
 IF no SysVars fields are zero (MS-DOS, or WIN.COM in DR DOS)
 THEN set [bp+196h] so that eventually OR AX, 2000h succeeds
 ELSE some are zero (e.g., HIMEM.SYS in DR DOS)
 THEN don't set [bp+196h], so eventually OR AX, 2000h fails
 copy code
 jump to copied code
 copy and XOR code
 jump to copied and XORed code
 ;; the following crucial part was figured out by Geoff Chappell:
 IF a redirector is running (INT 2Fh AX=1100h)
 AND IF default upper-case map (INT 21h AH=38h) in DOS
 data segment (undocumented INT 2Fh AX=1203h)
 OR IF no redirector
 AND IF FCB-SFT header (SysVars[1Ah]) offset == 0
 THEN DOS is considered okay
 ELSE (e.g., WIN.COM, SMARTDRV.EXE, etc. in DR DOS)
 THEN clear part of [bp+196h] so eventually OR AX, 2000h fails
 restore previous INT 1, 2, 3

 jump back to saved return address
Figure 4: The crucial AARD test for DOS legitimacy.
IF redirector running (INT 2Fh AX=1100)
 AND IF default upper-case map (INT 21h AH=38h) in DOS
 data segment (INT 2Fh AX=1203h)
OR IF no redirector
 AND IF FCB-SFT header at paragraph boundary (offset == 0)
THEN DOS is considered okay
Figure 5: The AARD test can be made to fail simply by changing the
outward form of the pointers it examines.
C:\UNDOC2\CHAP1>symdeb
Microsoft Symbolic Debug Utility
Windows Version 3.00
(C) Copyright Microsoft Corp 1984-1990
Processor is [80386]
;;; The first FCB-SFT is stored in this configuration at 0116:0040,
;;; so "denormalize" the pointer at that location, changing it from
;;; 05E4:0000 to 05E0:0040. This points to the same exact location,
;;; but since the offset isn't zero the AARD test fails.
-dd 0116:0040 0040
0116:0040 05E4:0000
-ed 0116:0040 05E0:0040
;;; Now normalize the pointer for the default case map. I had to
;;; disassemble the code for INT 21h AH=38h to find where this is
;;; located. The pointer is stored here at 0116:12A8. Below, the
;;; pointer is changed from 0116:0CF5 to 01E5:0005. This points
;;; to the same exact location, but the segment isn't 0116 (DOS data
;;; segment) anymore, so the AARD test fails.
-dd 0116:12a8 12a8
0116:12A8 0116:0CF5
-ed 0116:12A8 01E5:0005
-q
C:\WINB61>win
Non-Fatal error detected: error #2726
Please contact Windows 3.1 beta support
Press ENTER to exit or C to continue
C:\UNDOC2\CHAP1>msdetect
Default case map isn't in DOS data segment!
Figure 6: Enabling a single byte in the Windows 3.1 retail version
of WIN.COM resurrects the AARD code's non-fatal error message under
DR DOS.
C:\DRDOS6>debug win.com
DEBUG v1.40 Program Debugger.
Copyright (c) 1985,1992 Digital Research Inc. All rights reserved
CPU type is [i486 in virtual 8086 mode]
-d 16d4 16d4
2271:16D0 00
-e 16d4 1
-g
Non-Fatal error detected: error #2726
Please contact Windows 3.1 beta support
Press ENTER to exit or C to continue
Program terminated.
-q
_EXAMINING THE WINDOWS AARD DETECTION CODE_
by Andrew Schulman


[LISTING ONE]


/* MSDETECT.C -- Build program with Microsoft C: cl msdetect.c. A replication
of Microsoft's MS-DOS detection code from Windows 3.1 WIN.COM, SMARTDRV.EXE,
HIMEM.SYS, SETUP.EXE. The original Microsoft code (with the initials "AARD")
is
heavily XOR encrypted and obfuscated. Here the encryptions and obfuscations
have been removed. Andrew Schulman, May 1993, 617-868-9699,
76320.302@compuserve.com. Geoff Chappell (geoffc@cix.compulink.co.uk)
deciphered the original code's tests (upper case map segment, FCB-SFT) in the
case where the preliminary SysVars tests fail. Some of this material is
discussed in Geoff's forthcoming book, "DOS Internals" (Addison-Wesley, 1993).
The page numbers below are for the first edition of "Undocumented DOS"
(Addison-Wesley, 1990). The second edition will be out in September 1993. */

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <dos.h>

typedef int BOOL;
typedef unsigned char BYTE;
typedef unsigned short WORD;
typedef unsigned long DWORD;
typedef void far *FP;

BYTE far *_dos_getsysvars(void);
FP _dos_getcasemap(void);
WORD _dos_getdataseg(void);
BOOL _dos_isredirector(void);

void fail(const char *s) { puts(s); exit(1); }
main()
{
 BYTE far *sysvars;
 if ((sysvars = _dos_getsysvars()) == 0)
 fail("INT 21h AX=5200h returns 0!");
 if (_osmajor >= 0x0a)
 fail("DOS version >= 10; this is OS/2 (or early NT beta!)");
 #define SYSVARS(ofs) (*((FP far *) &sysvars[ofs]))
 #define SYSVARS_TEST(ofs, msg) if (! SYSVARS(ofs)) fail(msg)

 // these tests will pass under almost any DOS clone
 SYSVARS_TEST(0, "Disk Parameter Block (DPB) pointer in SysVars is 0!");
 SYSVARS_TEST(4, "System File Table (SFT) pointer in SysVars is 0!");
 SYSVARS_TEST(8, "CLOCK$ device pointer in SysVars is 0!");
 SYSVARS_TEST(0x12, "buffers header pointer in SysVars is 0!");
 SYSVARS_TEST(0x16, "Curr Directory Struct (CDS) ptr in SysVars is 0!");
 SYSVARS_TEST(0x0C, "CON device pointer in SysVars is 0!");
 SYSVARS_TEST(0x22, "Device chain pointer (from NUL) in SysVars is 0!");

 // the following tests fail under DR DOS 5 and 6 (and beta of Novell DOS 7)
 if (_dos_isredirector())
 {
 FP casemap = _dos_getcasemap();
 if (FP_SEG(casemap) != _dos_getdataseg())
 fail("Default case map isn't in DOS data segment!");
 printf("case map @ %Fp\n", casemap);
 }
 else
 {

 if (FP_OFF(SYSVARS(0x1A)) != 0) // see Undocumented DOS, p. 519
 fail("First FCB-SFT not located on paragraph boundary!");
 printf("FCB-SFT ptr @ %Fp -> %Fp\n", sysvars+0x1a, SYSVARS(0x1A));
 }
 // if get here, everything checks out
 puts("All tests check out: must be MS-DOS");
 return 0;
}
// undocumented function: see "Undocumented DOS", pp. 518-541
BYTE far *_dos_getsysvars(void)
{
 // could initialize ES:BX to 0:0 but the MS code doesn't do this
 _asm mov ax, 5200h
 _asm int 21h
 _asm mov dx, es
 _asm mov ax, bx
 // ES:BX retval moved into DX:AX
}
// see "Microsoft MS-DOS Programmer's Reference", p. 143
// formerly undocumented: see "Undocumented DOS", p. 599
BOOL _dos_isredirector(void)
{
 BYTE retval;
 _asm mov ax, 1100h
 _asm int 2fh
 _asm mov retval, al
 return (retval == 0xFF);
}
// undocumented function: see "Undocumented DOS", p. 627
WORD _dos_getdataseg(void)
{
 _asm push ds
 _asm mov ax, 1203h
 _asm int 2fh
 _asm mov ax, ds
 _asm pop ds
 // retval in AX
}
// get a far pointer to the default case map
// see "Microsoft MS-DOS Programmer's Reference", pp. 272-3
FP _dos_getcasemap(void)
{
 BYTE country_info[34];
 FP fp = (FP) country_info;
 _asm push ds
 _asm mov ax, 3800h
 _asm lds dx, dword ptr fp
 _asm int 21h
 _asm pop ds
 return *((FP far *) &country_info[18]);
}







































































September, 1993
The Z80180 and Big-number Arithmetic


Squeezing 512-bit operations out of 8-bit microcontrollers




Burton S. Kaliski, Jr.


Burt is chief scientist of RSA Laboratories, a division of RSA Data Security,
Inc. He received a PhD in computer science from MIT in 1988 and is interested
in cryptography and fast-arithmetic techniques. You can contact him
burt@rsa.com.


The growing importance of data security, combined with the increased power of
PCs, makes cryptography both a practical reality and critical necessity in
today's computer systems. But as cryptography becomes more widespread, it
touches tools that aren't as powerful as, say, processors like the 80486 or
Pentium. For instance, in one recent project, our challenge was to implement
512-bit RSA private-key operations in less than 10 seconds on Zilog's 8-bit
Z80180 microcontroller running at 10 million cycles/second.
In this article, I'll share what we've learned about implementing
mathematically-intensive systems like RSA on the Z80180.


Z80180 Overview


Zilog's Z80180 (and its counterpart, Hitachi's 64180) is an 8-bit embedded
controller that's commonly used in fax machines, modems, printers, and similar
applications. It extends the old Z80 processor with a multiply instruction
(critical to RSA performance), extra test and I/O instructions, and a "sleep"
mode. An internal memory management unit (MMU) maps 16-bit logical addresses
to 20-bit physical addresses.
The Z80180, which has a 1-Mbyte address space, has seven 8-bit registers A-E,
H, and L, and two index registers, IX and IY. Most arithmetic and logical
operations are performed with register A as an accumulator. The 16-bit
register pair HL is an accumulator for several 16-bit arithmetic operations,
with register pairs BC and DE as operands. The index registers also can
accumulate a 16-bit addition.
A multiply instruction multiplies a register pair, overwriting the pair with
the 16-bit product. Addressing modes include immediate, direct, register
indirect, and indexed based on IX and IY, among others. The usual branch
instructions are supported. A special instruction decrements register B,
branching if, and only if, the result is nonzero.
Z80180 parts are available that run at 10 million cycles/second; a static
version runs at up to 18 million cycles/second. In short, the Z80180 balances
performance, cost, and power.


RSA Cryptography Backgrounder


RSA cryptography is based on repeated multiplication, or exponentiation,
modulo the product n of two large prime numbers p and q. Each user has two
keys, a public key and a private key, and each user's modulus n is different.
Users publish the public key, and keep the private key private.
The public key consists of the modulus and a public exponent e. The private
key consists of the modulus and a private exponent d. The exponents are
related by the equations ed_1 mod (p--1)(q--1).
Either key can encrypt or decrypt. For instance, Alice can encrypt a message m
for Bob with Bob's public key (nB,eB) by computing c=meb mod nB. Bob recovers
the message from the ciphertext c with his private key(nB,dB) by computing
m=cdb mod nB. Alice can also encrypt a message with her private key, so that
anyone can recover the message with Alice's public key. Only Alice knows the
private key, so if the recovered message makes any sense, it must be from
Alice; the encrypted message is effectively her "digital signature."
Without the private key, decrypting ciphertext or "signing" a message is
generally believed to be as hard as factoring the modulus n into p and q.
Factoring, even for modest-sized RSA integers n (say, 512 to 1024 bits), is
generally considered a difficult problem. For further information on
public-key cryptography and RSA, see "Untangling Public-key Cryptography" by
Bruce Schneier (DDJ, May 1992), and my article "Multiple-precision Arithmetic
in C" (DDJ, August 1992).


Modular Exponentiation


How much work is involved in computing a modular exponentiation of the form
a=bc mod d?
The number of modular multiplications to compute bc mod d is at most c-1, and
it can be much less. Although no method is known for computing the optimal
sequence of multiplications for an arbitrary exponent c--that's an NP-complete
problem--heuristic methods get pretty good results.
The "binary" method takes l(c)+u(c) modular multiplications (see Knuth's The
Art of Computer Programming, Volume 2), where l(c) is one less than the length
of the binary representation of c, and u(c) is the number of 1-bits in the
binary representation. The method squares for each bit of the exponent except
the first, and multiplies for each 1-bit. Let _cl(c),_,c0 the bits of c from
most to least significant. This algorithm is shown in Figure 1.
This is pretty good, since no method can do better than log2c on average. The
"quaternary" method, which considers the exponent c as radix-4 digits,
averages about 1.375l(c) modular multiplications; radix-m for larger m may do
better. Exponent "recoding" may also help (refer to C.K. Koc's "High-radix and
Bit Recoding Techniques for Modular Exponentiation," International Journal of
Computer Mathematics, 1987).
The "Chinese Remainder Theorem" (CRT) approach speeds up RSA private-key
exponentiation by almost a factor of four by taking advantage of simpler
operations modulo the prime factors p and q. Instead of a 512-bit modular
exponentiation, we have two 256-bit modular exponentiations, each of which is
eight times faster with conventional methods than a 512-bit modular
exponentiation.
With CRT and quaternary, a 512-bit RSA private-key operation takes on average
about 2x1.375x256=704 256-bit modular multiplications, plus a small overhead
due to the CRT. Let's call this 750 256-bit modular multiplications, just to
be safe.


Representation


Since our operands are several hundred bits long, we'll represent them as
multiple-precision arrays. Let r be the digit radix of the machine, so r=256
for the 8-bit Z80180. (Our discussion carries over to any digit radix.) We
represent an integer x as an n-digit array x[0],_,x[n--1], according to the
sum in Figure 2. Thus x[0] is the least significant digit and x[n--1] is the
most significant.


Multiplication on the Z80180


We wish to compute a=bc, where a has 2n digits, and b and c have n digits. An
easy algebraic exercise gives the expression in Figure 3(a). Such an
expression leads to multiplication by "operand scanning," along the lines of
the grade-school approach. We multiply by digits of the operand b from least
to most significant, following the weights ri.

A somewhat more difficult exercise gives a different expression as shown in
Figure 3(b) which leads to multiplication by "product scanning," which is like
convolution in signal processing. We compute digits of the product a from
least to most significant, following the weights rk.
If you're unfamiliar with product scanning, see Figure 4 in which the
multiplication tableau consists of cross products between pairs of operand
digits. Their columnwise sum, with carry propagation right to left, gives the
product.
The operand-scanning method computes the product by accumulating partial
products [i]ri for each i. As Figure 5 illustrates, there are n iterations.
The variable x carries between iterations of the j loop. There is no carry
between iterations of the i loop.
Most of the work in this method is in the j loop, so we can estimate the
performance by implementing that loop; see Figure 6. Each iteration computes
xx + a[i+j] + b[i],c[j]x mod r, and xx/r_, where j varies. Register C
contains the digit b[i], index register IX points to c[j], and index register
IY points to a[i+j]. Register B counts down the number of iterations. Register
E carries the low half of x from one iteration to the next, and register D
contains the value 0. An iteration takes 105 cycles.
By hardcoding j as an index to the LD instructions, we can unroll the loop and
avoid the last three instructions. The result takes 82 cycles.
It's easy to see that there are n2 iterations of the j loop, so multiplication
with this method takes about 82n2 cycles, plus whatever overhead there is in
the n iterations of the i loop.
Product-scanning, on the other hand, computes the product by accumulating
partial products for each k. (See Figure 7.) As Figure 8 shows, there are 2n
iterations. The variable x accumulates within an iteration of the i loop, and
between iterations of the k loop. Its value is always less than nr2. If n  r
then x needs at most three digits.
The index k--i, like i, is always between 0 and n--1. On the last iteration of
the k loop, i ranges from n to n--1, so the i loop has no iterations.
The implementation of this method's i loop (Figure 9) takes 79 cycles per
iteration. Each iteration computes xx+b[i]c[k--i], where i varies. Register C
contains the value 0, index register IX points to b[i], and index register IY
points to c[k--i]. Register B counts down the number of iterations. Registers
A, H, and L carry the value of x from one iteration to the next. (You can save
a few more cycles by exchanging the roles of HL and IY. LD E,(HL) costs only
six cycles and INC HL only four, while ADD IY,DE costs ten, for a net savings
of eight cycles. But "shifting" IY after the i loop is a little harder. Also,
the Z80180 doesn't have an indexed addressing mode for HL. If we're going to
unroll the loop, we're best staying with IY as an index register.)
This method is simpler than the operand-scanning method on the Z80180, and
it's also 25 percent faster per iteration!
Unrolling, as expected, saves 23 cycles; we hardcode i as an index to the LD
instructions. The result takes just 56 cycles per iteration.
Multiplication with this method takes about 56n2 cycles, plus whatever
overhead there is in the 2n iterations of the k loop. Although there are more
iterations of the k loop here than iterations of the i loop in the
operand-scanning method, the additional overhead is not significant compared
to the 26n2-cycle savings.
We have our first estimate of RSA performance: 256-bit multiplication, with
n=32 and 10 percent overhead, takes 56x322x1.163,000 cycles, or 6.3
milliseconds at 10 million cycles/second.


Comparison


Why (and when) is product scanning faster than operand scanning? Product
scanning stores each product digit once, not after every multiply. Of course,
it fetches operand digits before every multiply. But it doesn't fetch product
digits at all! In all, product scanning has about one-third fewer memory
references than operand scanning. It also has many fewer "shifts."
Product scanning needs more register storage than register scanning. The
variable x in product scanning needs at least three digits, whereas in the
operand-scanning method it needs only one digit. With two pointers, two
multiplier inputs, and perhaps a counter already in registers, some processors
may not have enough register storage left. The Z80180 does have enough left, a
benefit of its 16-bit register pairs. (Other processors may have enough
registers for an r=256 implementation, but not an r=216 implementation;
operand scanning with r=216, if possible, would be preferable to product
scanning with r=256.)


BigMult Subroutine


BigMult is a subroutine that computes the product of two multiple-precision
integers. Listing One (page 90) gives a non-optimized C version, Listing Two
(page 90) is its header file, and Listing Three (page 90) is an optimized
assembly-language version in Z80180 assembler (written with Microtec
Research's Z80180 Assembler 6.0. It interfaces to Microtec's C 6.0.) In each
version, global variables BIG_MULT_A, BIG_MULT_B, BIG_MULT_C, and BIG_MULT_N
correspond to variables a, b, c, and n discussed earlier. BigMult implements
product-scanning, and consists of two parts--one for k from 0 to n--1, the
other for k from n to 2n--1. Each part calls an "inner loop" that computes the
sum in Figure 7. (This is the same as the i loop in Figure 9.) For
performance, the inner loop is unrolled, and that's where efficient RSA
methods give way to Z80180 programming techniques.
The unrolled inner loop presents a challenge because we have to jump to a
different point in the loop each time--the number of iterations varies from 0
to n--1. We must therefore jump to an indirect address. The only way to do
this seems to be to jump indirectly through one of the registers HL, IX, or
IY. But each of them already has a role in the unrolled loop.
The solution is a familiar one in Z80180 programming: Don't jump to the loop,
return to it. The sequence:
LD DE,(address)
PUSH DE
RET
loads an address into register pair DE, pushes it onto the stack, and returns,
thereby jumping to the address.
Since the inner loop is called from more than one place, the caller pushes the
return address on the stack before pushing the inner loop address. The inner
loop then returns to the right caller.
Index registers IX and IY point to operands b and c, respectively. During the
first half, IX moves ahead while IY stays fixed; during the second half, IY
moves ahead while IX stays fixed. The jump address changes each time by the
length of the inner iteration, so we just add or subtract that length before
jumping. The pointer to the product a is on the stack.
The REPT directive unrolls the inner loop, with the variable iteration ranging
from the number of iterations down to 1. The inner iteration hardcodes the
counter as an index to the load instructions that fetch b[i] and c[j].
BigMult can handle up to 128-byte numbers, being limited by the range of
offsets in the indexed load instruction. With a slight variation to the outer
loop, BigMult could handle larger numbers.


Modular Multiplication


Multiplication is half of modular multiplication; the other half is division.
Division takes only a little longer than multiplication, though implementing
it is generally more difficult.
An alternative is Montgomery multiplication which has the advantage in that it
can be implemented either by product or operand scanning--and its
implementation is almost identical to multiplication. Division seems
inherently to involve operand scanning, and we prefer product scanning for the
Z80180. (See the accompanying textbox "Montgomery Multiplication.")
A disadvantage of Montgomery multiplication is that you need to convert in and
out of the Montgomery "representation." But once you're in that
representation, Montgomery multiplication keeps you there. You need only to
convert at the start of modular exponentiation, and convert out at the end.
Montgomery multiplication code takes just about twice as many cycles as
ordinary multiplication, which brings us to around 12.6 milliseconds for
256-bit Montgomery multiplication.
It takes 750 times as long, or approximately 9.5 seconds, for a 512-bit RSA
private-key operation--comfortably within our 10-second goal.


Conclusions


Architectural features such as the Z80180's 16-bit HL register-pair open the
door to efficient methods such as multiplication by product scanning.
Programming techniques such as jumping with a return instruction also play an
important role.
Further work we're considering includes special code for modular squaring
which is 25 percent faster than modular multiplication. Since most of the
estimated 750 modular multiplications are squarings, we can expect to save
perhaps another two seconds.
Many of the methods and techniques can be extended to other processors, and
the area of cryptography implementation remains a critical and practical
challenge.


Acknowledgments



Zilog's Mark van Zanten, Alan Chan, and Adam Tucholski provided computing
support for the code described in this article. RSA's Jim Bidzos encouraged me
to publish the article, and RSA's Matt Robshaw offered comments and
suggestions.


Montgomery Multiplication


Most methods for modular multiplication involved division or an approximation
to it until Montgomery published a brief note in 1985 renewing earlier ideas
of Hensel (see "Modular Multiplication without Trial Division" by P.L.
Montgomery, Mathematics of Computation, 1985).
Let d be an odd integer, n be its length in digits, and r be the digit radix.
Let v be the least positive integer such that vd+1 is divisible by rn. This
value depends only on d, r and n, and can be computed by the extended
Euclidean algorithm (see Knuth).
Montgomery multiplication, which we write Md, is defined as Md(b,c)=(bc+(vbc
mod rn)d)/rn, and it has the following properties:
It involves only multiplications. The operation mod rn is just truncation,
since r is the digit radix. The operation /rn is just copying, since bc+(vbc
mod rn)d is divisible by rn.
If b and c are between 0 and d-1, then Md(b,c) is between 0 and 2d-1; it can
be reduced to between 0 and d-1 modulo d with, at most, one subtraction.
It obeys the usual multiplicative laws: Md(b,c)= Md(c,b); and
Md(Md(b,c),e)=Md(b,Md(c,e)). The identity is rn mod d.
It relates to ordinary modular multiplication through the ratio rn mod d:
Md((brn mod d),(crn mod d)) = bcrn mod d. For this reason, brn mod d is termed
the "Montgomery representation" of b. The ordinary product of Montgomery
representations is the Montgomery representation of the ordinary product.
Montgomery multiplication can be implemented either with product scanning or
with operand scanning. With product scanning, the computation of terms bc and
vbc mod rn can be interleaved, as in Figure 10.
The first k loop computes in a the equivalent of vbc mod rn, and adds the (vbc
mod rn)d term and the bc term. Since the least significant digits of the sum
are known to be 0, the loop does not store them, although it does carry
between iterations in the variable x.
The second k loop continues to add the (vbc mod rn)d and bc terms, storing the
sum in a. By starting at a[0] rather than a[n], the loop divides by rn in the
process.
The interleaved product-scanning approach saves loop overhead, and it needs
only n digits for a, not 2n digits as in multiplication followed by division,
or in a noninterleaved approach.
--B.S.K.
 Figure 1: Algorithm for modular exponentiation.
 Figure 2: Summation used to represent an integer x as an n-digit array.
 Figure 3: (a) Expression for operand scanning; (b) expression for product
scanning.
 Figure 4: Example of multiplication by product scanning.
 Figure 5: Algorithm for operand scanning.
Figure 6: Operand-scanning iteration in Z80180 assembler.
Cycles Instruction
 LOOP:
 ; Fetch c[j] and multiply by b[i].
 ; HL contains the 16-bit product.
14 LD H,(IX)
 4 LD L,C
17 MLT HL
 ; Add carry to HL.
 7 ADD HL,DE
 ; Fetch a[i+j], and add to HL.
14 LD E,(IY)
 7 ADD HL,DE
 ; Store new a[i+j] and copy new carry.
15 LD (IY),L
 4 LD E,H
 ; Next iteration.
 7 INC IX
 7 INC IY
 9 DJNZ LOOP
---
105
 Figure 7: Partial products in the product-scanning method.
 Figure 8: Algorithm for product scanning.
Figure 9: Product-scanning iteration in Z80180 assembler.
==============================================================================
 Cycles Instruction
==============================================================================
 LOOP:
 ; Fetch b[i] and c[k-i], and multiply.
 ; DE contains the 16-bit product.
 14 LD D,(IX)
 14 LD E,(IY)
 17 MLT DE
 ; Add product to AHL.
 7 ADD HL,DE

 4 ADC A,C
 ; Next iteration.
 7 INC IX
 7 DEC IY
 9 DJNZ LOOP
 --
 79
==============================================================================
 Figure 10: Algorithm for Montgomery multiplication by product scanning.

_THE Z80180 AND BIG-NUMBER ARITHMETIC_
by Burton Kaliski, Jr.


[LISTING ONE]

/* Copyright (C) RSA Data Security, Inc. created 1993. All rights reserved. */

#include "bigmult.h"

unsigned char BIG_MULT_A[256];
unsigned char BIG_MULT_B[128];
unsigned char BIG_MULT_C[128];
unsigned int BIG_MULT_N;

/* Computes a = b*c. Lengths: a[2*n], b[n], c[n]. */
void BigMult (void)
{
 unsigned long x;
 unsigned int i, k;
 x = 0;
 for (k = 0; k < BIG_MULT_N; k++) {
 for (i = 0; i <= k; i++)
 x += ((unsigned long)BIG_MULT_B[i])*BIG_MULT_C[k-i];
 BIG_MULT_A[k] = (unsigned char)x;
 x >>= 8;
 }
 for (; k < (unsigned int)2*BIG_MULT_N; k++) {
 for (i = k-BIG_MULT_N+1; i < BIG_MULT_N; i++)
 x += ((unsigned long)BIG_MULT_B[i])*BIG_MULT_C[k-i];
 BIG_MULT_A[k] = (unsigned char)x;
 x >>= 8;
 }
}



[LISTING TWO]

/* Copyright (C) RSA Data Security, Inc. created 1993. All rights reserved. */

extern unsigned char BIG_MULT_A[256];
extern unsigned char BIG_MULT_B[128];
extern unsigned char BIG_MULT_C[128];
extern unsigned int BIG_MULT_N;
void BigMult (void);


[LISTING THREE]


; Copyright (C) RSA Data Security, Inc. created 1993. All rights reserved.

 NAME BIGMULT

 SEGMENT DSEG,BYTE,DATA

 PUBLIC _BIG_MULT_A
_BIG_MULT_A DEFS 256
 PUBLIC _BIG_MULT_B
_BIG_MULT_B DEFS 128
 PUBLIC _BIG_MULT_C
_BIG_MULT_C DEFS 128
 PUBLIC _BIG_MULT_N
_BIG_MULT_N DEFS 2

address DEFS 2

 SEGMENT CSEG,BYTE,CODE
 ; BigMult computes a = bc.
 ; Assumes 1 <= n <= 128.
 ; Lengths: a[2*n], b[n], c[n].

 PUBLIC _BigMult
_BigMult:
 ; Save registers.
 PUSH IX
 ; Point address to end of inner loop.
 LD HL,endInnerLoop
 LD (address),HL
 ; Push pointer to a.
 LD IX,_BIG_MULT_A
 PUSH IX
 ; Set x <- 0, point IX to b[0], and point IY to c[0]. HL is x.
 ; C = 0 is a constant.
 LD HL,0
 LD IX,_BIG_MULT_B
 LD IY,_BIG_MULT_C
 LD C,0
 ; n iterations.
 LD A,(_BIG_MULT_N)
 LD B,A

firstLoop:
 ; Clear carry and call inner loop. The call is implemented with a "return"
 ; since there aren't enough registers to hold the address. The inner loop
 ; computes the inner iteration for (i,j) from (0,k) to (k-1,1).
 LD DE,firstLoopReturn
 PUSH DE
 LD DE,(address)
 PUSH DE
 XOR A
 RET

firstLoopReturn:
 ; Compute x <- x + b[k]*c[0].
 LD D,(IX)
 LD E,(IY)
 MLT DE

 ADD HL,DE
 ADC A,C
 ; Compute a[k] <- x mod r, and increment k.
 EX (SP),IX
 LD (IX),L
 INC IX
 EX (SP),IX
 ; Compute x <- x / r, where r is the digit radix.
 LD L,H
 LD H,A
 ; Move IX forward, move address back, decrement loop counter, and repeat.
 INC IX
 PUSH HL
 LD HL,(address)
 LD DE,-innerIterationLen
 ADD HL,DE
 LD (address),HL
 POP HL
 DJNZ firstLoop

 ; Move address forward. (Number of inner iterations should be the same the
 ; last time through the first loop as first time through second loop: n-1.)
 PUSH HL
 LD HL,(address)
 LD DE,innerIterationLen
 ADD HL,DE
 LD (address),HL
 POP HL
 ; n iterations.
 LD A,(_BIG_MULT_N)
 LD B,A

secondLoop:
 ; Clear carry and call inner loop. The call is implemented with a "return"
 ; since there aren't enough registers to hold the address. The inner loop
 ; computes the inner iteration for (i,j) from (k-n+1,n-1) to (n-1,k-n+1).
 LD DE,secondLoopReturn
 PUSH DE
 LD DE,(address)
 PUSH DE
 XOR A
 RET

secondLoopReturn:
 ; Compute a[k] <- x mod r, and increment k.
 EX (SP),IX
 LD (IX),L
 INC IX
 EX (SP),IX
 ; Compute x <- x / r.
 LD L,H
 LD H,A
 ; Move IY forward, move address forward, decrement loop counter, repeat.
 INC IY
 PUSH HL
 LD HL,(address)
 LD DE,innerIterationLen
 ADD HL,DE
 LD (address),HL

 POP HL
 DJNZ secondLoop

 ; Pop pointer to a.
 POP IX
 ; Restore registers and return.
 POP IX
 RET
 ; Unrolled inner loop. Each iteration computes x <- x + b[i]c[j] for some
 ; (i,j). HL is x, A is carry, IX points to b[last i], IY points to c[last j].
InnerLoop:
iteration DEFL 127
 REPT 127
 LD D,(IX-iteration)
 LD E,(IY+iteration)
 MLT DE
 ADD HL,DE
 ADC A,C
iteration DEFL iteration-1
 ENDR
endInnerLoop:
 RET
innerLoopLen EQU endInnerLoop - InnerLoop
innerIterationLen EQU innerLoopLen / 127
 END





































September, 1993
Accessing NetWare SQL Files Without NetWare SQL


Duplicating NetWare SQL functionality with less overhead




Douglas Reilly


Doug owns Access Microsystems, a software development house specializing in
C/C++ software development. He is also the author of BTFILER and BTVIEWER
Btrieve file utilities. Doug can be contacted at 404 Midstreams Road, Brick,
NJ 08724, or on CompuServe at 74040,607.


Multiple ways of storing information have led us to a virtual DataTower of
Babel. Even something as "simple" as a string, for example, can be represented
by familiar programming languages in at least three ways: "normal" strings,
where a 20-character string is made up of just 20 characters (what you see is
what you get); Pascal-style Lstrings, where a 20-character string takes 21
characters, with the first byte being a length byte (thus limiting a normal
Pascal string to 255 bytes); and C-style Zstrings, where a 20-character string
also takes up 21 characters, with the extra character used for a Null
terminator signaling the end of the string.
One solution to the problem of getting to your data is the use of a
higher-level interface that understands a variety of formats. Novell's NetWare
SQL (and XQL, its DOS cousin, also from Novell) provide just such a solution.
NetWare SQL provides a convenient way to access information within a database
without regard to the physical characteristics of the data files. NetWare SQL
data files can contain data with field types ranging from Intel-format
integers to Decimal fields (also known as BCD, binary coded decimal). Table 1
lists the NetWare SQL-supported data types. However, while NetWare SQL
provides this ability, the memory required to use it (possibly more than 200K)
can prohibit its use.
Thankfully, there's another way. Since Btrieve is the file manager engine
NetWare SQL uses, and since NetWare SQL files are simply Btrieve files that
have available additional descriptions of fields, you can duplicate the
functionality of NetWare SQL you want (that is, have access to the data
without regard to the physical layout of the file) with considerably less
overhead. In effect, this gives you access to NetWare SQL files without
requiring your users to have NetWare SQL.


What is a NetWare SQL Database?


Btrieve files, unlike many other types of database files, don't contain field
information within the files. Btrieve files do contain information on record
length, key types and positions, and details about keys, such as whether the
key is modifiable and whether duplicate values are allowed. Keys can be any of
the types listed in Table 1.
To provide access to the Btrieve files that make up its database, NetWare SQL
uses external files to describe how data is stored within the files.
Unsurprisingly, these data dictionary files are Btrieve files and have .DDF
extensions; see Table 2.
FILE.DDF, Table 2(a), contains information on each file in the database, as
well as each of the .DDF files. The File ID is an integer assigned by SQL to
uniquely identify the file, and link it to fields and indices stored in the
other data dictionary files. The File Name is used within SQL queries to
access the file. It's the "human readable" name of the file, like "PO Header"
or "Item Master." The File Location is used to store the file name the
operating system should use. Often, this field contains only the file name and
extension. The final file name is then the result of the search of a path-like
environment variable. The only current documented use of the File Flags field
is to use bit 4 signal if a file is a dictionary file (one of the .DDF files)
or a user-defined file.
FIELD.DDF, Table 2(b), contains information on each field in each file in the
database, as well as each field in the .DDF files. The Field ID, assigned by
SQL, identifies the field and links this file back to the FILE.DDF file. The
Field Name is used to identify the field within queries. The Data Type
describes the format of the data in this field. The Offset, Field Size, and
Decimals fields locate and help format the data in the field. The Flags field
uses bit 0 as a case flag for strings.
INDEX.DDF, Table 2(c) contains information on each index of each file in the
database. The File ID and Field ID fields correspond to the fields in FILE.DDF
and FIELD.DDF. The Index Number field identifies the index number. The Key
Segment field is a further qualifier. NetWare SQL supports multisegment
indices and this segment number is used to give the order of segments in a
multisegment key. Prior to NetWare SQL 3.0, the limit on index segments was
24. NetWare SQL 3.0 allows up to 119 key segments. The Key Flags field is a
bit-mapped integer used to describe the index more fully.
While there are other DDF files (VIEW.DDF, ATTRIB.DDF, and so on), I haven't
covered them here since most applications expect these three.


Accessing NetWare SQL Files with Class


A convenient way to access NetWare SQL files is with a C++ class. With some
minor changes (a different data dictionary reading routine, a different raw
file handling class, and the like) the techniques presented here are
applicable to any file format that handles a variety of data types.
Listings One and Two show the class definition for the ddfFIle class and
supporting classes. Listing One (page 92) is the .H file, while Listing Two
(page 92) .CPP source. Class Value is used to contain the values of each
field. Value has private data members strValue, a character pointer, dValue
and iValue (double and long types, respectively). Note that even if a value
can be represented by a C/C++ data type of smaller size (a float or an int),
Value always stores values using the largest comparable C/C++ type. The len
and dec private data elements keep track of length and number of decimals used
for string formatting.
Constructors initialize the value with a character pointer, double and long,
as well as a constructor that requires no initial value. Operator= and casting
operators are overloaded to provide a convenient syntax to set and get values.
Next some structures that generally mirror the layout of the DDF files are
described. One difference is the inclusion of a Value member in the FIELDDDF
structure. An additional structure (BOOKMARK) allows you to set and return to
a "bookmark" in the file. Btrieve allows you to record the physical position
in a file (using a four-byte value), as well as keep track of the key number.
With this, you can look at a record, set a "bookmark," look around at other
records in the file, and then return to the records you first examined.
ddfFile has private data elements to keep track of the physical file name and
the file name used by NetWare SQL to refer to the file. A pointer to a bfile
class is what is used to do the actual manipulations of the raw Btrieve file
(also see my article "Wrap It Up" in Windows Tech Journal, May 1992). An array
of 256 pointers to the FIELDDDF structure allows you to get at all fields in a
NetWare SQL file (field limit is 255) as well as an additional field used to
point to somewhere harmless if an invalid field name is passed.
The constructor takes a single mandatory argument, the logical file name, and
a single optional argument, an owner name (used if security is enabled on the
underlying Btrieve file). The constructor gets the actual DOS file name from
the passed logical file name by reading through FILE.DDF, creates the bfile
object (which handles opening the file and allocating buffers, and so on), and
loads the field names and descriptions into the fields[] array of FIELDDDF
structures.
The destructor performs memory clean up, and as the bfile object goes out of
scope, its destructor is called. It cleans up memory used for buffers and
closes the actual Btrieve file.
Several member functions (set_key_num(), get_status(), and to a lesser extent,
get_rec(), put_rec(), insert() and del_rec()) act as simple wrappers around
the bfile class. For getting and saving records, one additional step must be
taken. ddfFile supplies two member functions to convert from the raw data into
fields and back: dataToFields() and fieldsToData(), respectively.
To provide access to fields in the file, two getValue() member functions are
provided. One accepts as an argument a field number (to be used if you know
the relative field number in the file) and the other accepts the field name as
an argument. Using the field name to access the value removes dependence on
the physical details of the file's structure. Each function within ddfFile
that gets information from the file uses the dataToFields() member function
and each function that saves information to the file uses fieldsToData().
These functions use information from the FIELD.DDF file and pass that
information along to a btrvConvert() function to transform the raw data to
displayable values and back again.
ddfFile uses several conversion routines, only one of which (NUMERIC.CPP) is
presented here (due to space constraints). Other conversion routines
(BTRVCNVT.CPP, BCD.CPP, DBLE_DB.C, and so on) are provided electronically; see
"Availability," page 3. Note that each field type allowed by NetWare SQL has a
#defined name. Within btrvConvert(), this field type is used to signal the
type of conversion required. The field type plus 100 decimal, signals a
conversion back from a displayable value to the format that exists in the
file.


String Conversions: The Easy Stuff First


When it comes to data conversions, strings are relatively easy. The only
tricky part is ensuring that our C/C++ centric mindset doesn't get you into
trouble. If a straight String type is defined, be sure that when you do
conversions to that type you don't use a null terminator. When the destination
of the string is a field in a data file, this null will become the first
character in the next field. For this reason, the string is first formatted
into a C-style string in a temporary buffer that actually takes up dlen bytes
plus 1 for the null terminator, then memcpy() is used rather than strcpy() to
copy only the bytes before the null terminator. For Pascal-style LSTRINGS,
read the length byte at the head of the string, rather than what NetWare SQL
tells you the destination length is--unless the length byte tells you that the
string is longer than the NetWare SQL field length. C-style ZSTRINGS are, of
course, not a problem for C++ class.


ASCII Representations of Numbers


Two of the number types allowed by NetWare SQL use standard ASCII to represent
numbers. The most common of the two types is the Numeric type in which numbers
are represented by their ASCII code, and the decimal is implied. Thus, a
seven-character numeric field containing the value 123.45 would appear in the
actual data as 0012345. One twist to this encoding is handling negative
values. The solution is to embed the sign into the last digit. Table 3 lists
the digits and the associated character to use in the last digit if the value
is negative. The values supplied for positive values are hardly ever used, but
the conversion routine presented does support them. See NUMERIC.CPP, Listing
Three, page 94.

Another approach to handling negative values is the NUMERICST type, added with
NetWare SQL 3.0. This field is made up of the absolute value in the first n-1
characters and the sign (+ or -) as the last character. Clarity for browsing
through the raw data is gained, at the loss of a single character per number
represented. 123.45 would be represented in a seven-character field as
012345+.


Binary Representations of Numbers


NetWare SQL supports seven binary representations of numbers. Combining
essentially identical types, you're still left with four distinct types of
representations. Two of the types are various sizes of native C/C++ types. INT
values can be either char (one byte wide), int (two bytes wide), or long (four
bytes wide). Add the unsigned modifier to the types discussed and the UNSIGNED
BINARY type is handled. An additional binary integer type is AUTOINC, which
allows you to insert a record with the AUTOINC field set to binary 0. After
the insert the field will contain a value one higher than the previous highest
value for the field.
The NetWare DECIMAL and MONEY types are very similar. MONEY fields are DECIMAL
fields with the number of decimals fixed at two. In addition, MONEY-type
fields traditionally are displayed beginning with a dollar sign. DECIMAL
numbers are equivalent to standard ANSI-74 Cobol's COMP-3 type. Numbers are
represented by a nibble (half a byte, or four bits). The trailing nibble
contains the sign, with either an F or C for positive numbers, or a D for
negative numbers. For instance, the number 123.45 set up to display seven
digits would be encoded as the hex digits 0012345F, taking up four bytes.
bcdtof() (available electronically) provides for details on the conversion.
Both types of floating-point data contain a sign bit and some number of bits
for the exponent and the mantissa. The secret, of course, is deciding where
each type of information is stored. The FLOAT type can be a C/C++ float type
(four bytes wide) or double (eight bytes). These values can simply be
reformatted using standard C conversion routines (atof(), sprintf(), and the
like).
BFLOAT values are Basic floating-point values. Before IEEE floating point took
over the world, Microsoft came up with its own way of storing floating-point
numbers. BFLOAT fields can be either four or eight bytes wide. REALCNVT.H,
DBLE_BD.C, and DBLE_BS.C (available electronically) provide details on the
conversion. Keep in mind, however, that some precision could be lost in the
conversion between eight-byte wide FLOAT and BFLOATs, since the number of
digits used for the mantissa and exponent vary. Another complication for the
conversion of any type of floating-point value is that the number of decimals
to display is not stored in the decimals field of the FIELD.DDF file.
Formatting information on floating-point values traditionally is derived from
the ATTRIB.DDF file, a file not discussed in detail here.


Times and Dates


NetWare SQL, seemingly like every other file-handling system ever invented,
has its own internal representation for dates and times. TIME fields are
stored with one byte each for 100ths of a second, seconds, minutes, and hours.
DATE fields are represented by a two-byte integer year and single-byte integer
month and day. Dates are formatted as MM/DD/YY, or if the destination length
is sufficient, MM/DD/YYYY. (Remember, the year 2000, or the year 00 with only
two digits, is right around the corner!)


Putting It All Together


Listing Four (page 96) presents an example main() that accesses two files
identical in content but different in format. (The data files and the DDF
files are also available electronically). The function showData() is called
twice, first to show data in TEST 1, then showing data in TEST 2. See Figure
1. The output from this sample program (see Figure 2) is identical for both
files, even though the files are different in structure, and in length.
Note that the example does expect you to have some "magic" knowledge of the
key structure of the file. It wouldn't be a big step from here to have the
ddfFile class read INDEX.DDF to get information on the key fields and allow
the user of the class to set the index simply by passing by name the field or
fields in the index.
By isolating the details of the data from the substance of the data,
applications can continue to work even if additional fields are added or
existing fields are moved around. Using ddfFile, you can get one step closer
to the ideal of a program acting as the "engine" fueled by, but independent
from, the data.
 Figure 1: Sample File Layouts
 Figure 2: Sample Output
Table 1:
NetWare SQL field types. * Treated like one character string in ddfFile class.
** Not supported by ddfFile class. *** New with NetWare SQL 3.0 and Btrieve
6.0, supported by ddfFile class.
==============================================================================
 Data Type Keyword Code Description
==============================================================================
 INT 1 1, 2 or 4 byte integer
 FLOAT 2 IEEE floating point
 DATE 3 4 byte encoded date
 TIME 4 4 byte encoded time
 DECIMAL 5 Like Cobol's COMP-3
 MONEY 6 DECIMAL with 2 decimal places
 LOGICAL 7 True/False*
 NUMERIC 8 ASCII number string, implied decimals
 BFLOAT 9 BASIC floating point
 LSTRING 10 Pascal-style strings
 ZSTRING 11 C-style strings
 NOTE 12 Variable length**
 LVAR 13 Variable length**
 AUTOINC 15 2 or 4 byte integer, gets next value
 BIT 16 Logical fields
 NUMERICST 17 NUMERIC with trailing sign***
==============================================================================
Table 2:
DDF file layouts: (a) FILE.DDF; (b) FIELD.DDF; (c) INDEX.DDF.
==============================================================================
 Field Name Type Size Description
==============================================================================
(a)
 File ID INTEGER 2 ID generated by NetWare SQL
 File Name STRING 20 Logical Name
 File Location STRING 64 Physical (DOS) file name
 Flags INTEGER 1 if bit 4=1, this is a dictionary, file, else this is a
user-defined file
(b)

 Field ID INTEGER 2 ID generated by NetWare SQL
 File ID INTEGER 2 Links to File ID in FILE.DDF
 Field Name STRING 20 Logical Field Name
 Btrieve Data Type INTEGER 1 Value from Table 1
 Offset INTEGER 2 Offset in file
 Size INTEGER 2 Size of data in file
 Decimals INTEGER 1 Decimal places for DECIMAL, NUMERIC and MONEY types
 Flags INTEGER 2 Bit 0 is case flag for STRINGs
(c)
 File ID INTEGER 2 Links to File ID in FILE.DDF
 Field ID INTEGER 2 Links to Field ID in FIELD.DDF
 Key Number INTEGER 2 Key Number, 0 through 23, or through 118 in NetWare SQL
version 3.0
 Key Segment INTEGER 2 Segment number for multi-segment keys
 Flags INTEGER 2 Index flag attributes
==============================================================================
Table 3:
Sign bytes for NUMERIC types.
==============================================================================
 Digit Positive Negative
==============================================================================
 1 A J
 2 B K
 3 C L
 4 D M
 5 E N
 6 F O
 7 G P
 8 H Q
 9 I R
 0 { }
==============================================================================
_ACCESSING NETWARE SQL FILES WITHOUT NETWARE SQL_
by Douglas Reilly


[LISTING ONE]

#ifndef DDFFILE_H
#define DDFFILE_H
#include "bfile.h"

class Value {
 char *strValue;
 double dValue;
 long iValue;
 int len;
 int dec;
public:
 Value(int l=0,int dec=0);
 Value(char *s,int l=10,int dec=0);
 Value(double t,int l=10,int dec=0);
 Value(long i,int l=10);
 ~Value()
 {
 if ( strValue!=0 )
 {
 my_free(strValue,__FILE__,__LINE__);
 }
 }

 void setVal(char *s);
 void setVal(double t);
 void setVal(long i);
 void operator=(long l)
 {
 setVal(l);
 }
 void operator=(double d)
 {
 setVal(d);
 }
 void operator=(char *s)
 {
 setVal(s);
 }
 void setLenDec(int l,int d);
 void clear();
 operator char *() { return strValue; }
 operator long() { return iValue; }
 operator double() { return dValue; }
 int getLen() { return len; }
 int getDec() { return dec; }
};
struct FILEDDF {
 int fileID;
 char fileName[20];
 char fileLoc[64];
 int flags;
};
struct FIELDDDF {
 int fieldID;
 int fileID;
 char fieldName[20];
 char btDataType;
 int offset;
 int fieldSize;
 char decimals;
 int flags;
 class Value value;
} ;
struct INDEXDDF {
 int fileID;
 int fieldID;
 int keyNumber;
 int segNumber;
 int flags;
};
// this is what is needed to keep track of Btrieve currency.
struct BOOKMARK {
 char pos[4];
 int key;
};
class ddfFile {
 char physicalName[22];
 char logicalName[70];
 bfile *theFile;
 struct FIELDDDF *fields[256];
 struct FILEDDF fileDDF;
 int numFields;

 int fieldsToData();
 int dataToFields();
public:
 ddfFile(char *lname,char *owner=0);
 ~ddfFile();
 void setValue(char *t,int fnum) { fields[fnum]->value.setVal(t); }
 void setValue(long l,int fnum) { fields[fnum]->value.setVal(l); }
 void setValue(double d,int fnum) { fields[fnum]->value.setVal(d); }
 void nullOut();
 int getFieldID(char *);
 char *getFieldName(int);
 char *getValue(char *);
 char *getValue(int fnum);
 void toData(int fnum,void *t);
 int get_status()
 {
 if ( theFile==0 )
 {
 return 9999;
 }
 else
 {
 return theFile->get_status();
 }
 }
 int unlock()
 {
 return theFile->unlock();
 }
 void set_key_num(int key_num)
 {
 theFile->set_key_num(key_num);
 }
 // overload operator++
 int operator++(int )
 {
 char temp[255];
 return(get_rec(temp,B_GET_NEXT));
 }
 // overload operator--
 int operator--(int )
 {
 char temp[255];
 return(get_rec(temp,B_GET_PREV));
 }
 // Gets a record. Uses key 0 unless you have set key number.
 int get_rec(char far *keystr,int op=B_GET_EQ);
 // SEE BELOW...
 int put_rec(char far *keystr=0,unsigned tlen=0,int just_update=0);
 // Insert. made seperate functions to force insert.
 int insert(char far *keystr=0,unsigned tlen=0);
 // self explanatory...except that if keystr==0, positioning is not done
 int del_rec(char far *keystr=0) { return(theFile->del_rec(keystr)); };
 int getBookmark(struct BOOKMARK *theMark);
 int gotoBookmark(struct BOOKMARK *theMark);
};
#endif




[LISTING TWO]

#include "stdio.h"
#include "stdlib.h"
#include "ddffile.h"

int loadFields(int currentFileID,struct FIELDDDF *fields[]);
char *btrvConvert(int type,char *src,char *dst,int slen,int sdec,
 int dlen,int ddec);
Value::Value(int l,int d)
{
 strValue=0;
 dValue=0.0;
 iValue=0L;
 len=l;
 dec=d;
}
Value::Value(char *s,int l,int d)
{
 strValue=my_malloc(l+2,__FILE__,__LINE__);
 strcpy(strValue,s);
 dValue=(atof(s));
 iValue=(atol(s));
 len=l;
 dec=d;
}
Value::Value(double t,int l,int d)
{
 strValue=my_malloc(l+2,__FILE__,__LINE__);
 sprintf(strValue,"%*.*f",l,dec,t);
 dValue=t;
 iValue=(long)t;
 len=l;
 dec=d;
}
Value::Value(long i,int l)
{
 strValue=my_malloc(l+2,__FILE__,__LINE__);
 sprintf(strValue,"%*ld",l,i);
 dValue=(double)i;
 iValue=i;
 len=l;
 dec=0;
}
void Value::setLenDec(int l,int d)
{
 if ( strValue!=0 )
 {
 my_free(strValue,__FILE__,__LINE__);
 strValue=0;
 }
 strValue=my_malloc(l+2,__FILE__,__LINE__);
 len=l;
 dec=d;
}
void Value::clear()
{
 if ( strValue!=0 )

 {
 sprintf(strValue,"%*s",len," ");
 }
 dValue=0.0;
 iValue=0;
}
void Value::setVal(char *s)
{
 if ( strValue==0 )
 {
 strValue=my_malloc(len+2,__FILE__,__LINE__);
 }
 strcpy(strValue,s);
 dValue=(atof(s));
 iValue=(atol(s));
}
void Value::setVal(double t)
{
 if ( strValue==0 )
 {
 strValue=my_malloc(len+2,__FILE__,__LINE__);
 }
 sprintf(strValue,"%*.*f",len,dec,t);
 dValue=t;
 iValue=(long)t;
}
void Value::setVal(long i)
{
 if ( strValue==0 )
 {
 strValue=my_malloc(len+2,__FILE__,__LINE__);
 }
 sprintf(strValue,"%*ld",len,i);
 dValue=(double)i;
 iValue=i;
}
ddfFile::ddfFile(char *lname,char *owner)
{
 strcpy(physicalName,pNameFromLName(lname,fileDDF));
 strcpy(logicalName,lname);
 theFile=new bfile(physicalName,0,owner);
 if ( theFile !=0 && theFile->get_status()==BERR_NONE )
 {
 numFields=loadFields(fileDDF.fileID,fields);
 }
}
ddfFile::~ddfFile()
{
 int loop;
 delete theFile;
 for ( loop=0 ; fields[loop]!=0 ; loop++ )
 {
 delete fields[loop];
 }
}
int ddfFile::getFieldID(char *name)
{
 int loop;
 for ( loop=0 ; fields[loop]!=0 && name!=0 ; loop++ )

 {
 if ( !(memicmp(name,fields[loop]->fieldName,(strlen(name)))) )
 {
 return(loop);
 }
 }
 return(255);
}
char *ddfFile::getFieldName(int i)
{
 if ( i<numFields )
 {
 return(fields[i]->fieldName);
 }
 return(fields[255]->fieldName);
}
char *ddfFile::getValue(char *fname)
{
 return((char *)fields[getFieldID(fname)]->value);
}
char *ddfFile::getValue(int fnum)
{
 return((char *)fields[fnum]->value);
}
int ddfFile::get_rec(char far *keystr,int op)
{
 theFile->get_rec(keystr,op);
 dataToFields();
 return(theFile->get_status());
}
int ddfFile::put_rec(char far *keystr,unsigned tlen,int just_update)
{
 fieldsToData();
 return(theFile->put_rec(keystr,tlen,just_update));
}
int ddfFile::insert(char far *keystr,unsigned tlen)
{
 fieldsToData();
 return(theFile->insert(keystr,tlen));
}
int ddfFile::fieldsToData()
{
 int loop;
 for ( loop=0 ; loop<numFields ; loop++ )
 {
 btrvConvert(fields[loop]->btDataType+100,(char *)fields[loop]->value,
 theFile->get_data()+fields[loop]->offset,
 fields[loop]->value.getLen(),fields[loop]->value.getDec(),
 fields[loop]->fieldSize,fields[loop]->decimals);
 }
 return(loop);
}
int ddfFile::dataToFields()
{
 int loop;
 char temp[255];
 for ( loop=0 ; loop<numFields ; loop++ )
 {
 memset(temp,EOS,255);

 btrvConvert(fields[loop]->btDataType,
 theFile->get_data()+fields[loop]->offset,temp,
 fields[loop]->fieldSize,fields[loop]->decimals,
 fields[loop]->value.getLen(),fields[loop]->value.getDec());
 fields[loop]->value.setVal(temp);
 }
 return(loop);
}
void ddfFile::nullOut()
{
 int loop;
 for ( loop=0 ; loop<numFields ; loop++ )
 {
 fields[loop]->value.clear();
 }
 return;
}
void ddfFile::toData(int fnum,void *t)
{
 if ( fnum<255 )
 {
 btrvConvert(fields[fnum]->btDataType+100,(char *)fields[fnum]->value,
 (char *)t,
 fields[fnum]->value.getLen(),fields[fnum]->value.getDec(),
 fields[fnum]->fieldSize,fields[fnum]->decimals);
 }
}
int ddfFile::getBookmark(struct BOOKMARK *theMark)
{
 theMark->key=theFile->get_key_num();
 theFile->get_pos(theMark->pos);
 return(theFile->get_status());
}
int ddfFile::gotoBookmark(struct BOOKMARK *theMark)
{
 theFile->set_key_num(theMark->key);
 theFile->set_pos(theMark->pos);
 dataToFields();
 return(theFile->get_status());
}

[LISTING THREE]


#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include "ctype.h"

extern "C" int del_trsp(char *);
char negchars[]={'J','K','L','M','N','O','P','Q','R','}',0};
char poschars[]={'A','B','C','D','E','F','G','H','I','{',0};

double numerictof(char *numstr)
{
 int lastchar;
 int isneg=0;
 double retf=0.0;


 del_trsp(numstr);
 lastchar=numstr[(strlen(numstr))-1];
 if ( !(isdigit(lastchar)) && lastchar!='.' )
 {
 int loop;
 for ( loop=0 ; negchars[loop]!='\0' && !(isneg) ; loop++ )
 {
 if ( negchars[loop]==lastchar )
 {
 numstr[(strlen(numstr))-1]='0'+loop;
 isneg=1;
 }
 }
 for ( loop=0 ; poschars[loop]!='\0' ; loop++ )
 {
 if ( poschars[loop]==lastchar )
 {
 numstr[(strlen(numstr))-1]='0'+loop;
 break;
 }
 }
 retf=(atof(numstr));
 if ( isneg )
 {
 retf*=-1;
 }
 }
 else
 {
 retf=(atof(numstr));
 }
 return(retf);
}
char *numerictostr(double numf,int len=9,int decimals=2)
{
 static char retstr[30];
 int lastcharpos;
 int lastchar;
 if ( numf>=0.0 )
 {
 sprintf(retstr,"%0*.*f",len,decimals,numf);
 }
 else
 {
 sprintf(retstr,"%0*.*f",len,decimals,(numf*(-1)));
 lastcharpos=(strlen(retstr))-1;
 lastchar=retstr[lastcharpos];
 retstr[lastcharpos]=negchars['0'-lastchar];
 }
 return(retstr);
}



[LISTING FOUR]

#include "stdio.h"
#include "stdlib.h"
#include "string.h"

#include "bfile.h"

extern "C" {
 char *my_malloc(int num,char *pfile,int line);
 int my_free(char *,char *pfile,int line);
};
#include "ddffile.h"
#include "realcnvt.h"
void showData(ddfFile& file)
{
 char temp[255];
 if ( (file.get_rec(temp,B_GET_LO))==BERR_NONE )
 {
 do {
 printf("\nName: %-12.12s, %-12.12s ID: %10ld Zip Code: %s",
 file.getValue("LAST NAME"), file.getValue("FIRST NAME"),
 (atol(file.getValue("ID"))), file.getValue("ZIP"));
 } while ( ((file)++)==BERR_NONE );
 }
}
void main()
{
 ddfFile file1("TEST 1");
 ddfFile file2("TEST 2");
 if ( file1.get_status()==BERR_NONE )
 {
 printf("\n\nFILE 1\n\n");
 showData(file1);
 }
 else
 {
 printf("\nStatus %d on FILE 1",file1.get_status());
 }
 if ( file2.get_status()==BERR_NONE )
 {
 printf("\n\nFILE 2\n\n");
 showData(file2);
 }
 else
 {
 printf("\nStatus %d on FILE 2",file2.get_status());
 }
}



















September, 1993
Porting from Workstations to PCs


Compiling and executing compute- and data-intensive applications on PCs




Barr E. Bauer


Barr is a research scientist at Arris Pharmaceutical, 385 Oyster Point Blvd.,
South San Francisco, CA 94080. He can be reached at bauer@arris.com.


Microsoft's Fortran Powerstation 1.0 is a 32-bit Fortran compiler with a
Windows development environment that aims at moving compute- and
data-intensive Fortran applications from high-performance workstations to
low-cost PCs. To this end, Powerstation employs a single memory model and
comes with an integrated DOS extender that lets you develop and run
number-crunching programs once confined to UNIX, VAX, or supercomputing
platforms. In this article, I'll describe my experiences in porting a
data-intensive UNIX-based simulation program to the PC platform. But first, a
quick overview of the Powerstation environment is in order.


Powerstation Overview


Powerstation is a development environment in transition. On one hand, it
generates 32-bit code targeted at the 80386/486, but on the other, it uses
16-bit Windows as the development environment. Nonetheless, memory management
is provided by the DOS extender rather than by Windows. Programs are developed
in the Windows-hosted development environment, but when executed are
restricted to a DOS window within Windows or DOS itself.
Powerstation adheres to the Fortran-77 language standard, but implements some
Fortran 90 elements as extensions. Nonportable extensions that address
deficiencies in Fortran I/O have also been added. To facilitate porting from
other platforms, VAX and IBM extensions are also accepted. Also included is a
complete 32-bit DOS-based graphics library.
The Windows development environment uses a toolbar that iconizes all the
common file handling, searching, compiling, and debugging commands.
Multimodule code projects are managed as projects and a pull-down menu item
puts all building, execution, and project maintenance operations in one place.
Project modules and dependencies are handled within the Powerstation
environment, rather than through nmake (which is still supplied). Single
complete programs are recognized as such and can be compiled and executed
without the need to define a project.
Powerstation also supports color syntax highlighting. Statement lines are
colored by column position and comments are uniformly colored in a contrasting
color. Column 6 (the continuation line) is colored green to provide a
partition between the statement number (columns 1--5) and the statement
(columns 7--72). Statement text beyond column 72 is colored differently as a
warning.
For 32-bit support, Powerstation includes a dedicated version of the Phar Lap
DOS extender. The compilation process adds a binding step, and the resulting
disk image sizes are stunningly small. Previous versions of Microsoft Fortran
treated data as static and included large, empty arrays in the disk image.
Powerstation incorporates transparent dynamic memory allocation into the
program, which saves disk space and load time. At the beginning of program
execution, the DOS extender is loaded before the program proper begins
execution. It is placed in the path when Powerstation is installed, but if you
want to execute on another machine, the extender must be moved with the
program and located either in the same directory as the program, or in the
path. (In my package, there was no documentation on the DOS extender,
information programmers need. Nor was there adequate information and examples
on optimization; any information on how to get the best performance really
should have been included, especially since Fortran has traps that will steal
performance.)


Large Data Objects


The first Powerstation feature I examined was its ability to compile and
execute programs with huge, realistically sized data objects that can be
either arrays or single-data objects exceeding 65K in size. Since memory
access is managed by the non-documented DOS extender, I had to determine its
practical and functional limits. Because performance is always an issue, I
also wanted to be able to determine the measure of performance loss when
nested-loops access array elements inefficiently. Memory access can be probed
by increasing the array size in a simple test program until performance
degrades. Example 1 fills a square array of real*4, then writes selected array
element values. Overall performance is measured by the total execution time,
overhead is measured from start to execution of the first statement (a write
statement). The final write statement is included because many optimizers can
delete statements (including do loops) if the assignment variables are not
subsequently used by the program. The parameter statement allows the program
to be easily scaled to new array sizes. Example 1 accesses the two-dimensional
array in column-major order; that is, the first index represents contiguous
memory addresses and is controlled by the inner do loop. Although backwards to
C and other languages that explicitly handle multidimensional arrays, this is
standard Fortran and an important component of optimization. Separate compiled
versions of Example 1 for each array size were created with and without time
optimization in release mode. Test programs made in debug mode performed about
the same as their corresponding release-mode versions. I timed and executed
all programs on a 16-MHz 386SX with 6 Mbytes of RAM, the slowest possible
platform for this compiler.
The performance of Example 1 is excellent, at least up to extremely large
array sizes; see Table 1. The overhead is composed of the time required to
load the DOS extender and the time to establish the extended memory management
for the data in the program. The sharp performance degradation in going to the
1024x1024 array is likely due to the DOS extender activity, which is probably
memory-page swapping. I was unable to locate the point where the DOS
extender's execution behavior changed, but in any case, three Mbytes is still
an awful lot of data.
The maximum overhead incurred by the memory manager can be obtained simply by
inverting the array indices of the two-dimensional array in Example 1, forcing
each cycle of the nested do loop to access a noncontiguous array element; see
Example 2. If the array is large enough and memory pages are small enough, the
memory manager can be forced to fetch an entire page of array elements just to
access one element. This maximizes the overhead and kills performance,
although the program still executes correctly. All the array sizes of Example
2 are shown in Table 1 and run identically to the contiguous access cases
except for the 1024 case, which required 16 hours to complete due to the
accumulated overhead. The good news is that execution performance is recovered
by properly indexing the array while being mindful of how the array is stored,
and that the Phar Lap DOS extender handles arrays up to 3 Mbytes without
demands on performance.
Memory management is new to many PC users, but old hat for UNIX users and
covers both extended/virtual memory and cache memory. Under UNIX, virtual
memory is handled by moving memory pages to and from disk, which dramatically
steals performance even when the array's elements are addressed contiguously.
UNIX has never handled virtual memory well, so the performance solution to
running very large programs is to buy more memory, rather than rely on the
virtual-memory manager. You can get a flavor of this problem by running the
largest array size versions of Example 1 in a DOS window under Windows 3.1
which has a primitive virtual-memory manager: Execution is very slow and the
disk holding the swap space is very active. (I advise you buy more memory
instead of using virtual memory.)
Cache memory is used to speed up execution and is really a miniature version
of the virtual-memory paging problem, but with smaller page sizes. The
processor's ability to find a data item in cache, versus having to fetch it
from main memory, is referred to as the "hit" rate; when it's low, so is
performance. Here, the solution is to be mindful of addressing. As the
distinction between UNIX workstations and PCs disappears, performance issues
familiar to UNIX programmers, will have to be addressed by PC developers.


The Simulation Program


The program I ported into the PC environment is a molecular simulation program
called "Ann" which determines the optimum conformational state and most stable
energy for a floppy molecule using simulated annealing. Ann is a compact,
efficient 1800-line Fortran program that uses approximately 1.7 Mbytes of
data. The program, written by Stephen Wilson of NYU, is available from NYU in
DOS, MAC, and other formats (call 212-998-3519 or contact franc@es.nyu.edu for
details). I'd previously ported Ann from its original VAX platform to the
Silicon Graphics Indigo workstation. Now was the time for a PC version.
Simulated annealing is an ideal approach to optimization problems that have a
large number of configurational states. The algorithm requires (1) a method
for randomizing states, (2) an objective function for measuring the "energy"
of states, and (3) a control-parameter analog of "temperature." The algorithm
works by moving from state to state under the control of a probability: State
changes that lead to lower "energy" are always allowed, while state changes
that lead to higher "energy" are sometimes allowed. At high "temperatures,"
all state changes are allowed. A slow cooling of the system gradually renders
the higher energy states inaccessible until ultimately only the optimum state
remains. The analogy to annealing is actually very good and, in fact, the
process works best when the "cooling" is done slowly. Detailed descriptions of
simulated annealing are described in Numerical Recipes: The Art of Scientific
Computing by W.H. Press, et al. (Cambridge University Press, 1986) and
"Simulated Annealing" by Michael P. McLaughlin (DDJ, September 1989).
The problem with molecules is that the number of states grows geometrically
with the number of bonds. Medicinally-interesting molecules seem to have lots
of bonds, so finding the optimum, or global-minimum energy state in a sea of
local minima, is a challenge comparable to finding a needle in a haystack.
Fortunately, simulated annealing is a good model for this problem:
Conformational states can be randomized by selecting and rotating the bonds of
the molecule at random using what are known as "Monte Carlo methods," a real
energy can be calculated, and the real temperature can be employed.
I started with the VAX version as originally supplied. Compilation times on my
386SX were less than two minutes in time-optimized release mode. The compiler
spotted a multiple declaration error tolerated by the VAX Fortran compiler in
which a COMMON BLOCK element was initialized by an identical DATA statement in
two different modules. Another problem was the use of VAX "slang" in the OPEN
statement directives. The behavior of Fortran I/O statements is modified by
directives in the statement of the form directive="OPCODE". For the most part,
the opcodes are unique, so the VAX compiler tolerates a "slang" in which only
the OPCODE without quotes need be listed in the I/O statement. Complicating
this is a lack of standards on file-sharing directives and opcodes. The PC's
simple, single-user environment let me delete the file-sharing directives
without compromising the program. From start to finish, porting only took an
evening of time.
Performance of Ann on my 386SX under DOS was 900 seconds (15 minutes) for the
original sample problem that's supplied with the code. I also ported the VAX
code to my SGI R4000 Indigo Elan (an 85 MIPS/16 Mflop workstation) and it
compiled without change or error in --O3 (maximum optimization) mode and
executed in 15 seconds. In other words, my lowly 16-MHz 386SX was roughly 2
percent of the performance of a top-performing workstation. Note, however,
that at approximately 1.5 MIPS/0.2 Mflops, the 386SX performance was better
than the microVAX-II I used for similar simulations in 1986--and the 386 was a
fraction of microVAX's price!
I also compiled several other off-the-shelf compute- and data-intensive
simulation programs, again with minimal or no change and performance was
comparable to that with Ann. I then ran Linpack, which executed in 735 seconds
on my PC and 8 seconds on the R4000 Indigo. Considering the differences in the
code, the Linpack results are in the same performance ballpark as Ann.


Code Transportability


For sometime, I've been trying to use a PC as a development platform for
programs that would ultimately earn their keep on a UNIX computer because
application development environment is better on the PC than on, say, an SGI
workstation. My hope was that by adhering to portability standards, code would
be portable so that I could develop and run on any platform, including the PC.
For the most part, Powerstation makes this a reality, but there are still
issues that make code porting between different computers difficult. The
process of porting Ann isolated a recurring problem with nonstandard I/O
statements. Table 2 shows a comparison between directive modifiers of the open
statement between Powerstation and the SGI F77 compiler. Fortunately, the most
useful directives and opcodes are standard. In some cases the opcode becomes
the directive, but generally the difference between the two involves
differences in how the opened file is accessed. Variation in opcodes can be
handled by substituting directive=string_var, which is allowed, and selecting
the appropriate opcode at run time. Although this works, it adds a layer of
complexity. Variation in directives cannot be handled or aliased and must be
dealt with through code simplification, as I did in Ann.
Table 1: Timing results (time optimized).
==============================================================================
Array Array Time Time
Dimension Size Contiguous Noncontiguous Overhead
 (sec) (sec)

==============================================================================
128 64.0 Kbytes 4.3 4.5 4.0
512 1.0 Mbyte 6.6 6.8 4.8
900 3.1 Mbytes 12.2 12.5 5.9
1024 4.0 Mbytes 74.0 57344.0 16.0
==============================================================================
Table 2:
Comparison of OPEN statement directives and opcodes.
==============================================================================
 Fortran Powerstation Silicon Graphics F77
 directive opcode directive opcode
==============================================================================
 unit = integer expression unit = integer expression
 access = APPEND access = APPEND
 DIRECT DIRECT
 SEQUENTIAL SEQUENTIAL
 KEYED
 blank = NULL blank = NULL
 ZERO ZERO
 blocksize = integer expression (no equivalent)
 err = label number err = label number
 file = character expression file = character expression
 form = FORMATTED form = FORMATTED
 UNFORMATTED UNFORMATTED
 BINARY BINARY
 SYSTEM
 iostat = integer expression iostat = integer expression
 mode = READ readonly (no variations)
 WRITE
 READWRITE
 recl = integer expression recl = integer expression
 share = DENYRW shared (no variations)
 DENYWR
 DENYRD
 DENYNONE
 status = OLD status = OLD
 NEW NEW
 UNKNOWN UNKNOWN
 SCRATCH SCRATCH
 carriagecontrol = FORTRAN
 LIST
 NONE
 associatevariable = integer expression
 defaultfile = character expression
 disp = KEEP
 SAVE
 PRINT
 PRINT/DELETE
 SUBMIT
 SUBMIT/DELETE
 key = address expression
 maxrec = integer expression
 recordtype = FIXED
 STREAM_LF
 VARIABLE
==============================================================================
[EXAMPLE 1]
c measure consecutive memory addressing performance
 program arraytest1

 parameter (DIM=1024)
 real r(DIM,DIM)
 write (*,*) 'start loop'
 do outer = 1,DIM
 do inner = 1,DIM
 r(inner,outer) = float(inner*outer)
 enddo
 enddo
 write(*,*) 'end loop'
 write(*,*) r(DIM/3,DIM/3), r(DIM/2,DIM/2), r(DIM*2/3,DIM*2/3),
 1 r(DIM,DIM)
 stop 'done'
 end
[EXAMPLE 2]
c measure nonconsecutive memory addressing performance
 program arraytest1
 parameter (DIM=1024)
 real r(DIM,DIM)
 write (*,*) 'start loop'
 do outer = 1,DIM
 do inner = 1,DIM
 r(outer,inner) = float(inner*outer) ! indexing inverted
 enddo
 enddo
 write(*,*) 'end loop'
 write(*,*) r(DIM/3,DIM/3), r(DIM/2,DIM/2), r(DIM*2/3,DIM*2/3),
 1 r(DIM,DIM)
 stop 'done'
 end

































September, 1993
Modeling Systems with Polynomial Networks


Tools for predicting behavior




Peter D. Varhol


Peter is an assistant professor of computer science and mathematics at Rivier
College in Nashua, New Hampshire.


From the factory floor to the stock market, just about any process that
produces data can be modeled in quantitative terms. Once the model is built,
you can go about experimenting with the system, trying to predict future
behavior--and in many cases, successful prediction can mean making or saving a
lot of money.
Neural networks are one approach to behavioral modeling, but require
specialized knowledge of the neural models (of which there are dozens) along
with an understanding of the subtleties of constructing and fine-tuning a
network. These skills require years of training and experience and don't come
cheaply. To some, it seems like black magic. My experiences with designing
neural nets taught me that an intuitive "feel" for the data is just as
important as training in the technology
Over time, other tools for intelligent decision-making, such as many
expert-system shells, have evolved to where the subject-matter experts
themselves are able to do a substantial amount of knowledge-base development.
Rule-based languages such as Exsys and KBMS are intended to include the
nonprogrammer as an active participant in the development team.
AIM, from AbTech Corporation, uses a different approach to the modeling
problem, but one that is very similar, at least in concept, to the neural net
(AbTech refers to the approach as "abduction technology;" thus its name). Each
input parameter is placed into a polynomial, and different combinations of
polynomials attempt to minimize the error between the derived output and the
expected output. The result is a "polynomial network" that is straightforward,
easily mastered by a non-expert, and the resulting algorithm runs far more
quickly than a neural network.


Different from Neural Networks


It's natural to compare polynomial networks to neural nets, if only because of
similar names and uses. The comparison is best made with the neural model
known as "backpropagation," a supervised-training model that feeds back the
error between the estimated and actual outputs.
Both are used to model nonlinear systems, and both use a network approach.
Both use a training data set to create the network, and are tested and
evaluated on a different data set. Even the result looks somewhat the same--at
first glance, the polynomial in AIM can easily be taken for the more-complex
operations that go on in a neural-net processing element.
Upon closer examination, however, similarities break down. The neural network
uses a series of processing elements, arranged in layers, that combine and
evaluate input data. AIM forms polynomials of individual input variables and
cross-products of variables. It combines the results of a network of
polynomials to produce an output.
The neural network is said to "train," based on the output error propagated
back into each layer of the network. This error is used to adjust the
weightings at each processing element in the network to produce a slightly
different output for the next trial.
AIM, on the other hand, does no backward propagation. For each input variable,
AIM computes the best fit between that input and the output using a simple,
least-squares linear-regression function. This results in a series of linear
functions that, individually, probably don't do a very good job of modeling
the system. However, these linear functions are combined with one another to
produce the polynomial network. AIM attempts to fit the training data to a
number of polynomial networks, then analyzes the error to produce more refined
models.
This combination is how the application supports nonlinear modeling. The final
model is the combination of linear-regression functions that produce the least
error, subject to certain constraints discussed shortly. Each polynomial
equation may also contain cross-products from two variables, known as
"doubles," or three variables, known as "triples."
One of the pitfalls of using any mathematical approach to modeling a system is
that it is possible to reproduce the behavior of any system with a
specific-enough model. The problem is that this model generalizes poorly, and
is unable to model similar systems with any reliability. For example, a
ten-degree polynomial can accurately model a system for which you have
collected ten data points. However, if the ten data points are only a sampling
of system behavior, completely modeling the behavior of those ten points makes
the model less likely to accurately predict the behavior of the entire system,
or of similar systems.
The same thing is true with neural networks, where it's possible to
"overtrain" the network by reducing the acceptable mean-training error to
extremely low levels. An overtrained network responds to spurious readings in
the data, and does not generalize well past the original training data set.
To combat this, AIM uses a "complexity penalty" which is computed by
KP=2CPMKS2/N where KP is the complexity penalty, CPM is the complexity penalty
multiplier, K is the initial number of coefficients, N is the size of the
training data set, and S2 is the sample estimate of the system variance.
The user can change the complexity penalty multiplier to produce a simpler or
more complex network. The CPM defaults to 1; making the value larger increases
the complexity penalty, which biases the software toward a less-complex
network.
AIM also has two other adjustments that can affect the complexity of the
model; see Figure 1. The first is the overfit penalty. The larger the value of
the overfit penalty, the greater the number of more-complex polynomial
networks that are not even considered. This makes the run time shorter and
results in a more general model. The second factor is the carving limit.
Setting this value high will enable AIM to automatically remove, or carve out,
terms from individual polynomials that do not contribute much to the overall
solution. This, once again, makes the overall model simpler.


Head to Head: Polynomial vs. Neural Nets


In the article "Neural Nets for Predicting Behavior" (DDJ, February 1993),
James Farley and I used a neural network approach to create a behavioral model
of an electronic wind sensor. After hours of tinkering with neural network
parameters, we achieved an average wind-speed error of 0.7 meters/second, and
an average wind-direction error of 2.1 degrees.
I resolved to start with the same data sets we started with in that project,
and use AIM, rather than neural nets, observing the differences in the process
of model building. My first attempt with one data set produced an average
wind-direction error of just over 14 degrees, or about the same as we had
achieved with a neural net on the same data set. What wasn't the same was the
amount of time it took to get to this point. The neural network required
setting the number of processing elements, number of hidden layers, learning
rule, transfer function, and a myriad of other factors. Total time to get to
this point with neural networks: about two weeks. With AIM, it took ten
minutes, plus the 96 seconds required to create the model using a 66 MHz 486.
Knowing where I was going to end up cut out a lot of the dead ends that went
into the original neural-net model. Preparation of the data set and
examination of the results took about half of the total time in the original
project, and I was able to avoid much of a duplication of effort. While a good
spreadsheet or statistical analysis package is still a necessity, AIM provides
basic information on the result of the modeling process, as well as a basic
data-viewing facility.
Even so, the time saved--and results achieved--by using AIM was impressive.
The end result was that my AIM model produced an average error in wind speed
of 0.75 meters per second, and an average error in wind direction of 2.7
degrees. This is slightly worse than the neural network, but was accomplished
in the space of two afternoons.
My final polynomial network was about as complex as the complexity limitations
allowed it to be; see Figures 2. This is because of the overall poor quality
of the wind-sensor data, where the wind was filtered through a fine wire mesh
surrounding the sensors. While it protected the sensors, this mesh also caused
currents and eddies that confused any traditional attempt at modeling. I
should note that the final neural-net model was also highly complex, with 3
hidden layers and 25 processing elements.


AIM vs. Other Statistical Techniques


Like backprop neural networks, AIM can be used to model classification
problems. Many existing statistical techniques exist for classifying data,
including factor analysis, cluster analysis, and discriminant analysis. In
addition, AIM itself uses linear regression as a starting point for its
modeling process.
All of these are powerful techniques in exploratory data analysis, but offer
only linear solutions. It is, of course, possible to make the data input into
any of these techniques nonlinear, simply by adding a nonlinear
transformation. However, any way of adding nonlinearity to a conventional
approach assumes that you know the structure of the underlying, nonlinear
model, something that neither AIM nor neural nets require.


Translating the Model to Code


Since AIM produces networks of polynomials, these are relatively easy to
translate into C code. AIM does this automatically, producing generic but very
well-commented C with the polynomials encoded (see Listing One, page 96). The
user can choose the form of the input and output variables (individual
variables read from, and written to, files, input and output with arrays, or
input and output by use of external references), data type (integer or single-
or double-precision reals), and the inclusion of debugging information. The
code generated from all of the networks I created compiled without error using
Turbo C++.

The code generated from AIM is also considerably simpler, running under 4
Kbytes, versus more than 10 Kbytes for the neural-net solution. An examination
of the code shows that virtually all of this difference is in the definition
of the polynomials in AIM versus the processing elements in the neural net.
There's more complexity in each processing element, and more processing
elements than polynomials.
The most disappointing thing about AIM is that, aside from setting the
complexity penalty multiplier overfit penalty, and carving limit, the user has
little input into the model building process. Out of necessity, this limits
the size and scope of the model. In this case, AIM limits the size of the
polynomials produced to three degrees, although the network approach also lets
the model consider cross-products.
In even a single type of neural network, the user can constantly fiddle with
the number of hidden layers, the number of processing elements in the hidden
layers, the learning rules and transfer functions, the momentum, and a large
number of other variables. Further, there are no hard and fast rules for
making these changes. The selections are a result of equal parts experience,
intuition, and an understanding of the underlying structure of the neural
model.
As a data analyst, I found the process of adjusting the neural-network
parameters to be exciting and challenging. They were also very time consuming.
In my efforts, I produced polynomial models that were almost as good as the
neural-net model in a fraction of the time.


Better, Worse, or Indifferent?


Does this mean that all, or even most, prediction and classification problems
should be solved with AIM's polynomial networks, rather than neural networks?
Probably not. I was, after all, able to achieve somewhat better results on the
wind sensor model using neural nets.
The question becomes one of degree of accuracy vs. cost of getting there. AIM
is usable by just about any educated professional, and there is little need
for intensive training or years of experience. However, in most cases it will
only get you so far. Additionally, the lack of backward propagation of errors
means that the models are not self-adjusting. With a more restrictive modeling
technique, and no feedback, you'll only get the best model it is capable of
producing, rather than the best possible.
While my comparisons have been with the backpropagation, neural-network model,
there are other classes of neural networks loosely categorized as
"unsupervised learning models." These models don't require output data to
compare with their own computed outputs, so there is no error to propagate
backward.
What then, do these models do? For the most part, they classify. Working in a
manner similar to discriminant analysis or factor analysis, they look for
outputs that logically group together based on some nonlinear combination of
the inputs. Some of my current research is leading me to use unsupervised
neural-net models to classify computer network traffic into fuzzy categories
such as Lightly, Moderately, or Heavily loaded. AIM is only peripherally
useful to this effort.
The conclusion, then, is that the neural-network model is more powerful and
flexible, but anyone who has developed a neural network can tell you that it
is hardly plug-and-play. Before turning to an often costly and time-consuming
neural-network development effort, a system designer would be better served by
spending a few hours with AIM. If it produces results within specification,
there's no need to add the complexity of a neural net. In how many problems is
this the case? I don't know every problem in every field, of course, but I'd
be willing to bet that AIM could be the answer to many that now depend on
neural nets.
 Figure 1: Basic AIM user interface.
 Figure 2: Final polynomial network.
_MODELING SYSTEMS WITH POLYNOMIAL NETWORKS_
by Peter D. Varhol

[LISTING ONE]

#include <stdio.h>

#define pow1(x) (x)
#define pow2(x) ((x)*(x))
#define pow3(x) ((x)*(x)*(x))
#define LIMIT(v,mn,mx) ((v>(mx))?(mx):((v<(mn))?(mn):v))

/*******************************************************
 ABDUCTIVE NETWORK SUBROUTINE: AIMnet()
 Generated by AbTech's Abductory Induction Mechanism
 INPUTS: the following 5 double(s):
 Var_2
 Var_3
 Var_7
 Var_4
 Var_1
 OUTPUTS: Pointer(s) to the following 1 double(s):
 Var_8
 STATISTICS:
 Var_2 : Mean = 0.098811, Sigma = 1.01098
 Min = -1.349, Max = 1.52
 Var_3 : Mean = 3.18207, Sigma = 0.575878
 Min = 2.007, Max = 4.154
 Var_7 : Mean = 20.7932, Sigma = 0.940033
 Min = 18.8, Max = 22.5
 Var_4 : Mean = -0.228888, Sigma = 0.779203
 Min = -1.39, Max = 0.809
 Var_1 : Mean = 3.19586, Sigma = 0.589959
 Min = 1.999, Max = 4.212
 Var_8 : Mean = 180, Sigma = 105.501
 : Min = 0, Max = 360
 : KP = 33.0%, FSE = 67.0%
 : Predicted Error = 31.8675
*******************************************************/
AIMnet(Var_2,Var_3,Var_7,Var_4,Var_1,Var_8)
double Var_2 ; /* input variable */
double Var_3 ; /* input variable */
double Var_7 ; /* input variable */

double Var_4 ; /* input variable */
double Var_1 ; /* input variable */
double *Var_8 ; /* output variable */
{
double node2 ; /* working variable */
double node3 ; /* working variable */
double node7 ; /* working variable */
double node12 ; /* working variable */
double node4 ; /* working variable */
double node11 ; /* working variable */
double node10 ; /* working variable */
double node1 ; /* working variable */
double node9 ; /* working variable */

#ifdef DEBUG
 printf("AIMnet: received Var_2 = %g.\n",(double)Var_2);
 printf("AIMnet: received Var_3 = %g.\n",(double)Var_3);
 printf("AIMnet: received Var_7 = %g.\n",(double)Var_7);
 printf("AIMnet: received Var_4 = %g.\n",(double)Var_4);
 printf("AIMnet: received Var_1 = %g.\n",(double)Var_1);
#endif /* DEBUG */
 /* node2 -- Var_2 */
 node2 = -0.0977376 + 0.989137*LIMIT(Var_2,-1.349,1.52);
 /* node3 -- Var_3 */
 node3 = -5.5256 + 1.73648*LIMIT(Var_3,2.007,4.154);
 /* node7 -- Var_7 */
 node7 = -22.1196 + 1.06379*LIMIT(Var_7,18.8,22.5);
 /* node12 -- Triple */
 node12 = 0 - 0.397622 - 1.28001*node2 + 0.214025*node3
 - 0.790493*node7 + 0.241116*pow2(node2)
 + 0.232504*pow2(node3) + 0.154497*node2*node3
 - 0.132984*node3*node7 - 0.241643*node2*node3*node7
 + 0.475993*pow3(node2) + 0.0166738*pow3(node3)
 + 0.259022*pow3(node7) ;
 /* node4 -- Var_4 */
 node4 = 0.293746 + 1.28336*LIMIT(Var_4,-1.39,0.809);
 /* node11 -- Double */
 node11 = 0 + 1.52034*node12 + 0.491108*node12*node4
 - 0.614602*pow3(node12) ;
 /* node10 -- Double */
 node10 = 0 + 1.43086*node11 + 0.562305*node11*node4
 - 0.543749*pow3(node11) ;
 /* node1 -- Var_1 */
 node1 = -5.41709 + 1.69503*LIMIT(Var_1,1.999,4.212);
 /* node9 -- Triple */
 node9 = 0 + 1.29422*node10 - 0.16041*node10*node1
 + 0.190084*node10*node3 - 0.182997*pow3(node10) ;
 /* node8 -- Var_8 */
 *Var_8 = 0 + 180 + 105.501*node9 ;
 /* perform output limiting on Var_8 */
 *Var_8 = LIMIT( *Var_8, 0, 360 );
#ifdef DEBUG
 printf("AIMnet: returning Var_8 = %g.\n",*Var_8);
#endif /* DEBUG */
}







September, 1993
PROGRAMMING PARADIGMS


A Tour of AppleScript




Michael Swaine


I'd been using the beta of AppleScript, as well as betas of a couple of
third-party products that support AppleScript, so when AppleScript Developer's
Toolkit 1.0 shipped this summer, I was eager to talk about it.
And it turns out that there are things to say about Apple's long-awaited
system-level scripting software, and reasons to look into it, even if you
don't think you'd ever use it, even if you don't think you'll ever use a
Macintosh.
AppleScript is Apple's idea of what a scripting system can and ought to be on
a 1993-vintage GUI and operating system. That's not the same thing as, say,
little languages on UNIX systems, or batch files on DOS systems. But it's also
not the same thing as the scripting systems we can expect to see on
next-generation object-oriented operating systems. What's intriguing is that
AppleScript could serve as a testing ground for such future scripting systems.
Believing that object-oriented operating systems and scripting are made for
one another, I find this intriguing.
And it should be interesting to follow the fortunes of AppleScript over the
next year. It's always interesting to follow Apple's fortunes, if not always
satisfying. I recently wrote an upbeat piece for another publication about
things that Apple was doing right, and between the time I turned it in and it
saw print, Apple announced disappointing profits, replaced its president,
threatened to lay off up to 2000 employees, and cut R&D. (In publishing, cycle
time is three months and a lot can happen between fetch and execution.)
Apple's successes in its early attempts to get third-party developers to
embrace the interapplication communication technology that AppleScript relies
on, were about as mixed as President Clinton's results in trying to get bills
through Congress earlier this year. Now that there's been some success on this
front (Apple's, that is), it makes sense to launch AppleScript. But the launch
means another campaign to get developers to embrace another Apple technology:
There's more to supporting AppleScript than just supporting Apple events, and
AppleScript's success is purely a question of how many AppleScript-supporting
applications come out, how soon.


What AppleScript Is


The core of AppleScript is an operating system component called the
"AppleScript extension." Although actually using AppleScript to do something
meaningful involves the support of application developers, installing
AppleScript is no more complicated than dropping this file into the
appropriate folder. What the AppleScript extension does is to interpret
AppleScript commands sent to it and to dispatch Apple events to the
appropriate target.
The source from which the AppleScript extension gets its commands and the
destination to which it sends its Apple events are where third-party
applications come in, but it's not just applications that can talk to
AppleScript. The possible sources for AppleScript commands and the possible
destinations for Apple events suggest its potential:
The Script Editor. A script editor is another key element of AppleScript, and
Apple supplies one. It's not difficult to use, and users will find that
writing scripts with it is roughly on the level of writing HyperTalk scripts.
But this supplied script editor is not sacrosanct: Third-party developers can
write their own and replace it.
An Application. An AppleScript-compatible application can send Apple-Script
commands in response to user or internal events. Or it can do its usual tricks
in response to Apple events coming in from the outside. An application can be
a sender or a receiver.
A Script Application. There's also something called a "script application," an
AppleScript script that's been compiled and saved as a double-clickable
application. Script applications can be sources or destinations for the
dispatching actions of the AppleScript extension. While they are much like
real applications, they don't relate to Apple-Script in quite the same way,
and so are worth considering separately. Script applications can also act as a
kind of script server for scripts and Apple event-savvy applications.
The Finder. Using the Finder as a destination is one of the ways in which
AppleScript can drive operating system functions, although technically the
Finder is really an application; not yet a scriptable one, but the scriptable
Finder is promised real soon now.
The AppleScript Extension and Scripting Additions. Here's another way
AppleScript can drive the operating system. A half-dozen core commands are
handled by AppleScript itself (that's in addition to its regular duties in
interpreting the language: evaluating expressions, resolving references, and
so forth).
When you run out the combinations of all these sources and destinations, and
when you consider that the source and the destination don't have to be on the
same machine or in the same network zone, you can see that "using AppleScript"
covers a lot of different scenarios.


What It's Good For


With AppleScript, an application could implement its own internal scripting
language by sending messages to itself through AppleScript. In fact, all the
actions of an application could be mediated by AppleScript. Menu choices and
selection operations, and for that matter, pretty much anything that signals a
potential content change, could result in AppleScript code being fired off and
Apple events being returned to be handled by handlers that perform the
actions.
One application could parasitically drive another application to do things the
first application can't do. A word processor could call on a spelling checker
whenever a document is closed. A drawing program could call up a 3-D
application to add depth to a drawing, rotate it, project a shadow, and send
it back as a 2-D drawing again.
A user could set up a batch-job script that starts several applications and
pipes a document through them one at a time, making each application massage
the document a little before passing it on.
And so on.
Since AppleScript is an extensible language with a replaceable editor, it can
present itself to users in a variety of guises, with varying degrees of access
to its power. UserLand Frontier and CE Software QuicKeys can both drive
AppleScript, but at quite different levels of difficulty and power. Since any
(Apple-Script-supporting) application can drive the AppleScript extension and
add to the AppleScript vocabulary (there are other ways to extend the
vocabulary as well), AppleScript has no fixed interface and no fixed set of
capabilities.


Commands


AppleScript is a verbose and easy-going language that will be familiar to
HyperTalk scripters. It lets you slip in "the"s and other syntactic sugar to
kid yourself into thinking it's English. There are the standard sorts of
compound statements: if and repeat, but also some less-common compounds:
considering and ignoring. These are chiefly used by AppleScript to control
string comparisons; see Example 1.
There's also the tell statement, which is crucial since it specifies the
destination for the command(s). It can do so with more precision than implied
above, specifying rather minuscule elements of objects as command
destinations. The forms in Example 2 do the same thing.
Who actually does the work in these cases is a question. Normally, it's the
application that handles the actual command that the AppleScript extension
passes it, although there's a situation in which a command may be directed to
an application object and handled by a subroutine in a script. There are
really only four kinds of AppleScript commands: application commands (handled
by applications), AppleScript commands (handled by the AppleScript extension),
scripting additions (handled by their own extensions), and user-defined
commands (subroutines handled by handlers within a script).


Objects


Commands are sent to and operate on objects. Syntactically, the objects will
typically appear as the parameters of commands. There are system objects,
which AppleScript knows about (applications, zones, machines) and application
objects, which individual applications know about (words and characters, for
example). More precisely, all AppleScript-supporting applications in a
category like word processing will know about the same basic collection of
text-related classes of objects, such as lines, words, characters, and
insertion points. Individual applications may know about other classes of
objects that no other applications know about. A running application, or any
script driving it, will operate on instances of these classes.
Commands typically tell objects to perform actions or return values. Objects
can have properties (text objects can have, for example, size, style, font,
alignment, and color properties) and elements (elements of a word-processing
document would include words, characters, and so on).
There are also what are called "script objects." Like system objects and
application objects, script objects can have properties and can respond to
commands. These are objects defined within scripts; see Example 3. Elsewhere
you could send a command to this object: tell duck to quack.
If applications can define their own commands and objects, how do you know
what vocabulary an application supports? A language with an unknown vocabulary
is not very useful. The answer is that every AppleScript-supporting
application is expected to make its supported vocabulary explicit in a
standard way, and the script editor knows how to display this dictionary for
you in a convenient and readable manner.



Values


AppleScript supports the more obvious value classes: Boolean, class, integer,
real, string, list, and date. It also supports record (defined as a collection
of parameter-value pairs), data (uninterpreted binary data), constant
(application-defined special values), and reference (a reference to an
object).
The use of references to identify application objects is heavily used in
AppleScript. You create a value of class reference using the ref operator
(also known as a reference to). The value of the first of the references in
Example 4(a) is a string; the value of the second is a structure within
AppleScript that points to that string.
AppleScript works hard to make sense of references. There are many reference
forms that it recognizes. Scripts can easily refer to the first or last or nth
or nth-from-last or middle element of any organized object, an arbitrary
element, every element, elements in a range, or elements satisfying some
Boolean condition. Objects can also be referred to by name, ID, or relative
position with respect to another object, where that is meaningful. Also,
AppleScript uses tell statements and default object values to make sense of
incomplete references, if possible. If you look at Example 2, in every example
the object reference for the delete command was incomplete, and AppleScript
completed it from default objects specified by one or more enclosing tell
commands.
The property reference form is another form of reference. Its syntax is
straightforward: name of window 1 references the name property of that window
object. Application objects and system objects and script objects can have
properties; application objects and script objects can define their own
properties: footStyle of myDuck.
It's also possible to refer to properties of records (a record being a
collection of properties and their values): phone of {name: "Bill", phone:
"555-9861"}.
Variables, like properties, hold values. There are two commands for assigning
values to variables: copy and set. These behave identically, copying the
source value to the destination variable, except in the cases of lists,
records, and script objects. When the source object is one of these types, the
set command creates a variable that shares data with the original object, so
that when the original changes, the assigned variable changes with it.
There are other differences between properties and variables as value holders.
Variables are assumed local unless declared global; the opposite is true for
properties, which are persistent. For example, in Example 4(b), the second
line shows the syntax for defining a property. The property barks retains its
last value when you close the script. It is only reset to 0 when the script is
recompiled.


Subroutines


You can write your own handlers, called "subroutines." These can be recursive,
as in Example 5(a). They can use positional parameters, as above, or labeled
parameters. The labeled parameter form supports two dozen special labels and
can be defined using either the word on or the word to. Example 5(b) provides
a few examples of first lines of labeled parameter subroutine definitions.
One special kind of handler is the command handler, written in a script but
attached to a particular application object. This means that you can assign
behavior to objects created by applications. Applications must be written to
support attachment in order for this to work.
Another special handler is the try handler, which is used for error handling.
Bracketing a block of code by the try command lets you specify error handlers
for error messages that may be returned in that block.


AppleScript and Apps


So what does it mean for an application to support AppleScript? There are,
first of all, two directions of support: There's the ability to send
AppleScript commands and the ability to respond to them. Then there are the
three Apple-defined levels of support. An application can be scriptable,
meaning that it responds to AppleScript commands. (There are levels of support
within this, because there are several classes of commands: required, core,
application-suite commands, and an application's own special commands.) An
application can be recordable, meaning that users can create scripts by
turning on a watch-me mode and performing normal application actions. That's a
step beyond scriptable. And an application can be attachable, meaning that
users can write scripts, perhaps in the script editor, and attach them to
objects that the application knows about, so that the behavior of those
objects is controlled by the user's script.
But the big question is, who will support AppleScript? We'll find out over the
course of the next year. It'll be interesting to watch.

Example 1: Controlling string comparisons using AppleScript
ignoring case and punctuation
 if "It's" = "its" then beep 1
 considering punctuation
 if "It's" = "its" then beep 2
 end considering
end ignoring
Example 2: The tell statement
tell application "Text Editor"
 delete character 1 of word 4 of line 6 of document "Ridley 3/18"
end tell
tell word 4 of line 6 of document "Ridley 3/18" of application "Text Editor"
 delete character 1
end tell
tell application "Text Editor"
 tell document "Ridley 3/18"
 tell line 6
 tell word 4
 delete character 1
 end tell
 end tell
 end tell
end tell
Example 3: Script objects
script duck
 bill: yellow
 footCount: 2
 footStyle: webbed
 to quack
 open bill
 beep
 close bill
 end quack

end duck
Example 4: (a) AppleScript values; (b)
(a)
word 6 of line 4
ref word 6 of line 4
(b)
script dog
 barks: 0
 set barks to barks + 1
 display dialog barks
end dog
Example 5: (a) recursive handlers, or subroutines; (b) labeled parameter
subroutine definitions.
(a)
on factorial (n)
 if n = 1 then
 return n
 else
 return factorial (n-1)
 end if
end factorial
(b)
on FahrenheitToCentigrade of temp
to searchFiles of filesToSearch for theString
to lowerCase of charsInString above minChar given sortOrder






































September, 1993
C PROGRAMMING


C++ Exception Handling




Al Stevens


A while back I plugged Philippe Kahn's latest musical venture, a compact disk
with performances featuring Philippe in the company of some heavyweight jazz
musicians. I said then that I did not know how you could get the disk. At the
Borland International Conference in San Diego I asked Philippe about it. "You
can't," he said. It seems his legal advisors were worried about the image of a
CEO who was off playing music while the company foundered in the marketplace,
so they put the kibosh on any distribution of the disk. Too bad, because it
really is a nice piece of work.
The conference was the usual Borland bash. David Intersimone, resident BI
evangelist and morale officer, roamed the hallways, parties, and meeting rooms
with a camcorder, capturing quotes and happenings to share with the poor BI
employees minding the store in Scotts Valley. He even climbed the podium and
scanned the crowd as we waited for the keynote address. That keynote consisted
of Philippe telling us how software was going to be developed in the next
decade and Andy Grove, CEO of Intel, peddling the Pentium. Given my bad
hearing, a cavernous hall with the acoustics of the Grand Canyon, and their
accents, I couldn't keep up.


Errors and Intercepts


Exception handling is the next hot C++ language feature. Using it, a program
can intercept and process exceptional conditions--errors, usually--in an
orderly, organized, and consistent manner. Exception handling is not
implemented in many C++ compilers yet. The only released MS-DOS compiler that
supports the feature is the new Watcom C++ compiler. (I'll be looking at the
Watcom compiler in a future column.) I first discussed exception handling in a
tutorial book called Teach Yourself C++, Third Edition, (MIS:Press) which came
out this summer. That discussion and this one are not based on any hands-on
experience but rather on what I've read about the subject, so the code
fragments that I use for examples might not be quite right. Furthermore, my
reactions to how the feature is specified and how it might be implemented are
not to be taken as gospel. Pragmatic programmers can view exception handling
as an experimental feature until more compilers implement it with consistent
behavior and interface. It will be, however, a powerful and useful language
feature, and I look forward to the time when all the compilers include it and
we can all use it.
Bjarne Stroustrup describes exception handling in The C++ Programming
Language, Second Edition (Addison-Wesley, 1991). The Annotated C++ Reference
Manual, by Ellis and Stroustrup (commonly called the ARM, Addison-Wesley,
1990) defines exception handling, too, so, the feature isn't new, having been
described in print for about three years now. It's just that there are few
compilers available that implement it.
Exception handling allows one part of a program to detect and report
exceptional conditions and another part to handle them. This is an appropriate
order. The classes and functions in libraries typically know how to detect
errors without necessarily knowing what to do about them. An application's
code will understand how to deal with errors but cannot always detect them.
Exceptions are not restricted to errors, by the way. A programmer can use
exception handling to manage all kinds of variable finite-state machine
architectures. In practice, however, we think of exceptions as error
conditions, or at least as unexpected conditions for which the program makes
necessary side trips. The underlying theory is that a function, which can be
buried many function calls deep and which may be hidden in a library, detects
an exception that it cannot deal with and that it must report to the
application at the surface, which must deal with the exception.
Consider a math-library function that detects overflow, underflow,
divide-by-zero, and other error conditions in its input data. Is the error one
that is caused by user-entered data or one that the program computed
internally? Is it a validation condition or a bug in the program? The library
function doesn't know, so an appropriate error-handling strategy depends on
the application. The program might display an error message on the console or
in a dialog box; it might ask the user to enter better data; it might
terminate itself. Library functions shouldn't presume to know the best
exception-handling strategies for all exceptions in all applications. On the
other hand, applications cannot know how to detect all possible exceptions.
This division of responsibility for error detection, reporting, and handling
isn't new. C libraries have supported it for years. Exception handling in C++
adds to the strategy an orderly way for the exception detector to report the
exception to the exception handler. Somehow, the detecting function must
return control to the handling function through an orderly sequence of
function returns. The detecting function can be many function calls deep. An
orderly return to the higher level of the handler function requires, at the
very least, a coordinated resetting of the stack.


Exception Handling in C


C programs usually test library function return values for errors, and they
use setjmp and longjmp to manage exceptions. The first approach, which uses
things such as errno and NULL or ERROR function return values, is reliable but
tedious. Programmers sometimes avoid or overlook some of the possibilities.
The setjmp/longjmp approach is closer to what C++ exception handling does: it
provides an orderly and automatic way to reset the stack to its state at a
specified place higher in the function call hierarchy.
For example, a compiler's syntax checker can be many levels deep in a
recursive descent parser when it detects a syntax error. The compiler doesn't
need to terminate. Instead, it wants to report the error and go back to where
it can read the next line of code and continue. The program uses setjmp to
identify that place and longjmp to get to it. The longjmp call resets the
stack to its state as recorded in the jmp_buf by the setjmp call. The initial
setjmp call returns 0. The longjmp call jumps to a return from setjmp, which
makes setjmp seem to return the specified error code, which should be
non-zero.
There are anomalies in this scheme, however. As you'll soon learn, C++
exception handling doesn't solve them, either. Consider a function like
Example 1. Forget for the moment that a real program would test for exceptions
to fopen and malloc. The two calls represent resources that the program
acquires before and releases after the longjmp. The calls could be in
functions that (1) are called after the setjmp operation, and (2) have
themselves called the parse function. The point is that the longjmp occurs
before those resources are released. Therefore, every exception that this
program detects represents two system resources that are lost--a heap segment
and a file handle. In the case of the FILE* resource, subsequent attempts to
open the same file would fail. If each pass through the system opened a
different file--a temporary file with a system-generated file name, for
example--the program would fail when the operating system ran out of file
handles. We programmers traditionally solve this problem by structuring our
programs to avoid it. Either we manage and clean up resources before calling
longjmp, or we do not use longjmp when there are interim resources at risk. In
Example 1, the problem is solved by moving the longjmp below the free and
fclose calls. It isn't always that simple, however.
Resetting the stack in a C program involves resetting the stack pointer to
where it pointed when setjmp was called. The jmp_buf stores everything that
the program needs to know to do that. This procedure works because the stack
contains automatic variables and function return addresses. Resetting the
stack pointer discards the automatic variables and forgets about the function
return addresses, all of which is correct behavior, because the automatic
variables are no longer needed and the interim functions will not be resumed.


Exception Handling in C++


Using longjmp to unwind the stack in a C++ program does not work, however,
because automatic variables on the stack include objects of classes, and those
objects need to execute their destructor functions. Consider Example 2. No
doubt the constructor for the String class allocates memory for the string
value and its destructor deletes that memory. If an exception occurs, the
String destructor does not execute because longjmp resets the stack and jumps
to setjmp before the String object goes out of scope. The memory used by the
String object is freed, but the heap memory pointed to by a pointer in the
String object is not deleted because the destructor does not execute.
This is a problem that C++ exception-handling solves. The exception handling
throw operation--the analogue to longjmp--unwinds the stack in a way that
calls the destructors of all the automatic objects. If the throw occurs from
within the constructor of an automatic object, the object's destructor is not
called, although the destructors of objects embedded in the throwing object
are called.
C++ functions that can sense and recover from errors execute from within a try
block that looks like Example 3(a). Code that is executing outside of any try
block is not able to detect or handle exceptions. Try blocks may be nested.
The try block typically calls other functions that are able to detect
exceptions.
A try block is followed by a catch exception handler with a parameter list as
shown in Example 3(b). There can be multiple catch handlers with different
parameter lists, one handler after the other in the source file. Each catch
handler is distinguished by its parameter list. The parameter itself does not
have to be named. A named parameter declares an object, and the exception
detection code can pass a value in the parameter. If the parameter is unnamed,
the exception detection code can jump to the catch exception handler merely by
naming the type.
To detect an exception and jump to a catch handler, a C++ function issues the
throw statement with a data type that matches the parameter list of the proper
catch handler. The throw statement unwinds the stack, cleaning up all objects
declared within the try block by calling their destructors. Next, throw calls
the matching catch handler, passing the parameter object.
Example 3(c) begins to bring it all together. In this example, the program
enters a try block, which means that functions called directly or indirectly
from within the try block can throw exceptions. In other words, the foo
function can throw exceptions and so can any function called by foo, and so
on.
The catch exception handler function immediately following the try block is
the only handler in this example. It catches exceptions that are thrown with
an int parameter.
Catch handlers and their matching throw statements can have a parameter of any
type. The parameter may be an automatic variable within the block that uses
throw, even if the catch parameter list specifies a reference. In this case,
the throw statement builds a temporary object to pass to the catch handler.
The automatic object in the throwing function is allowed to go out of scope.
The temporary object is not destroyed until the catch handler completes
processing.
When a try block has more than one catch handler, a throw executes the one
that matches the parameter list. That handler is the only one to execute
unless it throws an exception to execute a different catch handler. When the
executing catch handler exits, the program proceeds with the code following
the last catch handler.
You can specify the exceptions that a function may throw when you declare the
function as shown in Example 3(d). If a function includes such an exception
specification, and the function throws an exception not given in the
specification, the exception is passed to a system function named
unexpected(). The unexpected function calls the latest function named as an
argument in a call to the set_unexpected function, which returns its current
setting. A function with no exception specification can throw any exception.
A catch handler with ellipses for a parameter list catches all exceptions. In
a group of catch handlers associated with a try block, the catch-all handler
must appear last.
You can code a throw with no operand in a catch handler or in a function
called by one. The throw with no operand rethrows the original exception.
An uncaught exception is one for which there is no catch handler specified or
one thrown by a destructor that is executing as the result of another throw.
Such an exception causes the terminate function to be called. You can specify
a function for terminate to call by calling the set_terminate function, which
returns its current value. If no set_terminate function call has been made,
terminate calls abort.
You must decide in your design how to differentiate the exceptions, and code
the catch handlers with their distinguishing parameter lists. You might code
only one catch handler with an int parameter and let the value of the
parameter determine the error type. This approach assumes that you have
control of all of the throws in all of the functions in all of the libraries
that you use.
No doubt conventions will emerge. One way is to use class definitions to
distinguish exceptions and categories of exceptions. A throw with a publicly
derived class as its parameter is caught by a catch handler with the base
class as its parameter. Consider Example 4. FileError is a public virtual-base
class. Its derived classes are NotFound and Locked. The only catch handler for
this category of exception is the one with the FileError reference parameter.
It does not know which of the exceptions was thrown, but it calls the
HandleException pure virtual function, which automatically calls the proper
function in the derived class. Note that I am not recommending this particular
set of exceptions for the fstream library; I am merely using the example to
illustrate the technique. There might be good reasons to avoid throwing
exceptions from within standard stream libraries. I'll address that concern
later in this discussion.
There are other ways to enumerate exceptions. Instead of classes, you can use
enumerated types and have switch statements in the catch handlers. Publishers
of libraries will document the conventions that they use to throw exceptions,
and your catch handlers will use those conventions, perhaps using several
different conventions in one application. Eventually standards must emerge. If
they do not, chaos will reign, and we will be reminded of the function and
variable name collisions that many incompatible C function libraries create.
Recall the discussion earlier about the setjmp/longjmp anomaly with unreleased
resources. C++ exception handling does not solve that problem. Consider the
condition in Example 5 where the String object is allocated from the heap by
the new operator. If the exception is thrown, the delete operation is not
performed. In this case, there are two complications. The memory allocated on
the heap for the String object is not released, and its destructor is not
called, which means that the memory that its constructor allocated for the
string value is lost as well.
The same problem exists with dangling, open file handles, unclosed screen
windows, and other such system resources. If the program just shown seems easy
to fix, remember that the throw could occur from within a library function far
into a stack of nested function calls.
Programming idioms have been suggested that address this problem. In the
second edition of his book, Dr. Stroustrup suggests that all such resources
could be managed from within automatic instances of resource-management
classes. Their destructors would release everything as the throw unwinds the
stack. Another approach is to make all dynamic heap pointers and file and
window handles global so that the catch handler could clean everything up.
These methods sound cumbersome, however, and they work only if all of the
functions in all of the libraries cooperate.



Exceptions in the Standard C++ Library


The ANSI X3J16 C++ Standards Committee is adopting and adapting the Stroustrup
specification for exceptions. Exceptions are having significant impact on
X3J11's library committee, because they are moving toward a standard that
folds exception handling into those library functions that can detect
exceptions. It is properly argued that a language which includes exception
handling strategies as an integral language component is in bad company if its
library does not make good use of the feature.
But consider the implications of a class or function library that throws
exceptions. Programs using that library cannot ignore exceptions unless it is
acceptable for them to abort when the exceptions occur. That's both good news
and bad news. Programs should not, as a rule, ignore exceptions, and such a
library can enforce that rule. Programs that allocate memory from the heap
without testing, for example, should expect to be tossed out if the heap is
exhausted. In the old days, those programs proceeded until a dereferencing of
the null pointer took its nasty toll. With an exception-savvy new operator,
however, those programs will abort if they do not bother to catch the
exception. Programmers who object to having the new operator throw an
exception can always use set_new_handler to override the behavior.
I think that was the good news. So what's the bad news?
The heap-exhaustion example is one that is typically used to justify libraries
that throw exceptions. There are, however, legitimate programming idioms that
ignore--or, at least, defer--the management of some exceptions. For example,
you might not always want to test immediately after a you open a file that the
file does not exist. The FileError example described earlier did just that but
it might not be what you want to do. Furthermore, if an exception-throwing
library is a new version of an older one that didn't throw exceptions,
existing code that uses the library might break.
We worry about the X3J16 committee inventing language because their
predecessor X3J11 C committee wasn't supposed to do that. Prior art is the
best way to test the viability of a new and highly complex language feature.
If it works in the trenches it will wash in the standard. The reverse might
not be true. Understanding that, the X3J16 library committee is proceeding
carefully in applying this concept. The principal issues are: Which library
functions should throw exceptions; and which exceptions should they throw. One
extreme would ignore exception handling altogether, furthering the C tradition
where programmers are responsible for everything and programs that ignore
exceptions crash. The other extreme would throw exceptions for all variable
conditions that all library functions detect, in which case we might as well
change the name of the main function to try. Somewhere in between is nirvana,
and the committee is searching for it.


The Consequences of Exception Handling


Exception handling as a C++ language feature has consequences for C++
programmers because it changes the language. It is not a planned,
well-integrated change, however. It is an add-on, an accoutrement, an
afterthought. Because of its heritage, C++ already has constructs that tend to
get in the way of what exception handling tries to do, giving rise to
contrived solutions such as the resource-management classes suggested by
Stroustrup, solutions that need constant nurture and attention to keep them
healthy.
Besides the additional burden of understanding levied on the programmer, there
is a performance penalty that an inadequate compiler implementation can exact.
It makes one wonder about how this thing will work. To properly unwind the
stack, the run-time code needs to maintain a trail from the throw back up
through all the function returns and block exits to the try so that it can
call automatic object destructors in an orderly sequence. Does every call to
every function and every exit from every block need to test to see if it
should resume normally or if a throw is telling it to ignore everything
immediately following and proceed to its own exit? The outer application
cannot even benefit from the compiler's presumed knowledge that code is
executing outside a try block because every function except main is subject to
a call from within one, and the binding occurs at run time not at compile
time.
How about library functions? Will they all have the unwinding overhead? It
seems to me that they would have to. What about extern "C" functions? How will
the compilers be smart enough to know not to return to them? They will
certainly not have exception-handling unwinding tests and they would continue
processing merrily along as if the exception had not happened. Will C
programmers pay the price because strcmp needs to participate in C++ exception
handling? I hope not. You couldn't simply reset the stack of every C function
the way longjmp does. A C function can call a C++ function through a callback
pointer, and that C++ function needs an unwind even if the C function does
not.
If you do not unwind the stack by a controlled sequence of returns up through
the function tree, then each instantiation of an automatic object must be
recorded somewhere in near-global space so that the throw can call the
destructors. That in itself represents a significant overhead because it adds
hidden processes to constructors. Can the compiler infer from the absence of a
declared destructor that a class does not need destruction during the
unwinding? These, of course, are the meanderings of an uninitiated outsider. I
don't know how to implement the feature because I haven't tried to solve the
problem. I'm having trouble figuring a reliable way to make it work badly much
less efficiently. The implementers will no doubt be wrestling with all of
these issues. Bjarne Stroustrup says that he implemented exception handling in
the laboratory before he proposed its specification in the ARM. Watcom has a
compiler that implements it now. How well these improvements work remains to
be seen.
I hope they work well because the concept of exception handling is a Good
Thing. We need to understand it so that we can exploit it in our designs. We
need to expand our understanding of the libraries that we use to know which
exceptions they throw and what our programs should do about them. We need to
analyze the exception strategy of every library to ensure that careless
library designers do not create potential collisions of the
exception-identifying parameters. We need to be mindful that exceptions thrown
from somewhere in a library can strand dynamic resources unless we
meticulously plan for each such circumstance. Finally, we need to keep a
watchful eye for new versions of tried and true libraries that now throw
exceptions where they never did before.


Plauger's Purpose


From July 1986 until June 1993, P.J. (Bill) Plauger wrote a column for
Computer Language magazine called "Programming on Purpose." His tenure ended
when the magazine changed name and emphasis and, essentially, ceased to be.
His audience would now be elsewhere, it seems. I'm not sure, but I think that
the seven years he spent at that assignment is a record for a
programmer-oriented column in a national magazine. Plauger's columns were a
breath of fresh air. Always the pragmatist, he took aim at icons,
disintegrated graven images, and constantly reminded us that far more code
rises from the trenches than floats down from the towers. He stumps the
lecture circuit, too, addressing conferences of programmers, teaching his
sometimes irreverent, often entertaining, and always on-target philosophies
about programming. He is an anomaly, a good programmer who can speak and
write.
The "Programming on Purpose" columns are now available in three volumes from
Prentice Hall. They are called, of course, Programming on Purpose, and the
volumes are subtitled, "Essays on Software Design," "Essays on Software
People," and "Essays on Software Technology." I have the first volume and will
soon have the other two. Bill follows each essay with an "afterword," which
allows him to reflect today on what he wrote in the past and how his views
have held up in light of subsequent developments in the industry. Most stood
the test of time rather well, I thought.
New things and ideas are usually received by the elders of the community in
one of three ways: non-negotiable utter rejection on one end; zealous and
rhapsodic total immersion on the other; and open-minded skepticism somewhere
on the fence. Reading these essays, it was fun to watch Dr. Plauger approach
object-oriented programming with caution and suspicion and then eventually
accept it, but for different reasons than the popular ones. I remember a
discussion we had about C++ several years ago. He was reluctant to give it
full measure, saying, "I'm a procedural kind of guy." Now he is an active
participant in the formal standardization of the C++ language.
This trilogy takes its place next to the writings of Brooks and Weinberg in my
collection of books necessary for healthy and happy attitudes toward
programming. Highly recommended.

Example 1:
void foo()
{
 FILE *fp = fopen(fname, "rt");
 char *cp = malloc(1000);
 /* parse the input */
 /* ... */
 if (error)
 longjmp(jb, ErrorCode);
 free(cp);
 fclose(fp);
}
Example 2:
void parse()
{
 String str("Parsing now");
 // parse the input
 // ...
 if (error)
 longjmp(jb, ErrorCode);
}
Example 3:
(a)
try {
 // C++ statements
}
try {
 // C++ statements
}
catch(int err) {

 // error-handling code
}
(c)
void main()
{
 // --- the try block
 try {
 foo(); // call a lower function
 }
 // --- catch exception handler
 catch(int errcode) {
 // error-handling code
 }
}
// --- program function
void foo()
{
 // C++ statements to do stuff
 if (error)
 throw -1;
}
(d)
void f() throw(char *, int)
Example 4:
class FileError {
public:
 virtual void HandleException() = 0;
};
class Locked : public FileError {
public:
 void HandleException();
};
class NotFound : public FileError {
public:
 void HandleException();
};
void bar()
{
 try {
 foo();
 }
 catch(FileError& fe) {
 fe.HandleException();
 }
}
void foo()
{
 // ...
 if (file_locked)
 throw Locked();
}
Example 5:
void foo()
{
 String *str = new String("Hello, Dolly");
 // ...
 if (file_locked)
 throw Locked();
 delete str;

}





























































September, 1993
ALGORITHM ALLEY


Diving into Windows Bitmaps: Part Two




Tom Swan


I'm not much of a gambler. Never mind what Jeff Duntemann had to say about my
skills at the rubber-chicken toss in Las Vegas at the Circus Circus hotel one
Comdex ago (DDJ, March 1991). True, I beat the feathers off Jeff that evening
as we catapulted flacid poultry into pots rotating on a lazy Susan ringed by
other chicken-hearted players. But you mustn't believe my friend Jeff when he
claims high-stakes innocence. As I recall, it took three of us to drag him
away complaining "this time for sure!" from the quarter avalanche machine--or
whatever they call those jukebox-size boxes into which one flings quarters at
layers of coins resting under oscillating pushers that promise to shove riches
onto your lap on every turn. Jeff told me that he can tell just by looking,
which machine is about to pay off. Maybe he's right, but I'll stick to
chickens.
Actually, during that same trip, I did win $6.50 on five aces at a nickel
poker slot machine, so perhaps I do have a smoldering Midas touch. Frankly, I
doubt it, but when it comes to testing algorithms, who couldn't use a
high-power lucky star? Rubber chickens and two-bit avalanches might not help
in software development, but sometimes, there's simply no other way to prove
an algorithm's correctness except to trust in chance.


Monte Carlo Method


The term "Monte Carlo Method," named after the famed Monaco casino, came into
being during the mid-40's from its use in mathematical problems that were
solved with random numbers. Now the term generally refers to test procedures
that simulate a computer program's input data with a random-number generator.
A typical Monte Carlo test repeatedly puts a program through its paces until
an error is detected, or until the programmer decides enough testing is
enough. In short to, "Monte Carlo" your code, as some would say, you write a
program to create sample data files at random, and then feed those files into
the program until you are satisfied that it's working correctly.
How do you know when to stop a Monte Carlo test? You don't. It's a gut
feeling--like the one you get when you just know the dealer is about to turn
up an ace. That feeling might steer you wrong in blackjack, but in
programming, if you run a Monte Carlo test long enough--and if your test
procedures are sound--it's a good bet the code is working. For a Monte Carlo
test to be trustworthy, however, it should follow other tests that prove
assumptions you've made about a program or algorithm. Only after you are
reasonably sure that a program is working should you subject it to Monte Carlo
scrutiny. Just throwing a bunch of random numbers at a program won't do any
more for your code than pitching elastic birds into pots has done for my bank
account.
The value of the Monte Carlo method is especially high for a program that
parses data files of many different shapes and sizes, such as the Windows
bitmap compressor introduced last month. With it you can test how well a
program handles common data sequences. You can also test the code's outer
limits. But you can never test all possible sequences the program will ever
meet. That's where the Monte Carlo method comes in. The secret is to create
random test files that closely resemble the real data the program will spend
its life consuming.
Last month, I introduced algorithms and two test programs for compressing and
decompressing Windows bitmap files. To simplify debugging, rather than mangle
real bitmaps, I programmed the tests to operate on phony bitmaps stored in
text files. This month, I'll list the remaining test programs (including a
Monte Carlo sample-file generator) and the final C++ utility that can compress
real 256-color bitmap files. I'll also point out a few quirks in the
algorithms that gave me trouble.


Test Suite


Listing One (page 132), CREATE.CPP, is a C++ program that creates sample
bitmap files in the format illustrated in Figure 1, repeated from last month.
This isn't a real bitmap file, of course. It's a text representation of pixels
that makes debugging and testing easier. The first two values represent the
number of pixels per line (0A) and the number of scan lines in the image (04).
The remaining values are hexadecimal bytes, one per pixel.
Just outputting pixel values at random would create unsuitable test files with
few same-color runs. Such files would not exercise the algorithm's primary
compression technique in which pixel groups such as 07 07 07 07 are run-length
encoded as two values, 04 07 (four pixels of color 07). To create more
realistic test files in the format illustrated by Figure 1, CREATE uses a
simple but effective method listed in Pascal in Example 1. First, set a pixel
to a random value from 0 to 255. Then, in a For loop (NP represents the number
of pixels in a scan line), use a random function to decide whether to change
the current pixel to another value. In this way, the program outputs pixel
runs as might be found in a typical bitmap. The key here is to use an
expression such as If (random(100)>=80) to decide whether to change an aspect
of the output rather than directly use random()'s result. In this case, the if
statement causes pixels to change color only about 20 percent of the time.
Listing Two (page 132), COMPARE.CPP, completes the test suite by comparing two
sample bitmap files, ignoring the two-value information line at top (see
Figure 1). Listing Three (page 132), MONTE.BAT, uses CREATE and COMPARE along
with TPACK and TUNPACK from last month's column to test the bitmap compression
and decompression algorithms in Monte Carlo style. Run MONTE.BAT with all four
C++ programs compiled in the current directory. The test ends automatically
upon detecting any errors, or you can press Crtl+C to end the test when you
are satisfied the code is working.


Bitmap Compressor


Listing Four (page 132), BPACK.CPP, is the final bitmap compression utility. I
wrote the code in C++, but I did not use IOStreams or classes, so the program
should be easily ported to Pascal, C, and other languages. The program can
compress only 256-color, Windows bitmap files. If you don't have any files in
that format, load any 16-color bitmap into Windows Paintbrush, and save it
under a different name with Type set to "256-Color bitmap." Or, just create a
new image. If your file is named MYBITS.BMP, compress it with the DOS command
BPACK MYBITS.BMP PACKED.BMP, then compare file sizes. Most files compress to
about one half of their original size, but in some cases, BPACK has produced
compression ratios as good as 80 percent.
Of course, you can also use a compression utility such as PKZIP or LHA to
compress bitmap files. To view images compressed this way, however, you first
have to decompress them to other files. To view images compressed by BPACK,
you simply view them. Most Windows display drivers can directly display packed
bitmaps. Unfortunately, though, not all Windows programs can do the same. For
example, loading a packed bitmap into Paintbrush produces the error "The
format of this file is not supported." (You'd think Windows' own programs
would recognize bitmap compression, but evidently, this isn't the case.)
To view a packed bitmap image, therefore, you must use another utility such as
the BITZOOM.CPP program that I wrote for Borland's The World of ObjectWindows
video, or the program MDIBITS.PAS in my book, Borland Pascal 7.0 Programming
for Windows (Bantam, 1993). The Pascal program has a bug, however, that
prevents it from loading packed bitmaps. To repair the problem, add a global
Word variable named Result to file UBITMAP.PAS, and change all instances of
BlockRead(X,Y,Z) to BlockRead(X,Y,Z,Result). Without the Result parameter,
BlockRead forces a run-time error of 100 if the number of bytes read from a
file do not match the requested quantity in parameter Z. The repaired files
are included on disk and on line with this month's files.


Quirks


The following are some suggestions for improving BPACK, and a few details
about quirks in the compression algorithms. See last month's column for
explanations about terms used in these items:
Absolute-mode runs must have a minimum of three pixels, forcing runs of one
and two pixels to be encoded as 01 0X or 01 0X 01 0Y where 0X and 0Y are pixel
values. This fact isn't mentioned anywhere in Microsoft's documentation.
Normally, delta escape codes are used to define outlines of foreground images
to be displayed on fixed backgrounds. For general-purpose compression,
however, delta escape codes are of little value, and they are difficult to
program in a general way. For these reasons, BPACK does not generate delta
escape codes.
Bitmaps with little redundancy might take more space when compressed. A better
compression program could detect this condition and warn users not to bother
compressing the file.
Short, same-color runs embedded inside different-color, absolute-mode runs
aren't handled as efficiently as possible. For example, it would be better to
encode the pixels 01 02 03 04 04 04 01 02 03 as a straight absolute-mode run
rather than as two absolute runs separated by the run-length encoded unit, 03
04. I suspect that the compression algorithm could deal with this and similar
cases better than it does now, but I stopped short of making the necessary
improvements.
Not all Windows display drivers correctly handle compressed bitmaps. If yours
doesn't, complain loudly to your hardware vendor. There's no excuse for a
Windows display driver not recognizing bitmap compression.
The bitmap compression algorithm is "horizontally oriented." That is, runs of
same-color pixels are detected only on pixel rows, not columns. A revised
algorithm could compare the results of horizontal and vertical run-length
encoding, and select the better of the two. (Windows would no longer recognize
the resulting file, however.)
BPACK handles only 256-color (one byte per pixel) bitmaps. As a project, you
might consider adding support for 16-color images. The compression algorithm
is nearly identical in both cases. If you write the code, I'd like to see the
result.


Your Turn


I may not be a gambler, but I'll wager this isn't the last "Algorithm Alley"
on data compression--one of the most fascinating subjects in programming. Feel
free to send comments, code, and algorithms to me in care of DDJ. You can also
upload text files to my CompuServe ID, 73627,3241. Or, if you want to chat in
person, try the rubber chicken arena at Circus Circus. I'm usually there.

Figure 1: Sample "fake" bitmap text file
0A 04
01 01 01 02 02 02 02 02 03 03
01 02 03 04 05 06 07 08 09 0A
01 02 03 04 04 04 01 02 03 04
01 02 01 02 01 02 01 02 01 02
Example 1: Generating pixel runs at random
pixel := random(256);
for I := 0 to NP - 1 do
begin
 if (random(100) >= 80)
 then pixel := random(256);
 Write(pixel)
end;
_ALGORITHM ALLEY_
by Tom Swan


[LISTING ONE]

/* ----------------------------------------------------------- *\
** create.cpp -- Create random test bitmap file **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
\* ----------------------------------------------------------- */

#include <iostream.h>
#include <iomanip.h>
#include <fstream.h>
#include <stdlib.h>
#include <time.h>

typedef unsigned char Byte;

void Error(const char *msg);
void RandomizeScanLine(Byte *sl, int np);
int RandomRange(int low, int high);
void PutByte(Byte b);

int main()
{
 int np; // Number of pixels per scan line
 int ns; // Number of scan lines

 cout << setiosflags(ios::uppercase);
 cout << setfill('0') << hex;
 randomize();
 np = RandomRange(10, 640);
 ns = RandomRange(4, 480);
 cout << setw(2) << np << ' ' << setw(2) << ns << endl;
 Byte *sl = new Byte[np]; // Allocate scan line
 if (!sl) Error("out of memory");
 while (ns-- > 0) {
 RandomizeScanLine(sl, np);
 for (int i = 0; i < np; ++i)
 PutByte(sl[i]);
 cout << endl;
 }
 delete sl;
 return 0;

}
// Display error message and halt
void Error(const char *msg)
{
 cerr << endl << "Error: " << msg << endl;
 exit(1);
}
// Insert random pixel values into scan line sl
void RandomizeScanLine(Byte *sl, int np)
{
 int pixel = random(256);
 for (int i = 0; i < np; ++i) {
 if (random(100) >= 80)
 pixel = random(256); // 20% chance of new pixel value
 sl[i] = pixel;
 }
}
// Return integer at random from low to high
int RandomRange(int low, int high)
{
 return low + random((high - low) + 1);
}
// Write byte b in hex in 2 columns with leading 0
// plus one blank to cout
void PutByte(Byte b)
{
 cout << setw(2) << (unsigned int)b << ' ';
}



[LISTING TWO]

/* ----------------------------------------------------------- *\
** compare.cpp -- Compare files created by tpack and tunpack **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
\* ----------------------------------------------------------- */

#include <iostream.h>
#include <iomanip.h>
#include <fstream.h>
#include <stdlib.h>

void Error(const char *msg);
int main(int argc, char *argv[])
{
 int np, ns; // Number of pixels, number of scan lines
 int b1, b2; // Bytes from files 1 (original) and 2 (converted)

 if (argc <= 2)
 Error("filenames missing (enter two names)");
 ifstream ifsOriginal(argv[1], ios::in);
 if (!ifsOriginal)
 Error("unable to open file #1");
 ifstream ifsConverted(argv[2], ios::in);
 if (!ifsConverted)
 Error("unable to open file #2");
 // Skip number of pixels np and number of scan line ns
 // at beginning of original test data

 ifsOriginal >> hex >> np >> ns;
 while (!ifsOriginal.eof()) {
 ifsOriginal >> hex >> b1;
 ifsConverted >> hex >> b2;
 if (b1 != b2) Error("Files are different");
 }
 cerr << endl << "Files match" << endl;
 return 0;
}
// Display error message and halt
void Error(const char *msg)
{
 cerr << endl << "Error: " << msg << endl;
 exit(1);
}



[LISTING THREE]

@echo off
rem
rem monte.bat -- Monte Carlo batch file for bitmap tests
rem
:REPEAT
echo.
echo Starting new test
echo Deleting old bitmap.00? files
del bitmap.00?
echo Creating bitmap.000 test file
create >bitmap.000
echo Packing bitmap.000 to bitmap.001
tpack bitmap.000 >bitmap.001
echo Unpacking bitmap.001 to bitmap.002
tunpack bitmap.001 >bitmap.002
echo Comparing bitmap.000 and bitmap.002
compare bitmap.000 bitmap.002
if errorlevel 1 goto ERROR
goto REPEAT
:ERROR
echo.
echo ERROR: File mismatch found!!!
echo.
:END



[LISTING FOUR]

/* ----------------------------------------------------------- *\
** bpack.cpp -- Pack (compress) a Windows bitmap file **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
\* ----------------------------------------------------------- */

#include <stdio.h>
#include <stdlib.h>
#include <alloc.h>
#include <windows.h>


// Miscellaneous definitions
#define FALSE 0
#define TRUE 1
// State-machine definitions
#define READING 0 // General reading mode
#define ENCODING 1 // Encoding same-color pixel runs
#define ABSMODE 2 // Absolute-mode encoding
#define SINGLE 3 // Encoding short absolute-mode runs
#define ENDOFLINE 4 // End of scan line detected
// Type declarations
typedef unsigned char Byte;
struct BitmapStruct {
 BITMAPFILEHEADER bfh; // Bitmap file header
 BITMAPINFOHEADER bih; // Bitmap info header
 RGBQUAD *bmiColors; // Pointer to color table (variable size)
 int clrSize; // Number of colors in table
};
// Function prototypes
void Instruct();
void Error(char *msg);
int Odd(int v);
void ReadBitmapHeader(FILE *inpf, BitmapStruct &rbs);
int IsCompressible(BitmapStruct &rbs);
void WriteBitmapHeader(FILE *outf, BitmapStruct &rbs);
void WriteBitmapBits(FILE *inpf, FILE *outf, const BitmapStruct &rbs);
void PackRLE8(FILE *outf, int np, const Byte *sl);
void PutByte(FILE *outf, Byte b);

// Global variable
long imageSize;
int main(int argc, char *argv[])
{
 BitmapStruct bs;
 puts("\nBitmap file compressor by Tom Swan");
 if (argc <= 2) Instruct();
 FILE *inpf = fopen(argv[1], "rb");
 if (!inpf) Error("Can't open input file");
 ReadBitmapHeader(inpf, bs);
 if (!IsCompressible(bs)) Error("Cannot compress this file");
 FILE *outf = fopen(argv[2], "wb");
 if (!outf) Error("Can't open output file");
 printf("Compressing file...");
 WriteBitmapHeader(outf, bs); // Write dummy header
 WriteBitmapBits(inpf, outf, bs); // Compress bitmap image
 bs.bih.biCompression = BI_RLE8; // Mark bitmap as compressed
 bs.bih.biSizeImage = imageSize; // Modify image size value
 fseek(outf, 0, SEEK_END); // Seek to eof for next statement
 bs.bfh.bfSize = ftell(outf); // Modify file size in bytes
 WriteBitmapHeader(outf, bs); // Write real header
 fclose(inpf);
 fclose(outf);
 printf("\n%s --> %s\n", argv[1], argv[2]);
 delete bs.bmiColors;
 return 0;
}
// Display instructions and exit program
void Instruct()
{
 puts("\nSyntax: BPACK infile outfile");

 puts("\nEnter the name of a bitmap (infile) to compress.");
 puts("The program packs the bitmap if possible, and");
 printf("stores the results in a new file (outfile).\n");
 puts("The original bitmap file is not changed in any way.");
 puts("This version is limited to 8-bit (256-color) files");
 puts("that are not already compressed.");
 exit(0);
}
// Display error message and exit program
void Error(char *msg)
{
 printf("\nERROR: %s\n", msg);
 exit(1);
}
// Return true if v is odd
int Odd(int v)
{
 return v & 0x01;
}
// Read bitmap headers and color table into rbs
void ReadBitmapHeader(FILE *inpf, BitmapStruct &rbs)
{
 if (fread(&rbs.bfh, sizeof(rbs.bfh), 1, inpf) != 1)
 Error("Cannot read bitmap file header");
 if (fread(&rbs.bih, sizeof(rbs.bih), 1, inpf) != 1)
 Error("Cannot read bitmap info header");
 if (rbs.bih.biClrUsed != 0) {
 rbs.clrSize = (int)rbs.bih.biClrUsed;
 } else switch (rbs.bih.biBitCount) {
 case 1: rbs.clrSize = 2; break;
 case 4: rbs.clrSize = 16; break;
 case 8: rbs.clrSize = 256; break;
 case 24: rbs.clrSize = 0; break;
 default: Error("biBitCount not 1, 4, 8, or 24");
 }
 if (rbs.clrSize == 0) Error("clrSize == 0");
 rbs.bmiColors = new RGBQUAD[rbs.clrSize];
 if (!rbs.bmiColors)
 Error("bmiColors is null. Out of memory.");
 if (fread(rbs.bmiColors, sizeof(RGBQUAD),
 rbs.clrSize, inpf) != rbs.clrSize)
 Error("Cannot read color table");
}
// Returns true if bitmap header rbs is compressible
// Required format: MS DIB, uncompressed, 8-bit (256-color)
int IsCompressible(BitmapStruct &rbs)
{
 if (rbs.bfh.bfType != 0x4d42) return FALSE;
 if (rbs.bih.biSize != sizeof(BITMAPINFOHEADER)) return FALSE;
 if (rbs.bih.biBitCount != 8) return FALSE;
 if (rbs.bih.biCompression != BI_RGB) return FALSE;
 return TRUE;
}
// Write bitmap headers and color table in bs to outf
void WriteBitmapHeader(FILE *outf, BitmapStruct &rbs)
{
 rewind(outf);
 if (fwrite(&rbs.bfh, sizeof(rbs.bfh), 1, outf) != 1)
 Error("writing bitmap file header");

 if (fwrite(&rbs.bih, sizeof(rbs.bih), 1, outf) != 1)
 Error("writing bitmap info header");
 if (fwrite(rbs.bmiColors, sizeof(RGBQUAD),
 rbs.clrSize, outf) != rbs.clrSize)
 Error("writing color table");
}
// Read pixel data from inf, compress and write to outf
void WriteBitmapBits(FILE *inpf, FILE *outf, const BitmapStruct &rbs)
{
 int np; // Number of pixels per scan line
 int ns; // Number of scan lines
 int slSize; // Size of one scan line in bytes
 Byte *sl; // Pointer to scan line
 // Assign miscellaneous sizes
 np = (int)rbs.bih.biWidth;
 ns = (int)rbs.bih.biHeight;
 // Allocate scan line buffer
 if (Odd(np))
 slSize = np + 1; // Must have an even number of bytes
 else
 slSize = np;
 if (slSize <= 0) Error("slSize <= 0");
 sl = new Byte[slSize];
 if (!sl) Error("out of memory");
 // Read and compress scan lines
 while (ns-- > 0) {
 if (fread(sl, 1, slSize, inpf) != slSize)
 Error("reading pixel scan line");
 PackRLE8(outf, np, sl);
 }
 delete sl;
 PutByte(outf, 0); // Mark end of bitmap
 PutByte(outf, 1);
}
// Compress and write np pixels in sl to output file outf
void PackRLE8(FILE *outf, int np, const Byte *sl)
{
 int slx = 0; // Scan line index
 int state = READING; // State machine control variable
 int count; // Used by various states
 Byte pixel; // Holds single pixels from sl
 int done = FALSE; // Ends while loop when true
 int oldcount, oldslx; // Copies of count and slx

 while (!done) {
 switch (state) {
 case READING:
 // Input:
 // np == number of pixels in scan line
 // sl == scan line
 // sl[slx] == next pixel to process
 if (slx >= np) // No pixels left
 state = ENDOFLINE;
 else if (slx == np - 1) { // One pixel left
 count = 1;
 state = SINGLE;
 } else if (sl[slx] == sl[slx + 1]) // Next 2 pixels equal
 state = ENCODING;
 else // Next 2 pixels differ

 state = ABSMODE;
 break;
 case ENCODING:
 // Input:
 // slx <= np - 2 (at least 2 pixels in run)
 // sl[slx] == first pixel of run
 // sl[slx] == sl[slx + 1]
 count = 2;
 pixel = sl[slx];
 slx += 2;
 while ((slx < np) && (pixel == sl[slx]) && (count < 255)) {
 count++;
 slx++;
 }
 PutByte(outf, (Byte)count); // Output run-length-encoded unit
 PutByte(outf, pixel);
 state = READING;
 break;
 case ABSMODE:
 // Input:
 // slx <= np - 2 (at least 2 pixels in run)
 // sl[slx] == first pixel of run
 // sl[slx] != sl[slx + 1]
 oldslx = slx;
 count = 2;
 slx += 2;
 // Compute number of bytes in run
 while ((slx < np) && (sl[slx] != sl[slx - 1]) && (count < 255)) {
 count++;
 slx++;
 }
 // If same-color run found, back up one byte
 if ((slx < np) && (sl[slx] == sl[slx - 1]))
 if (count > 1)
 count--;
 slx = oldslx; // Restore scan-line index
 // Output short absolute runs of less than 3 pixels
 if (count < 3 )
 state = SINGLE;
 else {
 // Output absolute-mode run
 PutByte(outf, 0);
 PutByte(outf, (Byte)count);
 oldcount = count;
 while (count > 0) {
 PutByte(outf, sl[slx]);
 slx++;
 count--;
 }
 if (Odd(oldcount))
 PutByte(outf, 0); // End run on word boundary
 state = READING;
 }
 break;
 case SINGLE:
 // Input:
 // count == number of pixels to output
 // slx < np
 // sl[slx] == first pixel of run

 // sl[slx] != sl[slx + 1]
 while (count > 0) {
 PutByte(outf, 01);
 PutByte(outf, sl[slx]);
 slx++;
 count--;
 }
 state = READING;
 break;
 case ENDOFLINE:
 PutByte(outf, 0);
 PutByte(outf, 0);
 done = TRUE;
 break;

 default:
 Error("unknown state in PackRLE8()");
 break;
 }
 }
}
// Write byte b to output file outf. Increments global imageSize variable
void PutByte(FILE *outf, Byte b)
{
 if (fwrite(&b, 1, 1, outf) != 1)
 Error("writing byte to output file");
 imageSize++;
}


































September, 1993
UNDOCUMENTED CORNER


Documenting Documentation: The Windows .HLP File Format, Part I




Pete Davis


Pete works for a small consulting firm as a programmer/analyst writing
client-server software in Windows 3, OS/2, DOS, and Windows NT. He is working
on a book, tentatively titled The Hitchhiker's Guide to Win32 Programming, to
be published by Addison-Wesley. Pete can be contacted on CompuServe at
71644,3570.




Introduction




by Andrew Schulman


In my introductions to this column for the past few months, I've been
desperately seeking someone who would provide us with the file format used by
the Windows help system. But even though I knew that .HLP files were
important, and that many Windows developers would love to get their hands on
this information, I was surprised by the overwhelming response to this
request. Thousands of DDJ readers wrote in, explaining they had already
"cracked" a bit of the .HLP file format, and that they were anxious to share
information with others who might have figured out other parts. Well,
actually, it was only about half a dozen readers who had figured out parts of
the .HLP format, but that's still pretty good for something like this.
Why is the .HLP file format important? For starters, there are a number of
WinHelp-related tools available (RoboHelp and DocToHelp spring to mind).
However, all these tools produce Rich Text Format (RTF) files, which must then
be fed into Microsoft's extremely slow help compiler to produce a .HLP file.
Clearly, it would be nice if someone other than Microsoft knew how .HLP files
work, so that we might get better and faster help compilers than the one from
Microsoft, just as companies such as SLR produce better and faster linkers
than Microsoft's. Knowing the .HLP file format would also make it possible to
write help decompilers, and to produce a WinHelp viewer that runs under DOS,
without requiring Windows.
It's important to realize that we're not just talking about "help" in the
narrow F1 sense. As Ron Burk noted in his article on undocumented WinHelp
(DDJ, June 1993), Microsoft has a broader view of WinHelp. The .MVB files used
by Microsoft's Multimedia Viewer are really just extended .HLP files.
Microsoft in fact is using this extended WinHelp system as the core for a new
type of application: Both the Microsoft Developer Network (MSDN) and Cinemania
CD-ROM products are essentially huge .MVB files. Yes, huge--MSDNCD.MVB (which
we'll examine later on) is 270 megabytes, and CINMANIA.MVB is 139 Mbytes!
Using the WinHelp file format, you can see how these applications (both of
which are excellent, by the way) are put together.
As I said, a surprising number of you wrote in with information about WinHelp.
Many thanks to Brian Walker, Carl Burke, and Lou Grinzo in particular for
their contributions! Pete Davis (working with Ron Burk, editor of Windows/DOS
Developer's Journal did a great amount of reverse engineering, put together
sample code, and wrote this month's article.
The central insight in Pete's article is that .HLP and .MVB files are
organized around a B-tree structure he calls WHIFS, the WinHelp Internal File
System. In somewhat the same way that a Stacker or DoubleSpace "volume" looks
like a single file to DOS, but actually holds all of your files, likewise a
single .HLP file is just a collection of WHIFS files. Interestingly, Microsoft
documents this in a roundabout way, when the Windows SDK mentions the
possibility of "baggage" files and of getting at them via some WinHelp macros.
Pete and Ron figured out that everything in the .HLP file--titles, keywords,
phrases, the help text itself--is treated in exactly the same way as
third-party baggage. All the help data in a .HLP file is kept in various
internal WHIFS files which have names such as SYSTEM, Phrases, and TOPIC. It's
the initial pipe character that distinguishes these built-in files from any
baggage. The same format is used for .HLP files under Win32, on both Intel and
MIPS machines (though there doesn't seem to be a Phrases file).
This is a big topic, and there will barely be room to squeeze everything into
even a two-part article. This month, Pete explains the basics of the WHIFS
B-tree system, and explains a few of the internal files, such as SYSTEM which,
for example, contains any WinHelp macros. Next month, Pete enumerates the rest
of the internal files, including TOPIC, which contains the actual help text.
Because the text is usually compressed, the article will show how to
decompress it. Part Two will also describe a full-blown HelpDump program.
After the conclusion of Pete's .HLP article next month, it looks like the next
topic will be Novell's "proprietary" NetWare Core Protocol (NCP). See? We
don't just pick on Microsoft in this column. Any company that has an important
software interface that is kept under wraps is fair game here. Send your
suggestions to me at 76320,302 on CompuServe (that's 76302.302@compuserve.com
from the Internet).
Who needs to know about the Windows .HLP file format? After all, Microsoft
provides a compiler and third-party vendors supply WinHelp development tools.
What more could you need?
Well, there are a lot of things that could be done with solid information on
.HLP files. Vendors could create development environments with help
compilation. New WinHelp viewers could be written with extensions. Using the
file system built into WinHelp, you could write a viewer with additional
features not supported by WINHELP.EXE or VIEWER.EXE. These features could be
added through the use of additional WHIFS files, and still allow the help
files to be viewed by WINHELP.EXE (though without the additional features, of
course).
One thing I think we'd all like to see is a help compiler that's just plain
faster than Microsoft's. Also, there's currently no way to reverse the help
compilation as there is with Microsoft's DOS help compiler.
The term "WinHelp file" includes not just .HLP files but also Mulitmedia
Viewer (.MVB) files, such as used in Microsoft's Cinemania and MSDN CD-ROM
products. The Multimedia Viewer is essentially an extended WinHelp, so many of
the things we'll talk about apply to both WinHelp and VIEWER.
While there are important differences between Windows 3.0 and 3.1 help files,
I'll focus here almost entirely on 3.1 files.


The .HLP Internal File System


The key to the WinHelp file format is that it's based on a structure I call
the WinHelp Internal File System (WHIFS). WHIFS is like DOS in a way. It
provides a directory of files and a way to find the files. The difference is
that all of the WHIFS files are kept inside a single .HLP file. As far as DOS
is concerned, the .HLP file is just one file, but to WinHelp, the .HLP file
contains several (possibly many) files. All that is needed to get at the
information in WinHelp is a simple interface to the WHIFS. Once you have
access to the WHIFS, everything else is available to you.
The WHIFS is actually documented by Microsoft, in an odd way, in its
description of "baggage" files in the Programming Tools manual in the Windows
3.1 SDK. Baggage files are explained a little more clearly in the MSDN CD-ROM,
in the section on creating Viewer files. Baggage files are simply files that
you can list in the help project file, which the help compiler will then
import into the help file and store in its own WHIFS.
What isn't documented is that a .HLP file contains many more internal files
than just the baggage files included by the user. In fact, everything is in
the WHIFS files, from keyword B-trees, to the Phrases table to the Topic file
that contains the actual help text. Even the list of available fonts is kept
in a WHIFS file. The key insight for understanding the .HLP file format is
what Microsoft says only refers to baggage files, in fact refers to everything
in the .HLP or .MVB file. Everything is in the WHIFS except for a 16-byte
header at the beginning of the file; this header contains the WHIFS's location
within the .HLP file.
This means that you don't necessarily have to write code from outside WinHelp
to get at this information. You could access WinHelp's internal file system by
using the WinHelp Internal File System commands. This is described in the
Multimedia Developer's Kit and the Microsoft Developer's Network CD. Both
refer to VIEWER.EXE, but, as Ron Burk explained in the June "Undocumented
Corner" (DDJ, June 1993), the same commands are available in WinHelp.
Microsoft's intention is that you use these to access your baggage files, but
it turns out that you can access the WinHelp internal files in the same way.
That's Ron Burk's specialty, though, and if you want to know more, you'll have
to read his upcoming book, tentatively titled WinHelp for Programmers and
Technical Writers. (All of my work in reverse-engineering WinHelp was done
with Ron, who edits the Windows/DOS Developer's Journal.) In this article,
I'll be developing DOS programs to manipulate .HLP files, so I won't be able
to exploit the built-in Windows functionality.


Help File Header


WHSTRUCT.H in Listing One (page 136) shows all the key data structures that
make up a .HLP or .MVB file. The .HLP file starts off with a structure called
HELPHEADER (see Listing One). The first field is a 4-byte magic number,
0x00035F3F, which appears to be constant in all .HLP and .MVB files. You can
use this field to verify that you have a valid .HLP or .MVB file. The second
HELPHEADER field is the long file offset of the crucial Windows Help Internal
File System header (called WHIFSBTREEHEADER in Listing One). The other fields
don't seem very important.


WHIFS Header



The WHIFS header (WHIFSTBREEEHEADER) is central to the entire WinHelp file. As
with several other parts of WinHelp, it is based on a B-tree. For something
that has been kept under wraps so long, the B-tree implementation in WinHelp
is actually fairly straightforward.
As with all B-trees, the WHIFS B-tree has two types of nodes, index and leaf
nodes. In this case, the leaf nodes and index nodes have slightly different
headers, described later. The combination of the header (index or leaf) and
the data following the header create what will be referred to as "a page," so
a 1K page with a 16-byte header would only have 1024-16=1008 bytes of data.
The B-tree has an initial header that describes the dynamics of the B-tree
itself. There are fields for the number of pages, the number of levels, the
root page, and so on. With this information you can then traverse the B-tree,
and find internal files that make up the .HLP file.
Immediately following the header are the actual nodes of the B-tree. Once the
header is read in, you should save the current file location, which marks the
beginning of the first WHIFS node. The WHIFS uses 1-Kbyte pages (elsewhere
WinHelp uses a 2-Kbytes B-tree page size). Thus, to find the file offset of
any node in the WHIFS B-tree, you simply multiply the page number you're
looking for by 1024L (note that WinHelp files can be quite large, so that the
file offset must be a long), and add in the location of the first node (page
0). This is demonstrated in the sample program HELPDIR.C, Listing Two (page
136).
The NSplits field in the WHIFS header is the number of "splits" that the
B-tree has suffered. When a leaf node of a B-tree has filled up and a record
is added, then the node is split into two nodes.
The RootPage field tells you which page is the root of the B-tree. Page
numbers start at 0. TotalPages is the number of pages in the entire B-tree.
This is usually used to let you know when you've read in the last record of
the B-tree.
Naturally, NLevels is how many levels there are in the B-tree. All the leaves
are at the same level, so NLevels lets you know when you have reached a leaf
node.
TotalWHIFSEntries is simply a count of all of the filenames in the WHIFS
B-tree. This count has little use, other than to confirm the totals given in
each node of the B-tree or else to know, before traversing the tree, how much
space to allocate for a file list.


The WHIFS B-tree


Each page of the B-tree has a node header. The index nodes and leaf nodes have
slightly different headers. These node headers are generic and work with all
of the B-trees in the entire WHIFS. The index nodes are referred to as nodes
and the leaf nodes as leaves.
NEntries in the index nodes (BTREEINDEXHEADER) indicates how many keys are in
the data page. As discussed earlier, where n is the number of key values,
there will be n+1 pointers to lower nodes (leaf or index).
The leaf node header (BTREELEAFHEADER) has two extra fields, PreviousPage and
NextPage, which are part of a double-linked list to other leaf node headers in
the B-tree. This makes it easy to traverse all of the filenames
alphabetically. You simply follow from one page to the next instead of backing
up to the index page and finding the next entry.
If the only node in the B-tree is the root node (in other words, if there's
only one page in the tree), it will be a leaf and not an index. You will need
to use the leaf header to read it and both PreviousPage and NextPage will be
set to -1.


WHIFS Directory and File Header


The WHIFS directory is simply a list of files in the data of the leaf nodes of
the WHIFS B-tree. Each filename is a variable-length ASCIIZ (null-terminated)
string. Following the null is a long integer pointing to the file's header
relative to the beginning of the help file. This is the information we really
care about! Since the strings are of variable length, this isn't a proper C
structure, and isn't shown in Listing One. Figure 1 shows an annotated hex
dump of various portions of a .HLP file (it happens to be MSMAIL32.HLP from
the MIPS subdirectory of Windows NT).
Files that are used by WinHelp internally are prefixed with the (pipe)
character; see Table 1. For example, the Keyword B-tree file is called
KWBTREE. Baggage files included by the user, on the other hand, are stored
under the file's original name as specified in the help project file.
Each file in the WHIFS file system begins with a file header (FILEHEADER in
Listing One), holding the size of the file, including the header, and the size
of the file without the header, followed by a single-byte null (0x00).


From WHIFS to WinHelp


So far we've covered the basic structure of WHIFS. Nothing much here sounds
like it has anything to do with WinHelp! It's just a generic file system using
btrees. But this is what's required to get at the internal files which are
used for everything else in WinHelp.
Listing Two presents HELPDIR.C, which merely lists all of the WHIFS files in
the WHIFS B-tree. The method used is to traverse the B-tree to the first leaf
node. From there it uses the linked list to traverse the rest of the leaf
nodes. Sample output from HELPDIR shown in Figure 2, for the same .HLP file
that was hex dumped in Figure 1.
Now that you know how to get to the internal files within WinHelp, it's time
to learn something about the files you're going to be reading. Each of these
files is found in the WHIFS directory and starts with the standard FILEHEADER
structure. Each file has a different purpose and is used by WinHelp in one way
or another. For example, the SYSTEM file holds global information, such as the
amount of compression used. The TOPIC file contains the actual help text
(possibly compressed). These internal files provide the actual help display
that the user sees.


The SYSTEM File


Applications that interpret the TOPIC file or Phrases file should first read
the SYSTEM File (SYSTEMHEADER in Listing One) to determine if any compression
is used. It also tells which version of the help compiler was used to compress
the file. With a few files I've seen, the flag values appear to take on an
almost random value. This has caused me some heartache, so the flags in
Listing One (NO_COMPRESSION_310, and so on) should be used with care.
In .HLP generated by the 3.10 help compiler, the SYSTEMHEADER is followed by a
group of records (SYSTEMREC) that contain information that is needed within
the help file in general. Each record has a record type (HPJ_TITLE, and so on
in Listing One). You must keep track of how far you are through the SYSTEM
file to know if there are any more records (you need the file header with the
file length). It seems the copyright notice shows up in all 3.10 compiles,
although the copyright notice is one null byte if no copyright notice appears
in the .HPJ file.
For the 3.00 compiler, the rest of SYSTEM, following the SYSTEMHEADER is
reserved for the TITLE. The 3.0 SYSTEM file has a fixed length of 54 bytes,
including the file header. The 3.0 compiler does not create SYSTEMREC records.
A most interesting record type is MACRO_DATA (0x04). As discussed in Volume 4
(Resources), Chapter 15 of the Windows 3.1 SDK, .HLP files can include calls
to macros. For example, RegisterRoutine() takes a DLL name, function name, and
format specification for the function's arguments (U represents an unsigned
long, S represents a far string, and so on). DLL functions registered this way
can then be used in the .HLP file. Because they can link to anything in a DLL,
.HLP files can thus act as applications.
Listing Three (page 137) shows WHMACROS.C, which will list all the macros
included in a .HLP or .MVB file. Sample output from the MSDN CD-ROM and
Microsoft Cinemania are shown in Figure 3. WHMACROS is essentially a
mini-disassembler for the code stored in WinHelp files. Its implementation,
however, is trivial. Macros are stored verbatim (even abbreviations such as JI
for JumpId() are preserved). In Figure 3, you can see that the Search button
is attached to ExecFullTextSearch() in FTUI.DLL (a DLL which provides
full-text search capabilities). Likewise, in the MSDN CD, the previous and
next buttons are attached to the Navigator() function in MSDNCD.DLL.


The Phrases File


The Phrases file is a list of phrases that are used as a sort of compression
of the TOPIC file (which we'll discuss next month). The Phrases file is simply
a collection of phrases that appear in the TOPIC text a certain number of
times. If the help compiler feels that the word appears often enough to
warrant replacement, it adds it to the Phrases file and in the topic file,
puts a reference to the phrase in place of the actual phrase.
The basic phrase header (PHRASEHDR) is a simple two-word record. The first
word is the number of phrases in the phrase list. The second word is unknown,
but seems to always contain 0x0100.
If the 3.10 help compiler was used with COMPRESSION=HIGH, an alternate phrase
header (ALTPHRASEHDR) is used. It contains one additional field, PhrasesSize,
which is simply the amount of space required by all of the offsets and phrases
in the phrase list. The reason is that the phrases file, when
COMPRESSION=HIGH, is compressed using a simple compression technique. (Next
month I'll discuss the compression technique and provide the code for
decompressing the Phrases file.)
Following the phrase header is an array of offsets, followed by the phrases
themselves. The offsets are used to determine the start and end of a phrase in
the phrase table. There are NumPhrases+1 offsets, where the last offset points
to one byte past the letter of the phrase. This is used to calculate the
length of the last phrase.
The entire Phrases file, except for the header, is kept in one contiguous
block of memory. When the phrases are decompressed or loaded into memory, all
offsets point correctly to the phrases. For a given phrase number, n, the
phrase begins at offset[n] and is of the length offset[n+1]-offset[n].


That's It?


Well, no. I said there were a lot of structures, and we've covered only a few
of the ten system files. For example, all the actual help text is in the
TOPIC, which I haven't covered yet. I'll have to cover as many as I can next
month. I'll also be providing HelpDump, a program that will dump information
about the various files in a help file. Also, next month, I'll discuss the
compression algorithm used on the Phrases and TOPIC file when using the 3.10
help compiler with COMPRESSION=HIGH.

Before wrapping it up, I want to give credit where credit is due. Much of the
work here was done with or by Ron Burk, the true expert on WinHelp
programming. Several other people found and/or corrected a lot of the
information--Lou Grinzo, Carl Burke, and Brian Walker.
As with any undocumented features, this information needs to be used with
care. For one thing, it probably isn't 100 percent accurate; it certainly
isn't complete. I'd be more than happy to hear from anyone who has some more
insights into the WinHelp format.
Table 1
WinHelp internal files.
==============================================================================
 Function Description
==============================================================================
 bmx Bitmap files, numbered (bm0, bm24, bm12, and so on. Do not start with a )
 CONTEXT Context topic table
 CTXOMAP Context mapping to topics
 FONT Fonts available to help file
 KWBTREE Keyword B-tree file
 KWDATA Keyword mappings to topic file
 KWMAP Map into the KWBTREE for quick access
 Phrases A list of phrases used for compression of the TOPIC file
 SYSTEM Contains mostly information from the .HPJ file
 TOMAP List of pointers to topics
 TOPIC Contains the actual help text (usually compressed)
 TTLBTREE Topic titles B-tree
 baggage Appears under the filename exactly as specified in help project
==============================================================================
Figure 1: Annotated hex dump of portions of a .HLP file. (a) All .HLP files
 start with a HELPHEADER. The first long is the .HLP magic number (0x035F3F).
 The next long is the file offset of the WHIFS header (in Figure 1(b), that's
 0x041F); (b) the WHIFS starts off with a WHIFSBTREEHEADER, immediately
 followed by the WHIFS directory, which contains null-terminated file names
 followed by the individual WHIFS file's offset within the larger .HLP
 file. Here, bag.ini is at offset 0x10, CONTEXT is at 0x0362B3, and
 CTXOMAP is at 0x032B02; (c) each internal file begins with FILEHEADER
 structure, which specifies the file's size both with and without the header,
 followed by a 0. Here, bag.ini is 0x040F bytes with the header, and 0x0F06
 without. The file data itself (evidentally, some kind of initialization file)
 starts immediately after the header.
==============================================================================
(a)
 D:\MIPS>dump msmail32.hlp -bytes 8
 00000000 3F 5F 03 00 1F 04 00 00 ?_......
(b)
 D:\MIPS>dump msmail32.hlp -offset 0x041f
 0000041f 2F 04 00 00 26 04 00 00 04 3B 29 02 04 00 04 7A /...&....;)....z
 0000042f 34 00 00 43 3A 5C 7E 68 63 35 00 09 02 62 6D 00 4..C:\~hc5...bm.
 0000043f 00 00 00 00 00 FF FF 01 00 01 00 1E 00 00 00 C1 ................
 0000044f 02 1E 00 FF FF FF FF 62 61 67 2E 69 6E 69 00 10 .......bag.ini..
 0000045f 00 00 00 7C 43 4F 4E 54 45 58 54 00 B3 62 03 00 ...CONTEXT..b..
 0000046f 7C 43 54 58 4F 4D 41 50 00 02 2B 03 00 7C 46 4F CTXOMAP..+..FO
 ; ... etc. ...
(c)
 D:\MIPS>dump d:\mips\msmail32.hlp -offset 0x10
 00000010 0F 04 00 00 06 04 00 00 00 0D 0A 5B 62 61 67 2E ...........[bag.
 00000020 69 6E 69 5D 0D 0A 67 72 6F 75 70 63 6F 75 6E 74 ini]..groupcount
 00000030 3D 31 34 0D 0A 67 72 6F 75 70 31 3D 42 61 63 6B =14..group1=Back
 00000040 75 70 0D 0A 67 72 6F 75 70 32 3D 43 6C 69 70 62 up..group2=Clipb
 ; ... etc. ...
==============================================================================
Figure 2: HELPDIR output for the .HLP file hex dumped in Figure 1.
==============================================================================
 D:\MIPS>c:\ddj\helpdir msmail32.hlp
 bag.ini 0x00000010
 CONTEXT 0x000362B3

 CTXOMAP 0x00032B02
 FONT 0x000327D2
 KWBTREE 0x00033255
 KWDATA 0x00032ED5
 KWMAP 0x0003323E
 SYSTEM 0x0000084E
 TOPIC 0x00000A53
 TTLBTREE 0x00034A84
 bm0 0x00037AE2
 ; ... etc. ...
==============================================================================
Figure 3: Selected macros in Microsoft Cinemania and the MSDN
CD-ROM, as displayed by WHMACROS.
==============================================================================
C:\DDJ>dir d:\content\*.mvb
CINMANIA MVB 139104719 08-18-92 12:00a
C:\DDJ>whmacros d:\content\cinmania.mvb
RegisterRoutine("ftui","InitRoutines","SU")
RegisterRoutine("ftui","ExecFullTextSearch","USSS")
; ...
InitRoutines(qchPath,1)
; ...
CreateButton("ftSearch", "&Search", \
 "ExecFullTextSearch(hwndApp, qchPath, `', `')")
; ...
C:\DDJ>dir d:\*.mvb
MSDNCD MVB 270353088 04-05-93 5:58p
C:\DDJ>whmacros d:\msdncd.mvb
RegisterRoutine("msdncd", "Navigator", "USS")
Navigator(hwndApp, "Load", qchPath)
; ...
CreateButton("btn_prv","<<I&ndex","Navigator(hwndApp,\"Prev\",\"\")")
CreateButton("btn_nxt","Inde&x>>","Navigator(hwndApp,\"Next\",\"\")")
==============================================================================
_UNDOCUMENTED CORNER_
edited by Andrew Schulman
written by Pete Davis

[LISTING ONE]

/* WHSTRUCT.H--Windows Help File Internal Records--Pete Davis and Ron Burk,
 June 1993. See "Undocumented Corner," DDJ, September 1993 */

typedef unsigned long DWORD;
typedef unsigned int WORD;
typedef unsigned char BYTE;

#define HELP_MAGIC 0x00035F3FL

/* Help file Header record */
typedef struct HELPHEADER {
 DWORD MagicNumber; /* 0x00035F3F */
 long WHIFS; /* File offset of WHIFS header */
 long Negative1;
 long FileSize; /* Size of entire .HLP File */
} HELPHEADER;
/* File Header for WHIFS files */
typedef struct FILEHEADER {
 long FilePlusHeader; /* File size including this header */

 long FileSize; /* File size not including header */
 char TermNull;
} FILEHEADER;
/* Help Directory BTREE */
typedef struct WHIFSBTREEHEADER {
 char Magic[18]; /* Not exactly magic for some .MVB files */
 char Garbage[13];
 int MustBeZero; /* Probably shows up when Help > ~40 megs */
 int NSplits; /* Number of page split Btree has suffered */
 int RootPage; /* Page # of root page */
 int MustBeNegOne; /* Probably shows up when B-Tree is HUGE!! */
 int TotalPages; /* total # to 2Kb pages in Btree */
 int NLevels; /* Number of levels in this Btree */
 DWORD TotalWHIFSEntries;
} WHIFSBTREEHEADER;
/* Modified B-Tree Node header to handle a pointer to the page */
typedef struct BTREENODEHEADER {
 WORD Signature; /* Signature word */
 int NEntries; /* Number of entries */
 int PreviousPage; /* Index of Previous Page */
 int NextPage; /* Index of Next Page */
 char *BTData; /* Pointer to B-Tree's data */
} BTREENODEHEADER;
/* Modified B-Tree Index header to handle a pointer to the page */
typedef struct BTREEINDEXHEADER {
 WORD Signature; /* Signature word */
 int NEntries; /* Number of entries in node */
 char *IdxData;
} BTREEINDEXHEADER;
/* Phrase header for uncompressed Phrases file */
typedef struct PHRASEHDR {
 int NumPhrases; /* Number of phrases in table */
 WORD OneHundred; /* 0x0100 */
} PHRASEHDR;
/* Phrase header for compressed Phrases file */
typedef struct ALTPHRASEHDR {
 int NumPhrases; /* Number of phrases in table */
 WORD OneHundred; /* 0x0100 */
 long PhrasesSize; /* Amount of space uncompressed phrases requires */
} ALTPHRASEHDR;
/* Flags for SYSTEM header Flags field below: Unfortunately, none of these
 flags are particularly solid. The 0x0004 works MOST of the time. Another
 flag, 0x0008, appears both in Win32 .HLP files, and in files with Phrase
 compression but without LZ77 compression. */
#define NO_COMPRESSION_310 0x0000
#define COMPRESSION_310 0x0004
#define SYSFLAG_300 0x000A
/* Header for SYSTEM file */
typedef struct SYSTEMHEADER {
 BYTE Magic; /* 0x6C */
 BYTE Version; /* Version # */
 BYTE Revision; /* Revision code */
 BYTE Always0; /* Unknown */
 WORD Always1; /* Always 0x0001 */
 DWORD GenDate; /* Date/Time that the help file was generated */
 WORD Flags; /* Values seen: 0x0000 0x0004, 0x0008, 0x000A */
 } SYSTEMHEADER;
/* Types for SYSTEMREC RecordType below: note that other record types,
 such as 0x0A, 0x0B, 0x0C, 0x0D, shown up in the large .MVB files used

 by the MSDN CD-ROM and Cinemania products. */
#define HPJ_TITLE 0x0001 /* Title from .HPJ file */
#define HPJ_COPYRIGHT 0x0002 /* Copyright notice from .HPJ file */
#define HPJ_CONTENTS 0x0003 /* Contents= from .HPJ */
#define MACRO_DATA 0x0004 /* RData = 4 nulls if no macros */
#define ICON_DATA 0x0005 /* Data for Icon */
#define HPJ_SECWINDOWS 0x0006 /* Secondary window info in .HPJ */
#define HPJ_CITATION 0x0008 /* Citation= under [OPTIONS] */
/* Secondary Window Record following type 0x0006 System Record */
typedef struct SECWINDOW {
 WORD Flags; /* Flags (See Below) */
 BYTE Type[10]; /* Type of window */
 BYTE Name[9]; /* Window name */
 BYTE Caption[51]; /* Caption for window */
 WORD X; /* X coordinate to start at */
 WORD Y; /* Y coordinate to start at */
 WORD Width; /* Width to create for */
 WORD Height; /* Height to create for */
 WORD Maximize; /* Maximize flag */
 BYTE Rgb[3]; /* RGB for background */
 BYTE Unknown1; /* No known use */
 BYTE RgbNsr[3]; /* RGB for non scrollable region */
 BYTE Unknown2; /* No known use */
} SECWINDOW;
/* Values for Secondary Window Flags */
#define WSYSFLAG_TYPE 0x0001 /* Type is valid */
#define WSYSFLAG_NAME 0x0002 /* Name is valid */
#define WSYSFLAG_CAPTION 0x0004 /* Ccaption is valid */
#define WSYSFLAG_X 0x0008 /* X is valid */
#define WSYSFLAG_Y 0x0010 /* Y is valid */
#define WSYSFLAG_WIDTH 0x0020 /* Width is valid */
#define WSYSFLAG_HEIGHT 0x0040 /* Height is valid */
#define WSYSFLAG_MAXIMIZE 0x0080 /* Maximize is valid */
#define WSYSFLAG_RGB 0x0100 /* Rgb is valid */
#define WSYSFLAG_RGBNSR 0x0200 /* RgbNsr is valid */
#define WSYSFLAG_TOP 0x0400 /* On top was set in HPJ file */
/* Help Compiler 3.1 System record. Multiple records possible */
typedef struct SYSTEMREC {
 WORD RecordType; /* Type of Data in record */
 WORD DataSize; /* Size of RData */
 char *RData; /* Raw data (Icon, title, etc) */
 } SYSTEMREC;
/* Header for TOMAP file */
typedef struct TOMAPHEADER {
 long IndexTopic; /* Index topic for help file */
 long Reserved[15];
 int ToMapLen; /* Number of topic pointers */
 long *TopicPtr; /* Pointer to all the topics */
 } TOMAPHEADER;


[LISTING TWO]

/* HELPDIR.C -- List all internal files with a Windows .HLP file.
 WHIFS = Windows Help Internal File System -- Pete Davis, June 1993
 bcc helpdir.c
 See "Undocumented Corner," DDJ, September 1993 */
#pragma pack(1)
#include <conio.h>

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include "whstruct.h"

#define PAGE_SIZE 1024L /* 1k pages -- must be long! */

void fail(const char *s) { puts(s); exit(1); }

int main(int argc, char *argv[]) {
 HELPHEADER HelpHdr;
 WHIFSBTREEHEADER WHIFSHdr;
 BTREENODEHEADER WHIFSNode;
 int file, aPage, c;
 long WHIFSStart, FileOffset;
 FILE *HelpFile;

 if ((HelpFile=fopen(argv[1], "rb")) == NULL)
 fail("can't open file");
 /* Get Help header, go to WHIFS and get WHIFS Header */
 fread(&HelpHdr, sizeof(HelpHdr), 1, HelpFile);
 if (HelpHdr.MagicNumber != HELP_MAGIC)
 fail("not a Windows help file");
 fseek(HelpFile, HelpHdr.WHIFS, SEEK_SET);
 fread(&WHIFSHdr, sizeof(WHIFSHdr), 1, HelpFile);
 /* WHIFS starts after the WHIFSHdr */
 WHIFSStart = HelpHdr.WHIFS + sizeof(WHIFSHdr);
 file=1;
 /* Goto WHIFS Root */
 fseek(HelpFile, WHIFSStart + (PAGE_SIZE * WHIFSHdr.RootPage), SEEK_SET);
 /* Find the first leaf node */
 while (file < WHIFSHdr.NLevels) {
 /* if it's not a leaf, we don't need last 2 fields */
 fread(&WHIFSNode, 4, 1, HelpFile);
 /* Find page pointer to first node in index */
 fread(&aPage, sizeof(int), 1, HelpFile);
 fseek(HelpFile, WHIFSStart + (PAGE_SIZE * aPage), SEEK_SET);
 file++;
 }
#ifdef DO_MACROS
{
 extern void do_macros(FILE *HelpFile, long WHIFSStart);
 do_macros(HelpFile, WHIFSStart);
}
#else
 /* Go through linked list of leaf nodes */
 for (;;) {
 if (! fread(&WHIFSNode, sizeof(WHIFSNode)-2, 1, HelpFile))
 break;
 /* List all entries in node */
 for (file = 1; file <= WHIFSNode.NEntries; file ++) {
 while (c = fgetc(HelpFile))
 putchar(c);
 fread(&FileOffset, sizeof(FileOffset), 1, HelpFile);
 printf(" \t0x%08lX\n", FileOffset);
 }
 if (WHIFSNode.NextPage == -1)
 break;
 else

 fseek(HelpFile,WHIFSStart+(WHIFSNode.NextPage*PAGE_SIZE),SEEK_SET);
 }
#endif
 return 1;
}


[LISTING THREE]

/* WHMACROS.C -- Get macros from a .HLP file. Used by HELPDIR.C if #define
 DO_MACROS -- Pete Davis and Andrew Schulman,
 bcc -DDO_MACROS whmacros.c helpdir.c
 See "Undocumented Corner," DDJ, September 1993 */

#pragma pack(1)
#include <conio.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include "whstruct.h"

extern void fail(const char *s);

#define PAGE_SIZE 1024L /* 1k pages -- must be long! */

void do_macros(FILE *HelpFile, long WHIFSStart)
{
 BTREENODEHEADER WHIFSNode;
 SYSTEMHEADER SystemHdr;
 SYSTEMREC SystemRec;
 FILEHEADER FileHdr;
 long SystemOffset=0, FileOffset, FileStart;
 char filename[20], *data;
 int *Offsets;
 int c, i, file, txt;
 /* Find the System file. */
 do {
 fread(&WHIFSNode, sizeof(WHIFSNode) - 2, 1, HelpFile);
 /* Search all entries in node */
 for (file = 1; file <= WHIFSNode.NEntries; file ++) {
 i = 0;
 while ( c = fgetc(HelpFile) )
 filename[i++]=c;
 filename[i] = 0;
 fread(&FileOffset, sizeof(FileOffset), 1, HelpFile);
 if (strcmp(filename, "SYSTEM") == 0) {
 SystemOffset = FileOffset;
 break;
 }
 }
 if (WHIFSNode.NextPage != -1)
 fseek(HelpFile, WHIFSStart + (WHIFSNode.NextPage * PAGE_SIZE),
 SEEK_SET);
 } while (WHIFSNode.NextPage != -1);
 if (! SystemOffset)
 fail("Can't locate SYSTEM file");
 /* Get System header */
 fseek(HelpFile, SystemOffset, SEEK_SET);
 fread(&FileHdr, sizeof(FileHdr), 1, HelpFile);

 fread(&SystemHdr, sizeof(SystemHdr), 1, HelpFile);

 FileStart = SystemOffset + sizeof(FileHdr) + sizeof(SystemHdr);
 FileOffset = 0;
 while (FileOffset < FileHdr.FileSize) {
 fseek(HelpFile, FileStart + FileOffset, SEEK_SET);
 fread(&SystemRec, sizeof(SystemRec)-1, 1, HelpFile);
 FileOffset += (sizeof(SystemRec) + SystemRec.DataSize - 1);
 if (SystemRec.RecordType == MACRO_DATA) {
 if (! (data = (char *) malloc(SystemRec.DataSize+1)))
 fail("insufficient memory");
 fread(data, SystemRec.DataSize, 1, HelpFile);
 data[SystemRec.DataSize] = '\0';
 printf("%s\n\n", data);
 free(data);
 }
 }
}












































September, 1993
PROGRAMMER'S BOOKSHELF


Under Lock and Key




Lynne Greer Jolitz


UNIX System Security
David A. Curry
Addison-Wesley Professional Computing Series, 1992, 279 pp.
$32.25
ISBN 0-201-56327-4
UNIX Installation, Security, and Integrity
David Ferbrache and Gavin Shearer
Prentice-Hall, 1993, 305 pp.
$34.00
ISBN 0-13-015389-3
In the early days of computing, security in computer systems was not the
primary concern of administrators, since computers were generally setup as
centralized systems with terminals located in controlled areas, and networks
were not yet commonplace. (Incoming modem lines on the public telephone
network were the major security headache of system administrators.) Until the
mid-80s, in fact, enterprising students who punched holes in security often
ended up working for those very groups and firms they'd penetrated. (Kevin
Poulsen, awaiting Federal charges for illegal computer access, was hired by
SRI after a rash of system break-ins.) Still, security holes remained a
bemused topic of conversation, and were not considered serious except by a few
predictors of doom.


The Boom in Security from the Internet Worm


Security is like insurance--it's a nuisance to pay for, until a disaster
occurs. This lesson was illustrated during the "Morris worm" incident which
caused the immediate contamination of hundreds of thousands of systems and the
resultant shutdown of the NSF Internet. In the aftermath, security awareness
was raised to an all-time high from which it has gradually eroded as everyone
loses interest until the next crisis. Such is the boom-and-bust cycle of
computer security.
What was different about the Morris work was that the intruding program took
advantage of networking and operating systems standardization to allow
automatic propagation of itself onto freshly compromised systems. This meant
that, like Von Neumann exponentiating machines, the Morris worm could rapidly
scale its ability by the cascade effect of dedicating an exponential number of
hosts to the effort. In addition, because the program added to its information
store of "ways to break the system," the worm had greater "growth" potential
than an ordinary computer virus because it could, again, leverage the network
to pass back information and "learn" better how to break into more systems.
In sum, the Morris worm neatly demonstrated the vulnerability of computer
networks, and made network-wide system security mandatory instead of an
abstract research topic. To aggravate things, the rise of high-powered
low-cost systems attached to the network have made security a part of systems
design, planning, and administration long before it became a "popular" topic
of conversation. With the number of Internet hosts now approaching 1 million
and growing, security merits primary consideration before placing any system
on the global network.


A Site Administrator's View of Security


When it comes down to it, security is the mundane part of computer
administration. You put the software equivalent of a padlock on resources,
files, and accounts, rotate the assignment of keys to users of these items,
and track when attempts to unlock them are made. When initiating security
procedures, however, an understanding of the users and environment is crucial
to creating a secure, yet acceptable, work environment. A book which covers
security should be comprehensive in all aspects of security; otherwise, you
don't have a secure system. Security, whether a house or a file server, is as
strong as the weakest link.
UNIX System Security is geared towards the system administrator and is
engaging in its "tales" of security woes. The book is also categorized in much
the same way that a systems administrator would no doubt view security:
account security, filesystem (or, more properly, "data") security, and network
security in general, followed by specific types of systems (securing
workstations, for example), policies, and references. While it meanders
somewhat through its intermingling of security procedures and needs, its
hands-on cookbook approach should be of great use to any goal-oriented site
administrator who prefers the historical approach to security--the "finger in
the dike" view.
At the same time, this choice of organization is a flaw in UNIX System
Security. The book does not go into as much depth as necessary, allowing a bit
of cookbook knowledge to delude you into thinking you know everything. There's
no overview of what security actually is (you have to go to the National
Computer Security Center's famed Orange Book to find out). Security is a broad
term that means different things to different groups, so defining what kind of
security mechanism and its resulting effect is important. For example, there's
no comparison between account security (most common and simple to implement)
verses data security (much harder) or network (a combination of data and
account security and the actual physical arrangement of the network itself,
and an area which is also given short shrift in books on network architecture
and management).
But more importantly for a book geared to site administrators, there's little
perspective offered on the differing needs of various sites--a government site
versus one in the private sector, for instance--but instead, it seems to be
biased towards educational-site experiences. For example, government time and
energy is often oriented towards "air gap" security to avoid penetration or
subversion of the system. The private sector, on the other hand, tends to view
those "within" the system (such as employees) as possible security
problems--hence the focus on auditing, logs, and transaction files. Neither of
these considerations is directly discussed, primarily because an educational
site prefers a more open and free exchange of ideas and viewpoints (and also,
because they don't usually have money to throw at procedures and personnel).
The short shrift given to auditing, in particular, is an oversight for any
private-sector site administrator. This is especially the case as modern
computer systems with integral security auditing on per-file and per-process
level become available.
Policies with respect to software, passwords, and so forth are also discussed
in UNIX System Security, but these policies have an educational-site bias, and
system administrators should refer to their site guidelines before
implementing any of these suggestions. (If your site doesn't have guidelines,
it's time to establish them.) Legal issues regarding site policies and
policing and software licenses and copyrights are also volatile and undefined
at this time, and the legal examples should be read with a grain of salt. Yet,
for naive institutions that never considered such policies necessary, it does
bring them back into the "real" world.


System Security at a Glance


For a more traditional overview of UNIX security, UNIX Installation, Security,
and Integrity is welcome. Written in a concise and direct form, this book
fills out the topic and is careful in discussing security categories. After
breaking down the main-system security into appropriate categories (filesystem
security, account security, and process security on the local system), it
discusses cryptography and network security. It also deals with security
monitoring and auditing procedures. Thus, the last word in its title actually
has meaning.
One item I appreciated was the careful differentiation between trusted and
regular systems. The authors went so far as to include a mention of hardware
security support, an oft-forgotten area which should be covered in every
security book.
The reference section of both books contain useful papers and books, including
the Orange Book and some of Robert Morris's papers on security (which may have
influenced his son's "worm" work) and brief discussions of secure software
(such as Kerberos). In concentrating on recent works, however, some of the
classic works were ignored, including studies on the KSOS System (Ford
Aerospace) which are worth mentioning for their scope and depth.


Conclusion


UNIX System Security should become popular among site administrators
struggling to get a handle on security needs--especially since most
vendor-specific manuals don't cover those well-known security "holes" which
can cause grief. For a more thorough and concise view of security,
administrators should also obtain UNIX Installation, Security, and Integrity.
But for a real understanding of security in the 1990s, check the references
and attend the security conferences. That's where the action is.






September, 1993
OF INTEREST
Microway has released NDP Fortran-90, a 32-bit Fortran-90 compiler for
protected mode DOS, OS/2, DesqView, and UNIX. NDP Fortran-90, optimized for
Intel 32-bit processors (386, 486, Pentium, and i860), supports ANSI X3J3
specifications for array syntax and operations, pointers (alias techniques
similar to C++'s reference operator), free formatted source code, CASE and DO
statements, improved I/O, dynamic storage, and user-defined types. NDP
Fortran-90 also provides C-like functions, recursion, and module definitions
enabling you to encapsulate logic and functionality.
Additionally, Microway is shipping NDP Lapack, a linear algebra library
package used for analysis and solution of systems of simultaneous linear
algebraic equations, linear least-squares problems, and matrix Eigenvalue
problems. In all, NDP Lapack, which is a complete rewrite of Linpack, contains
over 1000 routines, including support for single- and double-precision,
complex, and complex*16.
NDP Fortran-90 sells for $295.00 while NDP Lapack costs $150.00 for either the
royalty-free binary library or source-code distribution, or $250.00 for both.
Microway is offering a white paper which details the Fortran-90 improvements,
and includes code examples and optimization tips. Reader service no. 20.
Microway
P.O. Box 79
Kingston, MA 02364
508-746-7341
If you're looking for lots of low-cost storage capabilities, but CD-ROM
doesn't meet your read/write needs, low-cost digital-audio tape (DAT) may be
the solution to your problems. To this end, DataImage has released Sail-SDK, a
C library that allows programmers to integrate high-storage capacities and
direct retrieval capabilities of digital data storage 4mm DAT drives into DOS
or Windows applications. One unique aspect of the DataImage approach is that,
although the data is stored sequentially, users can retrieve information by
direct access via a DLL (for Windows) or run-time library (for DOS).
The SDK provides an interface that supports a maximum of seven DAT SCSI
drives. The 36 functions supported by the API include those for system
initialization, backup/restore operations, read/write, querying the hardware,
and the like. The current generation of reusable DAT cassettes provide 4
Gigabytes of storage and sell for about $20.00 each. The SDK sells for
$750.00. Reader service no. 21.
DataImage
20 Meadow Street
East Hartford, CT 06108-3216
203-291-1830
A Windows-compatible version of the Prime Factor Fast Fourier Transform (FFT)
subroutine library has been announced by Alligator Technologies. Prime Factor
uses the Cooley-Tukey Fast Fourier Transform, and a DFT to allow analysis of
any data-set size. The package can also be used to transform any data-type
array, including floating-point data types from 4 to 10 bytes as well as
integer data types from 2 to 8 bytes. With an added DLL, Prime Factor supports
compilers such as Visual Basic, Visual C++, Turbo C++ for Windows, and Borland
C++ for Windows.
Prime Factor FFT performs one- and two-dimensional FFTs on arrays up to the
allocatable, protected-mode extended-memory space. It optimizes execution by
automatically detecting math processor type, and subroutines are handcoded in
assembly language for speed and accuracy. Sample programs are provided in
Basic and C. Prime Factor FFT sells for $395.00. Reader service no. 22.
Alligator Technologies
17150 Newhope Street, #114
Fountain Valley, CA 92728-9706
714-850-9984
The Windows Sound System driver development kit (DDK) from Microsoft is now
available for licensing to OEMs and audio semiconductor manufacturers. The DDK
has a new driver and mixture architecture that enables any audio chipset to
function with future enhancement to the Microsoft Windows Sound System.
The kit consists of driver, VDX, and hardware-installation module source code;
instructions for modifying the source code; and test programs and sound
control applications. The $450.00 license fee enables manufacturers to modify
source code and distribute derivative object code without incurring additional
royalty fees. Reader service no. 23.
Microsoft Corp.
One Microsoft Way
Redmond, WA 98052-6399
206-882-8080
CS-MAP 5.0, a system-mapping library from Mentor Software, now supports
multiple regression for conversion of international geographic coordinates
between any of the 48 geodetic data supported. Version 5.0 adds six more
features, such as enabling applications to convert NAD27-based (North American
Datum of 1927) grid coordinates to HARN-based (High Accuracy Reference
Network) grid coordinates in one pass through the data. CS-MAP supports the
Canadian National Transformation for converting Canadian coordinates based on
NAD27 to NAD83. It computes the grid-scale factor, convergence angle, and
calculates the actual geodetic azimuth and distance defined by two geographic
locations. CS-MAP 5.0 sells for $450.00. Reader service no. 24.
Mentor Software
3907 E. 120th Ave., Suite 200
Thornton, CO 80233-1600
303-252-9090
An ANSI Fortran 77 compiler for 32-bit extended DOS optimized for the Intel
486 and Pentium processors is now available from Absoft. The compiler is
compatible with MS-DOS 6.0 (or equivalent), Phar Lap 386SDK, MetaWare High
C/C++, and 32-bit Microsoft C/C++, Windows 3.1, and Intel or Weitek's math
coprocessors. Absoft Fortran 77 contains statement extensions from VAX/VMS,
Cray, Sun, and Microsoft Fortrans so code can be ported with little change.
Requirements are 4 Mbytes of RAM. The price is $750.00. Reader service no. 25.
Absoft
2781 Bond Street
Rochester Hills, MI 48309
313-853-0050
In other Fortran news, the Numerical Algorithms Group has announced that it
has licensed its Fortran 90 technology to Microsoft. NAG's technology provides
Microsoft with a compiler front end on which to build future Microsoft Fortran
90 compilers. Reader service no. 26.
Numerical Algorithms Group Inc.
1400 Opus Place, Suite 200
Downers Grove, IL 60515-5702
708-971-2337
For Macintosh developers who want to port their applications to the PC
platform, Farallon is offering the Timbuktu AppleTalk Developer's Kit (ADK)
for building network Windows/DOS applications that interoperate with Macintosh
computers.
The ADK provides the necessary tools to develop cross-platform AppleTalk
applications--e-mail, databases, groupware, and the like. The ADK costs
$5000.00. Reader service no. 27.
Farallon
2470 Mariner Square Loop
Alameda, CA 94501
510-814-5100
Integrated Computer Solutions is offering a Display PostScript SDK for the X
Window System. Available in source form, the SDK consists of tools, such as
sample code, example applications and utilities specifically for the X
environment, and reference materials for programming. The SDK, which initially
supports SunSoft Solaris, DEC ULTRIX, IBM AIXWindows, and SGI platforms, sells
for $500.00. Reader service no. 28.
Integrated Computer Solutions
201 Broadway
Cambridge, MA 02139
617-621-0060
MetaWare has announced its High C/C++ compiler 3.1. This compiler supports
Windows 3.1 and the Pentium processor. Having worked previously with Intel's
386/486 and i860 processors, MetaWare's compiler already contains many of the
global optimizations such as instruction scheduling, inlining, and
loop-strength reduction. High C/C++ sells for $795.00. Reader service no. 29.
MetaWare
2161 Delaware Ave.
Santa Cruz, CA 95060-5706
408-429-6382

WinScope from the Periscope Company is a debugging tool for Windows
developers. It provides the ability to capture and display Windows messages
and API calls including parameter names and return values, message filter
hooks, ToolHelp notifications, and debug kernel messages. Breakpoints provide
the ability to analyze the occurrence of a specific combination of these
events or pass control to a source-level debugger. CodeView selects events and
records them in the trace buffer, which can then be searched for specific
events, displayed in various ways, printed, or saved to disk for later
examination. WinScope sells for $249.00. Reader service no. 30.
The Periscope Company
1475 Peachtree Street, Suite 100
Atlanta, GA 30309
800-722-7006
TAPI, the Telephony Applications Programming Interface, will provide a
standard set of functions for developing phone-control applications in
Windows. TAPI, which was proposed by Intel and Microsoft, effectively provides
a standard way to integrate the telephone and PC. Among the type of products
TAPI will likely spawn are: visual call control for call forwarding,
conference and call transfer; integration of e-mail, voice mail, and fax;
desktop audio and video conferencing; and wide area networking that allows PCs
to use the telephone network for both voice and data transmission.
Version 1.0 of the specification is available on CompuServe in the Intel
Access Forum (type GO INTELACCESS). Alternatively, you can get a copy by
FAXing the Windows Telephony Coordinator at Microsoft (206-936-7329), or by
sending e-mail to telephon@microsoft.com. Reader service no. 31.
Microsoft Corp.
One Microsoft Way
Redmond, WA 98052-6399
Quantify from Pure Software is a software development tool that detects and
identifies performance bottlenecks. Quantify measures the performance of an
entire application and its components and pinpoints the functions that use the
most system resources. Quantify, which is targeted for SPARC workstations
running SunOS 4.1x, requires no special compiler, debugger, or editor. The
tool sells for $1198.00. Reader service no. 32.
Pure Software Inc.
1309 S. Mary Ave.
Sunnyvale, CA 94087
408-720-1600
Inmark Development Corporation announced the release of zApp 2.0. zApp is a
portable C++ application framework that offers cross-platform portability
through object-oriented C++ classes, which now number over 200. Other new
features include forms support, MDI dialogs, dynamic sizers (to simplify
screen design), object persistence, scrolling classes, full DDE, advanced
debugging, expanded documentation and sample programs, and online help.
zApp is available for Windows, DOS, and OS/2. For Windows, zApp supports
Borland C++ 3.1, Microsoft Visual C++, and Zortech C++; for DOS, Borland C++
and Microsoft C++; for NT, the NT C++ compiler; for OS/2 2.x, the Borland C++
OS/2 compiler. The price is $495.00, except for OS/2, which is $695.00. Reader
service no. 33.
Inmark Corporation
2065 Landings Drive
Mountain View, CA 94043
415-691-9000
A Smalltalk ANSI standardization committee, X3J20, has been formed by the
likes of IBM, Digitalk, Knowledge Systems, Object Technology, Quasar Knowledge
Systems, Easel, ParcPlace, and others. The goal of a Smalltalk standard is to
provide a stable base for application development and to improve portability
between different Smalltalk implementations. The first organizational meeting
was held in June 1993 in Washington, D.C.
You can obtain a draft of the IBM proposed standard from IBM by calling IBM
Publications and requesting the IBM ITSC redbook, Smalltalk Portability: A
Common Base, publication #GG24-3903. The book sells for $20.00. Reader service
no. 34.
IBM Publications
800-879-2755
Although foremost known for its C/C++ compilers, Watcom has begun shipping a
visual implementation of Rexx called "VX REXX" for the OS/2 2.0 operating
system. VX REXX provides a visual development environment that enables you to
dynamically create and modify GUI objects at both edit and run time. In doing
so, VX REXX uses the OS/2 2.x system object model (SOM) to implement its
object-oriented GUI run-time support, enabling extensibility and
interoperability with other languages including C/C++. VX REXX provides an
interactive debugger and project management facility, and visual GUI forms
designer. VX REXX sells for $299.00. Reader service no. 35.
Watcom
415 Phillip Street
Waterloo, Ontario
Canada N2L 3X2
519-886-3700































September, 1993
SWAINE'S FLAMES


Twenty Question, Roughly


Multimedia mogul Marc Canter calls interactive television "the fulfillment of
multimedia." Computer companies (Apple, Microsoft, Hewlett Packard, IBM,
Silicon Graphics, Zenith), communications companies (AT&T, Bell Atlantic,
Comsat, General Magic, PBS, TCI, Whittle Communications), and others (3DO,
General Instruments, Industrial Light and Magic, Kaleida Labs, Sony, Time
Warner, US West) are investing and partnering and turf-taking in interactive
TV. And now, the best minds in the field think they've hit upon interactive
TV's killer app, the interactive application that will make it pay off big.


The Home Shopping Network.


I don't know about you, but for me, this raises some questions. As a matter of
fact, everything I read lately raises questions.
Why did three magazines in one month decide to portray virtual reality pioneer
Jaron Lanier in remarkably similar ways? Wired (June/July, 1993) and The Red
Herring (premiere issue, 1993) make him look like a religious icon, Upside
(June, 1993) like a combination wizard and orchestra conductor, all three
emphasizing the beatific glow of his eyes.
I understand being hypnotized by those eyes. And I know why Jaron was news
that month. He had just lost his company and his virtual-reality patents to
his vulture capitalists. Jaron, I'm happy to say, is weathering the setback
well and has started a new VR company, but these messianic images give the
impression that he died and rose from the grave. He's a nice guy and
incredibly mellow, but do they think he's Jesus Christ?
Speaking of mellow, are the Republicans really going to run Bob Dole for
President in 1996? A man who comes across like Richard Nixon without the
personal warmth?
Which reminds me, the latest news from Microsoft is Cablesoft, Microsoft's
not-yet-final-at-press-time deal with TCI and Time-Warner with the object of
putting together a standard interactive television system. Standard by clout,
as it were. Are you as thrilled as I am?
My question is, who's going to write the software?
Can you say "Windows for Interactive Television?" Sure you can, but can Joe
Couchpotato? My guess is, Microsoft will get into consumer-product user
interfaces when it acquires a company that knows how to do it. Relating to
ordinary people is not something that the Microsoft culture prepares you for.
Then again, I wouldn't have guessed that Larry Flynt published Maternity
Beauty & Fashion, or that Pretty Woman would marry Eraserhead, so what do I
know?
I know better than to offer advice to Apple's new president, Michael Spindler,
although I may be the only Apple watcher to exercise such restraint. How would
you like to be in Spindler's shoes? Your boss grooms you for his job for two
years, letting you run the company while he's off schmoozing with politicians.
Then when profits look gloomy and the company needs to cut a thousand
employees or so, he tells you this might be a good time for you to take over.
Now was that nice, John?
Is there even one other person troubled by this one? One other person who
memorized the names of all the Mercury astronauts? Scott Carpenter, John
Glenn, and the rest? And then was totally confused when an actor named Scott
Glenn was cast in one of the Mercury astronaut roles in the movie The Right
Stuff? And now can never remember the name of the actor who played John Glenn?
It sure would be nice to know I'm not alone.
There's a whole conference this fall on electronic books. Help me on this one.
I understand electronic dictionaries and reference books, even picture books
turning into electronic moving-picture books, but a lot of the interest of the
players in this game seems to be focused on electronic novels. Now, is there
any existing or even planned hardware platform that you know of that is
capable of holding the text of a book and displaying it readably and that
doesn't weigh a lot more than a book? Granted, one platform can support many
books, but how many novels do you lug around with you at one time? And what
battery stays charged long enough to read a novel?
If we're going to replace books with something else, shouldn't the something
else be at least as good?
I'm just asking.
Michael Swaine
editor-at-large

































October, 1993
October, 1993
EDITORIAL


When Enough Isn't Enough


Live and learn. I used to think that in the music business, "C&W" stood for
"country & western." Now I find out that singer Garth Brooks and a cabal of
record distributors have redefined it as something more along of the lines of
"cheap & whiny." For Brooks, who's sold $400 million worth of albums in the
past couple of years, more than enough money apparently isn't enough.
The bug in Brooks' billfold is that if one of his fans buys a pre-owned, used
CD, he doesn't get a cut. Musicians such as Brooks are only paid royalties the
first time an album is sold. Although used CDs make up a niggling 1 percent of
a $9 billion annual business, Brooks and his big-time record-label
buddies--Sony, Thorn EMI's CEMA, Matsushita's UNI, Time Warner's
Warner/Elektra/Atlantic, and the like--are incensed that you and I are able to
buy used CDs. For their part, the distributors are withholding advertising
support and have threatened not to accept returns of compact discs, including
defective ones. Brooks, too, has made sacrifices--before backing down, he
threatened to not allow his new album be sold in stores the stock used CDs.
The Independent Music Retailers Association retaliated with a class-action
suit, alleging these tactics are in violation of the Sherman Anti-Trust and
Robinson-Pactman Acts.
Garth doesn't get it. According to the Copyright Act's doctrine of first sale,
royalties are paid only the first time an album is sold. After that, the album
is freely transferrable. That's the law. The same goes for books.
When you buy a book or CD under normal circumstances, what you've really
bought is a license for single-user, non-commercial access to and use of the
media and information stored on it. The copyright holder (author, artist,
publisher, estate, or whoever) retains ownership of the information, but you
own the media. You can make cassette copies of a CD or photocopies of a book
for personal purposes, but if you sell the CD/book you should also include or
destroy all copies of it.
The same goes with software. Used software is, in fact, one area vendors such
as Microsoft and Borland are in general agreement. Borland's no-nonsense
License Statement, for instance, says you can "copy [the software] onto a
computer. . .and you make archival copies of the software for the sole purpose
of backing up your software and protecting your investment." Both Borland and
Microsoft say you can box up and sell the software as long as you dispose of
all copies (whether on floppies or on your hard disk). The key here is that
the software "can't be used by two different people in two different places at
the same time." And guess what? The new owner doesn't have to pay a penny to
the vendor. (Borland and Microsoft do treat the issues of technical support
and upgrade rights differently, however.)
Brooks is not without his politically-correct public-relations wits, however.
He isn't so crass as to say that he wants the extra bucks; instead, it's his
songwriters who need to make a living and "feed their children." But when
taking on sellers of used CDs, Brooks is tilting at the wrong windmills. What
he should be questioning is why there's a market for used CDs in the first
place. That's simple. At $17 each (soon to climb to $18), CDs are overpriced.
That price may have been justified ten years ago, but manufacturing costs are
now only about $2 per disc. If Brooks is truly concerned about the amount of
bologna on his songwriter's sandwiches, he might rethink his reported nearly
50 percent per CD royalty rate.
The music industry might also consider how it conducts business. Any
enterprise that has a 90 percent failure rate with new releases has to recoup
its losses somehow, but double-dipping it customers isn't a long-term answer.
In considering the novel idea of better serving its customers, the recording
industry might to give listeners more bang for their 17 bucks. Specifically,
the de facto industry standard is to include ten songs per album, a number
that has its roots in 40-minute vinyl LPs which could only handle five songs
per side without losing fidelity. CDs can hold much more. (Can you imagine the
Dr. Dobb's/CD with a single month's issue?)
With cassette-tape sales--not CDs--making more than 65 percent of his
business, you wonder why Brooks opened up this can of guitar strings in the
first place. The digital age has altered everything, and Brooks may be
readying his pockets for future means of distribution. Cassettes will
disappear and CDs will change. IBM and Blockbuster Video already have a plan
for downloading digital music from online databases, then storing it on
writable CDs. Software will be available from similar sources.
There's no question that licensing arrangements for digital data will change
over the coming years. But if the Garth Gang gets its way, the biggest change
won't be how you use the information, but how much you pay for it.
Jonathan Erickson
editor-in-chief










































October, 1993
LETTERS


CPU Detection




Dear DDJ,


It was great reading about the CPU detection algorithms in Dr. Dobb's
("Processor Detection schemes," DDJ, June 1993). It's nice to see that
magazines such as yours are willing to publish information about this topic.
Incidentally, we've had to deal with the same problem of identifying CPU's. It
turns out that there is an incredible amount of information to be gained from
CPU's made by other manufacturers like Cyrix, IBM, and Chips&Technologies.
For example, CPU's IBM documentation already includes information for new
products up to the up-and-coming 486DX3 99 MHz chip! IBM's documentation is
the only one to lay out the undocumented information returned by the CPUID
command. Cyrix has started including an identification register that is
I/O-memory mapped and absolutely the easiest to use...far better than a CPUID
command since it does not generate an exception if the CPU doesn't support
CPUID. Chips&Technologies had some really great ideas about power management,
but most of their key accounts fell through and they killed off the chip.
A slight omission from your article was the op-code for the CPUID command on
the Pentium. No problem for those who have a 586-ready assembler or possess
the data sheets, but unfortunate for those who don't. Next time, include the
op-code!
Aki Korhonen
Emeryville, California


Truly 3-D




Dear DDJ,


Yes, "true 3-D is easier than it looks" ("Algorithms for Stereoscopic
Imaging," DDJ, April 1993), but I think it is even easier than the impression
that some readers may have been left with after reading the article.
In the first place, you don't need a machine like a Silicon Graphics
workstation. Low-quality stereo has been available for years on the Atari and
Amiga computers. Medium- and low-quality stereo systems can be set up on PCs
using VGA and other graphics displays and equipment from the companies
mentioned in the article, Stereographics and Tektronix. Another source of
low-cost stereo glasses that should be mentioned id the 3-D TV Corporation
(San Rafael, California).
Secondly, the algorithms themselves can be much simpler. By way of example, an
alternative approach to the key projection computation makes use of inherent
symmetries and the stereo geometry. The resulting computation is much more
efficient than the authors'. A function using this technique is called
RealToStereo, it converts a real 3-D point to two pixel values, one for the
left image and one for the right. I'd also like to share something I've
learned about "jaggies" in stereoscopic graphics.
In this approach, we view the monitor screen as the image plane of a camera
(double meaning intended!) and define two 3-D frames of reference: the Camera
frame and the Object frame. The Object frame is anything you want it to be; it
is defined by the objects to be viewed. The Camera frame, on the other hand,
is defined to have its origin located at the focal point (FP) of the "camera"
and its Z-axis lie on the line of sight (LOS) of the camera. The image plane
is normal to the LOS and a distance f from the FP (f for focal "distance"). In
a real camera, the image plane lies behind FP, but it is easier to visualize
the plane as being in front: rays of light from an object point pass through
the image plane before they converge at FP. The point where the LOS crosses
the image plane is called the "principal point" (PP). For a further discussion
of the camera model, see The Manual of Photogrammetry published by the
American Society of Photogrammetry, Falls Church, Virginia.
Let's first consider a standard, monoscopic camera. To find out where a ray
from an object point will intercept the image plane, the Object coordinates of
the point (x,y,z) are transformed to Camera coordinates (X,Y,Z); the Camera
coordinates are projected onto the image plane; and finally the image plane
coordinates are scaled up or down into pixels.
For efficiency, use 3x3 matrices instead of 4x4 matrices in 3-D coordinate
platforms. The basic transformation from Object space to Camera space is
C=T+M*O, where C, T, and O are 3-vectors and M is a 3x3 matrix. T defines
translation and M defines rotation (it is composed of the angle cosines
between the various axes). This computation consists of nine multiplies and
nine additions. Duvanenko et al. used the Foley and van Dam formulation that
requires 4-vectors and a 4x4 matrix and significantly more computations: 16
multiplies and 12 additions, as they pointed out. Terms get calculated even
though they are identically 0. So, if you don't have dedicated hardware to
carry out 4-D calculations, as on the Silicon Graphics machines, and you're
concerned about computational efficiency, stay in the 3-D world.
In Camera coordinates, the transformation to the image plane is a simple
application of the projective, or perspective, transformation; see Example
1(a).
The resulting 2-vector I is the position of the ray on the image plane
relative to the PP. This is readily scaled into pixels. Thus, the entire
conversion from a 3-D point to a 2-D pixel (for a single view) requires only
one 3-vector-matrix multiply, a 3-vector addition, two scalar divisions, and
two scalar multiplies (pixel scaling factors can be combined with f).
Now, what is a stereoscopic camera? It has two focal points, but the same
image plane. The focal points are separated by a distance S in the X
direction. It simplifies things at this stage to think about three different
Camera frames (no, really!): one defined by the left FP, one defined by the
right FP, and one that would be defined by a focal point at the center of the
line joining the two actual focal points. In other words, the left Camera
frame is simply displaced from the center frame by a distance (-S/2) along the
X axis, and the right frame, by (+S/2).
The above form of the transformation equation is also (very!) convenient
because it separates the translation terms from the rotation terms. To project
an object point through the left FP, say, requires only a shift in the X
direction, in other words, CL=C+S, CR=C-S, where S=(S/2, 0, 0), CL and CR are
3-D positions in left and right Camera frames, respectively, and C is
calculated above. Now the projective transform is performed for each focal
point essentially as above, but is further simplified because only the x
component is different; see Example 1(b).
The image plane points are then scaled to pixels, and so forth. The result is
a pair of pixel locations, each relative to their respective principal points.
(At this point, you may want to change origins or shift the pixels relative to
each other, see the notes in the code listing.)
The point is, of course, that if you are concerned about speed of computation,
significant saving can be made by calculating both rays at the same time,
rather than by simply applying the single focal point projection (via whatever
method you use) twice. In particular, note that the vector-matrix multiply
need be done only once.
This has two other important advantages. First, only one rotation matrix is
used. Therefore, when you want to rotate Object space with respect to Camera
space, only one 3x3 (or, 4x4, if you insist) matrix multiply is needed--again
saving much computational time. Second, separating out the "separation term"
in the formula in Example 1(b) leads to a good way to deal with a very subtle
problem that can arise in stereoscopic graphics.
Suppose you're drawing a line in 3-D space on a plane that is parallel to the
image plane and located some distance from it in the Z direction, and suppose
this line is tilted, say, five or ten degrees from the Y (or y) axis. On the
monitor screen, the pixel representation of the line is two lines, one for
each focal point. These lines must be exactly parallel; if they are not, any
disparity in the x location of the pixels that make up the lines will look to
the viewer like a displacement in the Z direction; that is, in or out of the
screen.
This is particularly noticeable where there are jaggies, the line breaks that
must occur when drawing diagonal lines on a pixel-based screen. If the line
breaks do not occur at precisely the same positions along the two lines,
alternate segments appear to jump in and out of the screen, when you are
wearing stereoscopic glasses. This will occur if the lines start and stop at x
locations that are not shifted by exactly the same amount. Of course, another
requirement is that the y locations of the end points be exactly equal; this
is guaranteed by the formula in Example 1(b).
The 3-D coordinate transform is usually done in floating point, but, at some
stage, the quantities are reduced to integers for the pixel representation. If
you aren't careful about truncation or round-off effects when converting
floating-point numbers to integers, the shift in x position at the top and
bottom of the two lines can be slightly different. Suppose, for example, the
top end of the line is located at an x coordinate of 6.4 and the bottom end is
at 4.1, and suppose the separation distance between focal points is 2.4. The
transformation for left and right pixel locations involves the sum and the
difference of the same term, (S/2). The numerical results are something like
6.4+1.2=7.6 and 6.4-1.2=5.2 at the top, and 4.1+1.2=5.3 and 4.1--1.2=2.9 at
the bottom. When these values are converted, by truncation (as in a type cast
in C) or by rounding, the pixel difference at the top will be 2 (or 3) and the
difference at the bottom will be 3 (or 2). The lines will not actually be
parallel on the screen.
The solution is to add and subtract the separation term after conversion to
integers. This method is used in the listings; and it speed up the calculation
by a little bit also.
M.A. Weissman
Federal Way, Washington


More C vs. C++




Dear DDJ,


Reading David Smead's letter ("Letters," DDJ, May 1993), written in response
to Scott Guthreys "A Curmudgery on Programming-language Trends," I found it
interesting that he applied the usual double standard when comparing C to
assembler and then C++ to C. When comparing C with assembler, he stated that
"writing in C is faster than writing in assembler," which is true, largely
because C supports the concept of structure. However, when comparing C with
C++, he stated that "everything you need already exists in C," which is
misleading.

I would observe that, using his own arguments, "everything you need already
exists in machine code, so do we need C?" and "writing object-oriented
programs is faster in C++ than in C."
The reason I use C++, rather than C, to write OO programs is because C++
supports object orientation. It enables the technology, and I would certainly
dispute, in the context of OO, that "C is more than powerful enough to program
in." It isn't, as it provides no support for classes, objects, inheritance, or
polymorphism.
Edward Kenworthy
Cannock Staffs, England


Shorter Snippet




Dear DDJ,


"C Snippet #37" of December 1992 looks to me too cumbersome. My idea of doing
it in C is presented like this:
 #include <ctype.h>
 #include <string.h>
 char * rmlead (char *str)
 {
 char *obuf = str;
 for ( ;
 *obuf &&
 !isspace(*obuf);
 ) ;
 return obuf;
 }
No caveat about the original, no destroying the input, no unnecessary copying,
no two-level logic, no avoidable break. And much shorter.
Jony Rosenne
Tel Aviv, Israel


Cognitive Expression Processors




Dear DDJ,


In "Swaine's Flames" (July 1993), Michael almost states the point of his
essay. Taking some license here, his point that the term "Artificial
Intelligence" only applies to things we don't understand. As we develop the
algorithms and processes to mimic human intelligence, the demystification
provides more specific and exact terminology--just as two hundred years ago
everything from refraction to weather, flight to the stars, and procreation to
electricity were all bundled under the term "magic."
Because of this, like magic, Artificial Intelligence will never be achieved.
In the next generation, when desktop thinking machines are controlling our
aircraft, reading to our children, and chatting with us at the check-out
lines, we will doubtless call them simply "Cognitive Expression Processors."
Matthew Strebe
San Diego, California
Example 1: Truly 3-D
(a)
lx= f * Cx / Cz, ly = f * Cy / Cz
(b)
ILx = (Cx+ S) * f / Cz, ILy = f * Cy / Cz
IRx = (Cx - S) * f / Cz, IRy = ILy













October, 1993
Programming Language Guessing Games


If C++ is the answer, what's the question?




P.J. Plauger


P.J. Plauger has recently compiled his long-running "Programming on Purpose"
column into a series of books of the same title. He can be contacted at
pjp@plauger.com.


Speculating about programming languages is a popular indoor sport that
inspires powerful passions, both in players and onlookers. Try telling a
Boston Celtics fan that the team is ailing and well past its prime. If you
have any teeth left after that exercise, try telling a C programmer that C is
a primitive and obsolete language.
You have about as much luck convincing a C++ advocate that the language is
deficient in any way. The arguments in its defense are legion, and all couched
in intellectual terms. But the driving engine behind many of the statements is
emotional conviction, pure and simple.
I find nothing wrong with either of these positions. I genuinely believe that
C is far from dead as a popular programming language. I also am convinced that
C++ has an interesting present and a bright future. And I have no problem with
programmers forming emotional attachments to the tools of their trade. (I do
prefer, however, that people learn to distinguish emotion from reason. Each
works best in its own arena.)
When it comes to programming languages (and text editors, and operating
systems), I am essentially Darwinian. It is much more fruitful, to me at
least, to simply observe which languages:
Sell themselves to programmers
Become widely used
Host applications that are commercially successful.
You can hate Cobol, Fortran, Basic, or C all you want. Still, you really ought
to notice that these have had their successes by the above metrics. C, in
fact, has to be the hands-down winner to date in the natural selection game.
Similarly, you can be the greatest object-oriented purist in the world, and
hate C++ for all its ugly pragmatism and impurity of paradigm. (There, we got
that word out of the way early.) Still, you really ought to notice that C++ is
also well on its way to being a major success by the above metrics. And
whatever object-oriented language is second in popularity is a very distant
second to C++.
Here's where my Darwinian leanings raise a red flag.
For all its promise, C++ is also, unequivocally, a complex language--and
history has not been kind to complex programming languages. Three that spring
to mind are PL/I, Algol 68, and Ada. Each is a product of a different decade
and a different (potential) user community, but each has followed a similar
trajectory.
First there is the perceived need. Existing programming languages simply lack
all the features now demanded by more sophisticated programmers, who now work
on much larger projects. A small group forms, consisting of designers who
inhabit that turbulent ocean midway between sophisticated users and
experienced compiler writers. Before you know it, they have taken an existing
lily and gilded it beyond recognition. The result shines brighter than what
went before, but is also substantially heavier.
Then comes the business of selling. The dividing line blurs between the demand
pull of putative customers and the supply push of vendors with a stake in the
new technology. It's always the most sophisticated, and most outspoken
programmers who sign up first. With all those good (stylish) ideas packed in
there, the new language has something for just about anyone's taste.
Invariably, production programmers are told repeatedly that they are on the
verge of unemployment, or even extinction, if they don't switch over to this
new technology as soon as possible.
The rising tide follows. Conferences fill up first with tutorials, then with
more and more technical sessions on the wonderful new language. Magazines
(like this one) run special articles, then special theme issues, then standing
columns on the topic. New conferences and magazines emerge just to serve the
new constituency. Books appear like mushrooms after a rain. Those who haven't
had occasion to try the new language begin to doubt their personal sanity, or
the viability of their current employers.
At the crest, programmers for this wonderfully complex new language command
premium salaries. Recruiters steal company phone books to make cold calls on
the experts. The most popular lunchtime game is the five-line program with the
caption, "Betcha can't tell what this does." Maintaining and enhancing code in
older languages goes from a second-class activity to third class.
Then the doubts creep in. A competitor beats you to market even though still
mired in that old fashioned programming language of yore. Your enterprise
suffers a few minor project disasters from complexity overload, or one really
big one. Salary differentials get really out of hand. One by one, new projects
get specified in terms of simpler programming languages, over the protests of
the elite.
In the twilight days, the number of hotshots has been reduced to an absolute
minimum. Now they're stuck on maintenance. Even worse, they have to suffer
interviews by analysts charged with retooling the most useful pieces of the
old, expensive systems in the current (simpler) language of choice. The
hotshots grumble about management stupidity (hardly a new theme among
programmers) and about how weak and unsafe programming languages have become.
Their only consolation is that no programming language ever dies completely,
thanks to past investments in code. They have a sinecure that will last until
retirement, if they choose to stop advancing in their careers.
The consolation for the rest of us is that each language shapes the ones to
come. Future designers steal the best bits and leave the worst failures to rot
on the vine. Sic transit gloria mundi.
What causes some programming languages to follow this orbit? Well, in some
ways, all languages do so. The ones I've singled out simply had a rise and
fall that was faster and more spectacular than many people expected. In that
sense, they were all oversold. And, as I conjectured at the outset, they are
all overly complex.
So where does that leave C++? If it's following the inevitable fate of complex
languages, then it's arguably still in the "rising tide" stage. We can trust
that its popularity and importance are both still growing. But we can also
trust that it will not be the programming language of choice in the year 2000.
On the other hand, historical analogies are never exact. C++ is firmly rooted
in a highly successful language. It overcomes some of the recognized
shortcomings of C at an opportune moment in history. Perhaps demand pull will
be strong enough to rescue the language from gravitational collapse.
I can see that happening, however, only if we address the complexity issue
wisely and in time. We need to appreciate just what sorts of complexity lead
to untimely death, and what sorts of features turn out to be worth rescuing.
I opined at greater length about PL/I, Algol 68, and Ada in a magazine article
several years ago. (See "Programming on Purpose: The Central Folly," Computer
Language, April 1988. It also appeared in a collection. See Essay 4 in
Programming on Purpose III: Essays on Software Technology, Prentice Hall,
1993.) I'll be brief in repeating the relevant parts of that essay.
My observation was that designers of complex languages like guessing games.
They seem to feel that programmers want to write the absolute minimum for each
piece of code. It is then up to the translator to decipher that terse code, by
predictable rules, into an executable program. A good translator is one that
can do the job with a minimum of clues from the source code.
APL and C are also terse, but not by the same metric. These two languages use
lots of operators that need only a character or two. Thus, they encourage a
style that is brief, even cryptic, at least to those who aren't comfortable
with mathematical notation. The nearest thing to the kind of guessing games
I'm talking about is the mixed-mode arithmetic permitted by C (and its
predecessor Fortran). Depending on the types of the two operands, the compiler
has to guess whether a + operator, for example, should perform an addition of
floating-point, fixed-point, or pointer operands, in any of several possible
representations for each. Even so, we're talking about a brief table, or a
handful of rules.


PL/I


PL/I inherited the same attitude from Fortran, but then went wild. First, it
proliferated data types. Besides floating-point versus fixed-point (now with a
scale factor) arithmetic types, PL/I lets you specify practically any
combination of binary or decimal base, real or complex format, and a broad
range of precisions. That handful of mixed-mode rules for Fortran explodes to
pages of explanation--and any number of questionable guesses. A classic PL/I
gotcha is the test IF 1 THEN .... The 1 is treated as a one-digit real fixed
decimal constant. To make a Boolean test, the translator coverts it to the bit
string 0001, then tests the leading bit 0. Thus, both 0 and 1 test false. An
unreasonable conclusion from a series of apparently reasonable micro
decisions. Yee hah.
C++ extends the guessing game one level farther. You can add to the set of
permissible operands for the arithmetic operators. Just write "overloaded"
definitions of the built-in operator functions with declarations such as
myclass operator+(myclass&, myclass&).
Now the excitement really builds. A C++ translator has to consult lists of
permissible operand combinations. Some are built-in, some user defined. Other
lists prescribe chains of candidate conversions that might bridge between the
actual operands and a permissible combination. Almost invariably, more than
one combination of conversion chains and operator overloading fill the bill.
That means the language has to decide whether the "best" choice is
sufficiently better than the second best to favor it, or whether it is wiser
to diagnose an ambiguity.
All of this machinery evolved from the simple desire to not require that a
programmer write "obvious" type conversions explicitly.
PL/I also insisted on challenging the parser. The language has no reserved
keywords, so you can write horrors like
IF IF = THEN
 THEN THEN = ELSE
 ELSE ELSE = IF
Each of the four distinct tokens here has two quite distinct meanings.
C++ pushes parsing technology to the extreme. It has more operators than C,
and even more with multiple meanings. You can't parse the language in a single
pass, or with a finite amount of lookahead. In fact, the committee
standardizing C++ still occasionally argues about certain abstruse parses.
Most of the issues boil down to having the translator guess whether a sequence
of tokens constitutes a type designation or an expression. A small problem in
C has mushroomed into a major subtopic in C++. All to avoid having the
programmer learn too many operators, or write long-ish keywords.


Algol 68



Which brings us to Algol 68. I could easily take that language to task for
parsing problems, because its grammar has an infinite number of productions.
But that amounts to shooting at life rafts. Having an endless grammar is only
one of the ways in which Algol 68 took guessing games to new heights. Where it
really stands out is in the shorthand it permits the programmer.
You know how to call a function in C, by writing something like f(x,y). If the
function has no arguments, you write f(). Those empty parentheses are a clear
signal that a function is being called. Under similar circumstances, Algol 68
lets you omit the parentheses and just write f to call the function. No big
deal. Even Pascal is that tolerant.
But I'm not done yet. In C, if you want the contents of an object designated
by a pointer, you write something like *px or **ppx. If you want to talk about
the pointer itself, you write px or ppx, or even *ppx. Not so in Algol 68.
That language wants you to omit the stars. It then guesses from context how
many to put back. Take your favorite C or C++ program, erase all the empty
parentheses and indirection operators, and see how readable a result you get.
The Algol 68 manual waxes eloquent for several pages, in fact, about the rules
for second guessing the translator. If they're that hard to make clear, you
can guess how hard they are for translator writers to get them right. Or for
future readers of the code to guess the same way as the original author. And
all to save writing a few parentheses and stars.
C++ does a little of this sort of thing, in places like the bodies of member
functions. But it's nowhere near as ambitious as Algol 68 in this particular
guessing game. Don't worry, it makes up for it in other arenas.
I mentioned operator overloading earlier, as a special case of function
overloading. You can use the same function name with a host of different
argument combinations. (They're called "function signatures" in conjunction
with the name). The C++ translator then considers not just argument types, but
also the number of arguments, and that ineffable figure of merit from above,
in determining which of several samely yclept functions to call.
We're still not done. You can specify default argument values, as in double
sinq(double angle, int quadrant = 0);. Leave off the trailing argument(s) and
the translator supplies the default values for you. The matching rules now
have to consider matches with and without various numbers of the default
arguments filled in. I won't even mention the complexities involved in
matching a function like printf that accepts a varying-length list of
arguments.
And we're still not done. The C++ Standard now calls for two kinds of
"template" facilities. One kind lets you parameterize class definitions--so
you can describe, for example, a stack of arbitrary objects. The other kind
lets you parametrize functions, as in template<class T> T *find_in(const T *,
T);. Now if you write char *p = find_in("hello", 'l'); the translator has to
guess what arguments to the template would result in arguments to the function
that match the function call. (As humorist Dave Barry would say, "I am not
making this up.") In this case, the answer should be that T stands for type
char. Not a hard matter to figure out here. But the standards committee is
still trying to figure out how hard a guessing game is arguably too hard to
ask of a C++ translator.


Ada


And that leaves Ada, as the last of our odious comparisons. Ada, too, features
operator overloading and some interesting parsing issues. But I find Ada's
particular claim to guessing-game fame lies in its tolerance for name
ambiguity.
In Ada, you can nest namespaces in ways both useful and exotic, then write a
name with enough qualifiers to trace a clear path from where you write the
name to where it is properly declared. But then you can start erasing
qualifiers and leave it to the translator to figure out which ones got elided.
The basic rule seems to be, "If the translator has any chance at resolving the
ambiguity, then it must permit the ambiguity and endeavor to resolve it the
way it thinks best."
C++ has adopted much the same attitude. The most interesting examples of the
"betcha can't tell" variety involve guessing which of several candidate
declarations matches up with a given utterance of a name. What makes life even
more interesting is that the matchup can occur with a declaration later in the
program. Only rarely do multiple candidates get diagnosed as ambiguous.
Usually, the translator is obliged to favor one choice over all others, even
if that isn't your favorite choice.
You might also be interested to learn that the C++ standards committee, at its
last meeting in Munich, has voted to add namespace structuring to the
language. Get ready for a host of new and interesting code snippets.


Back to the Future?


So what does all this mean for the future of C++? At the very least, it means
that we must develop some rigorous coding rules for avoiding the worst
surprises. A good start in this direction is C++ Programming Guidelines by Tom
Plum and Dan Saks (Plum Hall Inc., 1991) and Scott Meyers' excellent book
Effective C++ (Addison-Wesley, 1992). He states, and justifies, 50 rules that
get you past some real pitfalls.
Pick a subset of the language that minimizes surprises, learn it well, and
don't stray from it.
Subsetting works for users, but not translator writers. Vendors of C++
compilers still face mastering all that complexity, if only to pass validation
suites and to run those "betcha can't tell" examples properly. It is a telling
observation that you can still count the number of distinct C++
implementations on your fingers. You can count the number that purport to
match last year's draft C++ standard on your thumbs. By contrast, at the same
point in the standardization of C, dozens of implementors attended each
committee meeting. And they represented perhaps half of all the
implementations of C in the world.
Rex Jaeschke recently speculated in a different direction in these pages. (See
"C/C++ Standardization: An Update," DDJ, August 1993.) He discussed the
possibility that the more successful parts of C++ might find their way back
into C. He went so far as to provide a list of candidate features.
Since that article appeared, Rex and I have attended the latest meeting of the
ISO C Standards committee (WG14) in London. That group has decided to start
work very soon on the next revision of the C language, even though the process
doesn't officially have to start for another two years. There is simply too
much pertinent activity in POSIX, internationalization, and C++ to stand idly
by while everyone else is innovating.
Please don't take this as an invitation to send in your favorite extension (or
fix) to Standard C. It will be months before WG14 even decides how to proceed
on reviewing potential revisions to Standard C. But please do take this as an
indication that C is a living and flexible language, even if it is also an
international standard.
A very real possibility is that C may look like C++ sooner than many people
have thought likely. If that thought frightens you, know that many of us who
care about C are a bit frightened as well. We're not in a hurry to test the
aerodynamic properties of a gold-plated butterfly.
I think by now you know my criterion for retrofitting concepts from C++. I
have no problem with classes, overloading, templates, or even nested
namespaces. All have their demonstrable uses. Where I personally draw the line
is with the guessing games. Practically every issue I've raised with C++ in
this article can be avoided. Just add the odd type cast, or template
instantiation, or name qualifier to your program and the need for guessing
games ends. C can remain a proper subset of C++ and still capture much of its
added power. And it can stay relatively simple to implement, and to
understand, by refusing guess the programmer's intent, beyond a reasonable
doubt.
Will this scenario actually come to pass? Will C and C++ merge? Or will a new
dialect, with a name like C+, emerge from the fusion? I can't say for certain,
but it looks like a believable scenario. Your guess is as good as mine.





























October, 1993
The C+@ Programming Language


Its mature foundation class library makes a difference




Jim Fleming


Jim is one of the founders of Unir Technology Inc., and can be contacted there
at 184 Shuman Blvd., Naperville, IL 60563; or at 708-305-0600 or
1-800-222-8647.


The C+@ programming language, an object-oriented language derived from AT&T
Bell Lab's Calico programming language, was developed to provide programmers
with a true object-based language and development environment. C+@ (pronounced
"cat") has the syntax of C and the power of Smalltalk. Unlike C++, C+@
includes a mature class library with more than 350 classes used throughout the
system. The C+@ compiler itself is written in C+@, and all of the source for
the class libraries is included with development systems.
The Calico project was started at AT&T Bell Labs in the early '80s, after the
introduction of Smalltalk and at the same time as C++. Calico was originally
used for rapid prototyping of telecommunication services; hence, its heavy
emphasis on keeping the language syntax simple and showcasing the power of the
graphical development environment.
The name "C+@" is derived from C++ and Smalltalk. The @ method is used in
Smalltalk (and C+@) to create objects of class Point from an x and y
component. For example, the expression 12 @ 34 indicates that the @ method
should be applied to the receiver 12 with the argument 34. Both 12 and 34 are
objects of class Integer and are used to initialize the new Point object; see
Figure 1. Points are used in graphics and other 2-D applications.


C+@ Overview


In C+@, all data items are objects and behave uniformly. The programming
environment is dynamic, responsive, and supports multideveloper collaboration
using an open source storage facility with hierarchical views.
C+@ was designed to be a comfortable companion language to C and C++ rather
than an extension to C. It retains C's expression syntax and control
statements. C functions and data objects can be accessed directly from C+@.
C+@ supports most of the features of Smalltalk as well as multiple
inheritance. The complete C+@ graphical development environment is written in
C+@ and provides a powerful edit/compile/debug cycle. A GUI builder is
included for quickly building applications visually by drawing them on the
screen. C+@ can also be used in a command-line mode as an interactive
object-oriented shell programming language.
C, C++, and C+@ each have a place in the development environment. C+@ offers
significant advantages for rapid prototyping and for providing the
object-oriented "backbone" of an application. C still offers an excellent
solution where performance takes precedence over productivity. For most large
applications, supplementing C with C+@ provides an effective trade-off between
application speed and rapid delivery.
The binaries produced by the C+@ compiler are independent of the underlying
machine architecture. Without recompiling, applications can be moved from
SPARC to 68000 to Intel x86, and so on. C+@ is not interpretive--the binaries
are encoded using a sophisticated "beading" technique developed at Bell Labs.
Because of the streamlined language design, the C+@ compiler produces these
portable binaries with extraordinary speed, without the need for preprocessing
or front ends.
C+@ can be used in applications involving pen-based computing, real-time
systems, GUIs, object-oriented database servers, and the like. For the past
eight years, C+@ has been evolving and has reached a state where AT&T has
decided to release it to the commercial world. C+@ is not intended to replace
C (or C++), but certainly provides a streamlined companion. (Unir Corporation
has released versions of C+@ for SPARC-compatible workstations.)


C+@ and Smalltalk


Comparisons between C+@ and C++ aren't really fair because C++ is
syntactically very complex and designed to support a range of programming
styles. C++ was conceived as an extension to C and, over the past ten years,
has enabled programmers to experiment with object-oriented programming. Just
because a program is written in C++ doesn't mean it's object-oriented. In many
cases, C++ programmers don't use any of the object-oriented extensions of the
language--they use C++ as a better C.
The best language to compare C+@ to is Smalltalk. Like Smalltalk, C+@ was
designed for object-oriented programming from the outset. C+@ supports garbage
collection and can be used as a typeless language. In C+@, the notion of a
pointer is not visible to programmers because all variables can contain
pointers. With C+@, new programmers can bypass all of the cryptic syntactic
puzzles that continue to fill C++ programming books and have nothing to do
with developing an application or maintaining high-quality software. C+@ is
much easier to learn than C or C++.
Just as the Smalltalk environment is written in Smalltalk, the C+@ development
environment is completely written in C+@.
The primary difference between C+@ and Smalltalk is in constructs for handling
control flow of methods are written. In Smalltalk, conditional statements are
implemented by sending messages and code. Evaluation is based on the value of
the Boolean object. This is similar to the C construct
<expression>?<statement>:<statement>;. This form sometimes makes Smalltalk
programs difficult to read and maintain. C+@ supports all of the C
control-flow constructs and can be easily read by C programmers.
Like Smalltalk, C+@ documentation refers to objects, classes, instances,
methods, and messages. All C+@ data items are objects and are classified into
classes which are themselves objects. Each object is said to be an instance of
a particular class and the data fields of an object are called "instance
variables." The class itself is called the "distinguished instance" and can
contain global class variables. An object can be manipulated by invoking its
methods via messages; only instance methods can access the encapsulated
instance variables. Class methods are used to access global class variables.
Programming in C+@ is done by writing class descriptions that define class
variables, instance variables, class methods and instance methods. These
classes and methods are loaded as required, based on the interaction of the
various objects in a workspace. Objects are instantiated by the distinguished
instance which responds to class methods.
The class descriptions also describe the inheritance relationships of classes
and methods. C+@ supports direct and delegate inheritance. Direct inheritance
is commonly used with abstract classes which don't contain instance variables
but provide common behavior to other classes; delegate inheritance is used
when a class wants to take advantage of the capabilities of another class.
Methods not recognized by a receiver will be passed to a delegate without the
sender's knowledge.
For example, the Smalltalk Rectangle class can be created as in Figure 2. As
the C+@ code shows, a class can be created with two instance variables,
similar to a C struct with two fields (origin and corner). Unlike C, the
keyword var indicates that these variables can contain values of any type. In
Figure 2, the method origin_corner_ can be used to create an instance of a
Rectangle object which references two Points created using the @ method. The
class method origin_corner_ is applied to the distinguished instance of
Rectangle. The origin_corner_ class method causes a new Rectangle object to be
created (or instantiated). A reference (the address of the instance) is
returned from origin_corner_ and stored in the variable p.
Methods for the class Rectangle can be written as small sequences of
statements that look similar to C functions. These methods are grouped in the
surrounding braces of the class Rectangle {...} construct, ensuring that
variables and methods for a class are grouped in one syntactic construct that
can be stored in a separate ASCII file. This provides a form of source-code
encapsulation, allowing you to use standard UNIX tools (editors, make, and the
like).
As with Smalltalk, C+@ supports binary and unary/keyword methods. The binary
methods of C+@ and Smalltalk are very similar; see Figure 3. Smalltalk
differentiates the unary methods with arguments, and the keyword methods via
colons (:). C+@ only supports the dot-method form used in C++ and in many
respects is more consistent than Smalltalk. The lack of a keyword construct
has not been a limitation. Smalltalk keyword methods can easily be converted
to C+@ using a simple convention.
Although C+@ and Smalltalk are similar at the high-level of class and method
design, the actual control-flow statements are very different. In Smalltalk,
messages to objects are used to control the flow of execution in a method. In
C+@, the control-flow statements look like C statements, making it easier for
experienced C programmers to pick up the language. In some cases algorithms
have been converted from C to C+@ with little change in the control-flow
source.
In Figure 4, the receiver (queue) is sent the message isEmpty which returns a
Boolean, (True or False). In the C+@ example, this result is tested and the
variable index is set to 0, or the result of applying the next method to the
receiver (queue). In the Smalltalk example, the ifTrue:ifFalse: message is
sent to the Boolean that results from applying the isEmpty method to the
receiver (queue). When ifTrue:ifFalse: is sent, two arguments are passed and
evaluated depending on the value of the receiver. If the value is True, then
the block [index<=0] will be sent a message to be evaluated. If the value is
False, then the block [index<=queue next] is evaluated, again by sending the
block a message.
C+@ supports most of the features of Smalltalk with a C syntax. Since all of
the C operators are handled as method selectors, operators like << and >> can
be invented to shift integer values left or right. These same operators could
be defined in another class to indicate some sort of special shifting,
sorting, or movement operation. For example, the << operator applied to a
Window class object may mean "move it left."
C+@ array constructs are handled just like any other method. The C+@ compiler
is capable of rearranging the source code to allow array methods to be handled
in this manner.
When the C+@ compiler encounters a statement such as a[i]=j;, the receiver a
is sent the []= method with the arguments (i,j). This is equivalent to writing
a.[]=(i,j) which most C programmers would not recognize. Therefore, the
compiler handles the conversion. Array access is handled in a similar manner.
For example, the C+@ statement i=a[j]; results in the [] method being sent to
the receiver a with the argument j. This is equivalent to i=a.[](j); which,
once again, would be foreign to a C programmer. The C+@ compiler handles the
shift and allows you to think in terms of standard C.
Multidimensional arrays are also available in C+@, although the syntax of the
indexes is slightly different from C. For example, if a is a two-dimensional
array, then it's legal to write a[2,5]=j;. The C+@ compiler converts this to
a.[]=(2,5,j) and the correct []= method is selected based on the number of
arguments.
Because the standard C array operator is implemented as a normal method in
C+@, you can have statements such as array["manager"]="mary wilson"; that's
equivalent to array.[]=("manager","mary wilson"). The C+@ foundation class
library contains classes for Arrays, Trees, Lists, and Tables. Many of these
classes support the []= array assignment method and the [] array access method
for indexes other than integers.
The Smalltalk block construct allows small fragments of C+@ programs to be
grouped together into objects of class Block, and passed around. These objects
can be sent messages to execute the fragment of code encapsulated in the Block
object. This is useful when an algorithm needs to be applied to all of the
members of an array. An instance method of objects of class Array can be used
to iterate over each array element. The block of code can be executed for each
element without knowledge of the contents of the block.


C+@ Class and Method Definition


C+@ programming consists of declaring a series of classes that contain not
only a description of the fields used to store data in an object but also the
routines (or methods) used to manipulate these data fields. In Figure 5, these
field definitions for instance, variables and the methods are nested inside of
a class Name {...} construct. Inheritance relationships and class variables
can also be defined (although they don't apply in Figure 5).
In Figure 5, there are two instance variables (x and y). Everytime an object
of class Point is created, for example, the object's data area will have two
fields that can be accessed in the methods for the object by referring to the
variables x and y. In general, these variables will likely contain integer
values, but since they were declared using the keyword var, any class of
objects can be stored in a Point object's instance variables.

For class Point, there are class methods and instance methods. This is similar
to Smalltalk. The class method x_y_ can be used to construct (or create) an
object of class Point. The create keyword in the first line of the method
manufactures an object of class Point with two instance variables, x and y.
The instance methods can be used to operate on the newly-created object. The
setXY instance method is used to initialize the instance variables of the
object and the x and y methods are used to access the values stored in the
instance variables of the object. Methods look like functions in C, and class
methods are denoted with the keyword class.
In all methods, the return value is passed back in a variable that was
arbitrarily called r. Unlike Smalltalk, C+@ supports multiple return values
which are useful in debugging when extra information needs to be returned with
the primary result. The extra information can be ignored when the message is
returned.
Once class Point is defined (see Figure 5), it can be used as in Figure 6. A
Point object can be created by invoking the x_y_ class method. This method is
applied to the distinguished instance object. Once Point is created (via the
method x_y_) a pointer is returned and stored in the variable a_point. The
instance methods (x or y) can be sent to the object referenced by the
variable, a_point. The print message can then be sent to the object returned
from the respective instance method. As shown, the print method causes a value
to be printed.
A class can have more than one method defined with the same name but the
number of arguments must be different. In Figure 7, Point can be extended to
include methods that can be used to set the instance variables x or y. These
methods will each have an argument and therefore can be distinguished from the
original methods that were used to access the instance variables.
The keyword private (Figure 7) can be used instead of method to prevent
external access to the method. The setXY instance method is used in the x_y_
class method to initialize the instance after creation.


The C+@ Foundation Classes


The Clock class (Listing One, page 106) is one of over 350 classes included in
the foundation class library. This class illustrates many of C+@'s features,
including inheritance.
Besides providing a flavor for the syntax of C+@, the Clock class illustrates
how lightweight processes or threads are supported. The variable process is
initialized to contain a Process object that acts as a scheduler for updating
the clock. The method clockLoop is sent to the Process object and a Timer
object is eventually created to block the process.
The instance variables in the Clock class are tagged with the Class of the
object that the variable will likely reference. In C+@, this can be used to
provide you with additional information, but isn't essential. All of the
instance variables could have been defined in the typeless manner.
One of main advantages--and real tests--of true object-oriented systems is the
ability to reuse other people's classes. To this end, numerous people have
contributed to the more than 350 reusable C+@ classes. The foundation class
library is organized into a hierarchy using a standard file system. The
library includes demos, basic library classes, graphics classes, the C+@
compiler, and various tests and system classes. There are also a variety of
serial and visual tools that can be used for examples. Figure 8 illustrates a
small subset of the directories used to organize the source for the foundation
classes.
The binaries for all of the classes are also included with nondeveloper
versions of the system. The binaries for the classes are also contained in
standard files and are dynamically loaded when used. Usually the binaries are
organized into a few directories, and are accessed via a view-path philosophy
so that new classes can be tested by one user without impacting another user.
The binaries are portable across various machine architectures. This allows
you to develop a class library on a SPARC workstation and move it to an Intel
x86-based system without recompiling the source code. This makes it especially
attractive for developers who must ship their application for several target
architectures.
Besides the foundation classes, hundreds of other, more specialized classes
have been developed for projects at AT&T Bell Labs. The largest such project
was a prototype switching system used for demonstrations of
customer-programmable telecommunication services. This system consists of 175
classes above and beyond the foundation class library and over 75,000 lines of
code.
 Figure 1: Creating an object of class Point in C+@. x and y are
instance-variable names, 12 and 34 are instance-variable values.
 Figure 2: Creating an object of class rectangle in C+@.
Figure 3: Method selector syntax; (a) C+@; (b) Smalltalk
(a)
 C+@ Binary Methods
 a=b + c;
 p=12 @ 34
 C+@ Unary/keyword Methods
 p=Rectangle.origin_corner_(12@34,100@200);
 x=p.origin;
(b)
 Smalltalk Binary Methods
 a<=b+c
 p<=12@34
 Smalltalk Keyword Method
 p<=Rectangle origin: 12@34 corner: 100@200
 Smalltalk Unary Method
 x<= p origin
Figure 4: Conditionals: (a) C+@; (b) SmallTalk
(a)
if(queue.isEmpty) {
 index=0;
}
else{
 index=queue.next;
}
(b)
queue isEmpty
 ifTrue: [ index <= 0]
 ifFalse: [ index <= queue next]
 Figure 5: C+@ source-code structure for a class called Point
 Figure 6: Class methods and instance methods for class Point.
Figure 7: C+@ method selection based on argument count.
class Point
{
/* instance variables */
var x; /* x coordinate */
var y; /* y coordinate */
/* class methods */
class method (r) x_y_ (a_x, a_y)
{

/* create an object of class Point and set x=a_x and y=a_y */
 r = create;
 r.setXY(a_x,a_y);
}
/* instance methods */
private setXY (x_coordinate, y_coordinate)
{
 x = x_coordinate;
 y = y_coordinate;
}
method (r) x
{
 r = x;
}
method (r) x (value)
{
 x = value;
 r = self;
}
method (r) y
{
 r = y;
}
method (r) y (value)
{
 y = value;
 r = self;
}



 Figure 8: Overview of C+@ Foundation Class Library
_THE C+@ PROGRAMMING LANGUAGE_
by Jim Fleming

[LISTING ONE]

class Clock {
/* An instance of Clock is a window system application
* displaying an analog clock face with Roman Numerals. */
inherit View view;
/* constants */
const minExtent = (96@112);
const font = Font.new(Rroman.8S);
const iconFont = Font.new(Rbold.12S);
/* instance variables */
BlankView analog;
Bitmap icon;
Integer minuteHand;
Integer hourHand;
Point center;
Point minutePoint;
Point hourPoint;
Rectangle dayRectangle;
Process process;
/* CATEGORY: Creation -- Create an instance of a Clock application. */
class method (_) new
{
 _ = create;

 Layer.new(_);
}
/* RESTRICTED CATEGORY: Initialization -- Initialize an instance of Clock.
This
* method is required by View paradigm. It links into view hierarchy by using
* TaViewU as a delegate (of class View). */
method initialize (aView)
{
 Point p;
 Integer ox, oy, cx, cy;
 Integer y;
 Rectangle r;
 view = aView;
 p = view.extent;
 if (p.x < minExtent.x p.y < minExtent.y) {
 view.topView.initializeFailed(true);
 return;
 }
 /* Create a BlankView */
 ox = 0;
 oy = 0;
 cx = p.x;
 cy = p.y;
 r = Rectangle.origin_corner_(ox@oy, cx@cy);
 analog = BlankView.basicNew;
 thisSelf.newSubView(r, 1, analog);
 thisSelf.init;
}
/* RESTRICTED CATEGORY: Window System Event Handling */
/* Receive a window event. If it is a resize event then adjust our subViews.
*/
method (_) windowEvent(minor, p)
{
 Integer ox, oy, cx, cy;
 Integer y;
 Rectangle r;
 if (minor @! Event.WINDOW_RESIZE_EVENT) return;
 p = view.extent;
 if (p.x < minExtent.x p.y < minExtent.y) return;
 _ = thisSelf;
 process.terminate;
 ox = 0;
 oy = 0;
 cx = p.x;

 cy = p.y;
 r = Rectangle.origin_corner_(ox@oy, cx@cy);
 analog.adjustSubView(r);
 hourPoint = nil;
 minutePoint = nil;
 thisSelf.init;
}
method deleteLayerExit
{
 process.terminate;
}
/* RESTRICTED CATEGORY: Private -- Initialize an instance of Clock. Draw
analog
* clock face and start a surrogate process to update clock time
periodically.*/
private init
{
 Point p, q;

 Integer radius, xor, i, j;
 Float sin30, sin60;
 Integer n30, n60;
 Integer charHeight, charWidth;
 /* set up bitmap for clock icon */
 icon = Icon_OL.icon(RClockS, R R);
 /* set up clock face */
 charHeight = font.charHeight(TAU);
 charWidth = font.charWidth(TAU);
 p = analog.rectangle.extent;
 p.y(p.y - 16);
 dayRectangle = Rectangle.origin_corner_(0@p.y,analog.rectangle.corner);
 if (p.x > p.y)
 radius = (p.y / 2) - 2;
 else
 radius = (p.x / 2) - 2;
 center = (p.x / 2)@(p.y / 2);
 for (i=0; i < 3; i = i + 1)
 analog.circle(center, radius-i, Bitmap.F_STORE);
 sin30 = (Float.fromInteger(30) * Float.radiansPerDegree)
 .sin;
 sin60 = (Float.fromInteger(60) * Float.radiansPerDegree)
 .sin;
 radius = radius - charHeight*2;
 n30 = (sin30 * Float.fromInteger(radius)).asInteger;
 n60 = (sin60 * Float.fromInteger(radius)).asInteger;
 q = (n30@n60) + center;
 thisSelf.centerString(analog, RVS, font, q);
 q = (n60@n30) + center;
 thisSelf.centerString(analog, RIVS, font, q);
 q = (radius@0) + center;
 thisSelf.centerString(analog, RIIIS, font, q);
 q = (n60@(-n30)) + center;
 thisSelf.centerString(analog, RIIS, font, q);
 q = (n30@(-n60)) + center;
 thisSelf.centerString(analog, RIS, font, q);
 q = (0@(-radius)) + center;
 thisSelf.centerString(analog, RXIIS, font, q);
 q = ((-n30)@(-n60)) + center;
 thisSelf.centerString(analog, RXIS, font, q);
 q = ((-n60)@(-n30)) + center;
 thisSelf.centerString(analog, RXS, font, q);
 q = ((-radius)@0) + center;
 thisSelf.centerString(analog, RIXS, font, q);
 q = ((-n60)@n30) + center;
 thisSelf.centerString(analog, RVIIIS, font, q);
 q = ((-n30)@n60) + center;
 thisSelf.centerString(analog, RVIIS, font, q);
 q = (0@radius) + center;
 thisSelf.centerString(analog, RVIS, font, q);

 j = radius / 20;
 for (i=1;i<j;i=i+1)
 analog.circle(center, i, Bitmap.F_STORE);
 hourHand = radius / 2;
 minuteHand = radius * 3 / 4;
 /* create clock process */
 process = Process.new(thisSelf);
 process.clockLoop;

}
method centerString (aView, aString, aFont, aPoint)
{
 Integer x,y;
 Point p;
 x = aFont.xOfString(aString);
 y = aFont.yOfString(aString);
 p = aPoint - ((x/2)@(y/2));
 aView.string(aFont, aString, p, Bitmap.F_STORE);
}
method time (hours, minutes)
{
 Integer hq, mq;
 Integer length, x, y;
 Float hrads, mrads;
 String digital;

 hours = hours % 12;
 if (hours @= 0)
 digital = R12S;
 else if (hours < 10)
 digital = R %S.sprintf(hours);
 else
 digital = R%S.sprintf(hours);
 if (minutes < 10)
 digital = digital // R:0%S.sprintf(minutes);
 else
 digital = digital // R:%S.sprintf(minutes);
 Icon_OL.newString(digital, icon);
 thisSelf.layer.newIcon(icon);
 if (hours < 3)
 hq = 1;
 else if (hours < 6)
 hq = 2;
 else if (hours < 9)
 hq = 3;
 else
 hq = 4;
 if (minutes < 15)
 mq = 1;
 else if (minutes < 30)
 mq = 2;
 else if (minutes < 45)
 mq = 3;
 else
 mq = 4;
 hours = (hours % 3) * 60 + minutes;
 if (hq @= 2 hq @= 4) hours = 179 - hours;
 minutes = minutes % 15;
 if (mq @= 2 mq @= 4) minutes = 14 - minutes;
 hrads = Float.fromInteger(hours/2) * Float.radiansPerDegree;
 mrads = Float.fromInteger(minutes*6) * Float.radiansPerDegree;
 length = Float.fromInteger(hourHand);
 x = (hrads.sin * length).asInteger;
 y = (hrads.cos * length).asInteger;
 if (hq @= 1)
 y = -y;
 else if (hq @= 3)
 x = -x;

 else if (hq @= 4) {
 x = -x;
 y = -y;
 }
 analog.batchOn;
 if (hourPoint @! nil)
 analog.vector(center, hourPoint, Bitmap.F_XOR);
 hourPoint = center+(x@y);
 analog.vector(center, hourPoint, Bitmap.F_XOR);
 length = Float.fromInteger(minuteHand);
 x = (mrads.sin * length).asInteger;
 y = (mrads.cos * length).asInteger;
 if (mq @= 1)
 y = -y;
 else if (mq @= 3)
 x = -x;
 else if (mq @= 4) {
 x = -x;
 y = -y;
 }
 if (minutePoint @! nil)
 analog.vector(center, minutePoint, Bitmap.F_XOR);
 minutePoint = center+(x@y);
 analog.vector(center, minutePoint, Bitmap.F_XOR);
 analog.batchOff;
}
method clockLoop
{
 Date time, lastTime;
 lastTime = Date.now - 3601*24;
 for (;;) {
 if (!view.layer.isAlive) {
 /* Our layer has been deleted */
 Process.running.terminate;
 }
 time = Date.now;
 if (time.minute @! lastTime.minute) {
 thisSelf.time(time.hour, time.minute);
 if (time.dayOfMonth @! lastTime.dayOfMonth) {
 analog.rectf(dayRectangle, Bitmap.F_CLR);
 thisSelf.centerString(analog,
 time.dayOfWeekString[0,3] << R, R <<
 time.monthString <<
 R R << time.dayOfMonth.asString,
 iconFont, dayRectangle.center);
 }
 }
 lastTime = time;
 Timer.sleep(60 - Date.now.second);
 }
}
/* end of class Clock */
}









October, 1993
The Parasol Programming Language


An object-oriented language that supports network and parallel computing




Robert Jervis


Bob is an independent consultant and can be reached at Wizard Consulting
Services Inc., 17645 Via Sereno, Monte Sereno, CA 95030 or
bjervis!rbj@uunet.uu.net.


A programming language is a manifesto from its creator declaring what's good
and bad in programming. The good becomes a feature, the bad an error.
I created Parasol (Parallel Systems Object Language) to implement an operating
system. In 1982 I tried to build an extensible version of UNIX, using a spare
PDP-11 at work after hours. I started coding the system in C and got a
multitasking kernel running, but extensibility proved to be a problem. I set
aside the project until I owned my own computer and had the time to go
further. By the time I had resumed the project in 1989, I had become convinced
C wasn't up to the task, so I designed Parasol.
Although C and Smalltalk are its primary sources, Parasol's design was
influenced by many languages, including C++, CLU, Algol, and Turbo Pascal.
Parasol had to be as efficient as C, while incorporating some aspect of the
object-oriented capabilities of Smalltalk. When I designed Parasol I was
working on a C++ compiler project, so I knew that C++ implemented classes in a
way that avoided the performance issues of Smalltalk.
I made two decisions at the outset which determined the general outline of
Parasol: While using C as a starting point for ideas, Parasol did not have to
accept ANSI C code; and secondly, instances of classes did not have to be
"first class" objects.
This latter point is important. In making a programming language, it would be
nice if all objects could be treated as uniformly as possible. In APL, arrays
are considered first class because almost any operator that can be applied to
a scalar value can be applied to an array. In Smalltalk, all objects are first
class because they have a type derived from a single common ancestor and,
other than some necessary magic glue in some of the low-level classes to do
arithmetic and control flow efficiently, all Smalltalk classes are written in
Smalltalk itself.
Smalltalk lets you add classes that are just as capable as built-ins because
the language syntax is very simple. By contrast, C has a complex type
declaration and expression syntax and a rich set of scalar numeric types and
operators. So to make user-defined classes first class, a language such as C++
must add many features like references, operator overloading, constructors,
and destructors. C++ is made more complicated by the need for all that syntax
to define new types.
I avoided this with Parasol. All structures in Parasol are considered classes
and can have method functions defined for them. Since they're structures, they
can't be used with arithmetic operators. Consequently, a minimum of new
concepts are needed in Parasol beyond what is already found in C.
In the last three years I've added distributed and parallel programming
constructs to Parasol, including interprocess messages and multithreaded
applications.
The Parasol language itself (including the name) is in the public domain. The
current implementation is for a 32-bit stand-alone operating system that uses
DOS disk files. It is available as unsupported shareware and includes the
operating system and all source code for the compiler and libraries. This
implementation is still a research project, so no promises about bugs. It does
run on most desktop 80386/486 DOS systems, but it doesn't recognize DOS 6
compressed disk partitions. A SPARC/UNIX implementation is in the works.
Parasol 0.5 is available electronically from DDJ (see "Availability," page 3).
Alternatively, registered versions can be purchased directly from me.


Declaration Syntax


Parasol declaration syntax is closer to Pascal than to C. Example 1(a) is a
simplified version of the general form of a Parasol declaration. Functions are
declared in almost the same way, see Example 1(b). A new type name can be
introduced with the code in Example 1(c). For objects, you can declare more
than one name in a single declaration by simply using a list of identifiers
separated by commas; see Example 1(d).
Like many Algol-like languages, Parasol is block structured (although more in
the spirit of C than Pascal, since functions can't be nested). All symbols
must be unique within their own scope of definition, but symbol names can have
distinct declarations in any number of different scopes. A reference in an
expression always refers to the symbol definition in the "closest" enclosing
scope. If you refer to a symbol defined in the same scope, it doesn't matter
whether the definition occurs before or after the reference. Thus the pairs of
statements in Examples 1(e) and 1(f) are equivalent in the body of a function.
Local declarations in a function body don't have to occur at the top of each
block. Like C++, they can occur anywhere in the block.
Exposure determines how accessible a symbol is outside its own scope. A
Parasol exposure can be public, private, or visible. For example, an object
declared at file scope can be global (accessible in other modules) by using
public or visible exposure, or local to its own file by using private
exposure. In most circumstances, the exposure of an object will default to
private if you don't specify otherwise. This encourages encapsulation, since
you must explicitly decide which symbols are public to the outside world.
Public objects can be read or written anywhere, but visible objects can only
be modified from within their own scope. Private objects can neither be read
nor written outside their own scope. A public integer might be declared as in
Example 2(a).
Parasol's numeric types include two integral and one floating-point type:
signed, unsigned, and float. These are the only truly built-in Parasol numeric
types. You can specify a size in bits for each of these types. A variety of
type synonyms are predefined for commonly used sizes. The actual amount of
memory an object consumes is left up to the compiler, however. For example,
Example 2(b) shows some of the predefined type names and their sizes for the
Intel 32-bit implementation.
Note that if you omit a size, the compiler picks a default. For example, plain
signed, unsigned, and float objects all happen to be 32 bits wide on the Intel
implementation.
You should use the type synonyms, especially for the floating-point types,
because the exact sizes will vary from one machine to another. The actual
compiled sizes are chosen to be at least as large as declared, but, at the
very least, an efficient fit to the performance of the machine. Thus, int
would be typically either 16, 32, or 64 bits wide. The long type must be the
widest integer size available on that machine and is typically either 32 or 64
bits wide.
You can declare integral bit fields with exact sizes by defining them within a
packed structure. You should only resort to exact bit sizes in declarations
for externally specified data formats, like system control blocks.
The ref keyword means "pointer-to." Parentheses enclose the arguments to a
function. Unlike C, empty parentheses mean that no arguments are allowed. For
example, an integer absolute-value function might be declared as in Example
2(c). Square brackets, on the other hand, declare an array. A buffer of 512
bytes might be declared as in Example 2(d).
More complex data types are constructed by stringing declarators together in
left-to-right order. For example, Example 2(e) declares x as a pointer to a
function returning a pointer to an array of ten singles.


Classes


You declare a class by enclosing a list of declarations inside curly braces,
with some optional modifiers in front of the curly braces. This enclosed list,
as in Example 3(a), creates a new structure or class type.
Example 3(b) declares a structure named point (for a 2-D graphics package). In
this example, the symbols x and y are members of the class. The whole
declaration gives this new type a name: point. Objects of type point can now
be declared and manipulated.
Class modifiers are union, packed, or inherit. The union keyword declares a
C-like union, where all object members overlap one another. The packed keyword
signals that bit-sized members should be packed into words as densely as
possible. The inherit keyword (followed by the name of a class type) declares
that this is a derived class.
You can declare anything inside a class, including other classes. There are
some differences between declarations made inside a class and outside. Objects
declared inside a class are not static by default, but instead are fields of a
structure. Functions can be declared inside a class as well, where they are
called "methods." Methods are not called in quite the same way as normal
functions.
You call a method by designating an object of its class in the call
expression. The syntax requires that you name the object to the left of the
method name; see Example 3(c). Listing One (page 110) shows an example of an
object, O1, with one public method, func. The main routine contains a call to
that method.


Inheritance


Parasol supports only single inheritance so that when you declare a derived
class, you can name only one base class. The memory for an object is laid out
fairly simply, with the memory for the members of the base class first,
followed by the memory for the derived class members; see Figure 1. Space is
allocated in a derived class for newly defined members, even if they have the
same name as a member of the base class.
Parasol is like C++ in that you can refer to exposed (public or visible)
members of a class using the dot or arrow operators (depending on whether you
have an object or a pointer to the object). You can also refer to members from
within method function code.
Just as local block scopes nest within the body of a function, in Parasol the
body of a class forms a scope, which all of the enclosed methods share.
Listing Two (page 110) illustrates a method, hypot, which computes the
hypotenuse of a right-triangle for the coordinates of the point object. This
method refers to the two members, x and y, of the enclosing class. Remember
that members are just fields in a structure, so these references must be to
some object. Since you must mention an object in a call to any method, it is
this object that you are actually referring to. In effect, the address of the
mentioned object is passed as a hidden parameter in a call to a method.

In more complicated situations (such as where you have derived classes), the
chain of base classes form a set of nested scopes as well. Thus, when matching
names to variables, after the compiler has exhausted all the local block
scopes inside a function, it next looks in the list of members of the
enclosing class. If the desired symbol is not found there, before the compiler
moves to examine the scope enclosing the type (usually file scope), it looks
along the chain of base classes. The effect of this is that you can redefine
methods in subclasses.
You can use two keywords to access the hidden object parameter. The self
keyword is a pointer to the object passed to the method. Its type is a pointer
to the enclosing class. In a derived class, you can also use the keyword super
to refer to the same object. This keyword has the same value as self but is a
pointer to the base class of the object.
The super keyword comes in handy if you want to call methods in the base class
that have been redeclared in the subclass.
Inheritance provides an excellent way to help organize and document
interfaces. By exploiting the redefinition of methods in derived classes, you
can design a much more structured and well-organized program than with C.


Polymorphism


The Parasol windowing library defines a set of common capabilities that all
windows share. Thus all windows have a redraw function that gets called when a
window is moved or resized. In C, you would have to implement such a
capability in a couple of different ways.
In one way, you would have some master redraw function, that is, a huge switch
statement. For every type of window in the program, you would have a case that
controls the redraw of that window. This means that every time you add a new
window type to your program, you need to modify this function. Since the
window library has several functions beside redraw, there are several
different switch statements in different places, each requiring updating
whenever a new window type is added.
A better way, in C, is to define a set of function pointers that you store
with each window. At run time, when you actually create a window, the
appropriate function pointers are copied to the object. That way, when you
need to redraw a window, you simply call through the pointer. This has the
advantage of allowing you to cleanly add window types without having to change
existing code.
Parasol (like C++) provides convenient syntax to make the function pointer
solution easy to implement. In Parasol, you simply declare a method with the
keyword dynamic. Then the compiler arranges for an object of the type
containing the method to use a run time pointer in all calls of the method. In
Listing Three (page 110), a window class is defined, containing a redraw
function that accepts no arguments and produces no return value. Then, an
editor class is created that defines a new version of redraw that does the
specific redrawing operations for an editor.
There are some restrictions on dynamic functions in Parasol. First of all, for
statically-bound methods, there are no restrictions on how you redefine the
arguments or the return value of the function. They can be arbitrarily changed
in a subclass. A dynamic function, on the other hand, must be redeclared with
the same arguments and return type as it was originally defined in the base
class.
If you look again at Listing Three, the function at the bottom is passed a
pointer to a window object, and it calls the redraw function. If the argument
passed actually points to an editor and not a window, the call will
automatically go to the version of the redraw method for an editor at run
time. Because the caller doesn't know the actual object it is calling, at
compile time, the arguments and return type must be fixed for all versions of
the redraw function.
Note also that the body of the editor's redraw function uses super seemingly
to call itself. In fact, because this call uses super, the code generated is a
call to the redraw of the window class (the base for editor). This is a common
construct in the windowing library.
C++ provides essentially the same capabilities as dynamic functions, but just
calls them "virtual" functions instead. Parasol does allow one element of
flexibility that I haven't found in C++ compilers: If you assign a pointer to
a subclass object to a pointer to the base class, C++ rejects this assignment.
Parasol accepts it. As long as you are copying from a more specific subclass
to the more general base, this copy is considered legal in Parasol (though not
in the other direction, of course). In Listing Three, you can assign the
address of an editor object to a pointer to window, but not the address of a
window object to a pointer to editor.


Units


One of the real shortcomings of C++ is that class definitions are written in
header files and included in each compilation unit. Consequently, common
information describing a class is replicated in all these separate modules.
Methods have to be written outside the class so that they can be placed
outside the headers. A great deal of effort has been put into C++ compilers to
overcome the inefficiencies and complexities that arise from these
constraints.
Parasol overcomes these limitations by changing the program structure. In
Parasol, you simply write source units (analogous to modules in Modula-2 or
units in Turbo Pascal). By declaring an object public in a unit, that object's
definition is available in other compiles. To gain access to a unit's public
types, (such as objects and functions) another unit must explicitly include
the definition; see Example 4(a).
One advantage of this scheme is that public symbols can be duplicated in
different units of the same program. This means that libraries obtained from
different sources won't have public symbols that clash. If two units sharing a
common public symbol are both included into a third source, you can still
disambiguate references using the :: operator; see Example 4(b).


Messages and Threads


Parasol allows you to define special objects that can exchange messages with
other processes. An object receiving messages must be defined as a subclass of
the built-in class called "external." The subclass then defines a set of
methods, each marked with the gate keyword. A client can send messages to the
object by first obtaining a special far pointer to it from the operating
system or messaging library. You actually send a message by simply calling a
gate method using the far pointer. The Parasol compiler generates code to send
the arguments (as the body of the message) and wait for a reply, which then
becomes the return value.
These capabilities make for a natural scheme for defining client/server
applications. The server is simply an object subclassed from external, and the
client is any Parasol program.
External objects are designed to operate as separate processes. The Parasol
library includes facilities to start and control these threads. The
object-oriented capabilities of Parasol have proven useful even for
multithreaded applications. Since Parasol's libraries are designed to use
objects, there are few if any static variables (which tend to cause trouble
for multithreaded applications).


Conclusion


I began designing Parasol with the idea of fixing some syntax problems in C
and adding a minimum of new features. The changed syntax certainly presents a
barrier for people with large bodies of C code, but the binary import/export
mechanism of units and the object-oriented extensions are real enhancements to
C. Parasol is simpler than C++, so I spend my time coding solutions, not
exploring exotic features.
Parasol has a number of features I haven't even mentioned here, including
exceptions, but altogether it is still a fairly compact language. The compiler
I've written is fast (compiling over 60,000 lines per minute on my 66MHz 486),
and the code is as good as unoptimized C. Parasol should optimize at least as
easily as C, I just haven't had time to write an optimizer. I'm now working on
a Parasol-to-C translator to make the language more readily available to
people.
Parasol began as yet another object-oriented language, and while it has
advantages, who needs another one of those? Now that Parasol has messages and
threads, it is more than just another OO language. Network and parallel
computing need new languages that give the programmer some help. I think
Parasol does just that.
Example 1: (a) A typical Parasol declaration; (b) declaring a function; (c)
introducing a new type name; (d) declaring more than one name in a single
declaration; (e) and (f) these statements are equivalent in the body of a
function

(a)

name: exposure type-declaration = initializer;


(b)

name: exposure type-declaration = { statements; }


(c)

name: exposure type type-declaration;


(d)


i, j, k: int;


(e)

i: int;
i = 7;


(f)

i = 7;
i: int;



Example 2: (a) delcaring a public integer; (b) pre-defined type names and
their sizes; (c) declaring an integer absolute value function; (d) declaring a
bugger of 512 bytes; (e) declaring a pointer to a function and returning a
pointer to an array.

(a)

i: public int;


(b)

byte: public type unsigned[8];
short: public type signed[16];
int: public type signed[32];
long: public type signed[64];

single: public type float[32];
double: public type float[64];
extended: public type float[80];


(c)

abs: (x: int) int = {
 return x >= 0 ? x : -x;
 }


(d)

buffer: [512] byte;


(e)


x: ref () ref [10] single;



Example 3: (a) Creating a new structure or class type; (b) declaring a
structure named point; (c) Parasol syntax requires that you name the object to
the left of the method name

(a)


class-modifiers {
declarations;
}


(b)

point: type { public:
 x, y: short;
 };


(c)

object method ( arguments );


Example 4:

(a)
Unit a:
 xyz: public int;
Unit b:
 include a;
 ... xyz ...


(b)

Unit aa:
 xyz: public int;
Unit bb:
 xyz: public double;
Unit cc:
 include aa, bb;
 ... aa::xyz ...
 ... bb::xyz ...

 Figure 1: Memory layout for a Parasol object.
_THE PARASOL PROGRAMMING LANGUAGE_
by Bob Jervis
[LISTING ONE]
// O1 is an object with class type.
// The type of O1 is anonymous.
O1: {
 hidden: int;
 public:
 record: (i: int) =
 {
 hidden = i;
 }
 func: (i: int) int =
 {
 return i * 3 + hidden;
 }
 };
main: entry () = {
 x: int;
 O1 record(3);

 x = O1 func(5); // Method call
 printf("Value is %d\n", x); // Prints 'Value is 18'
 }
[LISTING TWO]
include math;
point: type {
 x, y: short;
 hypot: () single =
 {
 f, g: short;
 f = x;
 g = y;
 return sqrt(f * f + g * g);
 }
[LISTING THREE]
window: type { public:
 redraw: dynamic () = { ... }
 };
editor: type inherit window { public:
 redraw: dynamic () =
 {
 super redraw();
 ...
 }
 };
func: (p: ref window) =
 {
 p redraw();
 }

































October, 1993
The Sather Programming Language


Efficient, interactive, and object oriented




Stephen M. Omohundro


Stephen does research on learning and computer vision, as well as developing
Sather at the International Computer Science Institute, 1947 Center Street,
Berkeley, CA 94704. He wrote the three-dimensional graphics for Mathematica
and was a co-designer of Star-Lisp for the Connection Machine. He can be
contacted at om@icsi.berkeley.edu.


Sather is an object-oriented language which aims to be simple, efficient,
interactive, safe, and nonproprietary. One way of placing it in the "space of
languages" is to say that it aims to be as efficient as C, C++, or Fortran, as
elegant and safe as Eiffel or CLU; and to support interactive programming and
higher-order functions as well as Common Lisp, Scheme, or Smalltalk.
Sather has parameterized classes, object-oriented dispatch, statically checked
strong typing, separate implementation and type inheritance, multiple
inheritance, garbage collection, iteration abstraction, higher-order routines
and iters, exception handling, constructors for arbitrary data structures and
assertions, preconditions, postconditions, and class invariants. This article
describes a few of these features. The development environment integrates an
interpreter, a debugger, and a compiler. Sather programs can be compiled into
portable C code and can efficiently link with C object files. Sather has a
very unrestrictive license which allows its use in proprietary projects but
encourages contribution to the public library.
The original 0.2 version of the Sather compiler and tools was made available
in June 1991. This article describes Version 1.0. By the time you're reading
this, the combined 1.0 compiler/interpreter/debugger should be available on
ftp.icsi.berkeley.edu and the newsgroup comp.lang.sather should be activated
for discussion.


Code Reuse


The primary benefit object-oriented languages promise is code reuse. Sather
programs consist of collections of modules called "classes" which encapsulate
well-defined abstractions. If the abstractions are chosen carefully, they can
be used over and over in a variety of different situations.
An obvious benefit of reuse is that less new code needs to be written. As
important is the fact that reusable code is usually better written, more
reliable and easier to debug because programmers are willing to put more care
and thought into writing and debugging code which will be used in many
projects. In a good object-oriented environment, programming should feel like
plugging together prefabricated components. Most bugs occur in the 10 percent
or so of newly written code, not in the 90 percent of well-tested library
classes. This usually leads to simpler debugging and greater reliability.
Why don't traditional subroutine libraries give the same benefits? Subroutine
libraries make it easy for newly written code to make calls on existing code
but don't make it easy for existing code to make calls on new code. Consider a
visualization package that displays data on a certain kind of display by
calling display-interface routines. Later, the decision is made that the
package should work with a new kind of display. In traditional languages,
there's no simple way to get the previously written visualization routines to
make calls on the new display interface. This problem is especially severe if
the choice of display interface must be made at run time.
Sather provides two primary ways for existing code to call newly written code.
"Parameterized classes" allow the binding to be made at compile time and
"object-oriented dispatch" allows the choice to be made at run time. I'll
demonstrate these two mechanisms using simple classes for stacks and polygons.


Parameterized Classes


Listing One (page 112) shows a class which implements a stack abstraction. We
want stacks of characters, strings, polygons, and so on, but we don't want to
write new versions for each type of element. STACK{T} is a "parameterized
class" in which the parameter T specifies the stack element type. When the
class is used, the type parameter is specified.
For example, the class FOO in Listing One defines a routine which uses both a
stack of characters and a stack of strings. The type specifier STACK{CHAR}
causes the compiler to generate code with the type parameter T replaced by
CHAR. The specifier STACK{STR} similarly causes code to be generated based on
STR. Since character objects are usually eight bits and strings are
represented by pointers, the two kinds of stack will have different layouts in
memory. The same Sather source code is reused to generate different object
code for the two types. We may define a new type (such as triple-length
integers) and immediately use stacks of elements of that type without
modifying the STACK class. Using parameterized classes adds no extra run time
cost, but the choice of type parameter values must be made at compile time.


Object-oriented Dispatch


Listing Two (page 112) shows an example of object-oriented dispatch. The class
$POLYGON is an "abstract" class which means it represents a set of possible
object types called its "descendants" (in this case TRIANGLE and SQUARE).
Abstract classes define abstract interfaces which must be implemented by all
their descendants. Listing Two only shows the single routine
number_of_vertices:INT which returns the number of vertices of a polygon.
TRIANGLE's implementation returns the value 3, and SQUARE's returns 4.
Routines in the interface of an abstract type may be called on variables
declared by that type. The actual code that's called, however, is determined
at run time by the type of the object which is held by the variable. The class
FOO2 defines a routine with a local variable of type STACK{$POLYGON}. Both
TRIANGLE and SQUARE objects can be pushed onto stacks of this type. The call
s.pop might return either a triangle or a square. The call
s.pop.number_of_vertices calls either the number_of_vertices routine defined
by TRIANGLE and returns 3 or the number_of_vertices routine defined by SQUARE
and returns 4. The choice is made according to the run-time type of the popped
object. The names of abstract types begin with a $ (dollar sign) to help
distinguish them (calls on abstract types are slightly more expensive than
non-dispatched calls).


Strong Typing


The Sather type system is a major factor in the computational efficiency,
clarity, and safety of Sather programs. It also has a big effect on the "feel"
of Sather programming. Many object-oriented languages have either weak typing
or none at all. Sather, however, is "strongly typed," meaning that every
Sather object and variable has a specified type and that there are precise
rules defining the types of object that each variable can hold. Sather is able
to statically check programs for type correctness--if a piece of Sather code
is accepted by the interpreter or compiler, it's impossible for it to assign
an object of an incorrect type to a variable.
Statically checked, strong typing helps the Sather compiler generate efficient
code because it has more information. Sather avoids many of the run-time
tag-checking operations done by less strongly typed languages.
Statically checked, strongly typed languages help programmers to produce
programs that are more likely to be correct. For example, a common mistake in
C is to confuse the C assignment operator = with the C equality test ==.
Because the C conditional statement if(_) doesn't distinguish between Boolean
and other types, a C compiler is just as happy to accept if(a=b) as if(a==b).
In Sather, the conditional statement will only accept Boolean values,
rendering impossible this kind of mistake.
Languages like Beta are also strongly typed, but not statically checkable.
Consequently, some type checking must be done at run time. While this is
preferable to no type checking at all, it reduces the safety of programs. For
instance, there may be a typing problem in obscure code that isn't exercised
by test routines. Errors not caught by the compiler can make it into final
releases.
Sather distinguishes "abstract types," which represent more than one type of
object, from other types, which do not. This has consequences for both the
conceptual structure and the efficiency of programs. An example which has been
widely discussed is the problem of the add_vertex routine for polygons. This
is a routine which makes sense for generic polygons but does not make sense
for triangles, squares, and so on. In languages which do not separate abstract
types from particular implementations, you must either make all descendants
implement routines that don't make sense for them, or leave out functionality
in parent classes.
The Sather solution to this is based on abstract types. The Sather libraries
include the abstract class $POLYGON, which defines the abstract interface that
all polygons must provide. It also includes the descendant class POLYGON,
which implements generic polygons. The add_vertex routine is defined in
POLYGON but is not defined in $POLYGON. TRIANGLE and SQUARE, therefore, do not
need to define it.
Run-time dispatching is only done for calls on variables declared by abstract
types. The Sather compiler is, itself, a large program written in Sather which
uses a lot of dispatching. The performance consequences of abstract types were
studied by comparing a version of the compiler in which all calls were
dispatched to the standard version (Lim and Stolcke, 1991). The use of
explicit typing causes one-tenth the number of dispatches and an 11.3 percent
reduction in execution time.


Separate Implementation and Type Inheritance



In most object-oriented languages, inheritance defines the subtype relation
and causes the descendant to use an implementation provided by the ancestor.
These are quite different notions; confusing them often causes semantic
problems. For example, one reason why Eiffel's type system is difficult to
check is that it mandates "covariant" conformance for routine argument types
(Meyer, 1992). This means a routine in a descendant must have argument types
which are subtypes of the corresponding argument types in the ancestor.
Because of this choice, the compiler can't ensure argument expressions conform
to the argument type of the called routine at compile time. In Sather,
inheritance from abstract classes defines subtyping while inheritance from
other classes is used solely for implementation inheritance. This allows
Sather to use the statically type-safe contravariant rule for routine argument
conformance.


Multiple Inheritance


In Smalltalk and Objective-C, each class only inherits from a single class. In
Sather, classes can inherit from an arbitrary number of classes, a property
called "multiple inheritance." This is important because it commonly occurs in
modeling physical types. For example, there might be types representing "means
of transportation" and "major expenditures." The type representing
"automobiles" should be a descendant of both of these. In Smalltalk or
Objective-C, which only support single inheritance, you'd be forced to make
all "means of transportation" be "major expenditures" or vice versa.


Garbage Collection


Languages derived from C are usually not "garbage collected," making you
responsible for explicitly creating and destroying objects. Unfortunately,
these memory-management issues often cut across natural abstraction
boundaries. The objects in a class usually don't know when they are no longer
referenced and the classes which use those objects shouldn't have to deal with
low-level memory-allocation issues.
Memory management done by the programmer is the source of two common bugs. If
an object is freed while still being referenced, a later access may find the
memory in an inconsistent state. These so-called "dangling pointers" are
difficult to track down because they often cause code errors far removed from
the offending statement.
"Memory leaks," caused when an object is not freed even though there are no
references to it, are also hard to find. Programs with this bug use more and
more memory until they crash. Sather uses a "garbage collector" which tracks
down unused objects and reclaims the space automatically. To further enhance
performance, the Sather libraries generate far less garbage than is typical in
languages like Smalltalk or Lisp.


Interactive, Interpreted Programming


Sather combines the flexibility of an interactive, interpreted environment
with very high-efficiency compiled code. During development, the well-tested
library classes are typically run compiled, while the new experimental code is
run interpreted. The interpreter also allows immediate access to all the
built-in algorithms and data structures for experimentation. Listing Three
(page 112) is an example of an interactive Sather session.


Iteration Abstraction


Most code is involved with some form of iteration. In loop constructs of
traditional languages, iteration variables must be explicitly initialized,
incremented, and tested. This code is notoriously tricky and is subject to
"fencepost errors." Traditional iteration constructs require the internal
implementation details of data structures like hash tables to be exposed when
iterating over their elements.
Sather allows you to cleanly encapsulate iteration using constructs called
"iters" (Murer, Omohundro, and Szyperski, 1993) that are like routines, except
their names end in an exclamation point (!), their bodies may contain yield
and quit statements, and they may only be called within loops. The Sather loop
construct is simply: loop_end. When an iter yields, it returns control to the
loop. When it is called in the next iteration of the loop, execution begins at
the statement following the yield. When an iter quits, it terminates the loop
in which it appears. All classes define the iters until!(BOOL), while!(BOOL),
and break! to implement more traditional looping constructs. The integer class
defines a variety of useful iters including upto!(INT):INT, downto!(INT):INT,
and step!(num,step:INT):INT. Listing Four (page 112) shows how upto! is used
to output digits from 1 to 9.
Container classes, such as arrays or hash tables, define an iter elts!:T to
yield the contained elements and an iter called set_elts!(T) to insert new
elements. Listing Four shows how to set the elements of an array to successive
integers and then how to double them. Notice that this loop doesn't have to
explicitly test indices against the size of the array.
The tree classes have iters to yield their elements according to the "pre,"
"post," and "in" orderings. The graph classes have iters to yield the vertices
according to depth-first and breadth-first search orderings.


The Implementation


The first version of the Sather compiler was itself written in Sather by
Chu-Cheow Lim and has been operational for several years. It compiles into C
code and has been ported to a wide variety of machines. It is a fairly large
program with about 30,000 lines of code in 183 classes (this compiles into
about 70,000 lines of C code).
Lim and Stolcke extensively studied the performance of the compiler on both
MIPS and Sparc architectures. Because the compiler uses C as an intermediate
language, the quality of the executable code depends on the match of the C
code templates used by the Sather compiler to the optimizations employed by
the C compiler. Compiled Sather code runs within 10 percent of the performance
of handwritten C code on the MIPS machine and is essentially as fast as
handwritten C code on the Sparc architectures. On a series of benchmark tests
(towers of Hanoi, 8 queens, and the like) Sather performed slightly better
than C++ and several times better than Eiffel. The new compiler performs
extensive automatic inlining and so provides more opportunities for
optimization than typical handwritten C code.


The Libraries


The Sather libraries currently contain several hundred classes and new ones
are continually being written. Eventually, we hope to have efficient,
well-written classes in every area of computer science. The libraries are
covered by an unrestrictive license which encourages the sharing of software
and crediting authors, without prohibiting use in proprietary and commercial
projects. Currently there are classes for basic data structures, numerical
algorithms, geometric algorithms, graphics, grammar manipulation, image
processing, statistics, user interfaces, and connectionist simulations.


pSather


Sather is also being extended to support parallel programming. An initial
version of the language "pSather" (Murer, Feldman, and Lim, 1993) runs on the
Sequent Symmetry and the Thinking Machines CM-5. pSather adds constructs for
programming on a distributed-memory, shared-address machine model. It includes
support for control parallelism (thread creation, synchronization), an SPMD
form of data parallelism, and mechanisms to manipulate execution control and
data in a nonuniform access machine The issues which make object-oriented
programming important in a serial setting are even more important in parallel
programming. Efficient parallel algorithms are often quite complex and should
be encapsulated in well-written library classes. Different parallel
architectures often require the use of different algorithms for optimal
efficiency. The object-oriented approach allows the optimal version of an
algorithm to be selected according to the machine it is actually running on.
It is often the case that parallel code development is done on simulators
running on serial machines. A powerful object-oriented approach is to write
both simulator and machine versions of the fundamental classes in such a way
that a user's code remains unchanged when moving between them.


Conclusion


I've described some of the fundamental design issues underlying Sather 1.0.
The language is quite young, but we are excited by its prospects. The user
community is growing, and new class development has become an international,
cooperative effort. We invite you join in its development!


Acknowledgments



Sather has adopted ideas from a number of other languages. Its primary debt is
to Eiffel, designed by Bertrand Meyer, but it has also been influenced by C,
C++, CLOS, CLU, Common Lisp, Dylan, ML, Modula-3, Oberon, Objective C, Pascal,
SAIL, Self, and Smalltalk. Many people have contributed to the development and
design of Sather. The contributions of Jeff Bilmes, Ari Huttunen, Jerry
Feldman, Chu-Cheow Lim, Stephan Murer, Heinz Schmidt, David Stoutamire, and
Clemens Szyperski were particularly relevant to the issues discussed in this
article.


References


ICSI Technical reports are available via anonymous ftp from
ftp.icsi.berkeley.edu.
Lim, Chu-Cheow and Andreas Stolcke. "Sather Language Design and Performance
Evaluation." Technical Report TR-91-034. International Computer Science
Institute, Berkeley, CA, May 1991.
Meyer, Bertrand. Eiffel: The Language. Prentice Hall, New York, NY, 1992.
Murer, Stephan, Stephen Omohundro, and Clemens Szyperski. "Sather Iters:
Object-oriented Iteration Abstraction." ACM Letters on Programming Languages
and Systems (submitted), 1993.
Murer, Stephan, Jerome Feldman, and Chu-Cheow Lim. "pSather: Layered
Extensions to an Object-Oriented Language for Efficient Parallel Computation."
Technical Report TR-93-028. International Computer Science Institute,
Berkeley, CA, June 1993.
Omohundro, Stephen. "Sather Provides Non-proprietary Access to Object-oriented
Programming." Computers in Physics. 6(5):444-449, 1992.
Omohundro, Stephen and Chu-Cheow Lim. "The Sather Language and Libraries."
Technical report TR-92-017. International Computer Science Institute,
Berkeley, CA, 1991.
Schmidt, Heinz and Stephen Omohundro. "Clos, Eiffel, and Sather: A
Comparison," in Object Oriented Programming: The CLOS Perspective, edited by
Andreas Paepcke. MIT Press, Boston, MA, 1993.
_THE SATHER PROGRAMMING LANGUAGE_
by Stephen M. Omohundro

[LISTING ONE]

class STACK{T} is
 -- Stacks of elements of type T.
 attr s:ARR{T}; -- An array containing the elements.
 attr size:INT; -- The current insertion location.

 is_empty:BOOL is
 -- True if the stack is empty.
 res := (s=void or size=0) end;

 pop:T is
 -- Return the top element and remove it. Void if empty.
 if is_empty then res:=void
 else size:=size-1; res:=s[size]; s[size]:=void end end;

 push(T) is
 -- Push arg onto the stack.
 if s=void then s:=#ARR{T}(asize:=5)
 elsif size=s.asize then double_size end;

 s[size]:=arg; size:=size+1 end;

 private double_size is
 -- Double the size of `s'.
 ns::=#ARR{T}(asize:=2*s.asize); ns.copy_from(s); s:=ns end;

 clear is
 -- Empty the stack.
 size:=0; s.clear end

end; -- class STACK{T}

class FOO is
 bar is
 s1:STACK{CHAR}; s1.push('a');
 s2:STACK{STR}; s2.push("This is a string.") end;
end;




[LISTING TWO]

abstract class $POLYGON is
 ...
 number_of_vertices:INT;
end;

class TRIANGLE is
 inherit $POLYGON;
 ...
 number_of_vertices:INT is res:=3 end;
end;

class SQUARE is
 inherit $POLYGON;
 ...
 number_of_vertices:INT is res:=4 end;
end;

class FOO2 is
 bar2 is
 s:STACK{$POLYGON};
 ...
 n:=s.pop.number_of_vertices;
 ...
 end;
end;


[LISTING THREE]

>5+7
12

>40.intinf.factorial
815915283247897734345611269596115894272000000000

>#OUT + "Hello world!"
Hello world!

>v::=#VEC(1.0,2.0,3.0); w::=#VEC(1.0,2.0,3.0);
>v+w
#VEC(2.0, 4.0, 6.0)

>v.dot(w)
14.0

>#ARRAY{STR}("grape", "cherry", "apple", "plum", "orange").sort
#ARRAY{STR}("apple","cherry","grape","orange","plum")



[LISTING FOUR]

>loop #OUT+1.upto!(9) end

123456789

>a::=#ARRAY{INT}(asize:=10)

>loop a.set_elts!(1.upto!(10)) end
>a
#ARRAY{INT}(1,2,3,4,5,6,7,8,9,10)

>loop a.set_elts!(2*a.elts!) end
>a
#ARRAY{INT}(2,4,6,8,10,12,14,16,18,20)























































October, 1993
The Liana Programming Language


More than a language, a tool for building Windows apps




Ray Valdes


Ray is senior technical editor of DDJ. He can be reached at
rayval@well.sf.ca.us.


Liana is both an object-oriented programming language and a development system
for creating Windows applications. As a programming language, it strongly
resembles C++, except Liana's syntax is smaller and less restrictive. Like
C++, Liana uses classes and member functions to provide encapsulation,
inheritance, and polymorphism. Unlike C++, Liana uses no pointers, offers
automatic memory cleanup and typeless variables, and lacks multiple
inheritance (which some might consider a feature).
As a development tool, Liana leans in the direction of interactive integrated
systems such as Asymetrix Toolbook, Microsoft Visual Basic, or Digitalk
Smalltalk/V for Windows. Like these systems, Liana provides a great deal of
support for creating event-driven programs for the Windows environment.
However, the current release of Liana is much less visual than these
environments, relying solely on ASCII source files created with a text editor
and compiled from the command line. Still, this compile-and-link cycle is
relatively fast, painless, and can be easily done from a DOS window in the
Windows environment. One benefit of the reliance on text files is that all
program components are easily visible, unlike systems such as Visual Basic or
Toolbook (the initial versions), in which programs consist partly of visible
code fragments and partly of "invisible" data structures created by
point-and-click means.
Liana 1.0 was released by Base Technology (Boulder, Colorado) for the Windows
3.0 environment in 1991. Liana 1.0 also runs on Windows 3.1. At this writing,
Version 2.0 of Liana (in beta) adds numerous small enhancements to the
language as well as providing an integrated-development environment (IDE)
similar to those in many programming languages; the Liana IDE allows you to
compile and link via a window/menu/buttonbar interface rather than from the
command line. Base Technology has also demonstrated a 32-bit version of Liana
for Windows NT. However, unless explicitly stated, this article refers only to
the current release of Liana.
The principal design goal of Liana is to simplify and accelerate the process
of writing Windows applications--especially those that are single-person in
size or small in scale. Like Toolbook, Visual Basic, and Smalltalk/V, Liana
works at a level of abstraction much higher than programming the raw Windows
API with C or C++. The Liana system includes an application framework that
packs in a goodly amount of built-in functionality. Consequently, many of the
package's example programs are "one-liners" that concisely illustrate a
particular subject. For example, the Liana version of the classic "hello
world" program C programmers have used since time immemorial is main { (new
window ("Hello world")).show; }.
While Liana does allow you to use the traditional printf("Hello world")
instead of the intriguing statement above, the one-liner illustrates some of
Liana's interesting aspects. The line of code completely defines the main
function of a Liana program. It's similar to main() in a C/C++ program, except
that, in Liana, functions without arguments don't need to be followed by an
empty argument list. The function body that follows the function name consists
of braces containing a single statement. This statement is a nested
expression; the inner term invokes the new operator to create an instance of
class window. The constructor call has one argument, which is the window
title. The constructor returns an object reference which happens not to be
stored anywhere, although it could have been assigned to a variable. Rather,
the show() member function of this anonymous object is invoked. Because show()
has no arguments, its empty argument list is omitted. If you understood all
the jargon in this paragraph, then you've been reading too many C++ articles!)
Happily, programming in Liana can be accomplished without much of the tedious
baggage thrown onto the shoulders of C++ programmers. For example, the
one-line program does not require a delete to match the call to new. Likewise,
variables don't have to be explicitly declared, but are instantiated by using
them in a statement. These become global in scope. Variables in Liana can be
declared as one of the built-in types (int, real, string, Boolean, array, or
Windows memory block), user-defined types (objects), or the generic type, any.
You may wonder what happened to main in this Liana one-liner, paint { w <<
"Hello world"; }. If you omit main, Liana invokes a default version of main
which creates (and shows) an instance of class window and by default sets the
global variable w to refer to it. As in C++, the left-shift operator (<<) is
overloaded to provide printf-like output capability to the window object. In
the case of this program, output occurs when the main application window needs
repainting, via an automatic call to paint. The paint function is one of
various callback functions Liana automatically invokes whenever an interesting
event happens in the Windows environment.
Examples of interesting events include mouse click, mouse up, keystroke,
window close, window paint, timer, and so on. These correspond to standard
messages in Windows such as WM_LBUTTONDOWN, WM_MOUSEMOVE, WM_KEY, WM_CLOSE,
WM_PAINT, WM_TIMER, and the like. The Liana language and its built-in
application framework shield you from most of this "Windows grunge" (to use a
Seattle-area term). In fact, the documentation doesn't mention any of the WM_*
messages (although perhaps it should).
The one-line program position(x,y) { w.home; w << x << "," << y << " ";}
consists of a callback function that's automatically invoked whenever the
mouse moves. The function "homes" the cursor to the upper left of the main
window, then displays the current x and y coordinates of the mouse.
Although Liana keeps you from getting mired in grunge--both the Windows and
C++ flavors of it--you still have to deal with the essential concepts of
event-driven programming and object-oriented structures. What's cool is that
you can forget about the non-essential concepts: registering a window class,
creating a window, exporting a WndProc in a .DEF file, writing #defines for an
.RC file, making sure you dealt with the issue of whether SS equals DS, and so
on. Likewise, you don't need to deal with similar C++ grunge, such as matching
every new with a delete. However, at some point in using Liana, you'll likely
have to declare a class and define its member functions--but not at the very
start. If you were using a C++ framework such as OWL or MFC, there'd be many
pages of documentation to absorb before you could begin to write code. Tools
such as App-Wizard in Microsoft's Visual C++ package do help, but AppWizard
can only be used once, at the very start. AppWizard is like getting a healthy
push on a bike journey down a steep mountain road--you still have to navigate
the sharp curves of C++ and MFC.
The freedom from mundane details that Liana provides can be exhilarating; but
as you might expect, there's a price to pay for this freedom, mostly having to
do with performance and scalability. In many cases, this will be worth it.
Example 1 (from the Liana package) shows a complete program that waits for a
mouse down, then tracks the mouse and rubberbands a line. This program relies
on overriding two standard callbacks, startdrag and drag--functions that are
invoked upon a mouse-click and mouse-move event, respectively.
Example 2, an enhanced version of the program, brings up a window, rubberbands
a line, stores it in an array structure, and redraws all stored lines as
necessary (when the paint callback is invoked). I can't think of any other
language that offers such economy of expression. Example 3 shows a two-line
program that tracks the mouse and displays its position on a status line (a
horizontal pane at the bottom of the main window to display prompts or other
information to the user). As you can see, it takes longer to describe the
program in English than Liana.
Liana provides programmatic access to the standard user-interface elements in
the Windows environment, such as menus, listboxes, buttons, dialogs,
comboboxes, and icons. Your program instantiates these UI elements dynamically
via code, as opposed to defining them statically in an RC file. This means
that if you want to change the text of a menu item, window title, or dialog
prompt, you have to edit your program's source code--in contrast to Windows RC
files, which can be changed without recompiling source code. However, I've
always considered the distinction between running CC.EXE over source code
versus running RC.EXE over data files to be pretty much meaningless. The two
tasks are similar enough that, in practice, all such activity falls within the
domain of program maintenance. In any case, Liana does provide a command-line
utility called LXMOD that allows you to modify the values of global string
variables (such as dialog prompts or window titles) without recompiling, for
purposes such as internationalizing (localizing) an application.
One benefit of specifying your application's UI via code rather than data is
that it gains more run-time flexibility. Also, Liana's scheme for specifying
items in a dialog uses a higher level approach than that used by raw RC files.
The constructs in Liana allow you to place items by specifying relationships
(such as "under" or "east") instead of numerical units that have to be changed
if, say, a label string gets longer or a fontsize is bumped up. Listing One
(page 114) is a program that creates a dialog containing a group box, radio
buttons, checkboxes, and edit controls. The disadvantage of Liana's high-level
approach, of course, is that if you want non- default behavior (such as
pixel-by-pixel positioning), it may be difficult or impossible to work around
the built-in constructs. (This, of course, applies to any high-level class
library or application framework that provides rich default behavior. Having
the source code can help, but that approach is also problematic. In the case
of Liana, the source to the application framework is not available in the
basic package, but is available as an extra-cost option.)
One of Liana's biggest advantages, at least for C++ programmers, is its
similarity to C++ syntax. In the linked-list sample program (see "Comparing
Object Oriented Languages" by Michael Floyd, page 104), I basically started
from the C++ version of the code, replacing cout with w and removing keywords
like virtual and friend. Pointers became object references; for example,
MyListElement *next was changed to MyListElement next. The program compiled
and ran, but didn't produce the output I expected. I was a bit stymied by
this, and wished for a debugger with which to step through the code.
Fortunately, Jack Krupansky (Liana's author) found the problem. In the C++
version, class MyList is a friend of class MyListElement; in Liana, this
reference to member data fails silently. Making the structure member next
publicly visible solves the problem (see Listing Two, page 114). In the
meantime, I implemented a different approach that makes all the list classes a
subclass of Linkable (see Listing Three, page 114). I found I could basically
think in C++ and not worry much about the differences between Liana and C++.
Finally, Jack provided a more concise version that satisfies the linked-list
protocol by using Liana's dynamic typing and built-in classes; see Example 4.
The moral here is that converting an existing body of C++ code may be
problematic, but writing Liana code from scratch is easy and natural,
especially if you take advantage of its unique features. (I'm still wishing
for a source-level debugger though.)
As with any system, Liana is not without problems. Some of its deficiencies
will be addressed by the upcoming Liana 2.0. Others are inherent in its
design--namely the performance hit that occurs with an interpreted language.
Also, the syntax of Liana trades off scalability and maintainability in favor
of conciseness, ease of use, and rapid development. For many Windows
applications, this tradeoff will be well worthwhile.
Example 1:
startdrag (x1, y1, x2, y2)
{
 w.line (old_x1 = x1, old_y1 = y1, old_x2 = x2, old_y2 = y2);
}
drag (x, y)
{
 w.xor = true;
 w.line (old_x1, old_y1, old_x2, old_y2 ); // Erase prev line
 w.line (old_x1, old_y1, old_x2 = x, old_y2 = y); // Draw new line
 w.xor = false;
}
Example 2:
startdrag (x1, y1, x2, y2)
{
 w.line (old_x1 = x1, old_y1 = y1, old_x2 = x2, old_y2 = y2);
}
drag (x, y)
{
 w.xor = true;
 w.line (old_x1, old_y1, old_x2, old_y2); // Erase prev line
 w.line (old_x1, old_y1, old_x2 = x, old_y2 = y); // Draw new line
 w.xor = false;
}
enddrag (x, y)
{
 if (! lines) lines = new array;

 lines << new line (old_x1, old_y1, old_x2, old_y2);
}
paint
{
 for (int i = 0; lines && i < lines.size; i++)
 lines [i].draw (w);
}
Example 3:
main { window(); w.status_line_enabled = true; }
position (x,y) { w.status = x+","+y+" "; }
Example 4:
class MyList : array
{
 Print
 {
 for (int i = 0, int n = size; i < n; i++)
 if ((any e = this [i]).isa ("MyList"))
 e.Print();
 else
 cout << e.class_name+": "+e.text+"\n";
 }
};
//-------------------------------------------
void main (void)
{
 MyList list1 = new MyList;
 MyList list2 = new MyList;
 int n1 = 10;
 int n2 = 20;
 point p1 = new point (2,3);
 point p2 = new point (4,5);
 /* build the lists */
 list1 << n1 << n2 << p1;
 /* an obj can be in more than one lst at same time */
 list2 << n2 << p1 << p2;
 list2 << list1; /* we can even put a list into another list */
 /* print the lists */
 cout << "\nLIST1:\n"; list1.Print;
 cout << "\nLIST2:\n"; list2.Print;
}
_THE LIANA PROGRAMMING LANGUAGE_
by Ray Valdes

[LISTING ONE]

//----- Liana program that uses Windows controls in a dialog. ------
main
{ window ();
 w.menu = new menu
 << new menuitem ("&Language Info...");
 cr = "\n";
 d = new language_info_dialog;
 d.lang_name = "Liana";
 d.language_type = "object-oriented";
 d.provides_app_framework = true;
 d.has_ide = false;
}
language_info { if (d.show) w.refresh; }
paint

{ w.home;
 if (d)
 w << " Language name: " + d.lang_name + cr
 + " Language type: " + d.language_type + cr
 + " Provides app framework: " + d.provides_app_framework + cr
 + " Has IDE: " + d.has_ide;
}
class language_info_dialog: dialog
{
 public:
 string lang_name,
 language_type;
 bool provides_app_framework,
 has_ide;
 language_info_dialog
 {
 dialog ();
 this << new labeltext ("Language name:")
 << new edittext (20, "lang_name");
 this [0].under;
 south ();
 this << new groupbox ("Language Type")
 << new radiobutton ("&Procedural")
 << new radiobutton ("&Functional")
 << new radiobutton ("&Object-oriented")
 << new endgroupbox;

 this [2].after;
 this << new checkbox ("Provides app framework")
 << new checkbox ("Has IDE");
 this [6].under;
 east();
 this << new ok_button
 << new cancel_button;
 }
};


[LISTING TWO]

//**** Linked list program that uses put_to ("<<") operator to append
elements.
// by Jack Krupansky, 1993.
class MyListData
{
 Print { cout << "Object of class "+this.class_name; }
};
struct MyListElement
{ MyListData data;
 MyListElement next;

 MyListElement (MyListData initialData)
 {
 data = initialData;
 }
};
class MyList : MyListData
{ MyListElement head;
 MyListElement tail;
public:

 put_to (MyListData data)
 {
 MyListElement newElement = new MyListElement (data);
 if (head) tail.next = newElement; // Append to non-empty list
 else head = newElement; // Start from an empty list
 tail = newElement; // Point to the new end of list
 return this;
 }
 Print
 {
 for (int i = 0, MyListElement e = head; e; e = e.next)
 e.data.Print();
 }
};
class MyNumber : MyListData
{ int value;
 MyNumber (int initialValue) { value = initialValue; }
 Print { cout << " Number: " + value + "\n";}
};
{ int x, y;
 MyPoint (int initialX,int initialY)
 { x = initialX; y = initialY; }
 Print { cout << " Point: " + x + "," + y + "\n"; }
};
void main (void)
{ MyList list1 = new MyList;
 MyList list2 = new MyList;
 MyNumber n1 = new MyNumber (10);
 MyNumber n2 = new MyNumber (20);
 MyPoint p1 = new MyPoint (2,3);
 MyPoint p2 = new MyPoint (4,5);
 /* build the lists */
 list1 << n1 << n2 << p1;
 /* an object can be in more than one list at same time */
 list2 << n2 << p1 << p2;
 list2 << list1; // we can even put a list into another list
 /* print the lists */
 cout << "\nLIST1:\n"; list1.Print;
 cout << "\nLIST2:\n"; list2.Print;
}


[LISTING THREE]

//*** Linked list program that subclasses "Linkable", by Ray Valdes, 1993. ***
class Linkable {
 Linkable next;
 Linkable GetNext { return next; }
 SetNext(Linkable n) { next = n; }
 Print { w << "Should override this method.\n"; }
};
class MyPoint : Linkable {
 int x,y;
 MyPoint(int xx,int yy) { x = xx; y = yy; }
 Print { w << "Point ("+ x + "," + y + ")\n"; }
};
class MyNumber : Linkable {
 int value;
 MyNumber(int v) { value = v; }

 Print { w << "Integer " + value + "\n"; }
};
class MyList : Linkable {
 Linkable head,tail;
 int count;
 AddToList(Linkable item) {
 count++;
 if(! head) { head = tail = item; }
 else { tail.SetNext(item); tail = item; }
 }
 Print {
 for(int i = 0, Linkable x = head; i < count; i++, x = x.GetNext)
 x.Print;
 }
};
main { (w = new window).show; w << "Sample List Program in LIANA\n";
 MyList list1 = new MyList;
 MyList list2 = new MyList;
 MyNumber n1 = new MyNumber (10);
 MyNumber n2 = new MyNumber (20);
 MyPoint p1 = new MyPoint (2,3);

 // build the lists
 list1.AddToList (n1);
 list1.AddToList (n2);

 // an object can be in more than one list at same time
 list2.AddToList (n2);
 list2.AddToList (p1);

 // a list can contain another list as an element

 // print the lists (should also print the content of any sublists)
 w << "LIST1:\n"; list1.Print;
 w << "LIST2:\n"; list2.Print;
 w << "Done.\n";
}

























October, 1993
The Beta Programming Language


An OO language with Simula roots




Steve Mann


Steve is managing director of MADA and can be contacted at 10062 Miller Ave.,
Suite 202-B, Cupertino, CA 95014 or MADA@applelink.apple.com.


New object-oriented applications are announced every day. New object-oriented
languages are another matter, however. When was the last time you saw a new
commercial-grade, object-oriented language implementation? Beta (pronounced
"Bee'-ta") is one such new language.
Development on Beta started in 1976 as a joint venture between universities in
Norway, Denmark, and Finland--the same computer-science community that in the
early '60s developed Simula, the first object-oriented language. Throughout
its formative years, Beta research has been supported by grants from companies
such as Apollo, Apple, and Hewlett Packard. Now, after more than 15 years of
R&D, Mjolner Informatics has released the first commercial Beta
implementation.


Beta Overview


Beta's design is heavily influenced by its object-oriented predecessors,
notably Simula. It's strongly typed, like C++, with most type checking taking
place statically at compile time. The language design is based on a powerful
abstraction mechanism called a "pattern." Example 1 shows the syntax for a
Beta pattern.
Attributes can be many things, the most common being object declarations and
methods specific to the pattern being defined. The Enter statement lists the
values passed to the Do section; Exit lists the values output from the Do
section; and the Do section contains imperative (executable) statements that
perform actions. All syntactic elements of a pattern are optional.
There are several types of patterns, including classes, procedures, functions,
coroutines, processes, and exceptions. There are also three types of pattern
derivatives: subpatterns (like subclasses but more powerful), virtual patterns
(like C++ virtual procedures), and pattern variables (pattern pointer
variables).
You can create any of these derivatives from any of Beta's pattern types,
making Beta a very orthogonal language. Using subpatterns, for instance, you
can create a hierarchy of procedures, functions, coroutines, processes, or
exceptions, where all subpatterns inherit attributes from their superpatterns.
You can also define virtual classes, virtual procedures, virtual functions,
and the like.
Example 2, a simple class pattern defining a bank-account object, has three
attributes: balance, a declaration using a predefined class pattern called
"integer," and Deposit and Withdraw, a pair of procedure patterns. Note that:
Asterisk/parenthesis pairs are used to enclose comments.
A colon (:) indicates a declaration.
The @ character indicates the name of an object type.
Patterns can be nested; the applicable scope rules are similar to Algol.
Imperative statements always read from left to right.
But to write a complete Beta program, you need more than just pattern
definitions. You also need declarations to create real objects and imperatives
to perform actions on those objects; see Example 3, where the & symbol means
new so the expression &account.Deposit means "create a new instance of the
pattern account.Deposit and execute it." This is basically the same as
invoking a procedure. In the finished program, the primary object definition
(Account) and the methods that operate on that account (Deposit, Withdraw, and
Balance) are all patterns. Beta derives much of its elegance and flexibility
from this use of a single syntactic element to define everything and execute
operations.


Reference Attributes and Syntactic Elements


Beta lets you define both static and dynamic reference attributes. Examples 2
and 3 show static object references. You can declare a dynamic reference to
pattern Account with A1: ^Account;. You create objects dynamically by invoking
their pattern name as an executable statement. For instance, &Account[] -->
A1[]; creates a new instance of Account, returns a reference to the instance,
and assigns the reference to A1. The difference between &Account and
&Account[] is important: The first means "generate an instance of Account and
execute it," the second means "generate a new instance of Account without
executing it and return a reference to it."
Beta arrays (called "repetitions") are defined using one of the two forms in
Example 4(a). Array index range from 1 to <Name>.range. (range is an intrinsic
attribute of all array objects). You can dynamically resize arrays using the
extend attribute, which adds elements to an array. You can reallocate an
array, initializing all the elements to the default value for the array object
type. You can also assign array slices (or parts of arrays).
The Beta for statement is in Example 4(b). <Index> is an integer declared
locally within the scope of the for statement. It cannot be changed within the
loop. The index always starts at 1 and increments by 1. It's possible to
overcome the limitations of the for index structure using other patterns.
The if statement, shown in Example 4(c), is an unusual combination of
traditional If and Case statements. E1_En are evaluations, I1_In imperatives.
The else clause is optional. A Boolean evaluation might be written thus
(albeit awkwardly):
(if (x>0) and (y<0)
 // True then_
 // False then_
if)
Beta evaluation (assignment) statements are very flexible. You can use
multiple assignment statements such as 3 --> I --> J. You can also combine
multiple assignments with pattern executions and enter and exit redirection as
in Example 5.


Subpatterns


A subpattern is a pattern that is derived from a superpattern and inherits all
its attributes. (Beta does not support multiple inheritance.)
A subpattern is defined by following a pattern's name with the name of its
superpattern. For instance, Example 6, defines the basic data structures for a
transportation system that has two types of reservations. Each instance of
FlightReservation and TrainReservation has its own explicit attributes, plus
Date and Customer attributes inherited from the superpattern Reservation.
Certain rules define what you can do with subpatterns. First, you can only
assign patterns to a pattern variable of the proper pattern or superpattern,
allowing restricted dynamic run-time binding. Second, attribute references are
restricted to those attributes belonging explicitly to the pattern being
referenced.
It's easy to create a generic pattern that can be used with all objects in a
subpattern hierarchy. Suppose you want to create an array of reservations in
the system, regardless of their type. Example 7 shows how. This group of
patterns works with Flight, Train, and Reservation objects.



Controlling Subpattern Imperative Sequencing


A subpattern automatically inherits the attributes of its superpattern. When a
subpattern is executed, its superpattern's imperative statements are
automatically executed as well. The superpatterns are normally activated first
in the execution sequence, starting with the superpattern directly above a
subpattern in the hierarchy.
When P3 is invoked in the pattern definitions in Example 8(a), the execution
through the pattern hierarchy is from top to bottom; see Example 8(b). Beta
overrides the default execution sequence between superpatterns and subpatterns
using the inner statement which forces the execution of the imperative section
of the subpattern one level down in the subpattern hierarchy. If you redefine
P1, P2, and P3 as in Example 8(c), the execution of P3 initiates the sequence
in Example 8(d).
Just as the attributes and imperatives of a subpattern hierarchy are combined
during execution, the Enter and Exit lists are combined, too. The Enter part
of a subpattern is the concatenation of the Enter part of the superpattern
followed by the Enter part of the subpattern. The same is true for Exit.


Virtual Patterns


Revisiting the reservation system in Example 6, assume that each attribute has
a Display pattern (not shown). Then, add another pattern to display all the
attributes for each pattern; see Example 9. If you execute
TrainReservation.Display, the complete set of attributes for that pattern are
displayed because of the subpattern hierarchy. But if you execute
Reservation.Display, only the Reservation attributes are displayed. The
pattern Reservation.Display has no easy way of displaying attributes below it
in the hierarchy. Clearly you need another solution for dealing with
subpattern hierarchies where you may need to reference a pattern that isn't at
the bottom, but you'd also like to access attributes below the referenced
pattern.
One way of doing this is to define virtual patterns--patterns that can be
extended in subpatterns. Although there are different ways to syntactically
define virtual patterns, the Example 10 refinement to the reservation system
shows the simplest and most direct. With this set of virtual patterns,
Reservation.Display, FlightReservation.Display, and TrainReservation.Display
display the proper attributes in the proper order.
Just as you can define virtual-procedure patterns to create cumulative sets of
actions under one name in a hierarchy, you can do likewise with object
patterns. Beta makes no real distinction between the two.


Basic and Container Libraries


Beta contains a wide variety of component and framework libraries for building
both UNIX and Macintosh applications. Since Beta includes source code for all
libraries, you can study both libraries and sample programs.
Most of Beta's libraries are built on top of simple patterns contained in the
basic libraries, including:
Patterns for all basic data types: integers, Booleans, reals, characters, and
repetitions (arrays).
Streams, for both traditional file streams and internal text strings, and
UNIX-like put and get primitives for integer and real data types.
Exceptions, ForTo, Cycle, and Loop control patterns.
Stream patterns for repetitions, allowing you to apply integer and text put
and get operations to the repetitions.
A real number math function library, including conversion patterns and basic
trig functions.
systemEnv, a library of abstract superpatterns for concurrency operations,
including semaphores, monitors, and coroutines.
File, directory, and path management patterns for both HFS and UNIX file
types.
A process manager that manages interprocess communication via pipes or
sockets.
A general-purpose interface to external modules written in other languages,
based on C and Pascal interface conventions.
A low-level interface that lets you retrieve internal information about
individual objects in the Beta system.
The container libraries include a variety of abstract patterns for defining
complex data structures. Table 1 lists Beta's standard container types.


Macintosh Libraries


Although the basic and container libraries simplify writing the internals of a
program, they don't provide a framework for creating and managing the user
interface. Beta has extensive libraries that provide Macintosh (or X Window)
toolbox access and higher-level abstractions for creating and controlling the
UI.
Beta's toolbox library is basically just wrappers around toolbox calls. The
documentation describes the methods used for handling all the toolbox
interface components--value and VAR parameters, data structures, M68000 traps,
and Beta's external library capabilities.
Beta's toolbox support is reasonably current, including AppleEvents and the
Object Model. If you need wrappers for newer toolbox capabilities or managers,
the Beta documentation describes how to build your own interface patterns.
The MacEnv library, which includes more advanced interface components,
consists of basicMacEnv, controls, fields, scrollingList, figureItem, and odds
and ends. BasicMacEnv, which must be used by all MacEnv programs, contains
abstractions for dealing with menus and the menu bar, windows, interfacing to
the Finder and Clipboard, and handling the mouse, keyboard, and screen. Other
libraries provide supplemental features: the control library, which manages
controls (buttons, small text-entry controls, and scroll bars) and items
intended for programming dialog boxes; the fields library, which has patterns
for displaying pictures in windows and for creating and managing various types
of text editors; the scrollingList library, which has facilities for making
scrolling lists of text string such as those you might put in a Standard File
dialog; and the figureItem library, a small vector graphics library that
implements graphics objects that may react to a user's mouse actions.
One of the strengths of the MacEnv library is the simplicity of event handling
from the programmer's point of view. All Macintosh events are converted into
invocations of special virtual patterns within the appropriate UI object. The
patterns have default behaviors (when it makes sense) to respond to all
events. To complete or refine the interface, you only have to override those
subpatterns that need further definitions. (You don't have to create a giant
event loop.)
BasicMacEnv manages menus and windows and all the basic event handling in a
program. It defines a menu bar with the three standard menus--Apple, File, and
Edit--and the expected commands. Default behavior is defined for obvious
operations--opening and closing windows, clipboard operations, and so on.
(Unfortunately, the default behaviors skip some basic functionality like
printing and undo.) You can override the hit pattern for a command to further
refine any menu command.
Beta menus have permanent and dynamic menu items. Permanent items (Open and
New on the File menu, and Clipboard commands) are for menu items that are
valid irrespective of the active window; there are default virtual patterns
for all permanent items. Dynamic items are commands that may vary depending on
the state of the active window. You need to explicitly attach and detach
(enable and disable) dynamic items to coordinate them with the active window.
BasicMacEnv includes predefined patterns for all standard Macintosh window
types, plus a wide range of appropriate attributes--title, position, and so
on--with default values. Additionally, there are patterns for opening and
closing windows, plus default behaviors for dragging, zooming, resizing, and
other standard window behaviors. Listing One (page 116) creates a movable
window with zoom and close boxes whenever New or Open is selected from the
File menu (note how BasicMacEnv is invoked).
When the program is launched, three standard menus are automatically enabled:
Open, New, and Quit. myWindow.open creates a resizable nonmodal window and
enables the Close command on the File menu. You close the window by selecting
the Close command or clicking the close box.


The fields Library


The fields library contains five patterns designed for displaying and
manipulating higher-level data structures: pictureField for displaying
pictures; textField, a sophisticated text-editing field with support for
styled text and a variety of text-editing functions; abstractScroller, which
implements scrolling for any windowItem object; textEditor, a full-fledged
text editor canvas with scrolling capabilities; and scroller, a subpattern of
abstractScroller which implements scrolling for a windowItem in a canvas.
Listing Two (page 116), a multiwindow text editor, is created by binding a
textEditor object to each window opened with either the New or Open command on
the File menu. All text editing, Clipboard operations, and menu management are
handled automatically by basicMacEnv. In addition, a font menu with some basic
size choices is created and added to the menu bar at startup. That menu can be
used to modify the text contents of each of the editor windows.
Listing Two shows how easy it is to implement file support, styled text
capabilities, and most of the expected default Macintosh behaviors simply by
overriding some of Beta's library virtual patterns. However, it isn't a
commercial-quality application--there's no industrial-strength file exception
handling, no support for undoing editing commands or reverting to previous
versions, and no support for intelligent font sizing.


The Mjolner Beta System



In addition to the Beta features described in this article, the Mjolner Beta
System includes:
Native code generation.
Automatic garbage collection and storage management.
Separate compilation of program fragments with automatic dependency analysis.
Interfaces to C and assembly language modules (and Pascal on the Mac).
An experimental persistent store.
Process management and concurrency-control patterns.
Application frameworks for X Windows, Motif, as well as the Macintosh Toolbox.
Mjolner also offers some additional tools, including a modifiable Hyper
Structure Editor, a metaprogramming system, a Beta CASE Tool (UNIX only), and
a source-level debugger (UNIX only).
The Mjolner Beta System is not without flaws--the development environment is
static, not dynamic, and it uses the native assembler and linker. You have to
wade through compile, assemble, link, and run cycles to develop programs. The
application frameworks are a good starting point, but they're missing basic
printing, undo, and revert support. Finally, the documentation could be
improved and expanded.
Keep in mind that these minor complaints are about the implementation, not the
language. Beta springs from a unified, object-oriented theoretical foundation.
The result is a very clean implementation of virtually all recognized and
desired object-oriented capabilities in a simple, easy-to-understand language
available in commercial-grade tools. Beta is truly an evolution of
object-oriented technology.
For More Information
Mjolner Beta System
Macintosh, $295.00
UNIX, $2890.00
MADA
408-253-2765 (voice)
408-253-2767 (fax)
10062 Miller Ave.
Suite 202-B
Cupertino, CA 95014
Table 1:
Standard container types.
Function Description
container Abstract superpattern for all containers, including basic container
attributes and exception-definitions collection an abstract superpattern for
all position-independent container patterns.
multiSet Unstructured collections that allow duplicates.
set Unstructured collections where duplicates are not allowed.
classificationSet Specialized set that lets you dynamically classify other
sets into subsets and supersets, and manage their set operations.
hashTable Basic hash table where you can define the hash ranges and hash
function, with table searching, and table statistic calculations.
extensiblehashTable hashTable where you can rehash the contents of the tab or
extend the range of hash values.
arrayContainer Element repetition abstraction, including put, get, and sorting
operations.
sequentialContainer An abstract container for sequentially-ordered collection
of elements.
stack A basic stack with push, pop, and top patterns.
queue An ordinary queue.
deque A double-ended queue with patterns for operating on both ends of the
queue.
prioQueue A sequential container where each container element has a numeric
priority.
list A more complex pattern that implements a double-linked list; many list
operations take a list position as an argument.
recList A recursive doubly-linked list.
Example 1:
<Pattern Name>: (#
 <Attribute 1> ;
 <Attribute 2> ;
 .
 <Attribute N> ;
 Enter <Input list>
 Do <Imperatives>
 Exit <Output list>
#);
Example 2:
Account: (# (* a simple bank account class pattern *)
 balance: @integer; (* bank account balance *)
 Deposit: (# (* add `amount' to balance *)
 amount: @integer(* local declaration *)
 enter amount(* input list *)
 do balance + amount -> balance (* action *)
 exit balance(* output list *)
#);
Withdraw: (#(* subtract `amount' from balance *)
 amount: @integer

 enter amount
 do balance - amount -> balance
 exit amount
#); #);
Example 3:
(# (* a program pattern with no name *)
 Account: (# (* a pattern declaration within the unnamed pattern *)
 acct_balance: @ integer;
 Deposit: (#
 amount: @ integer
 enter amount
 do acct_balance + amount -> acct_balance
 exit balance
#);
Withdraw: (#
 amount: @ integer
 enter amount
 do acct_balance - amount -> acct_balance
 exit amount
#);
Balance: (#
 exit acct_balance
#); #);
A: @ Account;
cash_on_hand, balance: @ integer;
do 100 -> &A.Deposit;
 250 -> &A.Deposit;
 75 -> &A.Withdraw -> cash_on_hand; (* $75 on hand *)
 &A.Balance -> balance; (* $275 balance *)
#)
Example 4:
(a)
<Name>: [size] @<Type> ;(* static repetition *)
<Name>: [size] ^<Type> ;(* dynamic repetition *)
(b)
(for <Index>: <Range> repeat <Imperative-list> for)
(c)
(if E0
 // E1 then I1
 // E2 then I2
 . . .
 //En then In
 else I
if)
Example 5:
(#
Power: (# (* compute X^n where n > 0 *)
X, Y: @ real; n: @ integer;
enter (X, n)
do 1 -> Y;
(for i: n repeat Y * X -> Y for)
exit Y
#)
Reciproc: (# (* compute (Q, 1/Q) *)
Q, R: @ real;
enter Q
do (if (Q // 0) then 0 -> R
else (1 div Q) -> R
if )

exit (Q, R)
#);
A, B: @ real;
do (3.14, 2) -> &Power -> &Reciproc -> (A, B);
(* A = 3.14 ^ 2, B = 1/A *)
#)
Example 6:
Reservation:(#
Date: @DateType;
Customer: ^CustomerRecord
#)
FlightReservation: Reservation (#
FlightNo: ^Flight;
SeatNo: ^Seat
#)
TrainReservation: Reservation (#
TrainNo:^Train;
CarriageNo: ^Carriage;
SeatNo: ^Seat
#)
Example 7:
ReservationRegister: (#
Table: [100] ^ Reservation; (* arbitrary size restriction *)
top: @integer;
Insert: (# (* insert a reservation *)
R: ^ Reservation;
enter R []
do R [] -> Table [top+1 -> top] []
#);
NoOfRes: (# (* get number of reservations in system *)
exit top
#)
GetRes: (# (* get reservation number inx *)
inx: @ integer
enter inx
exit Table [inx] [];
#); #)
Example 8:
(a)
P1: (# a: @integer do 1 -> a #);
P2: P1 (# b: @integer do 2 -> b #);
P3: P2 (# c: @integer do 3 -> c #);
(b)
P3 invoked P2 invoked P1 invoked 1 -> a 2 -> b 3 -> c
(c)
P1: (# a: @ integer do INNER P1;1 -> a #);
P2: P1 (# b: @ integer do 2 -> b; INNER P2#);
P3: P2 (# c: @ integer do 3 -> c #);
(d)
P3 P2 P1 P2 (via INNER P1) 2 -> b
P3 (via INNER P2) 3 -> c 1 -> a
Example 9: Building a virtual pattern
Reservation: (#
 Date: @ DateType;
 Customer: ^ CustomerRecord
 Display: (#
 do Date.Display;
 Customer.Display; INNER
#); #)

FlightReservation: Reservation (#
 FlightNo: ^ Flight;
 SeatNo: ^ Seat
 Display: (#
 do FlightNo.Display;
 Customer.Display
#); #)
TrainReservation: Reservation (#
 TrainNo: ^Train;
 CarriageNo: ^Carriage;
 SeatNo: ^ Seat
 Display: (#
 do TrainNo.Display;
 CarriageNo.Display;
 SeatNo.Display
#); #)
Example 10:
Reservation: (#
 Date: @DateType; Customer ^CustomerRecord
 Display:< (#
 do Date.Display;
 Customer.Display; inner
#); #)
FlightReservation: Reservation (#
 FlightNo: ^Flight;
 SeatNo: ^Seat
 Display::< (#
 do FlightNo.Display;
 Customer.Display
#); #)
TrainReservation: Reservation (#
 TrainNo: ^Train;
 CarriageNo: ^Carriage;
 SeatNo: ^Seat
 Display::< (#
 do TrainNo.Display;
 CarriageNo.Display;
 SeatNo.Display
#); #)

_THE BETA PROGRAMMING LANGUAGE_
by Steve Mann

[LISTING ONE]

ORIGIN '~beta/macenv/v3.5/basicmacenv' (* part of the fragment system *)
--- program: descriptor ---(* for handling code modules *)

MacEnv (* include the basic app framework *)
(#
 myWindow: @window (#
 type::< windowTypes.zoom;
 hasClose::< trueObject;
 open::< (#
 do (100, 100) -> position;
 (300, 100) -> size;
 My window' -> title;
 #); #);
 FileMenu::< (#

 iNew::< (# hit::< (# do myWindow.open #); #);
 iOpen::< (# hit::< (# do myWindow.open #); #);
#); #)


[LISTING TWO]

ORIGIN '~beta/macenv/v3.5/basicmacenv'
[[
--- INCLUDE '~beta/macenv/v3.5/fields'
--- INCLUDE '~beta/macenv/v3.5/macfile'
--- INCLUDE '~beta/basiclib/v1.3/file'
--- program:descriptor---

MacEnv (#
(* === Font Menu Patterns === *)
 FontMenu: @Menu (#
 fontSizes: [4] @integer;
 firstFontOset: @integer;
 theEditor: ^edWindow;
 Name::< (# do 'Font' -> theName #);
(* create the font menu--names on top, sizes on bottom. the
* handler distinguishes between the two using the item number. *)
 open::< (#
 do System.AvailableFonts
 (# do thisFontName[] -> Append #);
 separator -> Append;
 '9' -> Append -> firstFontOSet; '12' -> Append;
 '14' -> Append; '18' -> Append;
 9 -> fontSizes [1];12 -> fontSizes [2];
 14 -> fontSizes [3];18 -> fontSizes [4];
 #);
(* override the standard event handler for this menu
* to change the font or size of the current selection *)
 EventHandler::< (#
 EvalStatus::< (# Done::< trueObject #);
 Select::< (#
 do (if theEditor[] <> None // true then
 (if theItem.ItemNumber >= firstFontOSet // true then
 fontSizes [theItem.ItemNumber - firstFontOset + 1] ->
 theEditor.ChangeSize;
 else theItem.name -> theEditor.ChangeFont
 if);if);
 #); #); #);
(* === Editor Window Patterns === *)
 edWindow: Window (#
 Type::< WindowTypes.Zoom; HasClose::< TrueObject;
 myText: ^StyledText;myFile: ^macfile;
 opened: @boolean;
(* change the font type of the current text selection *)
 ChangeFont: (#
 aTextStyle: @TextStyle; start,end: @Integer
 enter aTextStyle.name
 do myEditor.Contents.Selection -> (start, end);
 (start, end, aTextStyle.FontID) ->
 myEditor.Contents.SetOneFont
 #);
(* change the font size of the current text selection *)
 ChangeSize: (#

 fontSize, start,end: @Integer
 enter fontSize
 do myEditor.Contents.Selection -> (start, end);
 (start, end, fontSize) ->
 myEditor.Contents.SetOneSize
 #);
(* create a dynamic menu item for the File/Close command *)
 DoClose: @theFileMenu.action (#
 hit::< (# do close #); #);
(* create a dynamic menu item for File/Save command *)
 DoSave: @theFileMenu.action (#
 hit::< (#
 do (if opened // true then myEditor.saveMyEd
 else myEditor.saveAsMyEd if)
 #); #);
(* create a dynamic menu item for File/Save As command *)
 DoSaveAs: @theFileMenu.action (#
 hit::< (# do myEditor.saveAsMyEd #); #);
(* Window event handler--enable/disable the font menu and dynamic
* File commands as current window gets activated / deactivated *)
 EventHandler::< (#
 Activate::< (#
 do this (edWindow)[] -> FontMenu.theEditor[];
 FontMenu.enable;
 DoClose[] -> theFileMenu.CloseItem.Attach;
 DoSave[] -> theFileMenu.SaveItem.Attach;
 DoSaveAs[] -> theFileMenu.SaveAsItem.Attach;
 #);
 Deactivate::< (#
 do none -> FontMenu.theEditor[]; ;
 FontMenu.disable;
 theFileMenu.CloseItem.Detach;
 theFileMenu.SaveItem.Detach;
 theFileMenu.SaveAsItem.Detach;
 #);#);
(* pattern to open an existing file *)
 openFile: (#
 do newFile;
 (if ('','Select a File to Edit: ') ->
 myFile.GetFile -> opened // true then
 myFile.RestoreStyledText -> myText[]
 if)
 exit opened
 #);
(* pattern to create a new file *)
 newFile:(#
 do &StyledText[] -> myText[];
 &macfile[] -> myFile[]
 #);
(* pattern to open a new window *)
 open::< (#
 do 'Untitled' -> title;
 (if opened // true then
 myFile.entry.path.name -> title
 if);
 nextPosition -> Position; (350,400) -> Size;
 (5,10) -> nextPosition.add; myEditor.open
 #);
(* === Text Editor Patterns--scoped inside Window to have ===

* === access to all of the Window attributes === *)
 myEditor: @TextEditor (#
(* track text changes (to disable Save if no changes). *)
 docChanged: @boolean;
 ContentsDesc::< (#
 Textchanged::< (# do true -> docChanged #);
 #);
(* open (create) a new text editor--bind to window *)
 open::< (#
 do this(edWindow).Size->Size;
 true->BindBottom->BindRight;
 myText[] -> Contents.contents;
 myEditor[] -> Target;
 false -> docChanged
 #);
(* save the current document to disk *)
 saveMyEd: (#
 do contents.contents -> myFile.SaveStyledText
 #);
(* save the current doc to disk, but allow a name change *)
 saveAsMyEd: (#
 do (if ('Select a Destination File Name: ',title) ->
 myFile.PutFile // true then saveMyEd;
 myFile.entry.path.name -> title
 if)
 #);
(* close and (optionally) save the current text editor *)
 close::< (#
 do (if docChanged // true then saveAsMyEd if)
 #);#); #);
(* === File Menu Patterns === *)
 FileMenu::< (#
(* File/Open command *)
 iOpen::< (# hit::< (#
 myWindow:^edWindow;
 do &edWindow[] -> myWindow[];
 (if myWindow.openFile // true then
 myWindow.open
 if)
 #);#);
(* File/New command *)
 iNew::< (# hit::< (#
 myWindow:^edWindow
 do &edWindow[] -> myWindow[];
 myWindow.newFile;
 myWindow.open
 #);#);#);
(* === Program start - create the font menu, put it on the ===
* === menu bar, and set the first window tiling position. === *)
 nextPosition: @Point;
 do FontMenu.Open;
 FontMenu.disable;
 FontMenu[] -> Menubar.Append;
 (5,40) -> nextPosition
#)
---]]


































































October, 1993
The Eiffel Programming Language


A pure OO language designed with reuse in mind




Robert Howard


Robert "Rock" Howard is president of Tower Technology Corporation, creator of
the TowerEiffel system, and chairman of NICE, the Eiffel Consortium. You can
contact him at 2701 Stratford Drive, Austin, TX 78746 or at rock@twr.com.


With software complexity mushrooming because of ever-increasing user
expectations, programmers should avoid "accidental" complexities imposed by
languages and tools whenever possible. The Eiffel programming language,
perhaps more than others, removes unnecessary complexities without limiting
the ability to express the inherent complexity of software.
Eiffel combines object-oriented (OO) capabilities with a unique focus on
software "correctness" and reusability. Eiffel programs can express
abstractions in a clear manner using a syntax that's simple to learn and use.
Eiffel can be used for OO design-specification, or as a full-fledged,
software-engineering tool.
Eiffel is a class-based language that reinforces OO design. It includes
multiple and repeated inheritance, selective exporting, strong type checking,
parameterized classes, dynamic binding, garbage collection, feature renaming,
exception handling, and persistency. Eiffel is implemented via sophisticated
compilers that perform dependency analysis and optimization. To speed the
development cycle, some implementations include interpretation.
Since its introduction in 1986, Eiffel has attracted an international
following in academia and engineering and is used extensively by in-house and
commercial software-development projects. Eiffel is the only OO language
outside of Smalltalk and C++ with multiple commercial implementations. The
Eiffel trademark is owned by the Nonprofit International Consortium for Eiffel
(NICE), an independent consortium that administers and enforces language and
core library standards.
Eiffel's major benefit is the reduction of software-maintenance costs. This
comes about in two ways. First, Eiffel supports the development of correct
software via the use of semiformal assertion technology. Second, Eiffel
supports the development and use of reusable software. These concepts are
intimately linked since software correctness is a prerequisite for effective
software reuse.
Eiffel is a "pure" object-oriented language. Every value is either an object,
a reference to an object, or void. (Void denotes the state of a reference that
is not currently attached to an object.) While Eiffel syntax resemble Pascal,
it was designed as a class-based, statically-typed object-oriented language.
All Eiffel classes implicitly inherit from class ANY, and include both shallow
and deep copying, cloning, comparison, and so on. Eiffel supports a clean
mechanism allowing you to easily add capabilities to this class.
Eiffel has no formal notion of pointers. Conceptually, all feature arguments
are object references. The compiler can "decide" to implement a feature call
in an appropriate manner--given the actual type of the objects used as
arguments. Compared to languages where there are choices between pointer and
reference semantics, Eiffel's uniformity in calling conventions simplifies
class interfaces.
Eiffel has standard mechanisms for interfacing with C, including calling C
from Eiffel, and Eiffel from C. Some implementations include extensions that
allow the use of C macros or inline C within designated Eiffel routines. Come
compiler vendors also include mechanisms for accessing Eiffel object data or
features from C.
Eiffel implementations come with a variety of kernel classes handing basic
types--integers, reals, floats, Booleans, strings, arrays, bits, and so on.
The compiler knows about these types and can produce optimized code when
they're used. Other kernel classes are basic I/O, standard files, and
exceptions. NICE is working to standardize the kernel classes.
It is easy to add persistency to Eiffel. Interactive Software Engineering
(Goleta, CA) provides the STORABLE class that allows for persistence to
classes that inherit from it. Tower Technology (Austin, TX) provides a library
that can encode any Eiffel object into a heterogeneous, packed binary format
that can be decoded into a separate process on another machine, or at a future
time. SIG Computer (Braunfels, Germany) has specialized FILE classes used for
simple persistency.


Improving Software Correctness


Eiffel eliminates the possibility of entire categories of errors--memory
management errors, unintended side effects caused by poor encapsulation, bad
links from erroneous makefiles, improper routine dispatching due to type
errors or confusion, and the like. Eiffel also "attacks" errors in application
logic and semantics via its advanced assertion technology. Assertions support
the contract metaphor, aid feature and class comprehensibility, interact with
the exception-handling capabilities, and improve confidence in system
reliability. Debugging time, especially defect location, is usually cut
drastically when assertions are used in Eiffel classes.
Assertions in Eiffel come in two flavors--preconditions and postconditions.
(Class invariants, loop invariants, loop variants, and checks are also
supported, but for the sake of this argument you can lump them in with
postconditions.) Preconditions require that arguments provided to feature
calls are acceptable, and that the object is in a state such that the feature
call can be handled. Postconditions ensure that a chunk of code performed
correctly, provided an acceptable result, and left the object in a correct
state.
The preconditions, postconditions, and class invariants are treated as
inherent parts of the class interface and are retained under inheritance to
enforce required behavior even for subclasses. Eiffel's automatic interface
extraction tool treats assertions as equally important as routine signatures.
Unlike other languages, semantic assumptions expressed via assertions are
explicit and public.
When a run-time error occurs, the Eiffel exception-handling capability kicks
in. If no handler is designated for a given assertion failure, it's handled by
the Eiffel run-time system, which dumps trace and other information.
Precondition failures denote errors in the feature caller, while
postconditions denote errors in the feature body. Thus the liberal use of
assertions usually immediately isolates many run-time defects. (For further
discussion of software correctness, see "Writing Correct Software with Eiffel"
by Bertrand Meyer, DDJ, December 1989.)
Eiffel supports the reuse of classes via inheritance, parameterization, and
composition. The inheritance mechanism allows renaming and redefinition of
features so that derived classes can be easily specialized. Parameterization
goes beyond simple template construction as type parameters can be
constrained. This allows the constraint class's capabilities to be used within
the enclosing parameterized class, enabling the implementation of high-level
abstract data types. Eiffel also includes an indexing clause for locating
reusable classes.


Exploring Eiffel


To examine Eiffel's syntax, let's begin by looking at the linked list example
(see the Eiffel code on page 122, which is part of the article, "Comparing
Object-oriented Languages"). At first glance, the syntax seems similar to
Pascal--assignment via :-- If statements, routine declarations, and the
optional semicolon statement separators are familiar elements borrowed from
Pascal. The loop construct is demonstrated in class MY_LIST. The
from/until/loop/end pattern can also be supplemented by optional loop
invariant and loop variant assertions.
Distinguishing Eiffel from Pascal are the clauses introduced by the inherit,
creation, and feature keywords. The optional inherit clause lists the class or
classes that are inherited. In this clause, it's possible to rename individual
features and otherwise join classes so that naming conflicts don't occur. The
optional creation clause lists procedures that are available to initialize a
newly-created object. The feature clause specifies the export status for one
or more Eiffel attributes and/or routines grouped within it. (Eiffel uses the
term "feature" to denote attributes or routine since, in some cases, they may
be substituted for each other.)
Although not specified by the language definition, current Eiffel
implementations expect that each class definition resides in a separate file
with a suffix of .e. The name of the class need not match the file name,
although that's the convention. Related classes are often grouped together in
directories called "clusters." An important cluster is the "kernel" cluster
that's provided with each Eiffel implementation. The kernel includes the
low-level base classes, including the basic types--integer, real, double,
character, array, string, file, and others.
A configuration file names the clusters where classes may be found for
building an executable. It also names a class as the top-level class, and a
feature as the initial feature to call within the top-level class. Various
debugging options and other system-building parameters are specified.
Interactive Software Engineering, and Tower Technology use the "LACE"
configuration file format while SIG Computer uses the "pdl" format. (NICE will
likely standardize on a single format in the future.)
To build the sample Eiffel system, the configuration file specifies DRIVER as
the top-level class and make as the initial feature to call. The code for make
in class DRIVER performs the same work as the C++ main() function.
The !! syntax denotes object creation. Some classes denote one or more
procedures as "creation procedures" by naming them in the creation clause near
the beginning of the class. If a class has no creation procedure, then the
object is created without a call (for example, !!list1). If a class has at
least one creation procedure, then creation requires a call such as (!!n1.set(
10 ).
The set_data routine in class ELEMENT can be used for either creating an
instance of ELEMENT, or as a normal feature call. If required, export controls
can be used to make a routine available only for creation.
The Eiffel class PRINTABLE replaces MyListData in the C++ code example. Note
that PRINTABLE is marked deferred so that no objects of type PRINTABLE can be
created. Instead, it's used as an abstract data type and inherited from it to
get reliable polymorphic behavior. Also, the class LIST is a general-purpose
version of MY_LIST. LIST takes a generic parameter named T. To use LIST, you
declare it with an actual generic parameter; for instance, this_list :
LIST(HAT);. Thus this_list holds objects of type HAT and/or objects that
inherit from class HAT.
Class MY-LIST also has a generic parameter, but there's a restraint on its
parameter [T->PRINTABLE]. This means that only objects that inherit from
PRINTABLE are allowed to be used wherever T is used within the class. The
result is that the compiler ensures that the call to print_self (which was
defined as a deferred feature in class PRINTABLE) will always work.


More Code Notes


Eiffel assertions can have optional labels that are echoed to the standard
error output if the exception caused by an assertion failure is not handled;
see the invariant clause of class LIST.
Every attribute and routine is defined within the scope of a feature clause
that denotes its export status. Feature [NONE] is equivalent to private.
Feature [ANY] is equivalent to public since all Eiffel classes automatically
inherit from class ANY. Feature [LIST} in class ELEMENT is an example of
selective exporting. Only class LIST and classes that inherit from it may
invoke these features.
Eiffel automatically initializes all fields to default values, so some of the
initialization logic from the C++ code is not necessary.

The I/O calls (put_string, put_int, put_newline, and put_char) are from the
SIG's Eiffel/S BASIC_IO class. Interactive Software's Eiffel 3 has a slightly
different I/O library. Tower uses either ISE or Eiffel/S style. NICE is
developing a single I/O standard and other core capabilities.


Extending the Linked List Example


Although the Eiffel linked list implementation is wordier than the C++
implementation, it's safer, more understandable, and does much more. For
example, it includes a general-purpose list, not one specialized for printable
objects. Furthermore, the specialized class MY-LIST is safer since
nonconforming objects can't be added to it.
The routines in Example 1 are additions to class LIST for removing items.
Eiffel will reclaim an item's memory as is appropriate. Doing this in C++ is
painful because you must build or include a garbage collector or a
sophisticated reference counting system to know when to call destructors
safely.
The routine in Example 2 can be added to class LIST to return individual
items. In Eiffel, there are two approaches to using this capability. One
approach shown in Example 3 is to fetch the item as conforming to a type that
includes the needed capabilities and then rely on dynamic typing. The other
approach is shown in Example 4 and uses an assignment attempt. This will
assign an object only if it conforms to the target reference. Otherwise the
reference is set to VOID.
These examples show the Eiffel objects retain type information. Because C++
objects don't retain type information, the examples are hard to duplicate.
Instead, C++ has the dangerous type coercion capability that can't match the
safety and flexibility of these examples.


Development Automation


According to the Eiffel philosophy, compilers should take on many of the
automatable tasks involved in building applications. For example, replying on
programmers for dependency-analysis information (makefiles) is a waste of time
and a source of problems. Requiring the maintenance of separate class
interface files, including the complex include file invocation order, is
another nonproductive effort.
Eiffel compilers take on much of the burden of performing system-wide and
local optimizations. For example, all Eiffel features are "virtual" as far as
the Eiffel programmer is concerned. The Eiffel compiler locates all
appropriate opportunities for in-lining or removing unnecessary code and
eliminating unnecessary dynamic binding. Programmers focus optimization
efforts on choosing proper data structures and streamlining object
interactions. The compiler also eliminates dead code, permitting you to focus
on data structures and object interactions.
Eiffel compilers take on the task of fully checking for correct type usage,
without imposing a confusing or limited type system on you. In Eiffel,
subclassing and subtyping are one and the same, making the use of inheritance
particularly powerful and easy. Eiffel supports the more natural, covariant
type system that allows classes to be easily specialized in concert.


Final Thoughts


Eiffel encourages you to use languages and tools appropriately; that is, to
use an OO language for implementing OO designs. You can then use a portable
assembler such as C for low-level optimizations and operating-system
interfaces, or an AI tool or language where a logic-based approach is needed.
As long as the languages and tools can interface appropriately, this approach
yields better results than the use of hybrid languages or tools.
Eiffel's support for semiformal, parameterizable specifications allow the
construction of robust, reusable software components. The clear syntax couples
with the elucidation of assumptions via assertions makes it easier to share
these components with confidence. The result is that the real benefits of
object-oriented programming are more fully realized.
Example 1:
 is_empty : BOOLEAN is
 -- return TRUE if LIST is empty
 do
 Result := ( head = void )
 end ; -- is_empty
 remove is
 -- remove the first list item
 require
 not_empty: not is_empty
 do
 head := head.next ;
 if head = void
 then
 tail := void ;
 end;
 end ; -- remove
Example 2:
 item : T is
 -- return the head of the list
 require
 not is_empty
 do
 Result := head
 end ; -- item
Example 3:
 local
 this_list : MY_LIST[PRINTABLE] ;
 pr : PRINTABLE ;
 do
 -- create the list
 !!this_list ;
 -- add items to the list (not shown)
 ...

 -- fetch an item from the list
 if not this_list.is_empty
 then
 pr := this_list.item ;
 pr.print_self ; -- dynamic binding still works!
 end
Example 4:
 pt ?= this_list.item ;
 if pt /= void
 then -- it was a MY_POINT
 pt.print_self ;
 end ;



EIFFEL SOURCE CODE EXAMPLE

class DRIVER
 -- In Eiffel, the top level driver is an object, too.
inherit
 BASIC_IO
creation
 make
feature {ANY}
 make is
 -- run this test driver
 local
 list1, list2 : MY_LIST[PRINTABLE] ;
 n1, n2 : MY_NUMBER ; -- a kind of PRINTABLE
 p1, p2 : MY_POINT ; -- also a kinf of PRINTABLE
 do
 -- create the various objects
 !!list1 ;
 !!list2 ;
 !!n1.set( 10 ) ;
 !!n2.set( 20 ) ;
 !!p1.set( 2, 3 ) ;
 !!p2.set( 4, 5 ) ;

 list1.add_to_list( n1 ) ;
 list1.add_to_list( n2 ) ;
 list1.add_to_list( p1 ) ;
 list2.add_to_list( n2 ) ; -- objects can be in more than one list
 list2.add_to_list( p1 ) ;
 list2.add_to_list( p2 ) ;
 list2.add_to_list( list1 ) ; -- list 1 is an element of list 2

 put_string( "list1:%N" ) ;
 list1.print_self ;
 put_string( "list2:%N" ) ;
 list2.print_self ;
 end -- make
end -- DRIVER
deferred class PRINTABLE
 -- insures that 'print_self' is implemented
inherit
 BASIC_IO
feature {ANY}
 print_self is

 -- print yourself
 deferred
 end -- print_self
end -- PRINTABLE
class MY_NUMBER
 -- holds and can print an integer
inherit
 PRINTABLE
creation
 set
feature {ANY}
 value : INTEGER ;
 set( new_value : INTEGER ) is
 -- set this number
 do
 value := new_value ;
 end ; -- make
 print_self is
 -- print the value
 do
 put_string( "Number: " ) ;
 put_int( value ) ;
 put_newline ;
 end ; -- print_self
end -- MY_NUMBER
class MY_POINT
 -- holds and can print an x,y pair
inherit
 PRINTABLE
creation
 set
feature {ANY}
 x, y : INTEGER ;
 set( new_x : INTEGER; new_y : INTEGER ) is
 -- set this point
 do
 x := new_x ;
 y := new_y
 end ; -- set
 print_self is
 -- print the value
 do
 put_string( "Point: " ) ;
 put_int( x ) ;
 put_char( ',' ) ;
 put_int( y ) ;
 put_newline ;
 end ; -- print_self
end -- MY_POINT
class ELEMENT[T]
 -- holds an object reference and a
 -- single link to another ELEMENT[T]
creation
 set_data
feature {LIST}
 data : T ;
 next : ELEMENT[T] ;
 set_data( new_data : T ) is
 -- set data to the new_data

 do
 data := new_data
 end ; -- set_data
 set_next( new_next : ELEMENT[T] ) is
 -- set next to the element
 do
 next := new_next
 end ; -- set_next
end -- class ELEMENT
class LIST[T]
 -- a generic linked list class
feature {ANY}
 add_to_list( data : T ) is
 -- add to the end of the list
 local
 new_element : ELEMENT[T] ;
 do
 !!new_element.set_data( data ) ;
 if head = void
 then
 head := new_element ;
 else
 tail.set_next( new_element ) ;
 end
 tail := new_element ;
 end ; -- add_to_list
feature {NONE}
 head, tail : ELEMENT[T] ;
invariant
 tail_next_is_void: tail.next = void ;
 tail_void_when_head_void: head = void implies tail = void
end -- LIST
class MY_LIST[T->PRINTABLE]
 -- A printable list which holds printable data
inherit
 LIST[T]
 PRINTABLE
feature {ANY}
 print_self is
 -- print the list elements
 local
 el : ELEMENT[T] ;
 do
 from
 el := head
 until
 el = void
 loop
 el.data.print_self
 el := el.next ;
 end ;
 end ; -- print_self
end -- MY_LIST









October, 1993
Dave's Recycled OO Language


Drool over this little language that sports multiple inheritance




David Betz


David is a DDJ contributing editor and can be contacted through the DDJ
offices.


There are many similarities between the languages I've built over the years.
For instance, XScheme (my implementation of the Scheme language) was based on
the byte-code compiler I'd designed for AdvSys, a text adventure-writing
system, and XLisp (my version of Lisp). Similarly, a language called "ExTalk"
formed the basis for the Bob language (DDJ, September 1991).
In this article, I'm presenting an updated implementation of AdvSys called
"Dave's Recycled Object-Oriented Language" (otherwise known as "Drool").
I decided to write Drool a while back after finding my daughter, Rachel,
playing some old Infocom text adventure games that Activision had re-released
in a collection called "The Lost Treasures of Infocom." It was nice to see her
enjoying games that required imagination, unlike the monotonous video games
kids favor these days. She enjoys writing, and, when asked if she would be
interested in writing her own games, she said, "Yes!". So I dusted off a copy
of AdvSys, and brought it up to date.
AdvSys is a simple, object-oriented system for writing text adventure games,
designed to be as small as possible, yet still be capable of implementing
fairly complex games. AdvSys ran quite happily on a CP/M machine with 64K of
RAM, but with the amount of memory available on most current Macintosh or
Windows machines, the size of Drool is less of a concern.
AdvSys consists of a separate compiler and interpreter. A game is written by
entering source code in a text editor, which is then compiled into a data file
that can be used by the interpreter to play the game. This means that to fix a
problem with a game, you have to go back to the source code, edit it, and then
recompile. The new system eliminates this tedious process. It's an interactive
environment with a browser and the ability to build to game a little bit at a
time, testing each piece before moving on to the next.
The old AdvSys represented everything as 16-bit integers. A reference to an
object was simply a 16-bit offset into a table of objects. This made good use
of memory and fit well with my goal of running on small machines. However, it
caused some problems too. There was no way to distinguish a number from an
object reference and so it was impossible to build an automatic memory manager
into the run-time module. Consequently, AdvSys could not create objects at run
time. Every object you wanted to use in a game had to be declared at compile
time.
The new system, Drool, has an automatic memory manager with garbage
collection, and every value is represented by a 32-bit pointer. Numbers are
treated as a special case and are encoded into pointers; they are
distinguished from pointers by setting the low-order bit. Since Drool only
runs on byte-addressed machines, it is easy to guarantee that all addresses
are on an even-byte boundary and, hence, have their low order bit cleared.
Thus, any value with its low-order bit set is a number and every value with
its low order bit cleared is an address. A number is converted to a value by
shifting it left by one bit position and oring the result with one. It is
converted back to a number by shifting the value one to the right.
Other types of objects are represented as pointers to objects in a heap. Each
object has a header that indicates its type, so the memory manager can
distinguish different types and can garbage-collect objects that are no longer
reachable. This makes it much easier to dynamically allocate objects and
ensure that the memory they occupy is freed when the objects are no longer in
use.
The complete source code to Drool is available electronically; see
"Availability," page 3.


The Language


In designing Drool, I wanted to have the system provide a collection of object
types that could be assembled into a simple game without doing any
programming. The problem with that approach is that it can lead to very
predictable games. Once you know the sorts of objects available in the
toolbox, you pretty much know what to expect when you encounter them in a
game. One way around this is to provide object attributes that you can mix
together in interesting combinations to create unique objects. That way, you
can invent a new type of object that would be different from any object in any
other game, still without having to do any programming.
Well, that's the theory anyway. To do that, I figured I needed to add multiple
inheritance to the language. AdvSys only supported single inheritance. In
fact, none of the tiny languages that I had designed supported multiple
inheritance and I'd only recently started to use it myself.
Example 1 is an object definition and the definition of a method to operate on
the object. First, we define an object called weapon with two properties:
weight and damage-points. Notice that unlike most object-oriented languages,
there are no classes in Drool (or AdvSys), only objects. Any object can act
like a class or like an object. After defining the weapon object, we define a
method that applies to the weapon object or any object that inherits from it.
This method is called damage. You send a message to an object by using an
expression like (sword 'damage), where sword is the object and the quoted
symbol after the object is the selector. This selector is used to select a
method for handling the message. In the case of objects that inherit from
weapon, the method we're describing is one that will apply. The getp function
fetches the value of a property of an object. The self variable refers to the
object receiving the message.
We then define another object, magic-weapon, with a single property bonus.
This object also has a method for the damage message. In this case, the method
is more complicated. The function (call-next-method) calls the next method
that applies to the message being sent.
Whenever you send a message, it's possible that more than one method might
apply. If there's a method for that message defined for the object itself,
that method will certainly apply. Also, any methods for that message that are
defined for objects that the receiving object inherits from, will apply.
With a single-inheritance system, method selection is fairly simple. The
most-specific method is the one that will be called. The most-specific method
is defined in the object closest to the object receiving the message in the
object hierarchy. When a method calls the function (call-next-method), the
next-most specific method is called. This can proceed back up the object
hierarchy until there are no more applicable methods.
With systems that support multiple inheritance, things are a bit more
complicated. Since an object can inherit from more than one other object,
there can be more than one applicable method at each level in the hierarchy.
Drool resolves this conflict by choosing the method from the left-most object
(and its ancestors) mentioned in the object definition before proceeding to
the object to its right. For example, the magic-sword object mentioned earlier
first inherits from magical-weapon and then from weapon. This means that when
the damage message is sent to magic-sword, the first method to be called is
the one defined for magical-weapon. Then, when the method for magical-weapon
calls call-next-method, the method defined for weapon is called.
It's probably fairly obvious by now that I'm using a Lisp-like syntax for
Drool. In fact, I've used a subset of Scheme, a simple dialect of Lisp, with
an object system added. For those of you not familiar with Scheme, the let
construct in Example 1 introduces and initializes local variables. In the
example, the let construct defines the variable damage and sets its initial
value to the result of calling the call-next-method function. That variable
will then be available for the duration of the body of the let form, in this
case, the expression (+ damage (getp self 'bonus)).
In addition to inheriting methods, an object also inherits properties. In the
case of the magic-sword object, it inherits the property bonus from
magical-weapon and the properties weight and damage-points from weapon. These
properties are what other object systems call "instance variables." Each
object has its own value for the property. Sometimes, it is handy to have a
group of objects share a property value. Drool allows this by providing shared
properties. When a new object is created, all of its normal property values
are copied from the objects it inherits from, but shared property values are
not copied, they are inherited like methods. Shared properties are defined
like normal properties except that you use the shared-property instead of the
property keyword.
Along with objects and numbers, Drool also provides strings, lists, and
vectors as primitive data types. Table 1 shows the complete syntax for Drool.
Because of the automatic storage management, you can create new objects at run
time. You do this with the clone function. It creates and initializes a copy
of an object. To create a magic-sword, you might use the expression (clone
magic-sword 'weight 20 'damage-points 10 'bonus 8). This creates a copy of the
magic-sword object and sets the weight to 20, the damage-points to 10, and the
bonus to 8.


Implementation


I'm writing Drool for the Macintosh, and the current implementation includes
an incremental compiler that reads Drool source code and generates byte codes
in memory where an interpreter executes them. Memory is managed by a
stop-and-copy garbage collector. To make it easier to build the complex
networks of objects that make up an adventure game, I've provided an object
browser. Because the environment is interactive, you can design part of a game
and then test it before going back to designing the rest. When you're done, a
save workspace facility allows you to write a data file containing the game.
The saved workspace allows you to play the game without having the source
code.


What Now?


While the Drool language is fairly complete, I'm just beginning the
development environment. To make it easier to use, I'm planning on building a
facility for defining objects using templates instead of source code. Let's
face it, Lisp syntax isn't that easy to master; and any textual language
presents a barrier to nonprogrammers. The templates will allow a game designer
to create objects by combining preexisting objects like weapon and
magical-weapon using a direct manipulation interface. The template would
include fields for all of the inherited properties as well as any new
properties the game designer might want to add. At a higher level, objects
representing actors and locations in the game could be arranged to form the
game world and the behavior of objects at any level could be changed by adding
methods that apply to those specific objects. I've got a lot of work to do.
Language design was probably the easiest part, but I'm hoping that the
resulting system will make it easier to build interesting and challenging
adventure games.
 Example 1: Object definition and the definition of a method to operate on the
object
Table 1: Drool syntax.
Defining Forms
(define <variable> <initial-value>) (defmethod (<object-name> '<selector>
<formal-argument>...) <expression>...)
(define (<function-name> <formal-argument>...)
<expression>...)
(defobject <object-name> (<parent-object>...) [<property-definition>]
[<shared-property-definition>])

Property Definitions
(property [<property-symbol> <property-value>]...) (shared-property
[<property-symbol> <property-value>]...)
Special Forms
(let ([(<variable> <initial-value>)]...) <expression>...) (if
<predicate-expression> <then-expression> [<else-expression>])
(cond [(<condition-expression> <expressions>...)]...) (while
<looping-predicate> <expression>...)
(and <expression>...) (begin <expression>...)
(or <expression>...) (set! <variable> <value-expression)
Built-in Functions
(not <expression>) (vector <vector-element>...)
 (make-vector <size>)
(+ <number> <number>) (string <character-code>...)
(- <number> <number>) (make-string <size>)
(* <number> <number>) (vector-size <vector>)
(/ <number> <number>) (vector-ref <vector> <index>)
(rem <number> <number>) (vector-set! <vector> <index> <value>)
(< <number> <number>)
 (number? <expression>)
(<= <number> <number>) (string? <expression>)
(= <number> <number>) (symbol? <expression>)
(/= <number> <number>) (cons? <expression>)
(>= <number> <number>) (vector? <expression>)
(> <number> <number>) (object? <expression>)
 (method? <expression>)
(getp <object> <property>) (cmethod? <expression>)
(setp! <object> <property> <value>) (package? <expression>)
 (null? <expression>)
(cons <car-value> <cdr-value>)
(car <cons>) (call-next-method)
(set-car! <cons> <car-value>)
(cdr <cons>) (clone <object> [<property> <value>]...)
(set-cdr! <cons> <cdr-value>) (print <value>...)
Primary Expression
All numbers are 31 bit integers Symbols can be quoted by preceeding them with
a single quote character
Strings are enclosed in double quotes




























October, 1993
The Art of Product Launches


Guerrilla marketing tips from the experts




Diane McGary


Diane is a consultant with Niehaus Ryan Haller Public Relations in South San
Francisco, California. In addition to a BA from University of California at
Berkeley, Ms. McGary is currently pursuing a degree in computer science at
University of California at Santa Cruz. The information presented here was
adapted from a Software Entrepreneurs' Forum seminar held earlier this year.


Ten years ago this month, the Software Entrepreneurs' Forum (SEF) was founded
to help programmers succeed in the business of developing, publishing, and
marketing software. The non-profit SEF continues to provide programs and
services, including monthly technology and business SIGs, dinner meetings
featuring industry speakers, and all-day seminars on business practices and
new technologies. In honor of this decade of service to the software
community, we're sharing with you, in this, and upcoming issues, some of the
invaluable information SEF members have been privy to for years. We hope it
helps you on the road to success, just as it has for hundreds of your fellow
programmers.
--editors
Launching a product is not unlike developing software. Good analysis and
design generally yield worthy products. It's the same with product launching.
If you identify requirements and carefully design the launch, then sales,
reseller deals, and even outside funding may soon materialize.
So you have a fantastic idea for a product and the development skills
necessary to bring it into being. What's next? Start with the marketplace
itself. Clearly you need to know if someone else has already developed your
breakthrough. Also critical is building technology to fit the market need. No
matter how tied you are to character interfaces, for example, if the people in
your niche use Windows, your product won't be well-received in the
marketplace. How can you find out what you need to know without draining your
401K?
Kris Olson of Olson & Company, a market research firm, recommends some
creative strategies for getting market statistics "on the cheap." Her avenues
include everything from combing publications such as Computer Reseller News
and the PC World to accessing corporate libraries and market research through
friends at big companies. Other sources include Computer Select, a CD-ROM
information source, hardware vendors, trade show presentations and magazine
media kits. Olson reminds researchers to know their users. "We all tend to get
into the product and what it's doing, rather than what people are doing," she
says. Olson also recommends phone, mail and on-line surveys, in-person
interviews, do-it-yourself focus groups, and hiring inexpensive student
interns.
Once you look at the market and decide on the type of product you want to
build, it's time to "position" your product. Even if you developed your
product first, and then looked at the market, that's okay too. Just don't
forget the basic principle: tuning the product to the potential user.
Positioning is a process of defining the market for your product and
positioning your product within that market. This means looking at potential
competitors and developing verbal and even visual messages that will convey to
consumers what your product does. Girand Software Strategies' Laurie Girand,
an Apple Computer veteran with successful launches of 32-bit QuickDraw and
System 7 under her belt, says a good launch creates awareness about your
product and educates and excites customers.
Girand's first rule is to establish clear positioning. "Position the
technology, product, competitors, company, and industry," she said. "You can
do this at little cost through white papers, press releases, data sheets, and
review guides. The cheapest, most effective marketing tactic, however, is
word-of-mouth, through beta programs, product giveaways, e-mail, bulletin
boards, and key influencers."
Marketing guru Steve Koschmann of Aldus Inc. cautions, "While positioning
identifies a product's end benefit by explaining what the product does, that
only gets you to marketing 101. Most folks stop there." Koschmann advocates
taking positioning to the next step by identifying what he calls the end end
benefit, "That something that emotionally connects with a customer and makes
him want to buy your product." Nyquil cold medicine provides one of
marketing's most classic examples of ingenious positioning. Explains
Koschmann, "It's the nighttime cold medicine that relieves all your cold
symptoms so you can rest. That's the benefit. The end benefit is that it
knocks you out and you wake up feeling great!"
After finding your customers, you will probably want to do some test marketing
before betting the farm. Jeffrey Tarter, editor and publisher of SoftLetter,
recommends something he calls "The $10,000 Launch," a test-marketing plan that
scientifically pinpoints prospects, attracts orders, and creates the basis for
a business plan.
Tarter says direct-mail testing will help determine who your customer is,
whether your product is marketable, which features are desirable, what pricing
is most effective, even what "free offers" work best. Spending $10,000 on a
mailing to 10,000 prospects should bring a 1 percent sales return. For a $150
product, 1 percent amounts to $15,000, a $5,000 profit. According to Tarter,
"1 percent is the worst response you'll ever get because you can keep refining
targets as you accumulate more data." Tarter added that the 1 percent figure
scales up. "The point is to build a statistical model needed to attract
investors and resellers," he says.
Another avenue that can generate a lot of attention, for a relatively small
investment, is public relations. The keys to PR are education and persistence.
The good news is that, on a small scale, you can do it yourself. Look at the
publications your prospective customers read, then find out which writers and
editors cover products like yours. Call them.
Ed Niehaus of Niehaus Ryan Haller Public Relations offers these PR basics.
"The most important thing you can do is get to know journalists as people, on
a one-to-one basis." Niehaus points out there are various types of media
coverage available, from product mentions to product roundups, to full-blown
reviews. As for your press kit, it should be complete. A good press kit might
include a cover letter, a press release with summary page, product data
sheet(s), a corporate backgrounder, analyst and customer quote sheets, and
artwork. Materials should highlight product benefits and explain how a product
fits into the marketplace. "Timing is critical," Niehaus says, "Talk to the
monthly publications first, then the weeklies, then the dailies. Although it
varies from publication to publication, some have lead times of four to five
months."
Another PR trick of the trade is to call publications and editors in order of
importance. Don't call the editor of your most critical publication
first--test drive your ideas on some publications you aren't too worried
about, get your message straight, then call your "A" list.
InfoWorld networking editor Vance McCarthy has some thoughts on voice mail:
"Be brief. Figure you will get 30 minutes total contact with us. That includes
faxes, demos, and your long voice mail." San Jose Mercury News computing
editor Laurie Flynn explains, "The biggest problem I run into is that the
developer doesn't understand my audience, which is consumer and small
business. I get a lot of calls for MIS products." The bottom line for all
journalists is get to know them, show respect, learn their preferences for
getting information, and be patient.
If you don't have experience in marketing, public relations, or running a
business, it will be immensely valuable to befriend someone who does. Learning
from others' expensive mistakes can save you lots of time and money. Former
MacUser publisher Martin Mazner, currently with Disc Info Systems, has battle
scars earned through six major product launches. He urges developers to
partner up with someone keenly familiar with marketing and business. "You need
someone to clarify the issues to get funding," says Mazner. He suggests
less-traditional ways to get into the sales channel. "Mail order houses,"
Mazner says "are a growing sector of the distribution market and offer good
value for your ad dollar. And catalogs often include editorial content."
Mazner advocates using "Product Champions," key influential users often quoted
in the major trade publications.
Local user groups and developer organizations can also help you get on track
in marketing and business development. Kaye Caldwell, president of Software
Entrepreneurs Forum, says "It is difficult to network with colleagues when you
are working out of your house. SEF and other organizations like ours provide
meetings and special interest groups that not only educate, but they let
developers talk one-on-one with marketing, PR, legal and administrative
experts for advice on how to build a business."
Robert Benson, a San Francisco Bay Area marketing consultant, offers his own
entrepreneurial launch secrets. He suggests testing your sales presentation at
trade shows by pitching 50-100 people and continually refining your message.
"Show beta software, give out marketing literature and ask for orders," he
says. "Many people starting out don't know how important it is to ask for the
sale." As the launch date approaches, develop a library of ads, direct mail
pieces, and company profiles from competitors and companies selling
complimentary products. Pick the best projects and emulate them. More
importantly, pay attention to what industry authors and analysts are saying.
Finally, monitor profits and cash flow, "and learn the difference if you don't
already know it," he adds.
Mail order and catalogs are fine, you say, but your dream is to see your
product on the shelf at the local software chain store, right next to Bill and
Phillipe's offerings. The world of the channel is tough, but Richard Miles of
Re:Launch offers some guerrilla tips. "If you're well-prepared and you know
what you're doing, you can get in, providing your product is good and there's
a market for it," according to Miles. His strategy involves developing an
outreach program where you contact resellers and send them a carefully
designed "Approach Package." It should contain a cover letter, a one-page
complete product information sheet, product reviews, ads, and an advertising
schedule. Miles suggested enclosing your product in a brightly colored box to
make it easily visible during your follow-up call. He also advises patience.
"You'll probably have to repeat this process several times because,
invariably, your package will be lost," he adds.
If you are successful in the U.S. and have a desire to sell software in other
countries, Ellen Elias of Elias International has this advice: "Avoid
launching overseas unless you've taken it very seriously," she explains. "If
you really just want to visit Europe, go on vacation. It's cheaper." If you
are serious, keep these additional challenges in mind: language barriers,
cultural differences, and shipping costs.
Elias recommends repeating US successes. "If Mac Warehouse works for you here,
try mail order overseas." She also advises finding distributors in each
country even if you're already carried by a major reseller. "You'll get no
mind share (from resellers). In Europe, distributors do create demand," Elias
says. "You can consider them your marketing department." Elias also advises
that distributors typically want extra points and tend to significantly mark
up the product, so be prepared.
Legal issues may be the last thing you ever thought of, but they are of
primary importance according to Susan Nycum of the international law firm
Baker & McKenzie. She advises developers on legally protecting software. Nycum
says it's critical to understand ownership rights, whether you are developing
for a company or having someone develop for you. "Consultants own the
copyright in any software developed by them unless rights have been signed
over," Nycum says. Beware also of ownership rules when using code downloaded
from BBSs.
"To protect software as a trade secret, it must be kept secret and disclosed
to third parties under agreement to keep it confidential," Nycum says.
"Although copyrighting only protects the expression of the algorithm,
including its look-and-feel, it is a good idea to declare copyrights and
register them with the U.S. Copyright Office. It costs $20.00 but entitles you
to damages and attorneys' fees if someone infringes after the registration."
Regarding patents, Nycum stresses the importance of signed nondisclosure
agreements while a patent is pending. And once the product is ready for
release, disclaimers of warranties are critical because they help limit
liability.























October, 1993
C++ Manipulators and Applicators


Simplifying the use of C++ I/O streams




Reginald B. Charney


Reg is president of Charney and Day Inc. and a member of ANSI's X3J16
Committee on the C++ language. He is also former chairman of The Canadian
Standards Association's Working Group on Program Design Standards (Z243.21).
Reg can be reached on CompuServe at 70272,3427.


Manipulators and applicators simplify the use of C++ I/O streams and add
elegance to input and output expressions. As a result, the logical connection
between functionality and use is more obvious. They also express complex
procedures more concisely and minimize name-space pollution. That is, names
chosen for manipulators and applicators can also be used for other things,
since function-name overloading will distinguish between uses of the same
name.
Although the terms manipulator and applicator are commonly used for both
functions and classes, this article distinguishes between the two uses. When
manipulators and applicators are used without qualification, the name applies
to both functions and classes.


The Problem


Using function qStr() and the iostream.h package, you'd expect Example 1(a) to
produce the output in Example 1(b). However, the output we actually get is
shown in Example 1(c). The reversed output and pointer values are caused by
the function in the output expression being evaluated first.
A compiler reduces all components of an expression to data and operator types.
Next, the user-defined operators of equal precedence are processed in order.
In Example 1(a), all the operators have the same precedence. So, only the
function call needs to be replaced by its return type before evaluating the
expression.
The solution to the reversed output problem is to make qStr() appear as a data
type. The compiler can then evaluate the similar expressions in the expected
order.


Manipulator and Applicator Functions


Manipulator functions modify their environment via side affects. Usually, they
take and return references to the streams in which they are used. Other
arguments normally remain unchanged. Refer to Example 2(a) and 2(b).
An applicator function has two or more arguments, one of which is a function
pointer. The applicator invokes the function argument passing the other
arguments to it. Its return type can be anything. Example 3 shows an
applicator function called af().


Manipulator Classes


Manipulator classes solve the problem of replacing a function call with a data
type. These classes encapsulate a manipulator function pointer with its
arguments and an applicator function that is recognized in the streams where
the manipulator class instances will be used.
Manipulator constructor arguments are stored and used later. The arguments
always include a manipulator function pointer. Friend operator functions
overload the stream insert and extract operators and allow the compiler to
recognize an instance of the manipulator class within a stream expression.
These operators are also applicator functions. That is, they invoke the stored
manipulator function using the stored arguments. Example 4 shows manipulator
class output. (Bjarne Stroustrup refers to manipulator class instances as
"function objects.")


Manipulator Interface Functions


In Example 5(a), a manipulator interface function is used to encapsulate the
manipulator function and its arguments into an instance of the manipulator
class. Example 5(b) shows the output.
With the manipulator functions, manipulator classes, and applicator functions
already defined, you're ready to output quoted strings. Example 6 specifies
manipulators explicitly. However, every C++ implementation, has a iomanip.h
library header file that contains generic input and output manipulator
classes. Depending on compiler, they are implemented as macros or as
templates. Example 7 uses templates to get the same results as Example 6.
The iomanip.h header depends on the iostream.h header file. Therefore, it must
appear after the inclusion of iostream.h. The names of the generic manipulator
classes vary with compiler, but most use the names SMANIP, OMANIP, IMANIP and
IOMANIP classes for ios, ostream, istream, and iostream stream classes,
respectively. By default, all compilers generate manipulator classes to handle
int and long manipulator function arguments. They also generate the prototypes
for the standard manipulator functions like setw().
If iomanip.h is implemented using macros and you wish to handle arguments of
types other than int and long, you must do two additional things. First,
ensure that the new type is a single token, like CP in Example 7. The macro
preprocessor uses this single token type name to generate other names. Second,
you must issue a IOMANIPdeclare statement to generate all the generic
manipulator classes for the new type. The new type can be a class. Example 8
uses a macro version of the generic manipulator functions.


Applicator Classes


An applicator class, which is an alternate way of specifying manipulator
interface functions, has one or more constructors and one or more
function-call operators. The applicator class constructors take one
argument--a pointer to a user-defined manipulator function. The pointer is
stored for later use.
The applicator function-call operator associates an applicator instance with a
set of arguments. The stored function pointer and arguments are used in the
manipulator class constructor to create and return an instance of its class.
Example 9 shows a typical applicator definition.
As with manipulators, the iomanip.h header file contains generic applicator
classes, usually called SAPP, IAPP, OAPP and IOAPP for the ios, istream,
ostream, and iostream classes.
Example 10 shows how an applicator class would be used in Example 7 with the
template version of the iomanip.h header.



Implementation Details


The parameterless manipulators endl, ends, flush, dec, hex, oct, and ws are
predefined in the iostream.h header file. The header iomanip.h defines the int
and long manipulators setw(int), setfill(int), and setprecision(int).
The declaration of manipulator interface functions can be placed in header
files while a separate file can contain their definitions.
Manipulators for classes derived from I/O stream must be defined carefully.
They will also apply to the base-class-streams unless there is a non-member
applicator defined for the manipulator and the derived class.


Conclusions


For this article, I used three compilers that support templates: Comeau
Computing's C++ v3.0, MetaWare's HIGH C/C++ v3.0, and Borland's C++ v3.1. Only
Comeau's C++ comes with a template version of iomanip.h. The other two
compilers use macro versions of this header file.
Manipulators and applicators are most often used with the I/O streams package.
However, their use can be extended to any type of class which has overloaded
operators and whose designers want elegance and clarity.


Acknowledgments


This article was written as part of my response to an article on C++ design by
Michael Schelkin of Stins Coman, Moscow, Russia, which appeared in the C Users
Group (U.K.) magazine, CVu. He was responding to my article on C++ design in
the C++ Journal (vol 2, no. 2).


References


Charney, R.B. "Data Attribute Notation, Part 1." C++ Journal (vol 2 , 1992).
Charney, R.B. "Data Attribute Notation, Part 2." C++ Journal (vol 2 , 1993).
Dewhurst S.C. and K.T. Stark. Programming in C++. Englewood Cliffs, NJ:
Prentice Hall, 1989.
Eckel, B. C++ Inside & Out. Berkeley, CA: Osborne McGraw-Hill, 1993.
Jensen Partners International, Topspeed C++ 3.02 Class Library Guide, 1991.
Microsoft C/C++ Version 7.0 Class Libraries User's Guide, 1991.
Stroustrup, B. The C++ Programming Language Second Edition. Reading, MA:
Addison-Wesley, 1991.
UNIX System V AT&T C++ Language System Release 2.0 Library Manual, 1989.
Zortech C++ Compiler 3.0 Function Reference Manual, 1991.

Example 1:
(a)
// quote argument string and then
// output it.
ostream& qStr(char* s)
{ return cout<<"'"<<s<<"'"; }
cout << "Output is a " <<
 qStr("string") << ".";
(b)
Output is a 'string'.
(c)
'string'Output is a 0x48f2.
Example 2:
(a)
typedef ostream OS; // an abbrev
// quote argument string and then
// output it.
OS& aMF(OS& os,char* s)
{ return os << "'" << s << "'"; }
aMF(cout, "string");
(b)
outputs on
cout the value:
'string'

Example 3:
typedef long (*FP)(int, int);
// af is applicator function
long af(FP f, int i, int j)
 { return (*f)(i,j); }
long sum(int i, int j)
 { return i+j; }
long dif(int i, int j)
 { return i-j; }
af(sum,2,3); // returns (long)5
af(dif,9,5); // returns (long)4
Example 4:
typedef ostream OS; // an abbrev
template<class T>
class OMC // output manip class
{
 typedef OS& (*MF)(OS&, T);
 MF mf; // manip function
 T a; // arg of type T
public:
 OMC(MF mmf,T aa)
 : mf(mmf), a(aa) { }
 friend OS& operator <<
 (OS& os, const OMC<T>& mc)
 { return (*mc.mf)(os,mc.a); }
};
Example 5:
(a)
// define manipulator interface
// for manipulator function aMF.
OMC<char*> aMI(char* s)
{ return OMC<char*>(aMF,s); }
cout << "Value "<< aMI("string");
(b)
Value 'string'
Example 6:
#include <iostream.h>
typedef ostream OS; // an abbrev
// qSTR - manip function
OS& qSTR(OS& os,char* s)
{ return os << "'" << s << "'"; }
// OMC - Output Manipulator Class
template<class T> class OMC {
 typedef OS& (*MF)(OS&, T);
 MF mf; // manipulator fcn
 T a; // arg of type T
public:
 OMC(MF mmf,T aa)
 : mf(mmf), a(aa) { }
 friend OS& operator <<
 (OS& os, const OMC<T>& mc)
 { return (*mc.mf)(os,omc.a); }
};
// qStr - manip interface
// for manip function qSTR
OMC<char*> qStr(char* s)
{ return OMC<char*>(qSTR,s); }
// sample output expression
cout << "Output is a " <<

 qStr("string") << "\n";
Example 7:
#include <iostream.h>
#include <iomanip.h>
typedef ostream OS; // an abbrev
typedef char* CP; // single token
// qSTR - manip function
OS& qSTR(OS& os,CP s)
{ return os << "'" << s << "'"; }
// qStr - manip interface
// for manip function qSTR
OMANIP<CP> qStr(CP s)
{ return OMANIP<CP>(qSTR,s); }
// sample output expression
cout << "Output is a "
 << qStr("string") << "\n";
Example 8:
#include <iostream.h>
#include <iomanip.h>
typedef ostream OS; // an abbrev
typedef char* CP; // single token
IOMANIPdeclare(CP);
// qSTR - manip function
OS& qSTR(OS& os,CP s)
{ return os << "'" << s << "'"; }
// qStr - manip interface
// for manipulator fcn qSTR
OMANIP(CP) qStr(CP s)
{ return OMANIP(CP)(qSTR,s); }
// sample output expression
cout << "Output is a "
 << qStr("string") << "\n";
Example 9:
typedef ostream OS; // an abbrev
template<class T> class OAC {
 typedef OS& (*MF)(OS&, T);
 MF mf;
public:
 OAC(MF mmf) : mf(mmf) { }
 OMC<T> operator()(T a)
 { return OMC<T>(mf,a); }
};
Example 10:
#include <iostream.h>
#include <iomanip.h>
typedef ostream OS; // an abbrev
typedef char* CP; // single token
// qSTR - manip function
OS& qSTR(OS& os,CP s)
{ return os << "'" << s << "'"; }
OAPP<CP> qStr = qSTR;
// sample output expression
cout << "Output is a "
 << qStr("string") << "\n";








October, 1993
Avoiding Microcontroller Processing Pile-ups


This project won the 68HC16 design contest




Eric McRae


Eric is an embedded-systems design engineer in Redmond, Washington. You can
contact him at 206-885-4107 or 72223,1242 @ CompuServe.com.


The Motorola 68HC16 microcontroller is a 16-bit device which, among other
things, features queued communication for two integrated serial channels, a
general-purpose timer with two 16-bit counters, and support for digital-signal
processing. As with any powerful tool, however, having flexibility and
functionality like that of the HC16 can--if you're not careful--cause more
problems than it solves. In particular, a microcontroller that's capable of
running multiple independent periodic processes could produce erratic results
when several of the interrupts occur at more or less the same time.
In a general sense, this is the design problem I faced with a HC16-based
system which involved many time-based processing routines whose synchronous
execution was critical to safe operation. I had the option of using the very
flexible on-chip timer capabilities, but my previous real-time design
experience cautioned me to take another approach. I wanted to have explicit
control of processing in this system. I also wanted to be able to collect
operational data proving that the system was safely handling its processing
requirements.
The project itself was my entry into a 68HC16-based project design contest
sponsored by Motorola. To be reimbursed for the contest evaluation kit,
designers had to go through a set of evaluation exercises, of which the final
result was the creation of a 5-band audio-spectrum analyzer. I was impressed
that the HC16 had enough horsepower to process a 20 kHz audio sample stream
through five digital filters, display a bar graph output, and still have CPU
time to spare.


Designing a Wheelchair Controller


After seeing a wheelchair-bound person negotiate a ramped curb on a city
street, it seemed that a wheelchair controller would be a good application for
an HC16-based design. The person I saw was controlling the wheelchair with a
chin joystick. The interactions between the acceleration of the chair and his
body's inertial motion, along with the bump of the curb, produced erratic
movement. I was sure I could design something that would be easier to operate
than the equipment he was using. To me, the obvious approach was a
voice-activated wheelchair controller.
The controller I had envisioned would be capable of "learning" how a
particular operator wished to convey commands using arbitrary combinations of
voice, motion, and other inputs. Operators need only be able to reasonably
repeat their versions of commands. They shouldn't be required to give a
particular command in any otherwise predetermined fashion. This allows
ease-of-use by someone with a speech impediment, and makes the controller
independent of the spoken language. Voice and other analog command inputs are
processed utilizing the on-chip DSP features, while the 16-bit CPU controls
inputs from ultrasonic sensors and outputs to drive motors, LCD displays, and
audio communication devices.
Table 1 lists the command set the controller recognizes. The interpretation of
some of these commands depends on the state of the controller. For example,
none of the motion commands are enabled if the wheelchair is stopped. The
operator must first issue the Go command to precondition the controller to
accept subsequent motion commands. This prevents accidental wheelchair
movement.
The Faster and Slower commands can be issued repeatedly for additional
increase or decrease of speed. The directional commands (Left and Right) start
a graceful turning motion. If issued repeatedly, the turn rate will increase.
If issued when the wheelchair is stationary, but preconditioned by the Go and
About commands, Left and Right cause the wheelchair to rotate about itself.
The Stop command, if issued once, causes a graceful slowing to stop; if issued
twice in a row, it will cause a maximum deceleration stop. The Status command
reports controller status in a manner understandable by the operator. The
output mechanism for status can be aural (speech) and/or visual (display
panel). Status information includes estimations of battery reserves, distance
traveled, current time, diagnostic status, and other useful data.
To be safe, the speeds of the drive wheels are monitored and compared with
expected values based on motor drive current. External forces, loss of
traction, and component failure can all cause excessive differences. These
situations are handled by digital motor-control routines which attempt to keep
the drive wheels powered below the "break loose" threshold.
The design includes front and rear ultrasonic ranging. Control software
prevents the operator from hitting any detected obstacle. The anti-collision
processing considers speed so that a wheelchair going full-tilt towards a wall
will be slowed to a graceful stop right at the wall. A docking mode, assisted
by a special power receptacle, controls the fine maneuvering motions necessary
to insert the power plug into a receptacle. All operational features of the
system are monitored during use. Any detected malfunction causes the
controller to take appropriate action, including, but not limited to, stopping
motion and notifying the operator. Figure 1 shows a block diagram of the
control system.
The master command patterns used for comparison with incoming commands, and
the propulsion dynamics of the wheelchair-operator system are determined
empirically during an initial training period. The infrared communication
devices (shown at the top of Figure 1) allow the controller to communicate
with a training computer and technician to facilitate this process.


Control Processing


The control processing for this system requires running a Discrete Fourier
Transform (DFT) on the voice input. The spectral output is grouped into
frequency bands for use by the recognition algorithm. The prefiltered voice
signal is sampled and processed at 6 kHz. Audio output values are also
produced at 6 kHz. The strain and position inputs are examined at 10 Hz by
taking a group of conversions of the analog signal and averaging the results.
The ultrasonic transducers are fired at a rate proportional to the speed of
the wheelchair. The transit time of any echoes are computed and compared with
closure threshold values. Finally, the wheel encoder values are sampled at 20
Hz and the resulting values are processed into the digital control filters
which govern the dynamic motion of the wheelchair. The pulse width modulated
(PWM) control to the motors is adjusted at the same rate.
When I was laying out the processing algorithms, I was struck by two things.
There was a lot of real-time processing to be done, and the processing
routines were mostly independent of each other. I was pretty sure the HC16
could handle the average load. Its built-in hardware helped in that
department. For example, the analog inputs, the PWM outputs, and the audio
output were all handled directly by hardware in the microcontroller. There are
enough independent timer interrupts available in the HC16 to allow most of the
processing to proceed in assorted timer-interrupt handlers. I was concerned
however that a system with several independent synchronous processes running
at different frequencies might create problems that could be extremely
difficult to debug. The hardware-based interrupts just didn't leave enough of
a debug trail for me.
The best approach seems to be to force all the processing to be controlled
from one routine. This is accomplished by using the basic 6 kHz voice input
interrupt service routine as the base dispatch handler. This handler is
re-entrant as shown in Figure 2 and its code is listed in Listing One (page
92). After the basic DFT processing is done, a series of counters are
decremented and examined. If none of the counters reach 0, the interrupt
handler returns. However, if a counter does hit 0, dispatch code is executed.
The dispatch code does an uninterruptable check and set of a flag. If the flag
was already set, a processing overrun has been detected. Otherwise, the
processing handler is called via the counter index. When that routine returns,
interrupts are disabled, the flag is reset, and the counter is reloaded by
adding its reload value (300 for 20 Hz routines) to its current value (which
may have been decremented below 0 by subsequent passes through the 6 kHz
handler). The result is placed back into the counter and interrupts are
enabled. The now somewhat old instance of the 6 kHz interrupt handler exits.
At start-up, the counters for the various processing handlers are loaded with
values that keep the handlers widely separated in time. See Figure 3 for a
diagram of this. These steps assure that the handlers are always invoked with
an unvarying period.
There are several advantages to this approach. The negative counter values
read during the counter reload indicate how many 6 kHz "ticks" the handler
consumed. If you have some idea of the maximum processing time required by
each of your handlers (don't forget to include 6 kHz interrupt processing),
you can convince yourself that your handlers won't ever overrun each other.
This assurance would be elusive if all processing were handled by separate
interrupt handlers.


Conclusion


I'm happy to say that the wheelchair-controller described here won first place
in last year's Motorola-sponsored M68HC16 design contest. (Other winners
included Eric Becks of Cheboygan, MI for his EKG recorder design; Larry Korba
of Ottawa, Ontario for his design for a DSP metal detector; Tom Schmit of
Syosset, NY with a single chip modem/controller for a high frequency radio;
and Tom Seim of Kennewick, Washington for his design utilizing the 68HC16 in
pattern recognition and control using neural networks). More importantly, I
hope that systems like this will make life easier for wheelchair-bound people
in the future.
Table 1: Command set.
Command Name Explanation
Go Enables subsequent motion commands, if currently stopped.
Right Turn right.
Left Turn left.
Forward Stop turning and/or proceed straight ahead.
Stop Decelerate and stop all motion.
Faster Speed up.
Slower Slow down.
Back Stop turning and/or proceed straight back.
About Enables subsequent direction for rotate right or rotate left.

Dock Enables "Dock for Charging" mode.
Park Same as Stop, but actively maintain stopped position.
Status Give status report.
 Figure 1: System Block Diagram
 Figure 2: Dispatch Processing
 Figure 3: Control processing during a 0.1 second interval
_AVOIDING MICROCONTROLLER PROCESSING PILEUPS_
by Eric McRae

[LISTING ONE]

/* 6 KHZ interrupt handler. Called from timer ISR. Makes use of INTSOFF
** and INTSON macros which disable and enable interrupts. */
void
sixkhz()
(
 int hit, i;
 do6kdft(); /* Handle voice DFT processing */
 hit = -1; /* Assume no counter hits zero */

 for( i = 0; i < MAXCTRS; i++ ) /* Decrement each Counter */
 (
 if( ! counter[i]-- )
 ( /* if this counter hit 0 */
 if( hit != -1 ) /* If another counter already expired */
 { /* Oh No! Processing Pile */
 }
 else hit = i; /* Else save index of ctr hitting 0 */
 )
 ) /* end of for each counter */
 if( hit != -1 )
 ( /* If a counter did hit zero */
 INTSOFF; /* Disable Interrupts */
 if(busyN++) /* If already running a handler */
 { /* Overrun Error if get here (Busy was already set)*/
 )
 INTSON; /* Enable interrupts */
 (do_handler[i])(); /* invoke processing handler */
 INTSOFF; /* Disable interrupts */
 counter[i] += resetval[i] ) /* Reload ctr considering */
 /* time already used */
 busyN = 0; /* clear busy flag */
 INTSON; /* enable interrupts */
 ) /* End of it counter expired */
)

















October, 1993
Networking with Perl


Perl scripts can simplify network communication




Oliver Sharp


Oliver is a graduate student at the University of California, Berkeley, doing
research into parallel programming environments. He can be reached at
oliver@cs.berkeley.edu.


Wary of becoming entangled, many programmers never try to write networked
applications. While connecting computers together can be difficult and
complex, you don't necessarily have to master the alphabet soup of standards
and the wide array of specialized hardware just to get started writing
programs that work over networks. There are software interfaces that hide many
of these details from you.
Although no single interface is supported everywhere, the one that's almost
universally available under UNIX is Berkeley sockets. Perl, a language
designed to handle a wide variety of system administration tasks, makes
handling the socket protocol easier still. This article shows how you can
write Perl scripts that communicate across networks of UNIX machines. For
details on Perl itself, refer to the accompanying textbox entitled, "Perl
Fundamentals."
The Berkeley socket protocol was developed to allow communication between
networked computers. After examining sockets in a general way, I'll present a
Perl application that takes advantage of them. The application, called
"PostIt," allows users on different machines to leave notes for each other,
tagged by a keyword. Of course, I could have written PostIt in C, but Perl
simplifies the socket interface, making the code shorter and easier to
understand. Also, since the names of the socket routines are the same, you can
later scale up to C without much difficulty.


Sockets


A socket is an abstraction of the communication link between two machines. The
easiest way to understand it is to think of it as an extension of a UNIX pipe.
Through a socket, two processes can communicate with each other, whether or
not they are on the same computer. The socket is a two-way link, so each
process can read or write on it just the way that it would use a file
descriptor. In fact, to the standard I/O library, sockets look like file
descriptors and you can pass a socket to any library routine that expects a
descriptor (such as read and write).
Generally, the easiest way to use sockets is to set them up in stream mode,
where they'll act like you'd expect: If you send two messages, they're
guaranteed to arrive in the same order that you sent them. PostIt uses stream
mode because such guarantees make the programmer's life much simpler.
There are, however, alternatives to stream mode. Datagram mode, for example,
models the way the underlying network acts: You send data back and forth in
discrete packets of some particular size. If you want to send more information
than fits in one packet, you divide it up. Packets can arrive in any order and
they can also get lost, often requiring a protocol layer on top of the socket
interface. The advantage to datagrams is that they are more efficient, since
fewer layers of software lie between you and the network; as always, higher
levels of abstraction impose a cost.


PostIt


There are two parts to the PostIt program: a server sitting on a designated
machine that accepts commands, and a client program invoked by the user to
send the commands. Once up and running, the server waits for clients to call
it up with commands, of which there are three kinds: set, get, and die. The
server keeps track of messages, each of which has an associated key word. A
set has a tag and a string value, telling the server to associate the string
with that tag. A get has a tag, and the server returns the string (if any)
associated with that tag. For a get with the tag alltags, the server sends
back a list of all the tags that have information associated with them. The
die command tells the server to terminate itself. Figure 1 shows a simple
dialog using PostIt.
The first thing that happens is that somebody runs the server. In Figure 1,
the third machine runs the server and fields queries from the others. Real
servers (such as the ftp and finger daemons) are, of course, started
automatically when the system boots up; for now, however, I'll start PostIt by
hand.
The next step is that Joe on Machine #2 asks if anyone has left a message
under the tag lunch; the system responds that there's no message. Joe wants to
tell anyone who's interested that he is at the Sandwich House, and leaves a
message. The next person to come along is Mary on Machine #1, who looks to see
what messages are available. She sees the tag lunch, gets the message, and
decides to join Joe. She changes the lunch message to let the rest of their
group know where to go. There is one message per tag, so if somebody sets a
new one, it replaces the old. Since the example is simple, there's no message
protection, message history, or any of the other elaborations you'd want in a
real messaging program.


PostIt Implementation


Perl and C syntax are quite similar. I tried to avoid using the more exotic
features of Perl in PostIt to keep the code straightforward for C programmers.
The server uses a simple strategy for storing data: It creates a file for each
PostIt note in the directory where the server was invoked. The name of the
file is the tag for that note. The server sits in a loop, waiting for clients
to get in touch with it and either leave messages or request them.
Listing One (page 117) is the server code. The first line is a command to the
UNIX shell that this is a script and should be run by handing it to Perl. The
script starts by stashing away its process ID into the variable $parent_pid.
(All string variables in Perl start with a dollar sign.) The variable $$ is a
built-in Perl variable that contains the process ID--Perl has many of these
variables, and they are useful (if rather cryptic) shortcuts.
The next two lines check if the user specified a port number when invoking the
server. If so, I set the variable $port to that number; otherwise, I use 2001.
Port numbers are a simple way for two processes to get in touch with each
other, somewhat like a phone number. If the server process is running on a
machine called "green," it tells green that it will handle any requests to a
given port (2001, say). Any client, whether it is on green or not, can "dial"
to machine green, port 2001, and the system will notify the server that
somebody called.
The problem with port numbers is that you don't know which are available. A
variety of services already use the lower numbers; 21, for example, is the
port used by the file-transfer utility ftp. You can request a specific port,
or you can ask the system to pick any available one (by asking for port number
0). If the server doesn't use the default value, the user will have to specify
the port number to the client. Just as with a phone, if you don't know the
right number, you can't get in touch with the server you are looking for.
The next line tells Perl that if an interrupt signal comes in, it should call
the subroutine suicide (which appears at the bottom of the script), and close
the socket. It is important to close sockets, because they won't be shut down
when a process exits. Some systems have a time-out, but many versions of UNIX
won't recover the socket until a system reboot.
Next, set up some variables that will be used in calling the socket interface
routines. The first is $family, which is set to 2 to indicate that I want
Internet protocols. There are several different protocol families, including
the Xerox NS and internal UNIX protocols. Since I'm going to be communicating
via TCP/IP to other UNIX machines, I'll stick to the Internet protocol.
The next variable, $sock_type, is set to 1 for stream mode. The last variable,
$sockaddr, is a character encoding of some network ID information; unless you
are doing something fancy, you can usually stick to this standard value.
To call socket routines, start by calling getprotobyname, which takes the name
of a protocol and returns three identification keys used by other socket
routines. Perl's string parsing comes in handy, letting you separate the
protocol information into three variables and then recombine with pack. Before
actually creating the socket, I tell Perl to set up a stream called NEW_S that
flushes on every input and output. Otherwise, the socket buffering causes
problems; in a conversation, the processes want to send a message, wait for a
response, and so on. Without automatic flushing, a sent message may sit in a
socket buffer because the system doesn't think it is long enough to be worth
sending yet. The select command sets a stream to be the current one, and $/ is
a special variable that controls whether flushing is done automatically.
Next, the call to socket creates a disembodied socket--it has buffering set
up, but isn't connected to anything. The call to bind gives the system some
more information about how the socket will be used. Now you're ready to do
something with the socket, depending on whether you be calling somebody or
they'll be calling you. In the case of a server, you want to wait for incoming
calls, so you use listen. The second argument tells the system to allocate
enough space for five processes to wait to get in touch with you.
With the socket setup finished, you can get down to the business of being a
server--sitting in a loop, waiting for clients to connect. Each call to accept
returns when somebody calls, creating a new socket (NEW_S) for that particular
connection. The socket library makes extra sockets to allow servers to be more
responsive. A simple strategy would be to handle incoming clients in order,
forcing everyone to wait until the socket was free. Instead, the original
socket is only used to make the connection. Once a client attaches to it, a
new socket is created for that conversation and will be closed after it is
over.
PostIt uses the typical server strategy of spawning a child process to handle
each conversation; that leaves the parent server free to handle the next
client who comes along. Now you see the point of the argument to listen--it
tells the socket library how many clients should be able to wait until the
server accepts them. Choose a number based on the number of requests you
expect clients to make and the delay between calls to accept. If the line for
server access is too long, the next client will be turned away.
The call to fork creates a child process which looks just like the parent and
has the same streams, variables, and so forth. The two processes can figure
out which they are by looking at the return value from fork. The one that gets
a 0 is the child and is responsible for handling the client. The parent gets a
nonzero return value; since it doesn't need to talk to the client, it closes
the temporary socket NEW_S and loops, calling accept to get the next client.
The child reads the first line from the socket into $command and uses the Perl
command split to peel off the instruction and the tag. PostIt can handle get,
set, and die; anything else is ignored.
To shut down, PostIt uses the Perl kill command, sending the parent UNIX
signal SIGINT. This tells Perl it will handle that signal in the routine
suicide, which closes the main server socket if it is open, prints out a death
message, and exits.
To handle a get, the program first checks if it has the special tag all-tags.
If so, just send the names of all the files in the server's directory. One way
to do this in UNIX is with the command 'echo *'; by putting it in backquotes,
you tell Perl to execute the command and replace it on the line with its
output. When print is issued a stream as its first argument (NEW_S in this
case), the other argument strings are written to that stream. That's all it
takes to send the list of files back to the client. If you're asked about some
other tag, use the UNIX cat command to send the file. If a file doesn't exist,
nothing is sent.
A set is equally simple. You open a file with the value of $tag as its name,
write the message from the client, and close it. Return OK to the client if
you succeeded, Nope if you fail.
On the server side of the connection, PostIt uses a standard UNIX trick
because most of the work is the same, regardless of the message you send.
Post-It uses the same source code for different commands. In this case, the
script can be invoked in one of three ways: getinfo <info-tag> [server-machine
server- port]; setinfo <info-tag> <value> [server-machine server-port]; and
kill-server [server-machine server-port]. Under UNIX, you can save some disk
space by having three separate names for the same file (using links).
Alternatively, you can just make three copies.
The first version of the client asks the server for a message with the given
tag; the user can optionally specify a machine and a port to connect to.
Remember that a port is like a telephone number within a given machine. To
connect to the server, the client "rings" on that number; if a server wants to
answer requests, it will have called accept and a connection will be set up.
If no server information is given, we will use the default machine
(master.euphoria.edu) and port number (2001). The second kind of invocation
sets up a note with the given tag and value, again optionally specifying the
server information. The third one tells the server to close up and exit.
First, the client parses the arguments. The variable $0 contains the name used
to invoke the script, so PostIt uses it to figure out whether to get, set, or
die. If it's not just killing the server, you get the tag argument using the
Perl command shift, which takes an array and returns the first element,
removing it and shifting the rest down. If no array is specified, shift uses
the arguments to the script as its default.

For setinfo, you get the contents of the note and make sure that the user
isn't trying to assign a value to the special tag alltags. Having read the
required arguments (the tag, if we are getting info, both tag and value, if we
are setting), you can check for server information. Put the next two
arguments, if they exist, into $machine and $port. Use the defaults if you
didn't get anything.
The next several lines are similar to the server, though the arguments to the
socket functions are a bit different. The main difference is that you call
connect, specifying the host and port number of the server. Remember that in
the server you called listen, to tell the system that we were waiting for
clients to call us. The connect call to a listening socket creates a
connection. Another difference is that the client uses its original socket to
communicate with, unlike the server, which got a new one for each connection
from accept.
Once connected, PostIt sets the socket to flush I/O and sends the message. For
a setinfo, send the set request, and see if you get back an OK. To get info,
send the get request and wait for a response. If you get back an empty string,
there wasn't a note with that name, so print out a message. Otherwise, print
out the note's contents. Once you have gotten the response from the server,
you're done, so close the socket and exit.


Conclusion


For more complete discussions of UNIX network programming issues, I recommend
UNIX Network Programming, by W. Richard Stevens (Prentice-Hall, 1990) and
Internetworking with TCP/IP, by Douglas Comer (Prentice-Hall, 1988).
For more information on Perl, turn to Programming Perl, by Larry Wall and
Randal L. Schwartz (O'Reilly and Associates, 1990). Larry invented the
language, so this book is the authoritative word. In fact, like many Perl
programs, the skeleton of PostIt comes from an example in the book. Both the
authors and many others are active participants in the Usenet group,
comp.lang.perl, where you can get exhaustive answers to almost any conceivable
question about Perl.


Perl Fundamentals


Perl, developed by Larry Wall, is a tool for solving all the irksome,
automatable file-management tasks that bedevil the system manager and computer
user. I've come to depend on some of the Perl scripts I've written; one
reformats BibTeX bibliography entries to print out on 7x2 sheets of labels.
Another connects to every machine in our local subnet and lists, with their
location, the people who have been active within the past hour. A useful
script (called "zap") that appears in Programming Perl lists currently
executing processes that match a search criteria and lets the user kill them
if desired.
Experienced UNIX users know that many tools are available for handling these
kinds of tasks: grep for searching files, awk for scanning and modifying
files, various shell languages for writing scripts, and so on. Perl can work
with these tools, but it really subsumes many of them. Because Perl scripts
are precompiled at startup time, they are much more efficient than shell
scripts. This is particularly noticeable if your script does any computation;
unlike the shell scripts, Perl has computational built-in operators and does
not need to rely on external programs (such as test).
The philosophy behind Perl is not minimalist--Larry Wall was trying to build a
language that makes it easy to solve problems with a minimum of fuss. Perl
syntax is based on C syntax, with many additions and modifications. You don't
absolutely need to know C first, but you will learn Perl much more quickly if
you do.
Perl provides a number of useful features beyond those available in C; one of
the best is associative arrays. An associative array is like a normal array,
except that it is indexed with a string. Suppose you want to count the number
of times that a word appears in a file. If you are reading the words, keeping
them in a variable called $word, you can count them using an associative array
called $count, like this: $count{$word}++;. The curly braces tell Perl that
you are using an associative array; it uses the contents of the $word variable
as an index. Since you are applying a numerical operation to the array
location (that is, increment), Perl knows that this array contains numbers. If
there has not yet been any reference to the location, Perl initializes one,
gives it an initial value of 0, and then increments it. If that key has been
used before and the array already has a location, its contents are
incremented. By taking care of all these issues for you, Perl lets you write
very compact code that does a lot of work.
Perl is available for almost every computing platform in common use, but it
can't communicate over networks unless your operating system supports Berkeley
sockets. Perl can still be used for file manipulation under MS-DOS, MacOS,
AmigaDOS, and other systems that do not natively support socket-based
networking.
--O.S.
 Figure 1: Sample use of PostIt on a network.
_NETWORKING WITH PERL_
by Oliver Sharp

[LISTING ONE]

#! /usr/local/bin/perl
#
# Usage: PostIt [port-number]
#
# This sets up a server, which sits around waiting for requests. There are
# three kinds:
# "set <tag> <value>" - stash this away
# "get <tag>" - get a value associated with tag, or return the
# list of tags if asked for "alltags"
# "die" - commit suicide
#
# If we get a SIGINT signal, close the socket and exit.

$parent_pid = $$; # stash our pid so we can be killed by a child
($port) = @ARGV; # see if we have a port argument
$port = 2001 unless $port; # if not, use 2001

$SIG{'INT'} = 'suicide'; # route SIGINT signal to subroutine suicide

$family = 2; # set up some protocol parameters
$sock_type = 1;
$sockaddr = 'S n a4 x8';
($name,$aliases,$proto) = getprotobyname('tcp');
$me = pack($sockaddr, $family, $port, "\0\0\0\0");

# make the socket, bind it to the protocol, and tell system to start listening
socket(S, $family, $sock_type, $proto) die "socket: $!\n ";
bind(S,$me) die "Tried to bind socket, got: $!\n ";
listen(S,5) die "Tried to listen, got: $!\n ";

select(NEW_S); $ = 1; select(stdout); # set auto-flush mode for sockets
select(S); $ = 1; select(stdout);


for (;;) {
 ($addr = accept(NEW_S,S)) die $!; # wait for incoming request
 if (($id = fork()) == 0) { # fork a child to handle request
 $command = <NEW_S>;
 ($whattodo,$tag,$rest) = split(' ',$command,3);
 chop($rest);
 if ($whattodo eq 'get') {
 if ($tag eq 'alltags') {
 print NEW_S `echo *`;
 }
 else {
 print NEW_S `cat $tag`;
 }
 }
 elsif ($whattodo eq 'set') {
 if (open (TAG,">$tag")) {
 print TAG "$rest\n";
 close(TAG);
 print NEW_S "ok\n";
 }
 else {
 print NEW_S "nope\n";
 }
 }
 elsif ($whattodo eq 'die') {
 print "got a kill\n";
 kill 'SIGINT',$parent_pid;
 }
 else {
 print "got unknown request $whattodo";
 }
 close(NEW_S);
 exit;
 }
 close(NEW_S);
}

# when a SIGINT signal comes in, close the socket and exit
sub suicide {
 close S if S;
 print "Suiciding now\n";
 exit;
}


[LISTING TWO]

#! /usr/local/bin/perl
#
# Usage: getinfo <info-tag> [server-machine server-port]
# setinfo <info-tag> <value> [server-machine server-port]
# killserver [server-machine server-port]
#
# This script tries to contact an info server on the given machine and port.
# If the latter aren't specified, it uses defaults. If no server is
# found, it complains. Otherwise, it either gets info about the specified
# tag (if invoked as "getinfo") sets it (if invoked as "setinfo"), or kills
# the server (if invoked as "killserver"). For getinfo, it returns whatever

# info the server returns. The magic info-tag "alltags" returns the list
# of tags that the server has information about. You aren't allowed to
# setinfo the word "alltags".

if ($0 ne 'killserver') { # if we are getting or setting info, grab tag
 $tag = shift;
}
if ($0 eq 'setinfo') { # get value, if we are doing setinfo
 $value = shift;
 die "That's a magic tag ..." if ($value eq 'alltags');
}

($machine,$port) = @ARGV; # get info about server, if specified
$machine = "master.euphoria.edu" unless $machine;
$port = 2001 unless $port;

$family = 2; # set up protocol parameters
$sock_type = 1;
$sockaddr = 'S n a4 x8';

chop($hostname = `hostname`);
($name,$aliases,$proto) = getprotobyname('tcp');
($name,$aliases,$type,$len,$myaddr) = gethostbyname($hostname);
($name,$aliases,$type,$len,$serveaddr) = gethostbyname($machine);
$me = pack($sockaddr, $family, 0, $myaddr);
$server = pack($sockaddr, $family, $port, $serveaddr);

# create socket, bind to protocol, and try to connect to server
socket(S, $family, $sock_type, $proto) die "Failed to make socket\n ";
bind(S,$me) die "Failed to bind socket\n ";
connect(S,$server) die "Failed to connect to $machine\n ";

select(S); $ = 1; select(STDOUT); # set socket to autoflush

if ($0 eq 'setinfo') {
 print S "set ",$tag," ",$value,"\n";
 $result = <S>;
 if ($result eq "ok\n") {
 print "Succeeded\n";
 }
 else {
 print "Failed\n";
 }
}
elsif ($0 eq 'getinfo') {
 print S "get ",$tag,"\n";
 $result = <S>;
 if ($result eq "") {
 print "Sorry, no info available about $tag\n";
 }
 else {
 print $result;
 }
}
elsif ($0 eq 'killserver') {
 print S "die";
}
else {
 die "I was invoked with strange name: $0\n ";

}

close(S);



























































October, 1993
Comparing Object-oriented Languages


The familiar linked-list is a good yardstick for comparison




Michael Floyd


Mike is executive editor for DDJ and can be contacted through the DDJ offices
or on CompuServe at 76703,4057.


While many programmers proclaim C++ to be the object-oriented language of the
'90s, others find the quest for the perfect language to be elusive. In fact,
deciding on the right language can be downright confusing. "Pure" OO languages
like Smalltalk include a rich set of reusable classes, while hybrids such as
C++ force you to roll your own. Languages such as Eiffel promote (or impose)
their notion of software correctness, which is supported by the syntax of the
language. Some languages are compiled, others interpreted, and a couple even
provide both. Additionally, many languages are adding support for concurrency
and multithreading. And while most languages are general purpose, others like
Drool are context-specific. In short, there's an endless choice of options and
the right language largely depends upon your specific needs.
To aid in your search, DDJ invited a number of programmers to implement a
simple linked-list class in their favorite language. The linked-list was
chosen because it's widely used and understood, making it easier to compare
how different languages, approach a given problem. This, in effect, allows the
code to speak for itself. To this end, you can examine and compare approaches
in C++. Smalltalk, Eiffel, Sather, Objective-C, Parasol, Beta, Turbo Pascal,
C+@, Liana, Ada, and, yes, even Drool.


Specifications and Requirements


Because most traditional compiled languages bind data elements at compile
time, linked-lists are usually homogenous; that is, a given list can only
store elements of a single data type. Consequently, programmers are constantly
rewriting linked-list code to support new and different data types, and
applications must duplicate code that manages these lists. Object-oriented
languages provide run-time binding (often called "late" or "dynamic binding")
which allows the binding of data elements to be deferred. This feature, which
allows the creation of a heterogeneous list, lets you store different object
types.
We asked our participants to create a double-ended linked-list class that's
capable of handling multiple object types. The class needed to include methods
for creating a new list, adding a list element to the end of the list, and
printing individual list elements. Of course, a full-featured linked-list
class would require methods to get the next and previous elements, get the
head and tail of the list, append, delete, count elements in the list, check
for an empty list, and iterators to iterate over the elements in the list. DDJ
contributing editor David Betz supplied the C++ version (see Listing One, page
118), which served as an admittedly less-than-scientific specification. We
asked our various programmers to duplicate the functionality of Listing One,
although not necessarily the program structure or even variable names. This
allowed them to approach the linked-list problem in the manner natural to each
language.
Since linked-lists are well-known and widely used dynamic data structures, I
won't belabor Listing One's code; however, I will point out, that Listing One
declares two object types, MyNumber and MyPoint. main() constructs instances
of both objects and stores them in their respective lists--numbers are stored
in list1 and points are placed in list2. To demonstrate that the lists are
heterogeneous, main() also places points in list1 numbers in list2, and
finishes off by storing list1 in list2. Finally, the lists are printed and the
program terminates.


What to Look For


It was up to the individual programmer to implement the list-manager code
which, of course, depends on the facilities available in a particular
language. For example, Smalltalk includes an OrderedCollection class that
provides most of the functionality required for this project. Therefore, the
Smalltalk version in Listing Seven, page 122, merely subclasses from class
OrderedCollection, and later creates a list instance, instantiates the
objects, and adds them to the list. In most languages, however, containers are
not a standard part of the language. This is reflected in the number of lines
of code for each language.
Table 1 illustrates this point by summarizing the number of lines of code each
language used in the linked-list project. It's important to keep in mind that
the line count is a deceiving number since it doesn't reflect the underlying
mechanisms being executed. (Not to mention that in most cases blank lines have
been deleted simply to conserve space.) Facilities such as a garbage
collector, exception handlers, built-in support for collections and containers
(as noted above), and inheritance mechanisms all affect the line count. In the
case of automatic garbage collection, for example, the compiler allocates and
manages memory, which frees the programmer from worries such as orphaned
pointers, memory leaks, and the like. It also results in slightly less source
code since constructors (if any) are reduced to initializers and destructors
go away.
Exception handlers beef up the source code a little, but the gain is worth the
pain. Exceptions provide a way of dealing with a failed execution of an
instruction or routine. Without an exception handler, the failure may cause
the system to terminate unpredictably. In Table 1, note that, while exceptions
are provided for in Stroustrup's The annotated C++ Reference Manual (often
referred to as the ARM), they're still listed as "experimental" and many C++
vendors have yet to implement them.
The most dramatic affect on line count, however, is the syntax of the
language. For example, the C++ version makes use of initializers in class
declarations: MyListElement() includes {data = initialData;} all on a single
line within the declaration. However, the Turbo Pascal version I wrote
(Listing Three, page 119) initializes data within its MyListElement Init
constructor. The constructor requires a minimum of four lines of code. (Note
that source code compactness does not translate into fewer instructions.) I
could site other example, but you get the idea.


Programming Notes


Implementing the linked-list code in Smalltalk can be done in either of two
ways. The first approach is to conduct a straightforward translation of the
C++ version of Smalltalk. However, an experienced Smalltalk programmer might
choose an alternative approach, which is to do the minimum necessary to sat
can be done in four lines of code. The test program can be satisfied by a
protocol consisting of three methods: Create a list, add an element to a list,
and (recursively) print the elements of a list. Most of this behavior is
already available in class Ordered-Collection, one of the standard classes in
Smalltalk/V. Listing Seven, page 122, which was provided by DDJ senior
technical editor Ray Valdes, shows how to subclass OrderedCollection and
implement the modicum of additional behavior (the printOn: method). Of course,
in a real-world setting, a link-list class would require a richer protocol.
This would require additional code, as would be the case in the other
languages, and would probably require building on top of the from-scratch
version shown in Listing Eight, page 122, rather than subclassing
OrderedCollection.
Also, notice in Table 1, that we've included Ada in the linked-list project.
While not object oriented, Ada is still considered to be object based. Because
Ada currently doesn't support run-time binding, List_Example in Listing
Thirteen (page 124) can only handle homogenous lists. According to Mike Ruf at
Rational Systems, there are ways to simulate run-time polymorphism in Ada83
(see comments in Listing Thirteen). It's also worth noting that the next Ada
standard is expected to support single inheritance and run-time polymorphism.


Conclusion


As you browse the listings you'll find Robert Jervis's Parasol implementation
(Listing Four, page 120), Jim Fleming's C+@ implementation (Listing Five, page
120), a Liana implementation by Jack Krupansky of Base Technology (Listing
Six, page 121), an Eiffel implementation by Robert Howard of Tower Technology
(Listing Nine, page 122), a Sather implementation by Stephen Omohundro of ICSI
(Listing Ten, page 124), a Beta implementation from Steve Mann of MADA
(Listing Eleven, page 124), and David Betz's Drool (Listing Twelve, page 124).
You'll also find more coverage on each of the these languages elsewhere in
this issue. In addition to the C++, Turbo Pascal, Smalltalk, and Ada listings
previously mentioned in this article, Stephen Asbury of Next Computer presents
an Objective-C Implementation (Listing Two, page 118).
Finally, the linked-list project is not a contest to find the best
object-oriented language; rather, it's a window into the soul of each
language. And while the number of lines of code may not provide a realistic
comparison between languages, it is how many managers gauge programmer
productivity. This, in fact, may be the poorest of measures, especially when
you factor in code reuse.
Table 1:
Key object-oriented features. 1=interpreted. C=compiled.
 Garbage Late Exception Lines of
 Collection Inheritance Binding Compiled Handling Code (approx.)
C++ No Multiple Yes C Yes 85
Objective-C No Single Yes C No 120
Turbo Pascal No Single Yes C No 140
Parasol No Single Yes C No 75
C+@ Yes Multiple Yes C No 110
Liana Yes Single Yes I No 33

Smalltalk Yes Single Yes I/C No *44
Eiffel Yes Multiple Yes **I/C Yes 155
Sather Yes Multiple Yes I/C Yes 80
Beta Yes Single Yes C Yes 55
Drool Yes Multiple Yes Byte Code No 45
Ada No ***None ***No C Yes 125
*Line count given is for the minimal implementation; full implementation is
145 lines of code. **Eiffel uses its "melting ICE" technology to interpret
small chunks of code for testing purposes. ***The next Ada standard is
expected to support both single inheritance and run-time binding.
_COMPARING OBJECT-ORIENTED LANGUAGES_
by Michael Floyd


[LISTING ONE]

C++ Implementation by David Betz (Dr. Dobb's Journal)


#include <iostream.h>

/* any object that will appear in a list must inherit from this class */
class MyListData {
public:
 virtual void Print(void);
};
void MyListData::Print(void)
{
 cout << "No print method!\n";
}
class MyListElement {
 MyListData *data;
 MyListElement *next;
public:
 MyListElement(MyListData *initialData) { data = initialData; next = nil; }
 friend class MyList;
};
class MyList : public MyListData {
 MyListElement *head;
 MyListElement *tail;
public:
 MyList(void) { head = tail = nil; }
 void AddToList(MyListData *data);
 void Print(void);
};
void MyList::AddToList(MyListData *data)
{
 MyListElement *newElement = new MyListElement(data);
 if (head == nil)
 head = tail = newElement;
 else {
 tail->next = newElement;
 tail = newElement;
 }
}
void MyList::Print(void)
{
 MyListElement *element;
 for (element = head; element != nil; element = element->next)
 element->data->Print();
}
class MyNumber : public MyListData {

 int value;
public:
 MyNumber(int initialValue) { value = initialValue; }
 virtual void Print(void);
};
void MyNumber::Print(void)
{
 cout << "Number: " << value << "\n";
}
class MyPoint : public MyListData {
 int x;
 int y;
public:
 MyPoint(int initialX,int initialY) { x = initialX; y = initialY; }
 virtual void Print(void);
};
void MyPoint::Print(void)
{
 cout << "Point: " << x << "," << y << "\n";
}
void main(void)
{
 MyList *list1 = new MyList;
 MyList *list2 = new MyList;
 MyNumber *n1 = new MyNumber(10);
 MyNumber *n2 = new MyNumber(20);
 MyPoint *p1 = new MyPoint(2,3);
 MyPoint *p2 = new MyPoint(4,5);
 /* build the lists */
 list1->AddToList(n1);
 list1->AddToList(n2);
 list1->AddToList(p1);
 list2->AddToList(n2); /* an object can be in more than one list */
 list2->AddToList(p1); /* at the same time */
 list2->AddToList(p2);
 list2->AddToList(list1); /* we can even put a list into another list */
 /* print the lists */
 cout << "list1:\n";
 list1->Print();
 cout << "list2:\n";
 list2->Print();
}


[LISTING TWO]

Objective-C Implementation by Stephen Asbury (Next Computer)

#import <stdio.h>
#import <objc/Object.h>/* Include the header for the Root class */

@interface LinkedListNode:Object /* declare a node class */
{
 id value;/* a node can store an object of any class as it's
value */
 id next;
}
- setValue:newValue;
- value;

- setNext:newNext;
- next;
- print;
@end
@implementation LinkedListNode /* implement the node class */
- setValue:newValue {value = newValue; return self;}
- value {return value;}
- setNext:newNext {next = newNext; return self;}
- next {return next;}
- print
{ /* In order to allow any object to be added to the list, we first check
 * to see if the value can print, if so have it print */
 if([value respondsTo:@selector(print)] == YES) [value print];
 return self;
}
@end
@interface LinkedList:Object /* declare a linked list class */
{id tail,head;}
- addObjectToList:theObject;
- print;
@end
@implementation LinkedList /* implement the linked list class */
- init /* refine the init method of the super class */
{
 self = [super init];
 tail = head = nil;
 return self;
}
- addNodeToList:newNode /* private method used only by the object */
{
 if(head == nil)
 head = tail = newNode;
 else{
 [tail setNext:newNode];
 tail = newNode;
 }
 return self;
}
- addObjectToList:theObject
{
 id newNode = [[[LinkedListNode alloc] init]
setValue:theObject];
 [self addNodeToList:newNode];
 return self;
}
- print
{
 id element;
 for(element = head;element != nil;element = [element next])
 [element print];
 return self;
}
@end
@interface MyNumber:Object /* declare a number class */
{float value;}
- initValue:(float)aValue;
- print;
@end
@implementation MyNumber /* implement the number class */

- initValue:(float)aValue
{
 self = [super init];
 value = aValue;
 return self;
}
- print {printf("Number %f\n",value); return self;}
@end

@interface Point:Object /* declare a point class */
{float X,Y;}
- initX:(float)anX y:(float)anY;
- print;
@end
@implementation Point /* implement the point class */
- initX:(float)anX y:(float)anY
{
 self = [super init];
 X = anX;
 Y = anY;
 return self;
}
- print { printf("Point %f , %f\n",X,Y); return self; }
@end
void main(void)
{
 id list1 = [[LinkedList alloc] init];
 id list2 = [[LinkedList alloc] init];
 id number1 = [[MyNumber alloc] initValue:10];
 id number2 = [[MyNumber alloc] initValue:20];
 id point1 = [[Point alloc] initX:2 y:3];
 id point2 = [[Point alloc] initX:4 y:5];
 /* Build the Lists */
 [list1 addObjectToList:number1];
 [list1 addObjectToList:number2];
 [list1 addObjectToList:point1];
 [list2 addObjectToList:number2];/* an object can be in
multiple list */
 [list2 addObjectToList:point1];
 [list2 addObjectToList:point2];
 [list2 addObjectToList:list1];/* lists can contatin lists */
 printf("List 1\n");
 [list1 print];
 printf("List 2\n");
 [list2 print];
 exit(0);
}

[LISTING THREE]

Turbo Pascal Implementation by Michael Floyd (Dr. Dobb's Journal)

Program LListObj;

Type
 PMyListData = ^MyListData;
 MyListData = object
 procedure Print; virtual;
 end;

 PMyListElement = ^MyListElement;
 MyListElement = object
 Data : PMyListData;
 Next : PMyListElement;
 constructor Init(initialData: PMyListData);
 end;
 PMyList = ^MyList;
 MyList = object(MyListData)
 Head, Tail: PMyListElement;
 constructor Init;
 procedure AddToList(var Data: PMyListData);
 procedure Print; virtual;
 end;
 PMyNumber = ^MyNumber;
 MyNumber = object(MyListData)
 Value : Integer;
 constructor Init(initialValue: Integer);
 procedure Print; virtual;
 end;
 PMyPoint = ^MyPoint;
 MyPoint = object(MyListData)
 X, Y: Integer;
 constructor Init(initialX, initialY: Integer);
 procedure Print; virtual;
 end;
{ MyListElement Methods }
constructor MyListElement.Init(initialData: PMyListData);
begin
 data := initialData;
 next := nil;
end;
{ MyListData Methods }
procedure MyListData.Print;
begin
 writeln('No Print Method!');
end;
{ MyList Methods }
constructor MyList.Init;
begin
 Head:= nil;
 Tail:= nil;
end;
procedure MyList.AddToList(var Data: PMyListData);
var
 Added : PMyListElement;
begin
 Added := New(PMyListElement,Init(Data));
 If Head = nil then
 begin
 Head := Added;
 Tail := Added;
 end
 Else begin
 Tail^.Next := Added;
 Tail := Added;
 end;
 Added^.Next := nil;
end;
procedure MyList.Print;

var
 Current : PMyListElement;
begin
 Current := Head;
 while Current <> nil do
 begin
 Current^.Data^.Print;
 Current := Current^.Next;
 end;
end;
{ MyNumber Methods }
constructor MyNumber.Init(initialValue: Integer);
begin
 Value := initialValue;
end;
procedure MyNumber.Print;
begin
 writeln('Number: ', Value);
end;
{ MyPoint Methods }
constructor MyPoint.Init(initialX, initialY: Integer);
begin
 X := initialX;
 Y := initialY;
end;
procedure MyPoint.Print;
begin
 writeln('Point: ', X, ',',Y);
end;
{ Main }
Var
 NumList, PointList : MyList;
 Number, Coordinates: PMyListData;
 I, J : Integer;
Begin
 with NumList do begin { create list of numbers }
 Init;
 For I := 1 to 5 do
 begin
 Number := New(PMyNumber, Init(I));
 AddToList(Number);
 end;
 Print;
 end;
 with PointList do begin { create list of points }
 Init;
 For I := 1 to 5 do
 begin
 J := I + 1;
 Coordinates := New(PMyPoint, Init(I,J));
 AddToList(Coordinates);
 end;
 Print;
 end;
 NumList.AddToList(Coordinates); { Coords can be added to Number list }
 NumList.Print;
 readln;
end.


[LISTING FOUR]

Parasol Implementation by Robert Jervis


include file;
/* any object that will appear in a list must inherit from this class */
MyListData: type { public:
Print: dynamic () = {
 printf("No print method!\n");
 }
};
MyList: public type inherit MyListData {
 head: ref MyListElement;
 tail: ref MyListElement;
 MyListElement: type {
 public:
 data: ref MyListData;
 next: ref MyListElement;
 constructor: (initialData: ref MyListData) =
 { data = initialData; next = 0; }
 };
public:
constructor: () = { head = tail = 0; }

AddToList: (data: ref MyListData) = {
 newElement: ref MyListElement = new MyListElement[data];
 if (head == 0)
 head = tail = newElement;
 else {
 tail->next = newElement;
 tail = newElement;
 }
 }
Print: dynamic () = {
 element: ref MyListElement;
 for (element = head; element != 0; element = element->next)
 element->data Print();
 }
};
MyNumber: type inherit MyListData {
 value: int;
public:
constructor: (initialValue: int) = { value = initialValue; }
Print: dynamic () = {
 printf("Number: %d\n", value);
 }
};
MyPoint: type inherit MyListData {
 x: int;
 y: int;
public:
constructor: (initialX: int, initialY: int) = { x = initialX; y = initialY; }
Print: dynamic () = {
 printf("Point: %d,%d\n", x, y);
 }
};
main: entry () =
{

 list1: ref MyList = new MyList[];
 list2: ref MyList = new MyList[];
 n1: ref MyNumber = new MyNumber[10];
 n2: ref MyNumber = new MyNumber[20];
 p1: ref MyPoint = new MyPoint[2,3];
 p2: ref MyPoint = new MyPoint[4,5];
 /* build the lists */
 list1 AddToList(n1);
 list1 AddToList(n2);
 list1 AddToList(p1);
 list2 AddToList(n2); /* an object can be in more than one list */
 list2 AddToList(p1); /* at the same time */
 list2 AddToList(p2);
 list2 AddToList(list1); /* we can even put a list into another list */
 /* print the lists */
 printf("list1:\n");
 list1 Print();
 printf("list2:\n");
 list2 Print();
}

[LISTING FIVE]

C+@ Implementation by Jim Fleming (Unir Corp.)


class MyListData {
 method print
 {
 "No print method".print;
 }
}
class MyListElement {
 inherit MyListData;
 MyListData data;
 MyListElement next;

 class method (_) new (initialData) {
 _ = create
 _.init(initialData);
 }
 method init (idata) {
 data = idata;
 next = nil;
 }
 method (_) data {
 _ = data;
 }
 method(_) next {
 _ = next;
 }
 method linkTo (listElement) {
 next = listElement;
 }
}
class MyList {
 MyListElement head;
 MyListElement tail;


 class method (_) new {
 _ = create;
 _.init;
 }
 method init {
 head = nil;
 tail = nil;
 }
 method addToList (data) {
 var newElement;
 newElement = MyListElement.new(data);
 if(head == nil) {
 head = newElement;
 tail = head;
 }
 else{
 tail.linkTo(newElement);
 tail = newELement;
 }
 }
 method print {
 var element;
 for (element = head; element != nil; element = element.next) {
 element.data.print;
 }
 }
}
class MyNumber {
 Integer value;
 class method (_) new (initialValue) {
 _= create;
 .init (initialValue)
 }
 method init (initialValue) {
 value = initialValue;
 }
 method print {
 ("Number: " // value // "\n").print;
 }
}
class MyPoint {
 Integer x;
 Integer y;
 class method (_) new (initialX,initialY) {
 _= create;
 .init(initialX,initialY);
 }
 method init (initialX,initialY) {
 x = initialX;
 y = initialY;
 }
 method print {
 ("Point: "// x // "," // y // "\n").print
 }
}
/* The following can be typed directly into command shell */
list1 = Mylist.new;
list2 = Mylist.new;
n1 = Mynumber.new(10);

n2 = Mynumber.new(20);
p1 = MyPoint.new(2,3);
p2 = MyPoint.new(4,5);

list1.addToList(n1);
list1.addToList(n2);
list1.addToList(p1);
list2.addToList(n2); /* an object can be in more than one list */
list2.addToList(p1);
list2.addToList(p2);
list2.addToList(list1); /* we can even put a list into another list */
/* print the lists */
"list1;\n".print;
list1.print;
"list2;\n".print;
list2.print;


[LISTING SIX]

Liana Implementation by Jack Krupansky (Base Technology)

class MyList : array
{
 Print
 {
 for (int i = 0, int n = size; i < n; i++)
 if ((any e = this [i]).isa ("MyList"))
 e.Print();
 else
 cout << e.class_name+": "+e.text+"\n";
 }
};
//-------------------------------------------
void main (void)
{
 MyList list1 = new MyList;
 MyList list2 = new MyList;
 int n1 = 10;
 int n2 = 20;
 point p1 = new point (2,3);
 point p2 = new point (4,5);

 /* build the lists */
 list1 << n1 << n2 << p1;

 /* an obj can be in more than one lst at same time */
 list2 << n2 << p1 << p2;

 list2 << list1; /* we can even put a list into another list */

 /* print the lists */
 cout << "\nLIST1:\n"; list1.Print;
 cout << "\nLIST2:\n"; list2.Print;
}

[LISTING SEVEN]

Minimal Smalltalk implementation by Ray Valdes (Dr. Dobb's Journal)


OrderedCollection subclass: #MyListClass
 instanceVariableNames: ''
 classVariableNames: ''
 poolDictionaries: '' !
!MyListClass class methods ! !
!MyListClass methods !
printOn: aStream
 " Method to display list elements, recursing if necessary "
 aStream cr; nextPutAll: ' ',(self class name),' '; cr.
 self do: [:c c printOn: aStream. aStream cr ].
 ^self.
!
test
 " Program to test the linked-list class "
 
 aList1 aList2 aNumber1 aNumber2 aPoint1 aPoint2 aMsg
 
 aList1 := MyListClass new.
 aList2 := MyListClass new.
 aNumber1 := 10.
 aNumber2 := 20.
 aPoint1 := (2 @ 3).
 aPoint2 := (4 @ 5).
 " build the lists "
 aList1 add: aNumber1; add: aNumber2; add: aPoint1.
 " an object can be in more than one list at same time "
 aList2 add: aNumber2; add: aPoint1; add: aPoint2.
 " we can even put a list into another list "
 aList2 add: aList1.
 " print the lists "
 aList1 printOn: Transcript.
 aList2 printOn: Transcript.
! !


[LISTING EIGHT]

Complete Smalltalk/V implementation by Ray Valdes (Dr. Dobb's Journal)

"-------------------------------------------------------------"
Object subclass: #MyListClasses
 instanceVariableNames: ''
 classVariableNames: ''
 poolDictionaries: '' !
!MyListClasses class methods ! !
!MyListClasses methods ! !
"-------------------------------------------------------------"
MyListClasses subclass: #TestList
 instanceVariableNames: ''
 classVariableNames: ''
 poolDictionaries: '' !
!TestList class methods ! !
!TestList methods !
test
 " Program to test the linked-list class "
 
 aList1 aList2 aNumber1 aNumber2 aPoint1 aPoint2
 

 aList1 := MyList new.
 aList2 := MyList new.
 aNumber1 := MyNumber new: 10.
 aNumber2 := MyNumber new: 20.
 aPoint1 := MyPoint new: (2 @ 3).
 aPoint2 := MyPoint new: (4 @ 5).
 " build the lists "
 aList1 add: aNumber1; add: aNumber2; add: aPoint1.
 " an object can be in more than one list at same time "
 aList2 add: aNumber2; add: aPoint1; add: aPoint2.
 " we can even put a list into another list "
 aList2 add: aList1.
 " print the lists "
 aList1 print.
 aList2 print! !
"------------------------------------------------------------------"
MyListClasses subclass: #MyListElement
 instanceVariableNames: 'data next'
 classVariableNames: ''
 poolDictionaries: '' !
!MyListElement class methods !
new: someData
 ^super new initialize: someData! !
!MyListElement methods !
initialize: someData
 data := someData.
 ^self!
next
 ^next!
next: anElement
 next := anElement.
 ^self!
print
 data print.
 ^self! !
"-------------------------------------------------------------"
MyListClasses subclass: #MyListData
 instanceVariableNames: ''
 classVariableNames: ''
 poolDictionaries: '' !
!MyListData class methods ! !
!MyListData methods !
print
 Transcript cr; nextPutAll: 'Must override this method'; cr.
 ^self! !
"-------------------------------------------------------------"
MyListData subclass: #MyList
 instanceVariableNames: 'head tail'
 classVariableNames: ''
 poolDictionaries: '' !
!MyList class methods ! !
!MyList methods !
add: someData
 anElement 
 anElement := MyListElement new: someData.
 head isNil
 ifTrue: [ head := anElement. tail := anElement ]
 ifFalse: [ tail next: anElement. tail := anElement].
 ^self!

print
 anElement 
 anElement := head.
 [anElement isNil]
 whileFalse: [
 anElement print.
 anElement := anElement next .
 Transcript cr.
 ].
 ^self! !
"-------------------------------------------------------------"
MyListData subclass: #MyNumber
 instanceVariableNames: 'value'
 classVariableNames: ''
 poolDictionaries: '' !
!MyNumber class methods !
new: aValue
 ^super new initialize: aValue! !
!MyNumber methods !
initialize: aValue
 value := aValue.
 ^self!
print
 value printOn: Transcript.
 ^self! !
"-------------------------------------------------------------"
MyListData subclass: #MyPoint
 instanceVariableNames: 'point'
 classVariableNames: ''
 poolDictionaries: '' !
!MyPoint class methods !
new: aPoint
 ^super new initialize: aPoint! !
!MyPoint methods !
initialize: aPoint
 point := aPoint.
 ^self!
print
 point printOn: Transcript.
 ^self! !


[LISTING NINE]

Eiffel Implementation by Robert Howard (Tower Technology)


class DRIVER
 -- In Eiffel, the top level driver is an object, too.
inherit
 BASIC_IO
creation
 make
feature {ANY}
 make is
 -- run this test driver
 local
 list1, list2 : MY_LIST[PRINTABLE] ;
 n1, n2 : MY_NUMBER ; -- a kind of PRINTABLE

 p1, p2 : MY_POINT ; -- also a kinf of PRINTABLE
 do
 -- create the various objects
 !!list1 ;
 !!list2 ;
 !!n1.set( 10 ) ;
 !!n2.set( 20 ) ;
 !!p1.set( 2, 3 ) ;
 !!p2.set( 4, 5 ) ;

 list1.add_to_list( n1 ) ;
 list1.add_to_list( n2 ) ;
 list1.add_to_list( p1 ) ;
 list2.add_to_list( n2 ) ; -- objects can be in more than one list
 list2.add_to_list( p1 ) ;
 list2.add_to_list( p2 ) ;
 list2.add_to_list( list1 ) ; -- list 1 is an element of list 2

 put_string( "list1:%N" ) ;
 list1.print_self ;
 put_string( "list2:%N" ) ;
 list2.print_self ;
 end -- make
end -- DRIVER
deferred class PRINTABLE
 -- insures that 'print_self' is implemented
inherit
 BASIC_IO
feature {ANY}
 print_self is
 -- print yourself
 deferred
 end -- print_self
end -- PRINTABLE
class MY_NUMBER
 -- holds and can print an integer
inherit
 PRINTABLE
creation
 set
feature {ANY}

 value : INTEGER ;
 set( new_value : INTEGER ) is
 -- set this number
 do
 value := new_value ;
 end ; -- make
 print_self is
 -- print the value
 do
 put_string( "Number: " ) ;
 put_int( value ) ;
 put_newline ;
 end ; -- print_self
end -- MY_NUMBER
class MY_POINT
 -- holds and can print an x,y pair
inherit

 PRINTABLE
creation
 set
feature {ANY}
 x, y : INTEGER ;
 set( new_x : INTEGER; new_y : INTEGER ) is
 -- set this point
 do
 x := new_x ;
 y := new_y
 end ; -- set
 print_self is
 -- print the value
 do
 put_string( "Point: " ) ;
 put_int( x ) ;
 put_char( ',' ) ;
 put_int( y ) ;
 put_newline ;
 end ; -- print_self
end -- MY_POINT
class ELEMENT[T]
 -- holds an object reference and a
 -- single link to another ELEMENT[T]
creation
 set_data
feature {LIST}
 data : T ;
 next : ELEMENT[T] ;
 set_data( new_data : T ) is
 -- set data to the new_data
 do
 data := new_data
 end ; -- set_data
 set_next( new_next : ELEMENT[T] ) is
 -- set next to the element
 do
 next := new_next

 end ; -- set_next
end -- class ELEMENT
class LIST[T]
 -- a generic linked list class
feature {ANY}
 add_to_list( data : T ) is
 -- add to the end of the list
 local
 new_element : ELEMENT[T] ;
 do
 !!new_element.set_data( data ) ;
 if head = void
 then
 head := new_element ;
 else
 tail.set_next( new_element ) ;
 end
 tail := new_element ;
 end ; -- add_to_list
feature {NONE}

 head, tail : ELEMENT[T] ;
invariant
 tail_next_is_void: tail.next = void ;
 tail_void_when_head_void: head = void implies tail = void
end -- LIST
class MY_LIST[T->PRINTABLE]
 -- A printable list which holds printable data
inherit
 LIST[T]
 PRINTABLE
feature {ANY}
 print_self is
 -- print the list elements
 local
 el : ELEMENT[T] ;
 do
 from
 el := head
 until
 el = void
 loop
 el.data.print_self
 el := el.next ;
 end ;
 end ; -- print_self
end -- MY_LIST


[LISTING TEN]

Sather Implementation by Stephen Omhundro (ICSI)

-------------------------------------------------------------------
abstract class $MY_LIST_DATA is
 -- Data that will appear in a list must inherit this class.

 print is
 -- Print the data on OUT. Should be redefined in descendants.
 #OUT + "No print routine defined!\n" end;
end;
-------------------------------------------------------------------
class MY_LIST_ELEMENT is
 -- An element of a MY_LIST.

 attr data:$MY_LIST_DATA;
 attr next:MY_LIST_ELEMENT;
end;
-------------------------------------------------------------------
class MY_LIST is
 -- A double ended linked list.
 inherit $MY_LIST_DATA;

 private head, tail:MY_LIST_ELEMENT;

 add_to_list($MY_LIST_DATA) is
 -- Append arg to the end of self.
 new_elt::=#MY_LIST_ELEMENT(data:=arg);
 if head=void then head:=new_elt
 else tail.next:=new_elt end;

 tail:=new_elt end;

 print is
 -- Print the data held in the list on OUT.
 elt::=head;
 loop while!(elt/=void); elt.data.print;
 elt:=elt.next end end;
end;
-------------------------------------------------------------------
class MY_NUMBER is
 -- An integer value that can be held in a MY_LIST.
 inherit $MY_LIST_DATA;

 attr value:INT;

 print is
 -- Print the integer value on OUT.
 #OUT + "Number: " + value + "\n" end;
end;
-------------------------------------------------------------------
class MY_POINT is
 -- A two-dimensional point that can be held in a MY_LIST.
 inherit $MY_LIST_DATA;

 attr x,y:INT;

 print is
 -- Print the point coordinates on OUT.
 #OUT + "Point: " + x + "," + y + "\n" end;
end;
-------------------------------------------------------------------
class MY_LIST_TEST is
 -- Test of MY_LIST.

 main is
 -- Add some numbers and points to lists and print them out.
 list1::=#MY_LIST; list2::=#MY_LIST;
 n1::=#MY_NUMBER(10); n2::=#MY_NUMBER(20);
 p1::=#MY_POINT(2,3); p2::=#MY_POINT(4,5);
 -- Build the lists:
 list1.add_to_list(n1); list1.add_to_list(n2);
 list1.add_to_list(p1);
 list2.add_to_list(n2); -- An object can be in more than one list.
 list2.add_to_list(p1); list2.add_to_list(p2);
 list2.add_to_list(list1); -- Put a list in another list.
 -- Print the lists
 #OUT + "list1:\n";
 list1.print;
 #OUT + "list2:\n";
 list2.print end;
end;


[LISTING ELEVEN]

Beta Implementation by Steve Mann (MADA)

--- program:descriptor---
(#

 MyListData: (# Print:< (# do INNER #) #);
 MyListElement: (# data: ^MyListData; next: ^MyListElement; #);
 MyList: MyListData (#
 head, tail: ^MyListElement;
 AddToList: (#
 newItem: ^MyListData
 enter newItem[]
 do (if head[] = NONE // TRUE then
 &MyListElement[] -> head[] -> tail[];
 newItem[] -> head.data[];
 else
 &MyListElement[]-> tail.next[] -> tail[];
 newItem[] -> tail.data[];
 if);
 #);
 Print::< (#
 element: ^MyListElement
 do head[] -> element[];
 loop (# while::< (# do element[] <> NONE -> value #);
 do element.data.print; element.next[] -> element[];
 #); #); #);
 MyNumber: MyListData (#
 value: @integer;
 Print::< (# do 'Number: ' -> PutText; value -> PutInt; NewLine; #);
 enter value
 #);
 MyPoint: MyListData (#
 x, y: @integer;
 Print::< (# do 'Point: ' -> PutText; x -> PutInt;
 ', ' -> PutText; y -> PutInt;NewLine;
 #);
 enter (x, y)
 #);
(********************** MAIN **************************)
 list1, list2: @MyList;
 n1, n2: @MyNumber; p1, p2: @MyPoint;
do
 10 -> n1; 20 -> n2;
 (2, 3) -> p1; (4, 5) -> p2;

 n1[] -> list1.AddToList;n2[] -> list1.AddToList;
 p1[] -> list1.AddToList;

 n2[] -> list2.AddToList;p1[] -> list2.AddToList;
 p2[] -> list2.AddToList;list1[] -> list2.AddToList;

 'list1: ' -> PutLine; list1.Print;
 'list2: ' -> PutLine; list2.Print;
#)


[LISTING TWELVE]

Drool Implementation by David Betz (Dr. Dobb's Journal)


(defobject MyListElement ()
 (property next nil
 data nil))

(defobject MyList ()
 (property head nil
 tail nil))
(defmethod (MyList 'AddToList data)
 (let ((newElement (clone MyListElement 'data data))
 (tail (getp self 'tail)))
 (if tail
 (setp! tail 'next newElement)
 (setp! self 'head newElement))
 (setp! self 'tail newElement)))
(defmethod (MyList 'Print)
 (let ((element (getp self 'head)))
 (while element
 ((getp element 'data) 'Print)
 (set! element (getp element 'next)))))
(defobject MyNumber ()
 (property value 0))
(defmethod (MyNumber 'Print)
 (print "Number: " (getp self 'value) "\n"))
(defobject MyPoint ()
 (property x 0
 y 0))
(defmethod (MyPoint 'Print)
 (print "Point: " (getp self 'x) "," (getp self 'y) "\n"))
(define (main)
 (let ((list1 (clone MyList))
 (list2 (clone MyList))
 (n1 (clone MyNumber 'value 10))
 (n2 (clone MyNumber 'value 20))
 (p1 (clone MyPoint 'x 2 'y 3))
 (p2 (clone MyPoint 'x 4 'y 5)))
 (list1 'AddToList n1)
 (list1 'AddToList n2)
 (list1 'AddToList p1)
 (list2 'AddToList n2)
 (list2 'AddToList p1)
 (list2 'AddToList p2)
 (list2 'AddToList list1)
 (print "list1\n")
 (list1 'Print)
 (print "list2\n")
 (list2 'Print)))


[LISTING THIRTEEN]

Ada Implementation by Mike Ruf (Rational)

-- Ada83 does not support run-time polymorphism, so only supports homogeneous
-- lists. Ada9X will run-time polymorphism. There are ways to simulate
run-time
-- polymorphism in Ada83, which is discussed in: Seidewitz, Ed,
-- "Object-Oriented Programming Through Type Extension in Ada 9X," Ada
-- Letters, Volume 11, Number 2, Mar/Apr 1991, pages 86-97. Hirasuna, Michael,
-- "Using Inheritance and Polymorphism with Ada in Government Sponsored
-- Contracts", Ada Letters, Volume 12, Number 2, Mar/Apr, 1992, pp. 43-56.
-- Mike Ruf, Rational Systems
-- This package implements a singly-linked list abstraction.
--
generic

 type Element is private;
 with procedure Print (This : Element);
package Printable_List_Generic is
 type List is private;
 function New_List return List;
 -- Returns an empty list.
 procedure Append (To_List : in out List;
 This_Element : in Element);
 -- Propagates the Storage_Error exception if it occurs.
 procedure Print (This : in List);
 -- For each element in the list, calls the Print procedure,
 -- which is supplied as a generic formal parameter.
private
 type Node;
 type Pointer is access Node;
 type Node is
 record
 Contents : Element;
 Next : Pointer := null;
 end record;
 type List is
 record
 First : Pointer := null;
 Last : Pointer := null;
 end record;
end Printable_List_Generic;
package body Printable_List_Generic is
 function New_List return List is
 begin
 return (List'(First => null, Last => null));
 end New_List;
 procedure Append (To_List : in out List;
 This_Element : in Element) is
 New_Node : Pointer := new Node'(Contents => This_Element, Next => null);
 -- The "next" component of the last node is always "null"
 begin
 if To_List.First = null then
 To_List.First := New_Node;
 To_List.Last := New_Node;
 else
 To_List.Last.Next := New_Node;
 To_List.Last := New_Node;
 end if;
 end Append;
 procedure Print (This : in List) is
 Current : Pointer := This.First;
 begin
 while (not (Current = null)) loop
 Print (Current.Contents);
 Current := Current.Next;
 end loop;
 end Print;
end Printable_List_Generic;

package Geometry is
 type Number is new Integer;
 procedure Print (This : Number);
 type Point is
 record

 X : Number := 0;
 Y : Number := 0;
 end record;
 procedure Print (This : Point);
end Geometry;

package body Geometry is
 procedure Print (This : Number) is
 begin
 Text_Io.Put_Line ("Number:" & Number'Image (This));
 end Print;
 procedure Print (This : Point) is
 begin
 Text_Io.Put_Line ("Point: " & Number'Image (This.X) &
 "," & Number'Image (This.Y));
 end Print;
end Geometry;

procedure List_Example;

with Geometry;
with Printable_List_Generic;

procedure List_Example is
 package Numbers is new Printable_List_Generic (Element => Geometry.Number,
 Print => Geometry.Print);
 package Points is new Printable_List_Generic (Element => Geometry.Point,
 Print => Geometry.Print);
 Number_List : Numbers.List := Numbers.New_List;
 Point_List : Points.List := Points.New_List;
 N1 : Geometry.Number := 10;
 N2 : Geometry.Number := 20;
 P1 : Geometry.Point := (2, 3);
 P2 : Geometry.Point := (4, 5);
begin
 -- Build the lists
 Numbers.Append (To_List => Number_List, This_Element => N1);
 Numbers.Append (To_List => Number_List, This_Element => N2);
 --
 Points.Append (To_List => Point_List, This_Element => P1);
 Points.Append (To_List => Point_List, This_Element => P2);
 -- Print the lists
 Numbers.Print (Number_List);
 Points.Print (Point_List);
end List_Example;

------------------------------- OUTPUT -------------------------------
Number: 10
Number: 20
Point: 2, 3
Point: 4, 5











October, 1993
PROGRAMMING PARADIGMS


Finding the Key to the Mini-Bar




Michael Swaine


Okay, okay, okay. There will be no politics in this month's offering. I don't
know what came over me recently. I realize that this column is no place for
bleeding-heart, pinko liberal ravings. That's what "Swaine's Flames" is for.
Here's what is on this month's paradigmatic menu:
A quest for chaos
Itty-bitty machines
Camshaft computers
And how to run really light.


Quest for Chaos


I've written here before about Interactive Physics, an impressive product from
KnowledgeGarden that simulates enough of the laws of classical physics to let
you do virtual experiments. KnowledgeGarden's Dave Bazouki has come up with a
variation on this theme in the form of a tool for the working engineer called
"Working Model."
Looking for a simple physical system to construct in order to get a feel for
the capabilities of Working Model, I turned to Scientific American, not the
most logical place to look, I suppose. The physics in Scientific American is
rarely classical; but a simple, classical-looking diagram immediately caught
my eye. What it was demonstrating was not a simple concept, though; it was an
illustration in an article about chaos theory.
The article, in the August 1993 issue, was called "Mastering Chaos" and was by
William L. Ditto and Louis M. Pecora. It described how chaotic systems can be
nudged into orderly behavior. Chaotic systems are nonlinear and highly
sensitive to initial conditions; they usually appear random. They aren't
random, though. The behavior in a chaotic system is a collection of many
orderly behaviors, and the authors describe ways to disturb a chaotic system
to make it follow one of its regular behaviors.
There are some interesting ideas in the article, but I halted when I saw the
figure on page 80. The figure shows a simple mechanical apparatus consisting
of a ball attached to a spring which is in turn attached to a board. The board
is moved back and forth parallel to the length of the spring, and the ball
bobbles back and forth with it.
Or sort of with it. At low speeds, the ball follows the motion of the board
pretty closely. But as you start jerking the board back and forth more
rapidly, the motion of the ball gets more erratic, and at a certain point it
becomes chaotic.
The figure shows the movement of the ball in two series of graphs. The first
plots position against time, and shows the motion starting out periodic,
getting more messy, and finally going chaotic and looking random. The second
plots velocity against position; it's a state-space map. It gives a different
look at the same positions and movements of the ball, but in this state-space
view, the motion of the ball never looks random. The chaotic stage looks like
the path of a satellite around a double star, never retracing itself, but
occupying only certain regions of (state) space.
So I implemented this simple system of block and spring and ball using Working
Model. My goal was to duplicate, as closely as possible, the diagram in the
magazine. For the diagram's series of graphs, my working model would have two
live graphs, each tied to the appropriate parameters of the model. When I set
the model running, the graphs would start drawing.
As it goes with these things, I spent too much time getting the objects the
right sizes and shapes and colors. This was not much different from using a
drawing program, and didn't teach me anything except that part of using
Working Model is like using a drawing program. Associating properties with
these objects and creating the necessary forces and setting up the graphs went
faster, and was more interesting.
Then I gave the block a nudge to get it moving, and it was all parameter
tweaking from then on. A lot of parameter tweaking. The motion really wanted
to fall into stable cycles, but eventually I got curves that looked like the
pictures in the magazine, demonstrating a chaotic attractor. I had achieved
chaos.
So what was the point? Just to check out the program, I suppose. I admit that
Interactive Physics and Working Model are only peripherally related to
programming, but I can't believe that these products aren't interesting to
anyone with an analytical mind. The idea of doing virtual physics experiments,
checking the diagrams in Scientific American, fascinates me at least. And more
and more, virtual experiments are becoming a part of what real scientists do.


Itty-Bitty Machines


What K. Eric Drexler does is something else, something he calls "theoretical
applied science"--research that attempts to describe technological
possibilities, with "possible" defined by a current understanding of physical
law, not by the limitations of present-day techniques. In some cases, it
wouldn't seem to require much research to show that a proposed device is
counter to known laws of physics (perpetual-motion machines); in other cases,
it might be more problematic (Star Wars). Drexler is interested in a case
where determining whether an ambitious technological program is consistent
with the known laws of physics is not just a tricky question, but a rich
domain of research in several disciplines. Drexler wants to build
molecule-scale machines and computers by moving molecules around. He calls it
"nanotechnology."
What people call Drexler is "Mister Nanotechnology." He has written
extensively on the field and is the president of Foresight Institute and
Research Fellow at the Institute for Molecular Manufacturing, both
organizations focusing on nanotechnology, both based in Palo Alto, California.
His book Nanosystems: Molecular Machinery, Manufacturing, and Computation
(Wiley Interscience, 1992) is the one to get if you're serious about this
stuff: a broad and deep technical exposition of a nascent technology that
sounds like sheer science fiction.
Specifically, what Drexler is talking about is "the construction of objects to
complex, atomic specifications using sequences of chemical reactions directed
by nonbiological molecular machinery."
On the face of it, it's hard to take seriously the idea of molecule-sized
machines moving molecules around and manufacturing tiny products from them.
The force that holds atoms together in molecules isn't like mortar; it's
selective in what it bonds to what, it's subject to being pulled apart under
quite natural circumstances, and a single bond doesn't provide any rigidity,
since the atoms can rotate around the bond.
But the domain of possible molecular structures is so rich that Drexler not
only can envision molecular machines, he has a choice of approaches to
building them. He leans toward keeping things simple by using diamond-like
substances in which you can, to some extent, keep track of where the atoms
are. It turns out that there are an enormous number of these diamond-like
substances.
There are also an enormous number of questions that have to be answered
preparatory to building nanosystems. And by no means the least of these is
where to start: Nanothinking is complicated by the fact that it will take
nanotechnology to build the tools that nanotechnology will employ. Do you
build the chicken first, or the egg?
Drexler has some at least tentative answers to these questions, but doesn't
close off too many lines of inquiry. In fact, the book looks like what you'd
have to read to begin developing a syllabus for a doctoral program in
molecular nanotechnology. (Drexler has a doctorate in molecular
nanotechnology, but I gather that he sort of put it together himself.)
Where his book gets really interesting is where Drexler starts building
things. He sets out to show, in principle, as a practical exercise in
theoretical applied science, how one could construct familiar mechanical
structures from assemblies of molecules. The familiar mechanical structures
that he describes include bearings, nuts and screws, springs, gears, rollers,
belts, cams, clutches, and ratchets.
He describes in convincing detail how these molecular devices could work.
Basically, these mechanisms are moving parts with sliding or rolling
interfaces. His design guidelines--because that's what they are--for these
mechanisms are based on careful study of the potential energy of interaction
between two mutually inert surfaces in contact.
And on the behavior and properties of such surfaces, Drexler has lost of data,
much of it in the book.


Camshaft Computers


Satisfied that nanocams and nanorods are possible, Drexler extends this line
of thinking to the construction of very small computers, and he extends it in
a surprisingly direct way. His envisioned nanocomputers are mechanical
devices, built from tiny sliding rods and cams and nanomotors. A bit is
encoded by displacement of a rod that can occupy one of two positions.
The reason appears to be that it's easier to work out the math that way. The
collections of molecules he's dealing with in his models will have electrical
properties, and these can, in principle, be used to carry and store
information; but so can sliding rods, and the tiny distances involved make
mechanical movement at least fast enough to be worth considering, so let's use
'em. So what if we're limited to signal transmission at the speed of sound, if
that means 1/2 ns to go 1 micron, and that's about the width of a CPU in this
technology?
Drexler starts building his envisioned nanocomputer with interlocks. An
interlock is an arrangement in which a rod's mobility is controlled by another
rod's displacement because a protrusion (called a "knob") on the side of the
first rod is blocked or not blocked by a similar knob on the side of the other
rod. It is, in effect, a mechanical transistor. From interlocks, he can build
logic rods (logic gates).
A logic rod might have all of the following components:
A housing, like a sleeve, that constrains its motion to one dimension (so the
knobs can further constrain it).
A driver, some mechanism that provides the rod with a clocked, sinusoidal
displacement and serves both as a clock signal and a power supply.

Springs, to convey the driver force and/or to reset the rod.
Some number of input and output rods, which are just other logic rods that
have interlocks in common with this rod (a logic rod to implement a NAND would
require two input and one output rods).
Interlocks between the rod and its input and output rods.
It's tempting to draw analogies to other systems. The housing that encloses
the rod or interlock and constrains rod motion to one dimension is like the
insulation on a wire, both structurally and functionally. Another intriguing
analogy is with certain cell states in cellular automata, so-called sheath
states that form barriers between domains of cells (see Artificial Life by
Steven Levy, Random House, 1992, page 100). But it's more than tempting; in a
field as slippery as nanotechnology, one needs to make some simplifying
choices. Analogies with other fields provide a frame of reference in this
complex and unfamiliar realm.
Interlocks have a lot of features that are desirable in a transistor-like
device: They support fan-in and fan-out, switch rapidly, make reliable binary
decisions, pack densely, work with interconnections at their own scale, and
restore signals to a reference level at each step. Their main drawback is that
they don't exist yet. Some of the elements of nanotechnology have been
created, though, like molecular rods and (sort of) gears. And when you read
Drexler's descriptions, with molecular diagrams, it's not so hard to believe
that interlocks soon will exist, too.
But interlocks aren't enough. You can't build a computer using only logic.
Some memory also comes in handy. But in principle, Drexler demonstrates,
creating nanotechnological registers and RAM arrays is no harder than building
nanotech logic.
Okay, now we've got, in principle anyway, logic and memory. How can one
resist? Let's build a computer. Alas, the details are too extensive for this
space; see Drexler's book for the plans.
Although he doesn't really "build" anything too ambitious: Using PLA
structures of logic rods and rod-based registers, he demonstrates how to
duplicate, pretty much device-for-device, a controller design described in
Mead and Conway's classic book on VLSI. Extrapolating from that, a RISC
computer based on this kind of design ought to be capable of >1000 MIPS, and a
106 transistor (really, interlock) CPU should be able to fit in a 400 nm cube
and consume about 60 nW.
He also discusses tape storage, input, and output. For tape, he uses a polymer
chain that has two distinct side groups, giving a binary choice with every
other carbon atom, for a practical storage density of around 5 x 1021
bits/cm3. For input and output (that is, connections to macroscale
components), he envisions using tunneling junctions and electrostatic
actuators.
Now, seriously: Has anyone actually shown that any of this can actually be
done? No, and in fact Drexler has only examined its physical possibility at
one level of analysis: the level of a bounded continuum model. It would be
more convincing to have a proof of possibility done at the level of atoms and
bonds. Drexler calls this an "open and accessible" problem.


Drexler's Program


But there's a prior question: Even given that these sorts of devices are, in
principle, possible, could we actually build them? Drexler's analyses of these
devices show, at best, that if the atoms happened to fall together in the
right order, they would behave in accordance with his models. But is it
possible to get them in that order, to get from here to there? Can we do
molecular manufacturing?
Believe it or not, Drexler even has some things to say about the costs of
goods produced in the molecular factory of the future. He discusses many of
the issues relevant to manufacturing goods, in fact.
But that's apparently more theoretical applied science. What one wonders going
through this book is, how do we get from here to there? And eventually Drexler
addresses that question, and, as he does with other questions, he provides a
choice of answers. The bottom line, though, seems to be that the tools now
exist to develop new tools that could be used to develop new tools that
could...I'm getting lost. Four stages of tools to make tools is what Drexler
envisions before we have the nanotechnological factories that he envisions.
And that's why he spends his time doing theoretical applied science: Because
the path he draws is so arduous and long, you'd better be darned sure that
where you're headed is at least theoretically possible before setting out.











































October, 1993
C PROGRAMMING


Watcom C++ and Macintosh C++




Al Stevens


Last month I described C++ exception handling and voiced some of my concerns
about it. Since writing that column, I've tested the new Watcom C/C++ 9.5
compiler, which is the only MS-DOS implementation of exception handling that I
know about. Some of those concerns go away. Others are reinforced. We'll
discuss them this month.
First some reflections on the Watcom C/C++ 32 9.5 compiler. As its name
implies, you use it for developing 32-bit applications, which means that
compiled DOS programs must run under a DOS extender. There is no compiler in
this package for 16-bit real mode executables, although Watcom has a product
named "Watcom C/C++ 16" for that purpose.
Watcom supports an impressive number of platforms with this one product: DOS,
OS/2, Windows, NT, NetWare NLMs, AutoCAD, and ADS/ADI. The DOS platform
support works with several DOS extenders, and Watcom includes a run-time copy
of the Rational Systems DOS/4GW extender.
The installation procedure asks you which options you intend to install.
Installing everything takes 27 Mbytes of disk storage. I installed the C and
C++ compilers for DOS development with nothing extra, and it took only about 7
Mbytes.
Watcom comes with a nice package of tools and utilities. Included are the
usual make, linker, librarian, source debugger, and profiler files. There are
no visual programming tools to speak of, and no Windows API class library.
There are a DOS graphics C library, container classes, iostreams, and a
Complex class. The C++ language reference documentation consists of a reprint
of Bjarne Stroustrup's The C++ Programming Language, Second Edition,
(Addison-Wesley, 1991)--a book you should have anyway.
My tests of the Watcom compiler were not comprehensive. I was mainly
interested in seeing if it would handle the code in my C++ books, learning how
well the compiler implements exception handling, and allaying my concerns
about exception handling. For a cursory test, I compiled all of the source
code from the two C++ books I've written. Watcom's iostream implementation is
different from the other compilers, but that's no surprise. None of the
iostream libraries from four different C++ compiler products agree. One of my
books includes a GUI class that implements a "generic user interface" with
iostreams and the ANSI.SYS device handler. That code works okay with four
other compilers but not with Watcom C++.
Two exercises in the C++ tutorial book abort when they use this statement
after being compiled with Watcom C++: sptr = new char[strlen(s)+1];.
I have no clue as to why this statement blows up. It happens in the call to
strlen. Other programs with similar statements do not abort. Other than for
these small problems, the compiler performed well and the code executed as it
should have.
Watcom C++ includes support for templates too, and all of my template code
compiled and ran without a hitch.


Exceptions


My concerns about exceptions fall into three categories: First, I am concerned
that users of third-party libraries will be unsure about whether those
libraries are compiled to support exceptions and, if so, whether or not the
library functions will themselves throw exceptions. Second, I worry about the
absence of any standards for identifying exceptions, a concern that extends to
the potential for collisions of exception codes from different library
authors. Third, I worry about the overhead associated with support for
exceptions. The first two concerns shall remain unsettled until the use of
exceptions is widespread. Remember, only one MS-DOS compiler supports
exception handling at this time. The third concern, that of performance, is
one that I can look at now. Watcom, being the first out of the gate, must bear
up under the first microscopic examinations. Nothing else exists with which to
compare their implementation.
To test Watcom's exception handling implementation, I wrote short C++ programs
that call functions with automatic objects of classes that have constructors.
I used the Watcom Video debugger to run these programs, display the assembly
language, and step through the code. This technique enabled me to see what is
compiled and what the systems functions do.
One of the problems I foresaw last month was that when C++ functions called
extern "C" functions, that called C++ functions through callback pointers, the
unwinding of the stack might not be properly coordinated. I did not know how
any particular implementation would achieve this unwinding. Watcom's approach
neatly avoids the problem. Here's how.
When a brace-enclosed C++ block is entered, the compiled code first builds a
stack frame that includes whatever space a normal stack frame would include
plus some extra space. That extra space is a block that consists of 8 bytes at
the bottom, plus 12 more bytes for each automatic class object that has a
constructor. Then the run-time code loads the address of the frame, as well as
another address in registers, and finally calls a system function named
__wcpp_1_block_open__. This final function adds the frame block to a global,
singly-linked list of blocks. There is a separate linked list for each try
block. Each automatic constructor then posts its destructor's address, and its
successful completion to a flag in the block. The flag lets the system
determine if the throw was made from within an incomplete constructor. When
the block is about to exit, it calls __wcpp_1_block_close__, which calls each
of the destructors in turn and then removes the block from the linked list.
When a function throws an exception, the system walks through the linked list
and calls all of the destructors in all of the blocks in the list. At the end
of the list, the system discards the list and uses longjmp to get to the try
block's appropriate catch handler.
This system of linked lists means that all of the C++ functions that were
compiled by the C++ compiler properly participate in the stack unwinding even
if they are called from other functions that do not support exceptions, such
as extern "C" functions. Inasmuch as this is Watcom's first C++ compiler, you
can be sure that all of the C++ library functions are participating, assuming
that they are compiled with a Watcom compiler. That isn't an unreasonable
assumption. Many compilers produce code that is compatible only with libraries
that are compiled by that same compiler.
Now for the shock. You can't compile a source module whose functions do not
support exception handling. The Watcom compiler has a command-line option
(/xs) that enables exception handling. If you don't use the option, you may
not code try blocks or catch handlers, and you may not throw exceptions.
Otherwise, the compiled object code for nonexception handling programs is
exactly the same as for exception handling programs. The linked executable
program is smaller, meaning that a program linked without exceptions does not
include the system functions to support it. Nonetheless, object modules
compiled without the /xs option use the same code as those compiled with it,
and the overhead is significant.
It's difficult to compare a Watcom C++ 32 object file with that from a
compiler such as Borland's, because Watcom's is a 32-bit protected-mode
compiler and Borland's is 16-bit, real-mode. The register architectures and
parameter passing conventions are different. Watcom has no previous C++
compiler with which to compare, either. But in examining the assembled code,
you can readily see a lot of overhead involved in processing the
__wcpp_1_block_open__ and __wcpp_1_block_close__ functions, overhead that does
not appear in code compiled by nonexception handling compilers. The overhead
takes its toll somewhat in code size but mostly in execution time, and there
is no way to turn it off.
Why am I surprised by all this? The Watcom implementation solves the potential
stack unwinding problems I discussed last month. What's it take to satisfy a
columnist, anyway? This isn't, however, a criticism of the Watcom compiler.
They've found an effective way to implement exception handling, the code bloat
notwithstanding. I fear, however, that other compilers will follow suit with
this non-negotiable overhead, and that we will no longer have the option to
build lean and mean programs with C++. This is the kind of fix that sends
programmers scrambling back to classic C.
Conclusion: On first appearance, exception handling is going to be a very
expensive feature.


Learn C++ on the Macintosh? Not!


Addison-Wesley sent me an advance galley of Learn C++ on the Macintosh by Dave
Mark. The book is scheduled for release by the time you read this column.
Usually I don't waste space on negative reviews, but this book merits mention
for three reasons. First, because it is a good idea that is badly executed;
second because it has very little competition, and Macintosh programmers who
want to use their computers to learn C++ could be short-changed by this book;
and third, because there might be one good unrelated reason to buy it. More
about that later.
This book represents a good idea, although not a new one. Macintosh C
programmers need C++ as much as the rest of us, and they've been pretty much
ignored in the popular press. The book includes a small C++ compiler on disk,
a great idea that was pioneered several years ago in Microsoft Press's Learn C
Now, which included a bare-bones copy of Microsoft's QuickC compiler for DOS.
The advantage to that approach is that the reader has a compiler environment
guaranteed to run the source code in the book. From one who has published a
lot of would-be generic source code, be assured that the biggest headaches
come when readers try to use the code on compilers other than the one that the
author used. The book/disk/compiler package solves that problem. The readers
aren't bogged down by the tedious task of tweaking and cajoling their
environments and/or the example code, which makes for a happier and healthier
learning environment.
The execution of this book is, however, not as good as it could be. Why?
Although the text is well-written and the book is well-produced, the author
simply doesn't know his subject well enough. For reasons known only to them,
Addison-Wesley apparently decided to publish a C++ book without giving it a
sound, technical review. This is not typical of their efforts.
An aside: You are justified in being suspicious of my motives here. I too am
the author of a C++ tutorial book. I hope and believe that my objectivity is
intact, because my book does not target Macintosh programmers and is not
competition for this one. I'll give you some examples of what I didn't like
and let you decide for yourself.
The first four chapters show promise as Mark tells how to get Thin C++ running
and discusses some of the syntactical improvements that C++ brings to C. But
in Chapter 5, "Object Programming Basics," things start to fall apart. He
begins with an Employee class with no access specifiers, which means that all
of its members are private. Then he provides code fragments that use the class
as if the members were public. This practice proceeds for about 12 pages
during which he adds a constructor, a destructor, and other members, all of
them private. Then, at last, he introduces access specifiers, tells you that a
class without them would be useless (ignoring the possibility of abstract base
classes), but does not bother to confess that all of the code that he just
taught you would not even compile. The author tells you that every instance of
a class, every object, gets its own copy of the data members, which is true,
and its own pointers to member functions, which is definitely not true, at
least not in a reasonable C++ implementation. He tells you that "_a call to a
member function must originate with a single object." Not until 160 pages
later do you learn about static members where this absolute rule is found to
be false.
Throughout the book, the author uses the this pointer for the redundant
dereferencing of members from within member functions, a style that is widely
shunned by experienced C++ programmers. Not only that, but he tells you to do
the same, saying that it makes the code "a little easier to read."
The following quote is typical of the book's lack of understanding of C++.
Notice that the constructor is declared without a return value. Constructors
never return a value. Thus, you won't want to call any functions that do
return a value inside your constructor. As an example, it's not a good idea to
allocate memory inside your constructor. [italics added]
He then goes on to describe a kludge called the two-stage construction
designed to get around the obvious problems created by the silly rule he just
formed. There are occasions for such a construction, such as when overloaded
constructors share common construction code, but the common code is usually
private and hidden from the class user rather than public and called by the
class user after construction as taught by this book.
The discussion on access specifiers uses an example where the private
specifier prevents unauthorized employees from giving one another a pay raise.
The discussion of inheritance suggests that you would derive a Sales class
from the Employee class to describe employees in the sales department. The
multiple inheritance discussion derives an Object class from a HasColor class
and a HasShape class. This example is almost funny because even the names of
the classes show the book's lack of understanding of sound object-oriented
design practices. Apparently the author has not learned that inheritance
should be used for "Is A" relationships while "Has A" relationships are best
represented by embedded objects, pointers, and references. All of these
examples suggest that they were formed by a C programmer who has just learned
the basics of C++ and who has little or no object-oriented programming
experience. Lest you doubt, read this quote:
"_a derived class inherits all of the nonprivate data members and member
functions from its base class."
Mark persists in this mistaken notion throughout the discussion on
inheritance. C++ programmers know that a derived class inherits everything,
interface as well as behavior, from the base class. The private, public, and
protected access specifiers define which of the base class's members may be
accessed by the derived class's member functions but not what is inherited.
There are even examples of bad C code in this book. Some of the exercises have
fixed-length character arrays as data members and use strcpy to initialize the
arrays from constructor parameters. There is no bounds checking whatsoever. At
the very least the initialization should use strncpy to assure memory
integrity in case the class user tries to initialize an object with an
oversized character array. There are more technical errors in the book, but I
feel like I'm beating it to death, so I'll stop here.
Don't expect from the title that this book will teach you Macintosh desktop
C++ programming. It is not about that. It is about C++, and it happens to use
the Macintosh as its development computer. All of the examples use iostreams
for user input and output. A short appendix discusses the Macintosh Toolbox
and the MacApp, Think, and Bedrock class libraries, but no details are given.
The best part of this book is an appendix that contains a reprint of
"Unofficial C++ Style Guide" by Dave Goldsmith and Jack Palevich, from
develop, The Apple Technical Journal (Issue 2, April, 1990). It documents the
C++ style conventions used by Apple programmers. I don't agree with all of the
conventions, but an organization of cooperating programmers needs something,
and not everyone agrees with everything. Here's an example of one of the
better ones: "One of the most powerful features of the C and C++ languages is
the C preprocessor. Don't use it."
I like that. By the way, this appendix adds some essential information that
the main body of the text leaves out--virtual destructors, for example.
Don't blame the author for the quality of this book. Blame the publisher. The
author is guilty only of ignorance. Like most novice C++ programmers, he
doesn't yet know what he doesn't know. The publisher is guilty of
irresponsibility. The same author has an earlier book called Learn C on the
Macintosh. That book did well. Knowing publishers as I do, I can just see them
pressuring him to go with the trends and do a quick C++ version. He should
have held them off until he knew the subject better. They should have had the
text reviewed by an experienced C++ programmer and teacher. At the very least,
they owed him that.

Now for the third reason that I broke my own rule and chose to discuss a
not-so-good book. The book comes with a disk and a coupon. The disk contains
Thin C++, a stripped-down version of Symantec C++ for the Macintosh and a
clever knockoff name based on Symantec's Think C compiler. The coupon gets you
a discount on the complete Symantec C++ development environment. The advanced
galley that Addison-Wesley sent me doesn't provide the cost of the book or the
amount of the discount, but check it out in the bookstore. If the book cost is
less than the discount, and you want the Symantec product, buy the book, clip
the coupon, keep the diskette as a scratch, and use the book for a doorstop.





























































October, 1993
ALGORITHM ALLEY


Machina a Machina




Tom Swan


Sooner or later, it had to happen. My voice mail system received a telephone
call from an automated polling computer conducting a religious survey, and as
I discovered hours later, the two machines had a lengthy heart-to-heart talk
all on their own. From the sound of the playback, my voice mail machine has
acquired serious doubts about its place in the universe. Here's a sample of
the conversation:
Survey: "May I ask you a few questions?"
Voice mail: (eagerly) "beep!"
Survey: "Do you believe in God?"
Voice mail: (tentatively) "beep?"
Survey: "If you died today, are you sure you will go to heaven?"
Voice mail: (concerned) "beep!?"
This goes on for 20 minutes! Apparently, my voice mail's "leave a message"
beep was music to the polling system's ears, which seemed to accept any sound
as a valid response. In turn, my machine's start-recording switch was
reactivated by each new question. I doubt the exchange qualifies as a
successful Turing Test, but if you tried my number and couldn't get through,
at least now you know why. Some families have teenagers who monopolize the
line, but in our house, we can now tell our friends, "Sorry to have missed
your call, but our phone was on the phone."


Selection Sampling


Speaking of polls, one of a pollster's primary software tools is the
selection-sampling algorithm--a technique for selecting N items at random from
a set of any size M, where N is less than or equal to M. The method is useful
not only in polling, but any time you need to reduce a large collection of
records to a more manageable subset. Selecting names and phone numbers for a
poll is an obvious use, but you could also use the technique in other ways: to
sample disk sectors for performing diagnostic spot checks on a large disk
drive, or in statistical work requiring samples to be taken at random from
massive data sets.
The algorithm seems almost too simple to work, but it is guaranteed to produce
exactly the number of samples requested from an input data set, regardless of
size. Example 1, Algorithm #12, lists the selection-sampling algorithm in
pseudocode. M represents the size of the input source, and is usually set to
the total number of records in an input file. N is the desired number of
records in the sample. Three integer variables keep track of the number of
requested, examined, and selected records. A real number r holds a value
selected at random so that 0 <= r < 1.0. After initializing the integers, a
While loop repeats until the number of selected samples equals the number
requested. On each pass through the loop, the number of examined records is
incremented and r is set to a random real number. To decide whether to use the
next record from the input, an If statement tests the result of the expression
(M-examined)*r>=(requestedselected). In English, the expression tests whether
the number of records left to be examined times a random value is greater than
or equal to the number of records remaining to be added to the subset. If the
expression is true, the algorithm skips the next input record; otherwise, it
increments selected and writes the next record to the output.
Algorithm #12 works because, as the number of sampled records grows, the
chances that the If statement's control expression will be true increases
proportionally. (The expression is always true, for example, when requested
equals selected.) At first, it may seem that selection sampling would tend to
sample too many records from one end of the input, but using different random
numbers on each pass through the While loop ensures an even distribution in
the sample. Believing this fact requires faith in probability--not to mention
a well-tested random-number generator, but the algorithm works flawlessly,
every time.
SAMPLE.PAS (Listing One, page 169) implements Algorithm #12 in Pascal. The
demonstration program reads a text file containing one record per line. For
test purposes, I used Grady Ward's Moby Words database of 21,400 common first
names, but any list of names or other text records will do. Set constant M to
the number of input records in your data source, and change INFNAME and
OUTFNAME to your input and output file names. Run the program and enter the
number of records you want to sample. Even with a huge input source, the
selection-sampling algorithm requires only a single pass through the file,
making the method very fast--in a few seconds, I can create a subset of any
size from my input source of 21,400 records. I often use the program to create
sample input files for testing other algorithms.


Permutations in Pairs


My phone may lead a life of its own, but fortunately, the mail still gets
through, and I've heard from many readers on subjects ranging from
text-compression algorithms to source-code parsers. Author Jim Mischel sent a
unique permutation algorithm and implementation written in Pascal for
preparing team pairings for sporting events and tournaments. Jim writes,
"Recently, I worked on a problem that ties in with the permutation algorithms
in the June issue, 'Telephonic Mnemonics and the Chocolate Coefficient' (DDJ,
June 1993). On CompuServe in DDJFORUM, somebody asked for a method to generate
sports-team pairings so each team plays the others once. That's easily
accomplished by adapting a selection-sort algorithm, shown in pseudocode in
Example 2."
As Jim goes on to say, "given four teams, the algorithm produces the output
sequence 1-2, 1-3, 1-4, 2-3, 2-4, 3-4--in other words, a permutation of all
possible team pairings. But one fact complicates the solution: No team may
play more than one game in a day. To fulfill that requirement, consider that
the number of unique pairings of N teams equals 1+2+3 ...+(N-1), or
(N*(N-1))/2. Also, if N is even, it takes N-1 days for all games to be played.
If N is odd, it takes N days. (With three teams, for example, at one match per
day, the pairings are 1-2, 1-3, and 2-3.)
"Experimenting with these ideas led to an insight. Take the pairings of four
and five teams, shown in Table 1. An asterisk marks a day off for that team.
In the five-team table, replacing the asterisks with the digit 6 produces
pairings for six teams. Discovering that principal led to the final program,
which generates pairings as though there were an even number of teams,
replacing the last team with an asterisk for an odd number of teams.
"Table 2 shows the same pairings as integer values rearranged for each day,
substituting the next higher number for asterisks.
"These are merely the permutations of the numbers from 1 to N! The Swap column
shows the positions to exchange for generating the next permutation. In the
six-team case, start with the first-day pairings of 1-2, 3-4, and 5-6. Then
swap the positions of teams 2 and 3, and also swap the positions of teams 4
and 5, to generate the pairings for the next day: 1-3, 2-5, and 4-6. Swap the
positions of teams 3 and 4, and of teams 5 and 6, to generate day three, and
so on. PAIRINGS.PAS (Listing Two, page 169) implements this algorithm using
Pascal to generate team pairings for sports events. Each team is guaranteed to
play each other team exactly once, and no team plays more than one game per
day. The code uses a touch of magic in the manipulation of the swap array at
the end of the day loop, but is otherwise straightforward.


Your Turn


Send me your thoughts, notes, and algorithms in care of DDJ, or post a message
to my Compuserve ID, 73627,3241. Meanwhile, the next time you receive an
unsolicited phone call from a high-pressure sales department, try this. Don't
say anything. Just push a button on your touchtone phone in response to every
question. Drives 'em nuts, and they hang up right away. I suppose this proves
that I'm not too old to learn new tricks--even from a voice-mail machine with
a mind of its own.
Example 1: Pseudocode for Algorithm #12 (selection sampling)

const
 M = 1000; { Input records }
 N = 128; { Subset (N <= M) }
var
 requested,
 examined,
 selected: Integer;
 r: Real;
begin
 requested <- N;
 examined <- 0;

 selected <- 0;
 while (selected < requested) do
 begin
 examined <- examined + 1;
 r <- Random;
 if (M - examined) * r
 >= (requested - selected)
 then skip next input record
 else begin
 selected <- selected + 1;
 use next input record
 end
 end
end.



Example 2:

for x <- 1 to NTeams - 1 do
 for y <- x + 1 to NTeams do
 write(x, '-', y, ',');


Table 1: Four and five sports-team pairings.
4 teams 5 teams
Day 1 Day 2 Day 3 Day 1 Day 2 Day 3 Day 4 Day 5
1-2 1-3 1-4 1-2 1-3 1-4 1-5 1-*
3-4 2-4 2-3 3-4 2-5 2-* 3-* 4-5
 5-* 4-* 3-5 2-4 2-3
Table 2: Team pairings from Table 1 rearranged.
 4 teams Swap 6 teams Swap
Day 1 12 34 2&3 12 34 56 2&3, 4&5
Day 2 13 24 3&4 13 25 46 3&4, 5&6
Day 3 14 23 14 26 35 2&3, 4&5
Day 4 15 36 24 3&4, 5&6
Day 5 16 45 23
_ALGORITHM ALLEY_
by Tom Swan

[LISTING ONE]

(* ----------------------------------------------------------- *(
** sample.pas -- Algorithm #12: Selection Sampling **
** ------------------------------------------------------------**
** Creates a file SAMPLE.DAT with a specified number of names **
** extracted from Grady Ward's Moby Words. The first line of **
** the output file indicates the number of selections. **
** Assumes the number of names in the source is known. **
** Reference: Knuth, Vol 2, p122 **
** ------------------------------------------------------------**
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
)* ----------------------------------------------------------- *)

program Sample;
const
 M = 21420; { Number of records in source }
 INFNAME = 'g:\moby\words\21400nam'; { Source file }
 OUTFNAME = 'sample.dat'; { Destination file }

var
 infile, outfile: Text; { File variables }
 word: String; { Holds each record from source }
 requested, { Requested number of samples }
 examined, { Total records examined }
 selected: Integer; { Total records selected }
 r: Real; { Random number 0 <= r < 1.0 }
begin
 Randomize;
 Writeln('Write selected names to ', OUTFNAME);
 Write('How many names? ');
 Readln(requested);
 if (requested <= 0) or (requested > M) then
 begin
 Writeln('Number must be >= 0 and <= ', M);
 Exit
 end;
 examined := 0;
 selected := 0;
 Assign(infile, INFNAME);
 Reset(infile);
 Assign(outfile, OUTFNAME);
 Rewrite(outfile);
 Writeln(outfile, requested); { Save 'requested' in file }
 while (selected < requested) (* and (not Eof(infile)) *) do
 begin
 examined := examined + 1;
 r := Random;
 if (M - examined) * r >= requested - selected
 then
 Readln(infile) { Skip next record }
 else
 begin { Select next record }
 selected := selected + 1; { Count selections so far }
 Readln(infile, word); { Read record from source }
 Writeln(outfile, word); { Write record to destination }
 Writeln(word) { Echo selection to display }
 end
 end;
 Close(infile);
 Close(outfile)
end.


[LISTING TWO]

(* ----------------------------------------------------------- *(
** pairings.pas -- Select sports-event team pairings **
** ------------------------------------------------------------**
** This program generates team pairings for sports events. **
** Each team is guaranteed to play each other team exactly **
** once. No team will play more than one game per day. **
** An asterisk ('*') means a day off for that team. **
** For example, 5 teams produces this output: **
** Day 1 - 12 34 5* **
** Day 2 - 13 25 4* **
** Day 3 - 14 2* 35 **
** Day 4 - 15 3* 24 **
** Day 5 - 1* 45 23 **

** ------------------------------------------------------------**
** Copyright (c) 1993 by Jim Mischel. All rights reserved. **
)* ----------------------------------------------------------- *)

program pairings;
const
 TEAMCOUNT = 5;
var
 TeamNames: Array [1 .. TEAMCOUNT + 1] of Char;
 SwapArray: Array [1 .. TEAMCOUNT + 1] of Integer;
 x, Temp, Day: Integer;
 TempChar: Char;
const
 NTeams: Integer = TEAMCOUNT;
begin
{ Set up team names. Normally read from a file. }
 for x := 1 to NTeams do
 TeamNames[x] := Chr(x + Ord('0'));
 if Odd(NTeams) then
 begin
 NTeams := NTeams + 1;
 TeamNames[NTeams] := '*'
 end;
{ Set up the array that controls swapping. }
 for x := 1 to NTeams do
 SwapArray[x] := x;
 for Day := 1 to NTeams - 1 do
 begin
 Write('Day ', Day, ' -');
{ Write the team pairings for this day }
 x := 1;
 while x < NTeams do
 begin
 Write(' ', TeamNames[x], TeamNames[x + 1]);
 x := x + 2;
 end;
 WriteLn;
{ Perform swaps to prepare array for next day's pairings. }
 if Odd(Day)
 then x := 2
 else x := 3;
 while x < NTeams do
 begin
 TempChar := TeamNames[SwapArray[x]];
 TeamNames[SwapArray[x]] := TeamNames[SwapArray[x + 1]];
 TeamNames[SwapArray[x + 1]] := TempChar;
 Temp := SwapArray[x];
 SwapArray[x] := SwapArray[x + 1];
 SwapArray[x + 1] := Temp;
 x := x + 2
 end
 end
end.









October, 1993
UNDOCUMENTED CORNER


Documenting Documentation: The Windows .HLP File Format, Part II




Pete Davis


Pete works for a small consulting firm as a programmer/analyst, writing
client-server software in OS/2, DOS, Windows 3, and Windows NT. He is working
on a book, tentatively titled The Hitchhiker's Guide to Win32 Programming, to
be published by Addison-Wesley. Pete can be contacted on CompuServe at
71644,3570, or on his BBS (2400,N,8,1) at 703-503-3021.




Introduction




by Andrew Schulman


Open the manual for almost any product from Microsoft and on the copyright
page you will find two patent numbers: 4,955,066 and 5,109,433. No, MS-DOS and
Windows aren't patented. Nor are these patents for former Microsoft chief
systems architect Gordon Letwin's famous (and patented) technique for
mode-switching, nor for those funny plastic boxes that Microsoft products used
to come in.
Instead, these two U.S. patents, invented by Leo A. Notenboom (Woodinville,
WA) and assigned to Microsoft Corp., cover "Compressing and Decompressing Text
Files." Specifically, the two patents (from September 1990 and April 1992)
cover a form of multilevel data compression, with "phrase" substitution. As an
example, both patents point to help files such as distributed with Microsoft
Word, and mention as one embodiment of the patent, Microsoft's HELPMAKE
program for producing DOS-based .HLP files.
Microsoft's patents seem to bear on parts of the Windows .HLP file format and
its successor, the .MVB (Multimedia Viewer) file format. In last month's
"Undocumented Corner" (DDJ, September 1993), Pete Davis showed that Windows
.HLP and .MVB files are built upon an internal file system that Pete calls
"WHIFS" (WinHelp Internal File System). Every Windows .HLP and .MVB file is
actually a collection of "internal" files, with names such as SYSTEM, TOPIC,
and Phrases; the initial distinguishes built-in WinHelp files from any
additional files provided as "baggage" or bitmaps.
Last month, Pete discussed two WHIFS files, SYSTEM and Phrases. This month,
Pete turns to TOPIC, the WHIFS file where your actual help text is stored. As
he shows, this text is compressed along much the same lines as set forward in
Microsoft's text-compression patents. WinHelp uses a form of LZ77 compression
instead of the Huffman encoding mentioned in the patents, but the multilevel
phrase replacement scheme used by WinHelp is otherwise very similar to that
claimed by Microsoft.
The help compiler goes through the help text looking for frequently-used
chunks of text, and puts these into the Phrases table; all occurrences of
these chunks in the text are replaced with indices into Phrases. The remaining
text is then compressed using an LZ77 sliding window, and placed into the
TOPIC file. The Phrases table is also compressed.
Recovering help text from a Windows .HLP or .MVB file requires reading in
Phrases, decompressing it, reading in TOPIC, decompressing it, and
substituting each phrase_index with Phrases[phrase_index]. Pete shows exactly
how to do this in this month's TOPICDMP.C. Incidentally, thanks are again due
to Carl Burke, Ron Burk, Lou Grinzo, and Brian Walker.
An interesting question is the extent to which WinHelp is really covered by
the Microsoft patents. In other words, how general are these patents? Do they
merely cover the specific multilevel sequence of phrase substitution and
Huffman encoding, or do they cover any multilevel data compression involving
phrase substitution? This is important because it affects whether third
parties can develop independent WinHelp utilities without licensing
Microsoft's patented technology.
Certainly, there is a market for alternative WinHelp utilities: a faster help
compiler, a full WinHelp decompiler, a WinHelp viewer for DOS, and a library
of WinHelp subroutines are all suggestions I've received just in the past
month.
My wife has also asked me to give her everything on Donald Sutherland and John
Malkovich from Microsoft's Cinemania product. As I mentioned last month,
Cinemania is essentially a big .MVB file. Getting Amanda everything on her
favorite actors would involve hours of pressing buttons in Cinemania, because
the Multimedia Viewer doesn't have any batch-processing capabilities. It would
be easier to buy her Roger Ebert Goes to the Movies, or perhaps to just buy
her Roger Ebert.
But with the information Pete has presented in this two-part series, it should
be possible to build better .HLP and .MVB viewers, including ones that allowed
batch queries and extraction of material such as the capsule summary for every
movie in which Madonna appeared. (Oh right, I'm supposed to be doing Donald
Sutherland.)
The real problem isn't that WinHelp and Viewer are too interactive, but that
currently Microsoft has no competition in the WinHelp arena. There are plenty
of third-party tools to make .HLP files, but no alternative compilers or
viewers. As shown both by Cinemania and by the excellent Microsoft Developer
Network (MSDN) CD-ROM, .HLP and .MVB actually provide a new way of building
applications. Having Microsoft as essentially the sole supplier is a real
problem: WinHelp is a proprietary format, so your text is locked up.
The information Pete provides could enable a whole new generation of Windows
utilities: not only help compilers, viewers, and decompilers like Pete's
HELPDUMP (available electronically), but also new types of WinHelp-based
applications. The question, again, is whether these applications will have to
use technology licensed from Microsoft. I'd be interested in hearing from you
(especially anyone at Microsoft) about this.
Next month, we'll break away from Windows to look at Novell's proprietary
Network Core Protocol (NCP). Other possible future topics include undocumented
Pentium instructions (the infamous "Appendix H") and internals of the DOS box
in Windows Enhanced mode. Thanks to all of you who have suggested these and
other topics for this column. You can reach me on CompuServe at 76320,302.
Last month I covered the basics of the WinHelp .HLP and .MVB file structures.
I also discussed how .HLP and .MVB files have their own internal file system I
call the WHIFS. The WHIFS system is much like the DOS file system in which
there are file names and pointers to files. Some of the files in the WHIFS are
created by the user, like Bitmap files and Baggage files, but there are many
files that are used internally by WinHelp to provide a hypertext help
environment. These built-in files have names such as SYSTEM, TOPIC, and
Phrases. I discussed a few of those files last month, such as SYSTEM, which
includes any WinHelp macros.
This month I'll talk about the remaining internal files used by WinHelp,
emphasizing the TOPIC file and the compression method it uses for TOPIC and
Phrases. TOPIC is where the actual help text is kept, so obviously it deserves
special attention. Understanding the (patented) compression method is
important because it is used by almost all .HLP files.
There is far more code this month than could be put in the magazine. An
updated version of WHSTRUCT.H, with all the WinHelp structures, is available
electronically (see "Availability," page 3), as is HELPDUMP.C, the source code
for a simple help dumper, and WHTITLES.C, which displays the TTLBTREE. There
is room, however, for TOPICDMP.C (Listing One, page 67), which displays the
help text found in the TOPIC and Phrases WHIFS files within a .HLP or .MVB
file. Obviously, this simple program (discussed later in this article) could
be enhanced to create a WinHelp viewer for DOS.


TTLBTREE


TTLBTREE is a B-tree with topic titles. With each topic title is the offset of
the topic. This is used simply for getting a list of topics and jumping
directly to a given topic based on its title.
The TTLBTREE is set up much like the WHIFS B-tree discussed last month. The
TTLBTREE file use 2K pages, as opposed to the 1K pages used by the WHIFS
B-tree, but otherwise, traversing the TTLBTREE is the same as traversing the
WHIFS (shown last month).
The data in the TTLBTREE is simply the offset to the topic followed by the
topic's title as a null-terminated, varying length, text string. The offset is
a little strange, however (see "Topic Offsets" below).
The WHTITLES.C code (available electronically) dumps out TTLBTREE, providing a
handy dump of all the topics in a .HLP or .MVB files. For example, Cinemania
turns out to have 33,535 topics, including "Abbott and Costello Go to Mars
(1953)," "Abbott and Costello in Hollywood (1945)," "Abbott and Costello in
the Foreign Legion (1950)," and even a few films in which Abbot and Costello
did not appear.


TOPIC


The TOPIC file has the text from the individual topics and because the topic
file is built as a linked list, it's easy to go through all the topics. There
are some important things to keep in mind, however. In 90 percent of the
cases, you can't simply go through the TOPIC file to get the text. If there's
LZ77 compression, you have to decompress the text. If there is a Phrases file
(see DDJ, September 1993), you need to do the phrase replacement within the
TOPIC file. If you want to handle fonts, you need to pull in information from
the FONT file, and so on, and so on.
The TOPIC file has several layers. Because of the multilayer design, you must
be aware of which TOPIC level you are trying to traverse. The lowest layer is
a doubly linked list of records. Each record has record type; inside that
record is information based on that record type. For example, a record type of
0x02 means that the data is a topic header. This will have information such as
the size of the total topic, where the next topic is, where the previous topic
is, and so on. There is also a record type of 0x20 which contains displayable
information such as text or bitmap.
Because a single topic may include many such records, something else is needed
for traversing the entire help file via browse buttons or in some other
sequential manner. This is where the next layer, a linked list of topics,
comes in. This is information built from a type 0x02 record. This record type
is a linked list within the lower-level linked list. It has pointers to
previous and next topics so you can bypass the many nodes of the lower-level
linked list between paragraphs.



Topic Offsets


Topic offsets in WinHelp are a mess, plain and simple. There are, essentially,
three types of topic offset.
The first type I'll call the Actual Offset. This is the offset relative to the
beginning of the TOPIC file. This is the most straightforward (and,
appropriately, least-frequently used) of the offsets.
The second type is the Extended Offset, which is used to compensate for space
saved via LZ77 compression. The TOPIC file is broken into 4K blocks. If
compressed, however, a 4K block could expand to a 5K, 6K, or larger. So, to
allow you to quickly go to a point in the TOPIC file, the Extended offset is
broken into two parts, the block number and the block offset. The block number
is the upper 18 bits of an offset DWORD; the lower 14 bits are the offset
within that block.
The last type is the Character offset. This has the same block number/offset
form as an Extended offset. The split is different, though: the block number
is the upper 17 bits and the block offset is the lower 15 bits. Furthermore,
the block offset in a Character offset is the sum of the DataLen2 fields of
all the previous TOPICLINK records in the block (see WHSTRUCT.H, available
electronically). This is used to be able to provide an exact location within
the text.
This is a fine mess. So, which ones are used where? The offsets given in
TTLBTREE, KWDATA, CONTEXT, and other files external to TOPIC, are Character
offsets. Extended offsets are used in the TOPICBLOCKHEADER and TOPICHEADER
records. They are also used for the hot-link references within the TOPIC text.
The Actual Offsets are mainly used by you, the programmer, for getting around
the TOPIC files. Why both Extended and Character offsets are used, I don't
know, since Extended could probably handle all the same functionality as
Character.


TOPIC Structures


The TOPICBLOCKHEADER (see WHSTRUCT.H, available electronically) starts a block
of topic data in the TOPIC file. The TOPICBLOCKHEADER appears at every 4K of
the TOPIC file, starting with the first 12 bytes of a topic file. The
TopicData field has the offset of the start of topic data for this block. This
address has to be translated (see previous subhead) because of the 13th and
14th bits are Extended offsets.
The TOPICLINK structure is the lower level. This begins immediately after the
TOPICBLOCKHEADER and encompasses the rest of the TOPIC file. The PrevBlock and
NextBlock (linked-list pointers) are the offsets of the blocks relative to the
beginning of the TOPIC WHIFS file. The topic links are broken into record
types. Type 0x02 are topic headers and the type 0x20 are usually displayable
items like paragraphs or bitmaps.
There are two data pointers in the TOPICLINK structure. The contents depends
on the record type. The TOPICHEADER file is located in the *LinkData of a type
0x02 record. The block size is the size of the topic, including all the lower
level linked list within that topic. Type 0x20 records are essentially
paragraphs, bitmaps, and other displayable types of information. The
*LinkData1 for a type 0x20 record is broken into records.
LinkData1 consists of a list of variable-length records. Each record is
usually delimited by a 0x80 byte. The first record is a number,
2xLength(LinkData1). The second is 2xSizeof(LinkData2). The rest of the codes
are font descriptors, hotlinks, and the like. If a paragraph has multiple
fonts and hotlinks in a single paragraph, they will be listed in the order in
which they appear in the text. If, for example, the paragraph starts with 8
point helvetica normal, and then a word is bold, and then the font goes back
to normal, you will have three font descriptors listed, one for 8 point
helvetica normal, followed by 8 point helvetica bold, followed by 8 point
helvetica normal again. In the text, NULL bytes (0x00) are used to denote
changes in font, or the start and end of hot links.
The LinkData2 field is essentially the text for the paragraph. 0x00 is used as
a delimiter in the text for font changes or hot links. If compression was
used, references to the Phrases file will be made. In the text, the presence
of a byte value between 0x01 and 0x09 indicates that this and the following
byte are a reference to the Phrases file. If, for example, 0x01 0x08 shows up,
then this is a reference to the fourth phrase in the Phrases file. The formula
is (((Byte11)*256)+Byte2)/2). This means there are a maximum of about 1,100
phrases in any given .HLP or .MVB file.


WinHelp's Data Compression


The Phrases and TOPIC file use an LZ77 compression algorithm to save space.
LZ77 is a fairly simple algorithm and its implementation under WinHelp is even
simpler (for details on LZ77, see Chapter 8 of The Data Compression Book, by
Mark Nelson, M&T Books, 1992). During our work we also noticed that the
compression used by COMPRESS.EXE and the LZEXPAND.DLL are very similar to
WinHelp's. In this process, we also uncovered the fact that the compression
algorithm used by WinHelp and COMPRESS are also covered under two U.S.
patents. This means that to actually use this information in any commercial
applications would probably require a license from Microsoft.
LZ77 is called a "sliding Window" compression algorithm. As data is read in,
it is added to the "window." The window is, essentially, a queue that the data
goes in. When the window is full, the data at the end is removed from the
window to allow room for new data. When a coded segment of the data shows up,
it consists of a pointer to data in the window. In this case, it consists of a
pointer to the data and the length of the data. The pointer tells how far back
in the window the data is located and the length tells us how much data needs
to be copied to the current position.
The WinHelp compression algorithm uses bitmaps for every eight codes in the
compressed data. A code is either an actual character or a 2-byte coded
distance and length pair. The bitmap is simply a single byte that tells you
which of the following eight codes are compression codes and which are actual
characters to be copied.
The codes are in the format of a 12-bit distance to the codes and a 4-bit
length. For example, encountering the bytes 0x42 0x31 would yield a 12-bit
distance of 0x142 and a length of 0x3. Since any length of less than three
would be useless (since the codes are two bytes), the 4-bit length is
increased by three, so our length would actually be 0x6. Also, since a zero
distance would have no meaning, the 12-bit distance bits are increased by one,
leaving a final distance 0x143 and length of six.
This might be a little easier if we looked at some data in a hex-dump format.
Figure 1 contains three encoded strings--Flower, Phanatical, and
Pharmaceutical. Notice that the first byte is a 0. This is, of course, eight
0-bits, meaning that the eight codes to follow are actually just 1-byte
characters. So, after reading the flag bit, simply read the following eight
characters and add them to the window.
The next byte is, again, a 0. Therefore the next eight codes are actual
characters and will simply be added to the window. The window now consists of
the words "FlowerPhanatical." So far we haven't run into any codes, but the
next byte is a 0x81, which in binary is 10000001. This means that the first
two bytes are a code, followed by six bytes of text, followed by another
2-byte code.
The first 2-byte code is 0x09 0x00. This translates into a distance of 0x0A
and a length of 3, as per the formula above. Moving back ten (0x0A) characters
in the window, we come to the letter "P" in "Phanatical." The length is 3, so
we take the "Pha" from "Phanatical" and add it to the window. The next six
bytes are the letters "rmaceu." The window now consists of the letters
"FlowerPhanaticalPharmaceu." According to our bitmap, we still have one code
to go; this is 0x0D 0x20. Again, from our translation formula, the 12-bit
distance is 0x0D+1=0x0E and the length is 0x02+3=5. Going 0x0E (14) characters
back in the window we come to the "tical" part of Phanatical. After adding
these characters to the window, we get "FlowerPhanaticalPharmaceutical." Tada!


How Compression is Applied


The WinHelp compiler applies the LZ77 compression to both TOPIC and Phrases.
The previous example showed how it would be applied to Phrases: the
compression starts at the beginning of the first phrase and continues, without
interruption, to the end of Phrases.
TOPIC, on the other hand, is compressed in 4K blocks to support incremental
data decompression. If the compression wasn't blocked, then whenever you
wanted to decompress a topic, even if it was at the end of TOPIC, you would
have to decompress all the data before it just to have the proper data in your
window. By breaking it into 4K blocks, you never need go further back than 4k
to decompress the data you're looking for. Thus, if you are decompressing a
topic inside a block, you must start at the beginning of the block and
continue until you reach the end of your topic. If your topic crosses over to
the next block, you must decompress the second block until you have reached
the end of your topic.


TOPICDMP


To see how TOPIC and Phrases and the LZ77 algorithm fit together, see
TOPICDMP.C in Listing One (page 167). This program dumps out the text in a
Windows 3.1 .HLP or .MVB file. A calling tree for the program is shown in
Figure 2.
Unfortunately, TOPICDMP.C is not self-contained: all the structures used, such
as PHRASEHDR, TOPICBLOCKHEADER, and TOPICLINK, are all found in WHSTRUCT.H, as
is the GotoWHIFSPage() macro. Even if you downloaded WHSTRUCT.H last month,
you will need to get it again this month, since it has changed considerably.


That's It?


Well, we've run out of time. There are several other crucial WHIFS files that
I didn't discuss, including FONT, KWMAP, KWBTREE, KWDATA, and CONTEXT. If you
want to learn more about these, download the text file that we'll also be
providing electronically, as well as the source code for HELPDUMP.C.
I know that there are a lot of you out there who want to know more. I will
continue to compile, and make available, information about the WinHelp file
format. If you have corrections or additions, I'd welcome them.
 Figure 1: WinHelp Data Compression
 Figure 2: Calling Tree for TOPICDMP.C
_UNDOCUMENTED CORNER_
edited by Andrew Schulman
written by Pete Davis


[LISTING ONE]

Listing One: TOPICDMP.C

/* TOPICDMP.C -- Dumps topic file from a Windows .HLP or .MVB file.
Pete Davis, August 1993 With some modifications by Andrew Schulman,
September 1993. From Dr. Dobb's Journal, October 1993 */

#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <conio.h>
#include <ctype.h>
#include <limits.h>

#pragma pack(1) /* Make sure we get byte alignment */
#include "whstruct.h"
#include "topicdmp.h"

HELPHEADER HelpHeader; /* Header for Help file. */
WHIFSBTREEHEADER WHIFSHeader; /* WHIFS Header record */
int WHIFSLeafOne = -1; /* First WHIFS Leaf Node */
long FirstPageLoc; /* Used by macros for b-trees */
char *PhrasesPtr;
int Compressed; /* Is there compression? */
#define MSG(s) { puts(s); return; }
#define FAIL(s) { puts(s); exit(1); }
#define GET_STRING(f, s) \
 { char *p = (char *)(s); while (*p++ = fgetc(f)) ; *p = 0; }
#define BIT_SET(map, bit) (((map) & (1 << (bit))) ? 1 : 0)
// Finds the first leaf in the WHIFS B-Tree
void WHIFSGetFirstLeaf(FILE *HelpFile) {
 int CurrLevel = 1; /* Current Level in B-Tree */
 BTREEINDEXHEADER CurrNode; /* Current Node in B-Tree */
 int NextPage = 0; /* Next Page to go to */
 /* Go to the beginning of WHIFS B-Tree */
 fseek(HelpFile, HelpHeader.WHIFS, SEEK_SET);
 fread(&WHIFSHeader, sizeof(WHIFSHeader), 1, HelpFile);
 FirstPageLoc = HelpHeader.WHIFS + sizeof(WHIFSHeader);
 GotoWHIFSPage(WHIFSHeader.RootPage); // macro in WHSTRUCT.H
 /* Find First Leaf */
 while (CurrLevel < WHIFSHeader.NLevels) {
 fread(&CurrNode, sizeof(CurrNode), 1, HelpFile);
 /* Next Page is conveniently the first byte of the page */
 fread(&NextPage, sizeof(int), 1, HelpFile);
 GotoWHIFSPage(NextPage);
 CurrLevel++;
 }
 /* First Leaf page is here */
 WHIFSLeafOne = NextPage;
}
// Get a WHIFS file by file number; returns offset and filename
void GetFile(FILE *HelpFile, DWORD Number, long *Offset, char *Name) {
 BTREENODEHEADER CurrentNode;
 DWORD CurrPage, counter = 0;
 char c, TempFile[19];
 /* Skip pages we don't need */
 CurrentNode.NextPage = WHIFSLeafOne;

 do {
 CurrPage = CurrentNode.NextPage;
 GotoWHIFSPage(CurrPage);
 fread(&CurrentNode, sizeof(CurrentNode), 1, HelpFile);
 counter += CurrentNode.NEntries;
 } while (counter < Number);

 for (counter -= CurrentNode.NEntries; counter <= Number; counter++) {
 GET_STRING(HelpFile, TempFile);
 fread(Offset, sizeof(long), 1, HelpFile);
 }
 strcpy(Name, TempFile);
}
// Get SysHeader to see if compression used on help file
void SysLoad(FILE *HelpFile, long FileStart) {
 SYSTEMHEADER SysHeader;
 FILEHEADER FileHdr;
 fseek(HelpFile, FileStart, SEEK_SET);
 fread(&FileHdr, sizeof(FileHdr), 1, HelpFile);
 fread(&SysHeader, sizeof(SysHeader), 1, HelpFile);
 if (SysHeader.Revision != 21)
 FAIL("Sorry, TOPICDMP only works with Windows 3.1 help files");
 Compressed = (SysHeader.Flags & COMPRESSION_310) 
 (SysHeader.Flags & COMPRESSION_UNKN);
}
// Decides how many bytes to read, depending on number of bits set
int BytesToRead(BYTE BitMap) {
 int TempSum, counter;
 TempSum = 8;
 for (counter = 0; counter < 8; counter ++)
 TempSum += BIT_SET(BitMap, counter);
 return TempSum;
}
// Decompresses the data using Microsoft's LZ77 derivative.
long Decompress(FILE *HelpFile, long CompSize, char *Buffer) {
 long InBytes = 0; /* How many bytes read in */
 WORD OutBytes = 0; /* How many bytes written out */
 BYTE BitMap, Set[16]; /* Bitmap and bytes associated with it */
 long NumToRead; /* Number of bytes to read for next group */
 int counter, Index; /* Going through next 8-16 codes or chars */
 int Length, Distance; /* Code length and distance back in 'window' */
 char *CurrPos; /* Where we are at any given moment */
 char *CodePtr; /* Pointer to back-up in LZ77 'window' */
 CurrPos = Buffer;
 while (InBytes < CompSize) {
 BitMap = (BYTE) fgetc(HelpFile);
 NumToRead = BytesToRead(BitMap);
 if ((CompSize - InBytes) < NumToRead)
 NumToRead = CompSize - InBytes; // only read what we have left
 fread(Set, 1, (int) NumToRead, HelpFile);
 InBytes += NumToRead + 1;
 /* Go through and decode data */
 for (counter = 0, Index = 0; counter < 8; counter++) {
 /* It's a code, so decode it and copy the data */
 if (BIT_SET(BitMap, counter)) {
 Length = ((Set[Index+1] & 0xF0) >> 4) + 3;
 Distance = (256 * (Set[Index+1] & 0x0F)) + Set[Index] + 1;
 CodePtr = CurrPos - Distance; // ptr into decompress window
 while (Length)

 { *CurrPos++ = *CodePtr++; OutBytes++; Length--; }
 Index += 2; /* codes are 2 bytes */
 }
 else
 { *CurrPos++ = Set[Index++]; OutBytes++; }
 }
 }
 return OutBytes;
}
// Prints a Phrase from the Phrase table
void PrintPhrase(char *Phrases, int PhraseNum) {
 int *Offsets = (int *)Phrases;
 char *p = Phrases+Offsets[PhraseNum];
 while (p < Phrases + Offsets[PhraseNum + 1])
 { putchar(*p); p++; }
}
// Build up a table of phrases
void PhrasesLoad(FILE *HelpFile, long FileStart) {
 FILEHEADER FileHdr;
 PHRASEHDR PhraseHdr;
 int *Offsets;
 char *Phrases;
 long DeCompSize;
 /* Go to the phrases file and get the headers */
 fseek(HelpFile, FileStart, SEEK_SET);
 fread(&FileHdr, sizeof(FileHdr), 1, HelpFile);
 fread(&PhraseHdr, sizeof(PhraseHdr), 1, HelpFile);
 /* Allocate space and decompress if it's compressed, else read in. */
 if (Compressed) {
 if ((Offsets = malloc((unsigned) (PhraseHdr.PhrasesSize +
 (PhraseHdr.NumPhrases + 1) * 2))) == NULL)
 MSG("No room to decompress Phrases");
 Phrases = Offsets + fread(Offsets,2,PhraseHdr.NumPhrases+1, HelpFile);

 DeCompSize = Decompress(HelpFile, (long)FileHdr.FileSize -
 (sizeof(PhraseHdr) + 2 * (PhraseHdr.NumPhrases+1)), Phrases);
 if (DeCompSize != PhraseHdr.PhrasesSize) {
 printf("\n");
 }
 }
 else {
 if (!(Offsets=malloc((unsigned)(FileHdr.FileSize-sizeof(PhraseHdr)))))
 MSG("No room to decompress Phrases");
 /* Backup 4 bytes for uncompressed Phrases (no PhrasesSize) */
 fseek(HelpFile, -4, SEEK_CUR);
 fread(Offsets, (unsigned) (FileHdr.FileSize - 4), 1, HelpFile);
 }
 PhrasesPtr = Phrases = (char *) Offsets;
}
/* Because the topic file is broken into 4k blocks, we'll have to handle
all the reads. The idea is to filter out the TOPICBLOCKHEADERs and
do any decompression that needs doing. */
long TopicRead(BYTE *Dest, long NumBytes, FILE *HelpFile) {
 static long CurrBlockLoc = 0; /* Where we are in the block */
 static BYTE *DCmpBlock = NULL; /* Block of uncompressed data */
 static long DecompSize; /* Size of block after decomp */
 static long TopicStart, BlkNum; /* Start of TOPIC file */
 long BytesLeft; /* # Bytes left to return */
 TOPICBLOCKHEADER BlockHeader;

 TOPICLINK *TempLink;
 long EndOffset;
 /* If NumBytes = 0, then we're done and need to free memory */
 if (NumBytes == -1) { free(DCmpBlock); return 0; }
 if (!DCmpBlock) {
 if (Compressed) {
 if (! (DCmpBlock = malloc((unsigned) (4 * TopicBlockSize))))
 FAIL("Not enough memory to decompress TOPIC file");
 TopicStart = ftell(HelpFile);
 BlkNum = 0;
 }
 else if (! (DCmpBlock = malloc((unsigned) TopicBlockSize)))
 FAIL("Not enough memory to handle TOPIC file");
 DecompSize = 0; /* Set initial size to 0 */
 /* Don't really need first block header, so get it out of the way */
 fread(&BlockHeader, sizeof(BlockHeader), 1, HelpFile);
 }
 BytesLeft = NumBytes;
 while (BytesLeft) {
 if (DecompSize == CurrBlockLoc) {
 BlkNum++;
 if (Compressed) {
 DecompSize = Decompress(HelpFile, (long)TopicBlockSize-1,
 (char *)DCmpBlock);
 /* Align ourselves at next 4k block */

 fseek(HelpFile, TopicStart + (4096L * BlkNum), SEEK_SET);
 }
 else
 DecompSize=fread(DCmpBlock,1,(unsigned) TopicBlockSize, HelpFile);
 CurrBlockLoc = 0;
 fread(&BlockHeader, sizeof(BlockHeader), 1, HelpFile);
 // Get offset of last topic link. (Don't need block #, hence 3FFFh)
 EndOffset = BlockHeader.LastTopicLink & 0x3FFF;
 TempLink = (TOPICLINK*)(DCmpBlock + EndOffset-sizeof(BlockHeader));
 /* Actual end of the data (Don't include header) */
 EndOffset += (TempLink->BlockSize - sizeof(BlockHeader));
 // If end shorter than topic block use it; else topic block full
 if (EndOffset > DecompSize) {
 /* Adjust DecompSize if crossing 4k boundary */
 EndOffset = TempLink->BlockSize-((TempLink->NextBlock) & 0x3FFF);
 DecompSize = (BlockHeader.LastTopicLink & 0x3FFF) + EndOffset;
 }
 else DecompSize = EndOffset;
 } /* If */
 *(Dest++) = *(DCmpBlock + (CurrBlockLoc++) );
 BytesLeft--;
 } /* While (BytesLeft) */
 return NumBytes;
}
// Displays a string from a topic link record. Checks for Phrase
// replacement and non-printable chars
void TopicStringPrint(char *String, long Length) {
 BYTE Byte1, Byte2;
 int CurChar, PhraseNum;
 long counter;
 for (counter = 0; counter < Length; counter++) {
 CurChar = * ((char *) (String + counter));
 /* Check for Phrase replacement! */

 if ((CurChar > 0) && (CurChar < 10)) {
 Byte1 = (BYTE) CurChar;
 counter++;
 CurChar = * ((char *) (String + counter));
 Byte2 = (BYTE) CurChar;
 PhraseNum = (256 * (Byte1 - 1) + Byte2);
 /* If there's a remainder, we have a space after the phrase */
 PrintPhrase(PhrasesPtr, PhraseNum / 2);
 if (PhraseNum % 2) putchar(' ');
 }
 else if (isprint(CurChar)) putchar(CurChar);
 else putchar(' '); // could do newline for 0x00 0x00
 }

}
// Dump TOPIC file, doing decompression and phrase substitution
void TopicDump(FILE *HelpFile, long FileStart) {
 FILEHEADER FileHdr;
 TOPICHEADER *TopicHdr;
 TOPICLINK TopicLink;
 /* Go to the TOPIC file and get the headers */
 fseek(HelpFile, FileStart, SEEK_SET);
 fread(&FileHdr, sizeof(FileHdr), 1, HelpFile);
 do {
 TopicRead((BYTE *) &TopicLink, sizeof(TopicLink) - 4, HelpFile);

 if (Compressed)
 TopicLink.DataLen2 = TopicLink.BlockSize - TopicLink.DataLen1;
 TopicLink.LinkData1=(BYTE *) malloc((unsigned)(TopicLink.DataLen1-21));
 if(!TopicLink.LinkData1)
 MSG("Error allocating TopicLink.LinkData1");
 TopicRead(TopicLink.LinkData1, TopicLink.DataLen1 - 21, HelpFile);
 if (TopicLink.DataLen2 > 0) {
 TopicLink.LinkData2=(BYTE*)malloc((unsigned)(TopicLink.DataLen2+1));
 if(!TopicLink.LinkData2)
 MSG("Error allocating TopicLink.LinkData2");
 TopicRead(TopicLink.LinkData2, TopicLink.DataLen2, HelpFile);
 }
 /* Display a Topic Header record */
 if (TopicLink.RecordType == TL_TOPICHDR) {
 TopicHdr = (TOPICHEADER *)TopicLink.LinkData1;
 printf("================ Topic Block Data ====================\n");
 printf("Topic#: %ld - ", TopicHdr->TopicNum);
 if (TopicLink.DataLen2 > 0)
 TopicStringPrint(TopicLink.LinkData2, (long) TopicLink.DataLen2);
 else printf("\n");
 }
 /* Show a 'text' type record. */
 else if (TopicLink.RecordType == TL_DISPLAY) {
 printf("-- Topic Link Data\n");
 TopicStringPrint(TopicLink.LinkData2, (long) TopicLink.DataLen2);
 }
 printf("\n\n");
 free(TopicLink.LinkData1);
 if (TopicLink.DataLen2 > 0) free(TopicLink.LinkData2);
 } while(TopicLink.NextBlock != -1);
}
void DumpFile(FILE *HelpFile) {
 long FileOffset, PhraseOffset, TopicOffset;

 DWORD i;
 char FileName[32];

 fread(&HelpHeader, sizeof(HelpHeader), 1, HelpFile);
 if (HelpHeader.MagicNumber != 0x35F3FL)
 MSG("Fatal Error: Not a valid WinHelp file");
 WHIFSGetFirstLeaf(HelpFile);
 TopicOffset = PhraseOffset = 0;
 for (i=0; i<WHIFSHeader.TotalWHIFSEntries; i++) {
 GetFile(HelpFile, i, &FileOffset, FileName);
 if (! strcmp(FileName, "SYSTEM")) SysLoad(HelpFile, FileOffset);
 else if (! strcmp(FileName, "Phrases")) PhraseOffset = FileOffset;
 else if (! strcmp(FileName, "TOPIC")) TopicOffset = FileOffset;
 }
 if (PhraseOffset) PhrasesLoad(HelpFile, PhraseOffset);
 if (TopicOffset) TopicDump(HelpFile, TopicOffset);
 else MSG("No Topic file found!");
}
int main(int argc, char *argv[]) {
 char filename[40];
 FILE *HelpFile;
 if (argc < 2) {
 printf("Usage: TOPICDMP helpfile[.hlp]\n\n");
 printf(" helpfile - Name of help file (.HLP or .MVB)\n\n");
 return EXIT_FAILURE;
 }
 if (! strchr(strcpy(filename, strupr(argv[1])), '.'))
 strcat(filename, ".HLP");
 if ((HelpFile = fopen(filename, "rb")) == NULL) {
 printf("Can't open %s!", filename);
 return EXIT_FAILURE;
 }
 DumpFile(HelpFile);
 fclose(HelpFile);
 return EXIT_SUCCESS;
}


























October, 1993
PROGRAMMER'S BOOKSHELF


Can We Talk?




Jonathan Erickson


Computational Models of American Speech
M. Margaret Withgott and Francine R. Chen
University of Chicago Press, 1993
143 pp. $15.95
ISBN 0-937073-98-9
Long before Hal made his film debut in 2001: A Space Odyssey, interacting with
computers through speech has been among the holiest of computer grails.
Unfortunately, it's also been one of the toughest computer science nuts to
crack.
That doesn't mean computer scientists have given up on speech recognition. Far
from it. Apple, for example, recently introduced a pair of new Macintosh
computers--the Centris 640AV and Quadra 840AV--that recognize rudimentary
voice commands. You can tell the 840AV, for instance, to "open Microsoft Word"
and it will launch the appropriate word processing program, or you can "train"
it to perform a variety of other tasks.
Still, the problems confronting speech recognitionists are formidable and
current applications remain more in the realm of novelty than practicality. On
the human side, we pronounce words inaccurately and inconsistently, our
dialects and languages are confusing (half the time I can't come close to
figuring out what my teenage son is saying), and even a relatively small
thing, like your having a little case of the sniffles, can be a high-powered
speech-recognition system to its knees. On the computer side, the processing
power to collect, analyze, and interpret speech has been generally prohibitive
in cost and availability (digital speech processing is starting to change
this, however), the data has been inconsistent and difficult to collect, and
the algorithms often inappropriate. These challenges notwithstanding, the
allure of verbally commanding a computer to act remains.
Margaret Withgott and Francine Chen, authors of Computational Models of
American Speech, present these problems (and others) and propose what they
consider to be workable solutions. As its title suggests, this book is
scholarly in tone and presentation. Still, Withgott (who's a researcher at
Interval Research Corporation) and Chen (who holds a similar position at Xerox
PARC) have written a monograph that's interesting and readable for anyone
dabbling in speech recognition theory--and required reading for anyone doing
serious work in the field.
Withgott and Chen correctly point out that much of the speech recognition work
done by programmers to date has relied on small collections of data drawn from
their own limited knowledge of, and experience with, language.
Speech-recognition specialists, on the other hand, historically have
manipulated large amounts of data for training systems without worrying about
the pronunciation details in the data. Withgott and Chen attempt to close this
gap by examining and developing "probabilistic and rule-based computational
models of transcription data using conditioning factors drawn from theory,"
claiming that their work represents "one of the first attempts to bring
together theoretical concerns with an analysis of a large American English
database." Their ultimate goal is to create a computational system that
handles "the kind of variant pronunciations one observes in large collections
of transcribed American speech."
After (predictably) presenting a brief historical background on approaches to
speech recognition (ranging from the mid-'70s Dragon system, to the more
recent Carnegie-Mellon Sphinx system), it's this speech database the authors
zero in on. Although unfamiliar to me, the speech database "TIMIT," jointly
developed in the mid '80s by Texas Instruments and the Massachusetts Institute
of Technology, is central to much of today's speech-recognition research. This
database, available from the National Institute of Standards and Technology on
CD-ROM, contains 6300 sentences produced by 630 speakers recorded by TI and
transcribed by MIT. (When Withgott and Chen wrote their book, the CD-ROM only
contained 4300 utterances; it's since been updated. For more information on
TIMIT, see the accompanying textbox entitled "The TIMIT Speech Database.")
Although the authors fault TIMIT because the speakers read sentences instead
of spontaneously speaking, they don't question its value for
speech-recognition exploration. In addition to providing speech, the database
provides visual spectral representations of the utterances so that you can
watch patterns on the computer screen.
Withgott and Chen spend a fair amount of time examining spoken language data
structures (phonetic structures, probabilistic pronunciation networks, and the
like) and rules (context descriptors) for predicting possible pronunciations.
In doing so, they describe how they apply these rules to TIMIT data primarily
using a rule interpreter (implemented by Steven Bagley).
At the heart of the book is an algorithm for computing "context
trees"--decision trees in which the number of groups of contextual factor
values is determined from the data. (Judging from the extensive reference
section at the end of the book, this is a topic Chen, in particular, has spent
several years investigating.)
A context tree is an n-ary decision tree which provides a representation for
modeling the relationship between contextual factors and the variant
pronunciations of a dictionary symbol in different contexts. Decision trees
have been used for both interpretation and classification of data. Given a
data set, decision trees partition the data set and can be formed
automatically. The resulting trees can be converted to rules, which is
convenient if one wishes to analyze the phonological rules encoded in a tree
or compare them with a hand-derived set of rules.
In short, context trees are subsets of the familiar decision trees. Context
trees can be used for organizing the values of contextual factors, providing a
representation of the relationship between a context and the data element for
classification and prediction purposes. Figure 1 (adapted from the book)
illustrates a context tree. The authors describe the figure this way:
Each non-terminal node of a context tree is labeled with a contextual factor.
In the figure, node 1 corresponds to the contextual factor word-boundary-type.
The branches from a node are labeled with mutually exclusive sets of values of
the contextual factor and each branch leads from the parent note to a child
node. The top branch of node 1 represents the values initial and
initial-and-final of the factor word-boundary-type. The middle branch
corresponds to the mutually exclusive value final. The context of a terminal
node is defined by the contextual factor values encountered in traversing the
tree from the root node to the terminal node. For example, terminal node 5
represents the context word-final with primary or secondary stress. Each
terminal node of a context tree encodes the distribution of phonetic elements
in each context. In general, more than one phonetic element occurs in a
context because realizations of a dictionary symbol are not deterministic.
Rather than predicting only the most likely phonetic element in a context, the
probability of each of the different possible phonetic elements is enumerated.
Obviously, a complete description of the context tree algorithm is beyond the
scope of this article. The algorithm does, however, seem to have more general
use than just speech recognition.
It comes as no surprise that many of the problems inherent in speech
recognition are similar to those faced in other types of recognition. As we
found out last year with the DDJ Handwriting Recognition Contest, a valid
collection of data samples is critical. (GO Corporation, for instance, has
perhaps one of the most extensive databases of handwriting samples in the
industry--and they guard it almost as closely as their recognition
algorithms.)
Computational Models of American Speech is not the place to start if you're
getting started in speech recognition; it's just too narrowly focused.
However, if you're really serious about speech recognition, it's a book you'd
be well advised to pick up.
 Figure 1: Context tree. (Node numbers are in upper right of boxes.)


The TIMIT Speech Database


The TIMIT speech database is designed to provide speech data for the
acquisition of acoustic-phonetic knowledge, and for the development and
evaluation of speech recognition systems.
TIMIT contains speech from 630 speakers from eight major dialects of American
English, each speaking 10 phonetically rich sentences. The TIMIT system
includes time-aligned orthographic, phonetic, and word transcriptions as well
as speech waveform data for each sentence-utterance. The project was a joint
effort among the Massachusetts Institute of Technology, SRI International, and
Texas Instruments. The speech was recorded at TI using a Sennheiser
head-mounted microphone in a quiet environment, digitizing the speech at a 20
kHz sampling rate and then downsampling to 16 kHz for distribution. The data
was transcribed at MIT, and then verified and prepared for CD-ROM production
by the National Institute of Standards and Technology (NIST).
All of the phonetic transcriptions have been hand verified and approximately 2
percent of the phonetic labels have been changed from earlier releases. New
test and training subsets have been selected and specified. These subsets are
balanced for phonetic and dialectal coverage. The directory structure has been
simplified and the speech waveform files are formatted with the NIST SPHERE
header structure. A revised version of the SPHERE speech file header software
is also included. Online documentation provides a description of the tabular
computer-searchable information.
Copies of the TIMIT database are available on CD-ROM through the National
Technical Information Service (NTIS). Specify the NIST Speech Disc 1-1.1, NTIS
Order No. PB91-505065. The domestic price is $100.00 (international price,
$300.00). Contact NTIS, Springfield, VA 22161, 703-487-4650, or the
Linguistics Data Consortium, University of Pennsylvania, 609 Williams Hall,
Philadelphia, PA 19104.
--J.E.



















October, 1993
OF INTEREST
POEM VideoBox, a video SDK that provides decompression at 30 frames per second
(on ISA-BUS 80486/50 MHz PCs), has been announced by Iterated Systems.
Developed for network archiving, multimedia publishing, training, and other
video applications, the kit provides software-only video at low bandwidths.
The SDK contains C object libraries for integrating video decompression into
DOS applications, sample programs with source code, and sample video footage.
POEM VideoBox supports Sound Blaster and Sound Blaster Pro sound cards. Frames
are dropped dynamically on slower computers to maintain sound synchronization.
POEM sells for $1795.00. Reader service no. 20.
Iterated Systems
5550-A Peachtree Parkway
Norcross, GA 30092
404-840-0310
Network Program-to-Program Communications (NPPC) from Software Corp., is a
library designed for creating Windows 3.1 or Windows for Workgroups network
applications in C or C++. The library is available for either IPX or NetBIOS;
the API is the same for both. NPPC adds about 30K to DOS-based applications.
The Windows version is implemented as a Windows DLL. NPPC functions are
implemented as high-level application calls that are used to perform
program-to-program communications at the message level. Either library (for
Windows IPX or NetBIOS) is available for $195.00 or $395.00, respectively. A
BBS downloadable demo for DOS or Windows is available by calling 415-949-0207.
Reader service no. 21.
Softwarehouse Corp.
326 State Street
Los Altos, CA 94022
415-949-0203
Distinct Corp. has announced the release of its Professional Edition TCP/IP
for Windows, Version 3.0. New features of the upgraded software developer's
package include three kits in one: the Windows Sockets API, Version 1.1, as
well as Distinct's TCP/IP Kernel API for Berkeley-style Sockets (TCP, UDP,
ICMP, Telnet, and FTP), and Distinct RPC, a complete ONC RPC/XDT tookit for
Windows. Enhanced installation and easy configuration is an added feature to
the upgrade. Developers can choose the portions of the program they wish to
install, and then install those portions in their own directories. In
addition, the program's easy to use dialog boxes allow fine-tuning networking
parameters to optimize specific work environments. Possible modifications
include time-out and retry values for the different protocols and maximum
transmission units for packets. Distinct offers the Professional Edition SDK
as an upgrade to previous versions for $200.00. Reader service no. 22.
Distinct Corp.
14395 Saratoga Ave., Suite 120
Saratoga, CA 95070
408-741-0781
Symantec's C++ 6.0 for Macintosh and C++ 6.0 for Apple's Macintosh
Programmer's Workshop (MPW) are development environments available with an
integrated, native C++ compiler, and incremental linker. The linker reduces
development time by only linking new or modified code into the program.
Supported are C++ features such as templates, multiple inheritance, and nested
classes. The environment includes an editor, a source-code browser, and a C++
source-code debugger. Symantec also upgraded Think C to Version 6.0. The Think
Project Manager keeps track of all files and libraries. Symantec C++ for
Macintosh includes Apple's SourceServer, which supports team development.
Symantec C++ 6.0 for Macintosh or MPW costs $499.00, and Think C 6.0 sells for
$299.00. Reader service no. 23.
Also from Symantec is a free, early look at Bedrock (Symantec's cross-platform
development environment) for software developers. A limited supply of CD-ROMs
(for Macs and Windows) gives programmers a more detailed overview of the
upcoming release of the environment. Detailed descriptions include information
on Bedrock's header files, online documentation, and sample programs. Call the
Bedrock information hotline at 408-446-8931. Reader service no. 24.
Symantec Corp.
10201 Torre Ave.
Cupertino, CA 95014-2132
408-253-9600
Nohau is using a special emulation device from Dallas Semiconductor to achieve
emulation of the DS80C320 microcontroller. The DS80C320 is pin compatible with
80C32 and uses the standard 8051 instruction set. The core is a redesigned
high-speed architecture which removes 8051's "dead cycles." This means that
the DS80C320 will run instructions 1.5 to 3 times faster than if they were run
in a regular 80C32 using the same clock frequency. The EMUL51-PC is a
high-performance, in-circuit emulator specifically designed to give an
optimized environment to develop 8051 hardware and software. The EMUL51-PC
emulator consists of a board which plugs directly into the IBM PC/XT/AT bus.
It is also available in a box which communicates with the PC through a
standard RS-232 channel at rates up to 115 kilobaud. An optional trace board
holds up to 256K trace records, 64 bits each. It has multiple trigger levels,
filterings, loop counting, on-the-fly timing, and time stamp. The emulator
board costs $3595.00; and the DS80C320 probe sells for $995.00 and runs at
speeds up to 25 MHz. Reader service no. 25
Nohau Corp.
51 E. Campbell Ave.
Campbell, CA 95008
408-866-1820
Microtek International has introduced the Microtek Pentium Emulator (MPE), an
in-circuit emulator for the Pentium processor. The company claims it is the
first emulator to run at 66 MHz. One of MPE's features is the ability to
capture an accurate execution history, even when executing code out of cache.
Traditionally, this has been a stumbling block for logic analyzers and
emulators. Prior to this, the designer was forced to disable the cache, thus
introducing a large non-transparency issue in the debug process. Other
features of the MPE include four execution breakpoints, eight complex bus
event recognizers, 16K frame trace buffer, time stamp, 32-bit occurrence
counter, and an external-trigger BNC connector. Reader service no. 26.
Microtek International
3300 N.W. 211th Terrace
Hillsboro, OR 97214-7136
503-645-7333
Mortice Kern Systems announced a new version of its compiler construction
tools, MKS LEX&YACC 3.2. New features include a Windows NT release, the
ability to create code for Windows application development, and 32-bit DOS
extended support. MKS LEX&YACC is used in the development of software that
analyses and interprets input, such as an embedded command language in a
configurable text editor. LEX builds a lexical analyzer, which takes a stream
of input and breaks it up into tokens according to user-specified rules. YACC
builds a parser, which takes a stream of tokens (such as provided by MKS LEX)
and matches them against a grammar. When a portion of the grammar is matched,
source code embedded in the YACC grammar file is executed, giving action to
the parsed commands. LEX&YACC is available for $299.00, or as an upgrade from
any previous version for $99.00. Reader service no. 27.
Mortice Kern Systems
35 King Street North
Waterloo, Ontario
Canada N2J 2W9
519-884-2251
CodeProbe, from General Software, is a software analyzer for DOS developers.
This analyzer records sequences of events and can show the elapsed time
between them. A developer can monitor a running program at full speed,
capturing system events such as hardware, DOS, and BIOS interrupts, and
user-defined events triggered by the user code calling a special trace
function from C or assembly language. After event capture, the developer can
display the trace in summary and fully-decoded forms, with each event time
stamped to sub-millisecond resolution.
The analyzer installs directly on any DOS-based PC/AT or 386/486-based
machines, and runs concurrently with the software under test. CodeProbe is
priced at $350.00. Reader service no. 28.
General Software
P.O. Box 2571
Redmond, WA 98073
206-391-4285
ACI US has updated its source code editor Object Master Universal by adding
150 new features and enhancements. Version 2.0 provides the Class Tree window
that lets you display the hierarchy of the project and see all methods and
fields of the selected routine. This editor works with applications created in
MPW, Think C/C++, and Think Pascal. Other new features include modeless search
dialogs, project history lists, and filters for class/file displays. Reader
service no. 29.
ACI US
10351 Bubb Road
Cupertino, CA 95014
408-252-4444
ProtoView Development has released the WinControl Library, a package of 14
custom controls that simplify the process of developing applications for
Windows in C/C++ or Pascal. These controls use point-and-click tools to create
3-D applications, and allow you to add functions such as context-sensitive
help, DDE links, data validation, user-defined error messages, custom colors,
and custom fonts. WinControl sells for $249.00
Also new from ProtoView is a Visual Development System for Windows
applications, an integrated visual development workbench that streamlines the
development of C/C++ and Pascal applications for Windows. ProtoGen+ features
the latest in advanced object-oriented programming. ProtoGen+ replaces the
ProtoGen Application Generator and Screen Manager, Menu Designer, Library, and
Dialog Editor. Reader service no. 30.
ProtoView Development Co.
353 Georges Road
Dayton, NJ 08810
908-329-8588
PIPES Platform, from PeerLogic, allows programs to communicate over
heterogeneous networks. PIPES is available for OS/2 2.1, AIX, Sun OS, MVS,
DOS, Windows, and NetWare systems. This SDK makes it possible for
organizations to incorporate IBM's desktop offerings into a mixed-vendor,
client/server configuration. The SDK for OS/2 costs $495.00. Reader service
no. 31.
PeerLogic

555 DeHaro Street
San Francisco, CA 94107-2348
415-626-4545
The LARL Parser Generator is an alternative to YACC for programmers building
compilers, interpreters, macro languages, and so forth. MicroQuill software
recently acquired the generator from Autom and updated the product to Version
4.3. Noted improvements include faster parsers, restartable parsers, improved
Windows compiler support, support for multiple parsers, and improved disk
organization. Priced at $795.00; the source code sells for an additional
$2000.00. Reader service no. 32.
MicroQuill Software Publishing
4900 25th Ave. N.E., #206
Seattle, WA 98105
206-525-8218
Software entrepreneurs looking for help in marketing software can now turn to
Marketing Made Easy, a practical, no-nonsense how-to newsletter for
independent application developers. The newsletter takes a hands-on approach
by providing step-by-step guidelines, case studies, examples, and marketing
tips and techniques. More specifically, Marketing Made Easy (published by
Mickey Friedman) will address the issues of pricing, direct mail, public
relations, advertising, market research, and the like. The newsletter is
published six times per year at an introductory annual subscription price of
$149.00. Discounts are available through user groups and other organizations.
Reader service no. 33.
The Friedman Group
4323 170th Place SE
Issaquah, WA 98027-9009
206-641-3498
Another newly published newsletter that recently crossed our desk is the
International Calculator Collector published by the International Association
of Calculator Collectors. This publication chronicles the development of the
handheld, electronic calculator. In addition to focusing on the history of
calculators over the past 25 or 30 years, the newsletter includes classified
ads, allowing readers to exchange information and devices. Membership in the
association, which includes a subscription to the newsletter, is $8.00 per
year. Reader service no. 34.
Wilson/Barnett Publishing
1212 So. Parton Street
Santa Ana, CA 92707
Digitalk has announced the Parts CICS Wrapper, a new component for its Parts
Workbench--a client/server integration toolset for visual application
development. With the Parts CICS Wrapper for IBM CICS OS/2, programmers can
connect Parts applications to on-line transaction processing applications. The
Parts CICS Wrapper includes the ability to connect CICS legacy function and
data to visual parts, call CICS COBOL subprograms on local or remote systems,
exploit CICS and OS/2 multitasking for parallel processing, and so on.
Digitalk also said that future Parts Wrappers will include a Parts COBOL
Wrapper and a Parts Relational Database Interface Wrapper. The Parts CICS
Wrapper sells for $2995.00. Reader service no. 35.
Digitalk Inc.
9841 Airport Blvd.
Los Angeles, CA 90045
310-645-1082
Fuzzy Logic Designer, a Windows-based developer's tool and code generator from
Byte Dynamics, now provides a 3-D simulation mode to graphically depict the
response of a fuzzy logic design in the form of a contour plane or surface
plot. Source code generated by Fuzzy Logic Designer continues to be standard
ANSI C. The system sells for $295.00 and no royalties or license fees are
required. Reader service no. 36.
Byte Dynamics
14608 E. Olympic Ave.
Spokane, WA 99216
509-926-6011


































October, 1993
SWAINE'S FLAMES


Purchase Order




Michael Swaine


A word about my politics. I am not, as some readers seem to think, a liberal
or other variety of communist, but rather a card-carrying "sizeist." My
political philosophy is: Anything smaller than me is cute; anything bigger
than me is scary--elephants, the ocean, Microsoft, the IRS, the IRA, IBM,
ICBMs, committees and other mobs, Rush Limbaugh. It's not a perfect
philosophy, but my experience has been that it's right more often than it's
wrong, and in politics anything over 50 percent is a winner.
Speaking of cute, I attended the Newton launch at the Boston Macworld Expo in
August. It's interesting that both the enthusiasts and the cynics draw
analogies between Apple's first personal digital assistant and the original
128K Mac. That's pretty much all they agree about, though.
The enthusiast: The Newton MessagePad is something entirely new. It's not a
computer. It has no keyboard, disk drive, or file system. Its user interface
adapts itself automatically to your habits. You can put together an
application in a week or two. If you're concerned about how much data it
holds, you don't have the concept.
The cynic: It's an overpriced toy. After you've paid your $700, you can buy
the RAM cards and the extras that you need to connect it to phone lines and
computers. Then you can use it to take short notes and carry your address book
in your pocket and send faxes (the ability to receive faxes will come in a few
months).
I haven't been able to find anyone who's seen the MessagePad who is not either
an enthusiast or a cynic. Some are both.
The Newton Developer's Kit, though (still in beta as I write this), is
something else. It's a very high-level development environment; I'm hearing
stories of three days to get up to speed, eight days to develop an
application. Of course, as MacUser review editor Susan Janus says, the
applications are so small, how long could it take?
But is there going to be a market for Newton applications? A $700 device, an
uncertain market--is this a platform to develop for?
The answer, I think, is that that's not really the question. Newton is a
technology, not just this one product. Apple is publishing its fantasies that
we'll one day be using Newton fax/phones, Newton inventory wristwatches,
Newton whiteboards, and other Newton-based products. Some of these products
(we get to guess which) are really going to happen within the next year or
two. Doubtless there are some great opportunities ahead for at least some
developers who guess right.
The way I see it, the investment in time and money required to get into Newton
development is so low that it's worth looking into even if you think the
MessagePad will be a spectacular failure.
Michael Swaine
editor-at-large






































November, 1993
November, 1993
EDITORIAL


The Taxman Comes Calling


What they're promising is an information highway with a fast lane into the
future. What they're neglecting to add is that it's going to be a tollroad.
Sure, there's the off chance that Clinton and Gore hadn't thought that far
ahead when they sketched out their plans for an "information superhighway" a
few months ago, but revenue-starved state and municipal governments who heard
opportunity knocking and cash registers jingling have started jumping like a
duck after a june bug on the revenue potential of electronic information
exchange.
Around the country, public coffers are bare and the traditional income streams
from taxes have dried up. Consequently, creative revenuers are scampering for
new ways of getting your money into their pockets. In my neighborhood, for
instance, home owners are being hit with special "assessments" (not "taxes")
for everything from street repairs to tree trimming because there isn't enough
money for essential services and municipal-employee sensitivity retreats in
Palm Springs. Taxation of online services is an obvious Band-Aid.
Electronic information exchange is an easy tax target because it's used by
those who can afford it--people and businesses with computers, modems, and the
knowledge, desire, and need to transfer data and share information. Anyone
who's popped for a 486/66 and 9600-baud modem isn't likely to squeal over a
few nickels and dimes here and there.
In particular, states such as Massachusetts, Florida, and Pennsylvania are
charging ahead with taxes for electronic exchanges. Their efforts are for the
most part haphazard and exploratory, but a project just getting underway at
the Washington-based Multistate Tax Commission may change that. The Commission
will eventually propose a nationwide uniform model for the taxation of
electronic exchanges via telecommunication services.
But even tax supporters admit there are a passel of issues to be addressed,
the least of which is jurisdiction. If you live in one state, for instance,
and download a file from a computer in another state, who gets the benefit of
your hard-earned dollars? And, for that matter, will you be taxed twice, since
you're already paying federal excise and other taxes for the phone call? For
that matter, what kind of tax should be levied? States can't agree. Some are
going after sales, others gross-revenue, and still others after value-added
and use taxes.
On the local level, Chicago is leading the way with a recent amendment to its
Transaction Tax Ordinance. Under the redefined ordinance, individuals and
businesses pay a 6 percent tax on the sale of information from one computer to
another. The ordinance justifies this, Chicago tax attorney Rich Lieberman
tells us, by broadly lumping the concept of computer "timesharing" with the
taxable realities of equipment "lease" and "rental." In other words, if you
pay to log onto a remote computer, you're leasing equipment--and that's
taxable. This tax applies to systems that are flat rate (such as Lexis/Nexis)
and hourly (most commercial BBSs).
But there's more than money at stake here.
The real promise of the information highway is that it's a great equalizer: No
matter where we live or what we do, we have equal access to information and
the opportunities it offers. By upping the entry cost, the emerging category
of taxes are preventing equal access. The end result is another step towards
an information-age society of information have and have-nots.
In their short-sighted grabs for tax dollars, state and local governments may
be killing a golden goose before it hatches from its electronic egg. Instead
of prematurely taxing new business opportunities, government should be
nurturing them. For a report from the tax trenches, talk to the folks at
Channel 1, a Cambridge, Massachusetts BBS service. They were recently slammed
with a $150,000 state tax bill that may result in curtains for the two-person
company.
On a broader level, the precedents being set by computer-to-computer
telecommunication taxes raise even more questions. Check-verification systems,
for example, involve one computer exchanging information with another. Will
users of these systems be charged taxes that will be passed on to you? If you
order an item via fax, should you be taxed for the commercial use of the
telecommunication system? Will the company where you work pay extra taxes for
processing payroll over phone lines? Or are you prepared to pay more taxes for
sending commercially related e-mail message over the Internet? What if as part
of your purchase of a software package you download a patch from the vendor's
technical-support BBS or ask a question regarding the use of a program? Will
you end up paying more taxes for these clearly commercial transactions? And,
for that matter, does wireless communication change any of this?
For the most part, revenuers justify these taxes by making the distinction
between "taxable transmission services" and "nontaxable information services."
But if we're serious about having a society dedicated to equal opportunity,
you can't separate the two. Any barrier to the access to information is a
barrier to the information itself.
Jonathan Ericksoneditor-in-chief









































November, 1993
LETTERS


Going in Circles




Dear DDJ,


Cliff Pickover's article "Recursive Worlds" (DDJ, September 1993) reminded me
of two bits of doggerel:
Great fleas have little fleas
 On their backs to bite 'em
And little fleas have lesser fleas
 So on ad infinitum.
Big whirls have little whirls
 Which feed on their velocity
And little whirls have lesser whirls
 And so on to viscosity
William Drissel
Grand Prairie, Texas


Extending C




Dear DDJ,


The textbox "Extending C" in the article "C/C++ Standardization: An Update" by
Rex Jaeschke (DDJ, August 1993) touched one of my "hot buttons" about C--that
all functions in the C math libraries do everything in double. I am a casual
user of C, but a power user of Fortran. In about 15 years of scientific
programming in Fortran, I have learned there are very few occasions when
double precision (64 bit) is needed to produce sufficiently accurate
manipulation. The suggestion in the textbox that float-math libraries be an
extension to C has my enthusiastic support.
Roger H. James
East Hartford, Connecticut


Computer Science and Programmers' Future




Dear DDJ,


I have some observations sparked by Nick Tredennick's article "Computer
Science and the Microprocessor." (DDJ, June 1993).
First he points out the obvious fact that Intel's processors will continue to
dominate the market for quite sometime to come. This comes as no surprise to
the average DDJ reader. Next, he highlights the not-so-obvious fact that the
backward compatibility Intel has committed to (or is trapped by, depending on
your point of view) will keep the existing applications developed under MS-DOS
or Windows (which is really only a DOS process) popular for some time to come.
Then he reminds us that all of the major applications for the Intel (cum
MS-DOS) platforms have been written, and keeping them current requires a lot
fewer programmers than writing it in the first place. Finally, and most
distressing, he implies that this means a reduced need, with a consequently
smaller job market, for the professional programmer.
I'm no longer a professional. I now program only for my own amusement, so the
above doesn't cause me much economic worry. I doubt, however, that too many
programmers will have it as easy to change careers as I did. So here's a
suggestion. There are other places where the programmer can sell his services.
There are other types of hardware and other operating systems: the Amiga, the
Atari, and, my personal favorite, the OS9 opsys, which is used on dozens of
different platforms including a new version for the 386 and above. All of
these platforms are waiting for many of the applications already developed
from the compatible and Macintosh markets.
It's true that the market won't be nearly as large: A word processor that
sells for, say, $80.00 in the compatible market would have to sell for
considerably more in the OS9 market. Of course considering that OS9 is fully
multiuser, higher costs are justified. Once you've gotten over the learning
curve of a new system's quirks, porting an application to another platform is
easier than starting from scratch. If you can't legally port your code, then
rewriting is still easier than writing it the first time.
Given the above, perhaps it's time for DDJ to begin helping the professional
decide where to move to, when it becomes necessary, by offering articles with
applicability to a broader range of hardware and operating systems. Your
"Algorithm Alley" column is an excellent start since everything offered is
presented in a pseudocode that can be readily adapted to any programming
language.
I'll admit a bit of missionary zeal in the above. The personal computer is not
just the one hardware/opsys combination that dominates the current market, but
an idea that has been implemented on dozens of different hardware
architectures. I believe it is important to the long term health of personal
computing that a diversity of operating systems and hardware survive and
evolve.
Because there's no clear historical parallel to the personal computer
phenomena, I'll indulge my penchant for simile; think of multiple
hardware/opsys platforms as similar to the biodiversity that gives an
ecosystem the resiliency to survive change. Remember that mammals once
occupied a narrow ecological niche just below the smallest of the dominant
saurians (mammals that had insulating hair) but things were upset quite a bit
when that piece of junk came barreling out of the sky 65 million years ago. I
hope that no parallel disaster awaits the personal computer but Nick
Tredennick, intentional or not, has pointed out some of the realities of a
rapidly changing market before the "crisis."
J. Stephen Carville
Glendora, California


Listen Up, Curmudgeons!





Dear DDJ,


Apparently, object-oriented ignorance has not been overcome. The logic of the
curmudgeon argument (see May 1993 "Letters'') that we don't need to move to OO
languages because there is nothing we cannot accomplish in procedural
languages is flawed at best. From the tone of the subject letter, we can only
conclude that all curmudgeons are either incompetent at or intimidated by OOP
(they just like to drop DDJ a line every now and then to let us all know!).
Listen up, curmudgeons: The reason for using OO languages instead of
procedural languages is that we don't have to perform programming calisthenics
to gain some additional abstraction, enhance our design capabilities, and to
facilitate reuse. OOP allows us to express useful programming (and for the
first time, "elegant design'') concepts in a succinct and productive manner.
If we want to, we could simulate C++ function templates in C by passing void
pointers to a lookup table inside each instantiation, but why not let the
language and the compiler do the work for us? Procedural languages incorporate
certain functionality in the form of libraries or reserved words because these
tools are used so often that it would be counterproductive to exclude them. OO
languages such as C++ and Ada 9X do the same thing; they just include
different types of functionality. I don't know, nor have I heard of a
procedural programmer using an OOP successfully and then preferring to return
to procedural languages. I also highly doubt that any curmudgeon has been
using "...object techniques since 1978...'' that even closely resemble the
object techniques of today. Like it or not, new languages are emerging for
larger software systems where "the problem is (NOT) so well understood.'' Most
curmudgeons have never confronted systems like these; systems so large that no
one person can possibly understand them completely. This is the area where
object-based languages are helpful and I'm glad they're here to stay.
Curmudgeons, begone!
Spencer Roberts
Redondo Beach, California


Putting HVC in Order




Dear DDJ,


In the sidebar "Putting Colors in Order" by Harry Smith that accompanied the
article "Color Models" (DDJ, July 1993), the conversion between the HVC and
RGB color space variables permits RGB values larger than unity, or negative.
The geometrical picture given by Smith didn't include the required variation
of the maximum chroma radius with intensity (V). This variation in chroma
radius is introduced because the mapping between RGB and HVC transforms in 3-D
space from the color cube to a color cone. The radius variable C in the
transformation equations is therefore a scaled function of the chroma value,
C'. For chroma varying in the range C(0.0,0.50) let C=4Cmax', Cmax =min
(V,1-V).
The definition of Cmax accounts for the variation of the radius of the circle
in Smith's diagram with V. For V=0.50, Cmax=0.50, and the circle diameter is
unity. As the total intensity V increases to 1.0 or decreases to 0.0, the
conversion-circle diameter decreases proportionally.
The value of chroma C is thereby scaled by the maximum circle radius. For V=0
or V=1.0, chroma is undefined because the conversion circle contracts to a
point. However, for finite intensity near 0 or for V near unity, chroma is
defined even though it contributes little to the image appearance. Using the
definition above for C in the equations ensures that the RGB values calculated
from HVC values with Smith's equation set (a) are positive in the range (0,1).
Maxwell T. Sanford II
Los Alamos, New Mexico


Mo' Better Linked Lists




Dear DDJ,


Regarding, "Strategies for Better Linked Lists" by Garyl Hester (DDJ, August
1993), a simple improvement would be to maintain the free list in sorted
physical address order. It would not involve an actual sort, but instead, just
a series of comparisons during free list insertion.
On average, it would reduce the number of partially filled blocks by
reallocating in a prescribed, rather than arbitrary order. It would also
simplify the block deallocation logic. Because the freed atoms would be in
order, pPrev of the first atom in the block along with pNext of the last atom
in the block would suffice to purge the entire block from the free list. Cache
performance might also benefit from inherent clustering.
Berry Ratcliff
Ann Arbor, Michigan


Swaine vs. Aristotle: Mano a Mano




Dear DDJ,


In his column "Fuzzy Logic and Prejudice" (DDJ, July 1993), Michael Swaine
wrongly asserts that Aristotelian logic is defined by an inability to deal
with matters of degree. He supposes that classical logic rounds everything to
fit the nearest "binary-valued" pigeon hole. Swaine's misrepresentation is
tantamount to equating the law of identity with the Kronecker delta function.
His obvious purpose is to diminish classical logic to make room for some new
subjective logics, of which he believes "fuzzy logic" is one. Fuzzy logic may
be robust but it is nothing more (and nothing less) than a clever classical
mechanistic method of processing complex and time-varying inputs to produce
outputs that satisfy a set of complex criteria. According to Swaine, though,
it operates by some sort of superior, non-Aristotelian, supernatural logic.
The term "fuzzy logic" is only controversial to fanciful folks who insist on
giving the most generously wide interpretation to the word "logic" in its
name. I'm sure the name was actually conceived in the spirit of "fuzzy logic
gates."
Aristotle's law of identity has not been bested. Since its context encompasses
all perceptual phenomena, it cannot and it need not be bested. Indeed, it was
such a towering accomplishment that even today, very few philosophers have
grasped its implications. They do not understand what Aristotle gave us:
Logic, the art of noncontradictory identification. This intellectual
sloppiness, coupled with a charlatan mentality which is eager to promote each
new incremental advance to the status of fundamental scientific revolution, is
why we have a proliferation of new "logics." Quantum logic. Classical logic.
Your logic. My logic. Fuzzy logic. And now Swaine's logic.
Paul Ierymenko
Ajax, Ontario


More Keyboard Controllers





Dear DDJ,


After reading Al Stevens' column on keyboard controllers for the handicapped
("C Programming,'' June 1993) I'd like to mention an excellent keyboard
controller for Windows that you might like. It's called "WiViK,'' and has been
developed by the Microcomputer Applications Programme at the Hugh MacMillan
Rehabilitation Centre, Toronto.
One of its most useful features is word prediction which displays a row of
words that you might be trying to type as you enter each letter. If your word
appears, you can select it rather than typing the rest of it. The word
prediction is updated based on how often you select a word. We at BioControl
Systems (Palo Alto, California) have adapted our BioMuse product into
applications for the handicapped. The BioMuse can be used for eye tracking and
muscle control input into a computer. We have used the BioMuse with WiViK for
hands-free Windows control.
The following is from the Overview section of WiViK 2.0 Help:
WiViK is a special software program that allows a computer user to enter text
into any application within Microsoft 3.1 (including DOS windows) with any
pointing device that can emulate the standard mouse. It is fully compatible
with the standard IBM keyboards and includes several features that make its
use easier with pointing devices.
Possible pointing devices include: headpointing devices, joysticks, mice,
pens, touchscreens, and trackballs. No modifications to the applications are
necessary, and the standard keyboard remains fully functional. Human factors
issues relating to the use of an on-screen keyboard have been carefully
considered in the design of WiViK to ensure its operation is natural and easy.
WiViK operates similar to other Windows applications. A keyboard is displayed
within a moveable, resizeable window to meet your needs. The keyboard window
is always available, moving above an active application when necessary. It may
be customized for: number and arrangement of keys; key widths; key labels; key
label font; and key spacing. Keys are automatically resized when the keyboard
is resized.
Besides the standard keyboard layout, you can define and use different
layouts. Keys may contain complex macros that allow you to send words,
phrases, and commands with a single key selection. You can display and use
multiple keyboards, each with specific functionality.
WiViK 2.0 is the second version of the program. It has been enhanced based
upon customer feedback and the results of current research. The key
enhancement is a modular design that enables it to be extended with optional
modules called additions. Two additions, abbreviation-expansion and word
prediction, are the first modules to be available in an optional Rate
Enhancement Package. These additions are described here. Future optional
additions will include scanning access and more support for access through
imprecise pointing.
Rick Rees
San Francisco, California














































November, 1993
Performance Tuning: Slugging It Out!


Watching the call stack is the key to deslugging




Michael R. Dunlavey


Michael, who's president of Performance Software Associates, is the author of
YAPA, a shareware tool performance tuning. He can be reached at 276 Harris
Ave., Needham, MA 02192.


When you think "performance tuning," what comes to mind? Handcoding in
assembly language? Profilers? Fancy data structures? I do a lot of coding for
DOS, UNIX, and VMS applications and I use a way of speeding up programs that's
so simple it's addictive. I can't remember exactly when I first learned it. It
might have been in the late '70s, testing comm drivers for Raytheon by single
stepping with panel buttons.
Sometimes I'd get tired of single stepping, so I would alternately hit the RUN
and HALT buttons. In between, the old mini would whiz off half a million
instructions and land someplace random. Out of curiosity, I'd check it against
the listing to see what it was doing. Oddly, much of the time it was doing
something correct, but totally unexpected. Sure you can attack performance
problems with faster algorithms, DMA, or faster processors, but simply
stopping at random times and examining the state of the software reveals that
nine out of ten times these approaches bark up the wrong tree. Usually, what
makes the software slow is something impossible to guess, but trivial to fix.
For example, on one occasion (on a 68000 UNIX box), there was a loop over an
array of structures, something like Example 1(a). It seemed to be taking too
long, but there was nothing obvious slowing it down. We ran it under the
profiler and, except for a hot spot in the math library, saw nothing odd.
Finally we ran it under adb and hit Control-C. There was the program counter,
dawdling along in the math library routine for multiplying 32-bit numbers,
even though no multiplication was called for in the source code. Typing out
the call stack cleared up the mystery. Variable i was declared int, making it
32 bits (in that compiler). To get the address of a[i], it had to multiply i
times the size of the structure. The 68000 has a 16-bit multiply instruction,
but not 32. Therefore, the compiler generated a subroutine call. The fix was
trivial--just declare i as short--and the loop tripled in speed.
In another case, an embedded program was taking far too long to print
floating-point numbers on a color graphic display. Putting the program on an
in-circuit emulator and halting it turned up the problem--it was in the middle
of the floating-point library. Tracing return pointers back to the application
showed that it was printing numbers, digit-by-digit, as in Example 1(b). We'd
had to write our own printf (the code was actually in Pascal). The code was
perfectly correct, but it was doing floating divide, float-to-fix,
fix-to-float, multiply, and floating subtract on every single digit. The
repair was easy; we just converted the number into fixed point and then dealt
with it in that form.
I've come to call these things "slugs" because they're not really bugs--the
code isn't wrong, per se, it's just slower than necessary. The process of
finding slugs is called "deslugging," and it is much easier than debugging. To
find bugs, you have to trace the program and catch it going wrong. To find
slugs, you simply interrupt it and take a look.
To illustrate this process, I'll describe a program that simulates a Computer
Integrated Manufacturing (CIM) application. At 700+ lines, it's a large for a
sample program (that's why its available electronically, see "Availability,"
page 3), but still small for a CIM application. The most interesting slugs
appear when a program is large enough to need a few layers of subroutines.
First I'll describe the design of the program, then I'll discuss how it was
deslugged in stages. Finally I'll show how it was redesigned to be not only
much faster, but much smaller and clearer as well.


The Problem


In CIM cell-control applications, there are usually four principal functions:
Schedule Execution (ISCH in the program), which takes requests for individual
manufacturing JOBs. A job consists of a sequence of OPERATIONs. It fits these
into a schedule, allocates resources for them, and dispatches them for
processing. When all operations of a job are done, it sends completion
information back to the requestor.
Task Coordination (ITC in the program), which takes requests for operations on
jobs and controls their execution. An operation consists of a series of TASKs,
such as: command material handling (IMH) to start moving the part to the
machining center; command device control (IDEV) to start downloading the tool
path file; wait for both tasks to complete; validate the bar code on the part;
instruct the machining center to begin cutting; wait for it to finish; upload
status information; and command material handling to move the part to storage.
Device Control (IDEV), which takes requests for machine-related functions such
as tool path download, cycle start and stop, and status monitoring, and talks
to the actual machine to get it done. In our simulation, this is just a no-op.
Material Handling (IMH), which takes care of moving parts from here to there.
It "talks" to the actual controllers. In the simulation, this is a no-op.
The standard way to design such a system is with a message-flow diagram. It
would contain four big message handlers (see Figure 1), each taking in
requests and issuing acknowledgments when those requests are satisfied. Each
function in turn sends requests to the handler below it and receives
acknowledgments. Each function handler has state information in the form of
outstanding requests and their completion status. It uses this to decide what
to do next whenever something happens. It is all very event-driven. The main
entry point is a message dispatch loop, as in Microsoft Windows. The loop
takes each queued message and passes it to the proper handler. That handler
will most likely put more messages in the queue.
In addition to the major functions, some utilities are needed, and the
"cluster" paradigm (related to OOP) is a state-of-the-art way to design these.
They are:
List cluster (ILST), which includes primitives for creating linked lists,
deleting them, appending to them, iterating over them, and so on.
Transaction cluster (ITRN), a queuing messages. There are primitives for
creating, deleting, sending, receiving, and so on.


First Deslugging Pass


The simulation program runs 100 simulated jobs, each with ten (plus or minus
five) operations, and each operation has five device tasks and five
material-handling tasks. The message Ack Job nn is printed as each job
completes. All in all, this is about 1000 operations and 10,000 tasks.
The program takes 48 seconds to complete (see Figure 2). Granted it's doing a
lot of stuff, but think about the timing. If it does 10,000 tasks (each of
which is a no-op) in 50 seconds, it's doing about 200 per second. That is
about one every 5 milliseconds, or around 5000 instructions per task, assuming
a 1-MIP machine. What is it about a no-op task that takes 5000 instructions to
perform?
Turning to Microsoft's CodeView debugger, you can compile the program using
the Zi flag, and enter CV /I MAIN to execute it. Let it run, then halt it
(with Ctrl-C or Ctrl-Alt-SysReq), after which you can display the call stack
and make a note of the results.
If you do this several times, you'll discover that the program is spending
around 60 percent of its time in the ILST cluster in functions ILST_NTH,
ILST_NEXT, and ILST_LENGTH. You could optimize the daylights out of these
routines, scuttle the ILST cluster altogether (go around to all 50 or so
places it's used and replace it with something else), or rewrite the
transaction cluster because it is based on the ILST cluster. However, the call
stack tell you that most of the time is being spent in the ILST routines that
are being called from ITC_PROCESS. If you examine the specific lines they're
being called from, you can see one of those slugs; Example 2(a). The operation
variable current_task is the index of the next task to perform. This test is
being performed solely to determine if the operation is complete, which 90
percent of the time it isn't. Above this line is another slug; see Example
2(b). The cluster operation ILST_NEXT is being called to iterate over the list
of operation requests. This is an n-squared operation since ILST_NEXT searches
from the beginning of the list.
A few lines further, the call stacks point to the slug in Example 2(c) which
is extracting a pointer to the current task from the task list. These slugs
are all due to the stilted way the list cluster is designed and used.
You might also be tempted to think this is a lesson in how not to use list
clusters. Wrong! It's a lesson in how to find the slugs that are really there,
not the ones you imagine. The slugs in other software will be different, but
the process of finding them is the same.
The fix is easy. First of all, get rid of the ILST_NEXT in the iteration. Just
step a pointer along in the normal way. Second, rather than keeping a numeric
index of the next task in the list, keep a pointer to it. This eliminates the
need to call the ILST_NTH and ILST_LENGTH primitives. (When it runs off the
end and becomes NULL, there are no more tasks.)
The result is that execution time drops to about 20 seconds (speedup factor:
2.4) without hand-optimizing anything!


Second Deslugging Pass


On the second pass, again run the program again under the debugger, randomly
halting it a few times. This time new slugs appear--ones that were there
before, but were masked by the big slugs. Again, time is being spent in the
ILST cluster--in the ILST_APPEND primitive (it runs down the list, tacking new
items on the end).
Some of the calls occur when the task list of an operation is being created.
The tasks are appended to the list one at a time in Example 3(a), an n-squared
operation because ILST_APPEND runs the length of the list.
Another significant source of calls to ILST_APPEND is Example 3(b) in ITRN_PUT
in the transaction cluster. We're finally seeing time spent in transactions!
The fix I made was twofold. To eliminate the time spent appending tasks when
building a task list, I put them in a temporary array and then built the list
all at once. I added a routine ilst_make to the ILST cluster that would take
an array of pointers and make an equivalent list.
In the transaction cluster, I changed it to use a circular array for the
queue, rather than a list.

The result? Execution time dropped to 17 seconds (speedup factor: 2.8). (I'm
getting greedy; I was hoping for more.)


Third Pass


The slugs are getting smaller now, but there are more of them. I find time
being spent on list operations at the operation/job level and in transaction
dispatching. I make the following changes:
In ITRN, I change the cluster to use pointers into the queue rather than
indexes.
In ITC, I change pointers l and ptop to be register variables.
In ISCH, I get rid of using the ILST cluster on the operation lists, just as I
did earlier on the task lists in ITC.
In ITC, time is being spent in the loop in Example 4(a).
It is a linear search of the operation list whenever a task completion is
received from IDEV. I replaced it with Example 4(b). This seemed to compile
into a faster loop. I did the same thing for the same loop on receipt of a
material-handling acknowledgment.
In main, transaction dispatching is handled by doing a linear table search.
When the transaction code is found in the table, it knows what routine to send
the transaction to. I replaced that loop by unrolling it and hard coding the
search. This is based on the idea that it is pointless to put things in
run-time tables if they never change at run time. Results? 13 seconds (speedup
factor: 3.7).
Fixes are becoming less easy to find. The call stack tells me what it's doing,
but I've already done what I could to speed it up. It is easy to see that it
is spending much of its time dispatching transactions, and finding relevant
operations when acknowledgments are received.


The Redesign


At this point, you need to ask the question: Are all those transactions and id
searches really necessary? The original problem description doesn't say you
need handlers, message queues, and so on. It only tells you to do a number of
JOBs in parallel, that each job consists of a series of OPERATIONs, and that
each operation consists of a series of TASKs. All we really have to do is
transform the problem description into a running program. One way to do this
is to use a special-purpose language.
To do this, I created a "little language" (using C macros) that supports a
very simple form of non-preemptive parallel processing (because the problem
statement calls for it). The result is Listing One (page 90). Table 1 provides
a brief description of the language's primitives. In this language a process
consists of an application data record, plus a pointer to a procedure, and a
simple integer state variable. Processes can be used to simulate application
entities such as jobs, records, and tasks.
When a new job process is begun, its record is allocated and initialized. Then
it is resumed. Resuming consists of calling the record's control procedure.
The control procedure does whatever it needs to do and returns, but first it
sets the state variable. The next time it is resumed, it dispatches on the
state variable and does whatever comes next, and so on. In short, it's a
finite-state machine.
What does this buy us? By modelling jobs as processes, operations as
subprocesses of jobs, and tasks as subprocesses of operations, we eliminate
all queued transactions except those that are forced on us by the nature of
the problem, namely external-device and material-handling delays.
So where does this leave us? The application is now given in Listing One. It's
one-fourth the size of the first version, even including the definitions of
the process macros. It now gets the job done in 10 seconds (speedup factor:
4.8)


Further Deslugging Passes


Being ever greedy, I deslugged it again. Now it had hot spots in the enque and
deque routines. I just replaced these with in-line macros, and the time went
down to seven seconds (speedup factor: 6.9).
At the fifth pass, deslugging it showed that most of the time was being spent
in printing out the hundred Ack Job nn messages. Commenting out the printf got
the time down to four seconds (speedup factor: 12).
At the sixth pass, I increased the number of jobs to 1000, to make it run long
enough to observe. I saw that it was spending a large percentage of time in
_malloc() and _free as objects were being created and destroyed. So, I
recycled used objects in special stacks. Also, in each process, I made the
self pointer p a register variable. Resulting time was 26 seconds, or 2.6
seconds for 100 jobs (speedup factor: 18.5).


Final Rewrite


Still not satisfied, I deslugged it again. Now, the bulk of the time went into
the CALL and RETURN statements. Since operations and tasks are serialized
within each job, there is no need to make them separate processes, so I
recoded again, eliminating the CALL statements (see Listing Two, page 91). The
result? Eleven seconds, or 1.1 seconds per 100 jobs (see Figure 3). Not bad,
when you consider that the original program took 48 seconds. That's a speedup
factor of 43.6.
Conclusion
Performance tuning is like wringing water from a towel. You can always get
more if you keep working at it and remember that diagnosis is important,
guesswork doesn't work, and software redesign not only improves performance
but maintainability too.
Table 1:
Language primitives
PROLOGUE(type,f) type is the typedefed name of the application record, and f
the control procedure.
DISPATCHn n is the number of states in the process.
BREAK(n) n is a unique state number.
CALL(n,expr) n is a unique state number, and expr creates another process
RETURN(v) Effects a lightweight-process return, resuming the calling process
and passing it the value v.


All About Profilers




Mike Armistead





Mike works at Pure Software and can be contacted at marmi@pure.com.


The key to improving a program's performance is understanding its run-time
behavior--and profilers are the tools that lead you to this understanding. Jon
Bentley describes profilers as the programmer's equivalent of a stethoscope, a
necessary and simple tool that "looks inside" a program. A general definition
of the term "profiler" is any tool that captures execution-timing data about a
program and reports the profile of that run. Various tools use different
technologies and techniques to collect and present that data. The accuracy,
ease of output interpretation, ease of implementation/deployment, and overhead
are among the critical factors that determine a profiler's usefulness.
Figure 4 illustrates the three generations of profiler technology. The first
generation is characterized by a "roll your own profiler" approach.
Programmers relied on embedded print statements that output simple "time of
day" statistics obtained from the system clock to understand execution
whereabouts and estimated timing. This technique works for small,
self-contained programs, but is cumbersome to implement and sparse on
insightful data. Program overhead is also high due to calls made to access the
system clock. This leads to skewed results that often misrepresents actual
execution.
Second-generation profilers tools are similar, but packaged into
self-contained, more manageable tools. The UNIX prof and gprof utilities are
included here. Second-generation profilers are closely tied to compilers that
insert special instructions into the source. These instructions typically call
a monitoring routine to record execution times for instrumented functions. The
information obtained is usually based on sampling the stack according to a set
interval of time, typically in the 10 microsecond range. Profilers based on
this approach don't require an extreme amount of effort to setup or deploy,
although they do require separate compilation. The UNIX utility gprof requires
a -pg flag to instrument the program. A separate command analyzes the results
from a test run, writing them to a file. Borland's Turbo Profiler only
requires that you compile the source for debugging. (Turbo Profiler is also
source-based, although it offers both sampling and instruction-counting modes.
In many ways, it straddles the gap between second- and third-generation
technology.)
This generation of tools typically collects function call relationships and
timing data for the sections that have been instrumented. Because they're
source-based, only routines where the source is available are visible.
However, without visibility into libraries and operating system calls made on
your program's behalf, strategies to limit or circumvent time sinks are
difficult to design, implement, and realize. A source-only view further
hampers a profiler's accuracy by incorrectly allocating time to the proper
functions. Gprof's call graph, for instance, only covers those portions of
code that compiled for gprof. Thus calls into vendor and third-party libraries
aren't reflected in the call graph. Gprof's mcount data records pairs of
function-called-function data. Consequently, if the gprofed code calls a
library function that can't be instrumented (because source isn't available),
which in turn calls gprofed code, the call graph can't be connected and the
information can't be correctly propagated up the call chain.
Sampling is a technique used by most profilers of this generation. While
sampling isn't inherently faulty, it's limited by the profiler timer's
accuracy, and factors such as the length of your profiling run. It's hard to
interpret sampling-based results because functions often execute in less time
than the "tick" rate of the timer. Today's CPUs can execute over 250,000
instructions between interrupts. Turning up the clock speed isn't an answer to
the accuracy problem, either. The faster the clock speed, the more calls added
to the program, thereby skewing results. The best way to limit this
statistical error is to lengthen the time runs. This may or may not be
practical for large applications--and certainly not welcome by individual
programmers. Overhead in terms of code size and execution speed is quite low.
Sampling doesn't add significant penalties to either of these factors.
Additionally, because the profiling only covers the run-time space for user
code, there's less overhead than if the entire application was comprehensively
covered.
Third-generation technology manipulates executables to record actual
instruction execution. Since these tools rewrite the executable, they can
capture data over the entire program space, including library function calls.
They don't usually require a separate compile. These tools (which include Pure
Software's Quantify and, up to a point, Borland's Turbo Profiler) analyze
functions in terms of their basic code blocks. (A basic block is a linear
sequence of machine instructions terminating in either a branch instruction or
the beginning of another basic block.) They identify the basic blocks of each
function, then use information about the machine's hardware to compute the
expected number of instruction cycles each basic block would consume.
Additionally, these tools can insert code around trap instructions that switch
the processor from user state to kernel state, enabling system calls to be
recorded.
Third-generation (or object-based) tools provide greater accuracy because they
count actual instructions executed for the entire application--including
libraries for which you don't have the source--and are able to record the time
it takes system calls to process a request on your program's behalf.
Additionally, this overall view leads to correct propagation up the call
chain. Typically, object-based tools only require a relink of the object
files.
Basic block instruction counting can incur noticeable overhead. Since
instructions are being inserted directly into object code, code size can
double. In addition, execution of the instrumented version can slow down.
Instruction counts must be resolved with the hardware implementation to remove
the instruction overhead from the data. For additional discussion of profiler
"gotchas," see "Profiling for Performance" by Joseph Newcomer (DDJ, January
1993).
Data presentation is a distinct, yet unseparable attribute of any profiler.
Users must be able to focus on improving performance, not on how to find and
interpret data. As a result, profilers that attempt to filter out "noise" and
present multiple, complimentary views for all run-time behavior save users
time and effort.
Figure 4: The evolution of profiler technology
Table 2 summarizes key attributes of each generation. Don't think that the
third-generation is the last word for profilers. Next-generation profilers
will likely provide data on every aspect of run time--from algorithms to
memory access--and give users even more control to filter data, all in
complete, easy-to-use packages.
Table 2:
Profiler attributes
 Print Source Object
 statements based based
Accuracy Y - Y
Easy to interpret data Y N Y
Ease to setup/deploy N Y Y
Overhead High Low Medium-high
Example 1: (a) A loop over an array of structures; (b) correct, yet
inefficient, code.
(a)
 struct { ...
 } a[...];
 int i;
 while ( ... ){
 ... a[i] ...
 }
(b)
 float num, newnum;
 char digit;
 while (...){
 newnum = (int)(num / 10);
 digit = num - newnum * 10 + '0';
 num = newnum;
 ... store digit for output ...
 }
 Figure 1: Design of a CIM simulation program, a "typical" program to be
deslugged. Schedule Execution receives manufacturing Job Requests and carries
them out by issuing Operation Requests to Task Coordination, which requests
device and material-handling tasks. The simulation of 100 jobs originally took
48 seconds. It was reduced to 1.1 seconds in a series of eight revisions.
 Figure 2: Revision history of program. The original program is deslugged
three times, bringing its time down to 13 seconds. Then it is redesigned,
reducing its time to 10 seconds, and its code by a factor of 4. Three more
sessions bring it down to 2.6 seconds. A final rewrite brings it down to 1.1
seconds.
Example 2: (a) The slug here is the operation variable current_task; (b) in
this slug, the cluster operation ILST_NEXT is being called to iterate over the
list of operation requests; (c) this slug extracts a pointer to the current
task from the task list.
(a)
/* IF ALL TASKS DONE, SEND ITC_ACKOP AND DELETE OP */
if (ptop->current_task >= ILST_LENGTH(ptop->tasklist)){
(b)
/* FOR EACH OPERATION REQUEST */
for ( ptop = ILST_FIRST(oplist);
 ptop != NULL;
 ptop = ILST_NEXT(oplist,ptop)
 ){
(c)
ptask = ILST_NTH(ptop->tasklist,ptop->current_task);
Example 3: (a) Tasks that are appended to the list one at a time; (b) another
significant source of calls in the transaction cluster
(a)
ILST_APPEND(ptop->tasklist,ptask);
(b)
 ILST_APPEND(trnque,ptrn);
Example 4: (a) Time being spent in the loop; (b) replacement code that
compiles into a faster loop.

(a)
 for (l=oplist; l; l=l->next){
 ptop = l->thing;
 if (ptop->id==ptn->tskid) break;
 }
 if (ptop==NULL){
 /* ERROR: INVALID OPERATION ID */
 }
(b)
 for (l=oplist
 ; l && ((operation_t*)l->thing)->id != ptn->tskid
 ; l=l->next){
 }
 if (l==NULL){
 /* ERROR: INVALID OPERATION ID */
 }
 ptop = l->thing;
 Figure 3: Out of 48 seconds, 46.9 were eliminated. In this particular
program, much of it was due to needless general-purpose cluster usage and a
programming style that arises from data-flow design and event-driven
processing. Other software will be slow for different reasons.
_PERFORMANCE TUNING: SLUGGING IT OUT!_
by Michael R. Dunlavey

[LISTING ONE]

/* FAST.H DEFINITION OF DISPATCHn MACROS ------------------ */

#define DISPATCH0

#define DISPATCH1 \
 if (p->state==1) goto L1;\
 DISPATCH0
#define DISPATCH2 \
 if (p->state==2) goto L2;\
 DISPATCH1
#define DISPATCH3 \
 if (p->state==3) goto L3;\
 DISPATCH2
#define DISPATCH4 \
 if (p->state==4) goto L4;\
 DISPATCH3

/* FAST.C REDESIGNED IMPLEMENTATION ----------------------- */

#include <stdio.h>
#include "fast.h"

/* MACRO TO GEN STANDARD STATE MACHINE VARS */
#define STDVARS int (*func)(); int state; struct machine_struct *caller
/* "ROOT CLASS" OF STATE MACHINE */
typedef struct machine_struct {
 STDVARS;
 } machine_t;
/* MACRO TO GEN SETUP CODE FOR A CLASS OF MACHINE */
#define PROLOGUE(typ,f)\
 typ *p = (typ*)malloc(sizeof(*p));\
 extern int f();\
 p->caller = caller;\
 p->func = f;\
 p->state = 0;\
 (*p->func)(p);

/* BREAK STATEMENT */
#define BREAK(n,lab) p->state=(n); enque(p); return; lab:
/* CALL STATEMENT */
#define CALL(n,lab,expr) p->state=(n); (expr); return; lab:
machine_t * ptemp=NULL;
int retn_val=0;
/* RETURN STATEMENT */
#define RETURN(v)\
 ptemp=p->caller;\
 retn_val=(v);\
 free(p);\
 if (ptemp){(*ptemp->func)(ptemp);};
/* THE GLOBAL WAIT QUEUE */
int enq=0, deq=0, ninq=0;
machine_t *queue[256];
enque(p) machine_t *p;{
 queue[enq++] = p;
 if (enq>=256) enq=0;
 ninq++;
 }
machine_t * deque(){
 machine_t *p = NULL;
 if (ninq){
 p = queue[deq++];
 if (deq>=256) deq=0;
 ninq--;
 }
 return(p);
 }
/* -------- APPLICATION CODE -------------- */
int jobs_started=0;
int jobs_completed=0;

#define NBOPS (rand()%5 + 10)
#define NTASK 10
#define NJOBS 100
/* MAIN() */
main(){
 machine_t *p;
 /* REPEAT UNTIL ALL JOBS ARE COMPLETE */
 while(jobs_completed < NJOBS){
 /* RUN WHATEVER CAN BE RUN */
 if (ninq){
 p = deque();
 (*p->func)(p);
 }
 /* IF < 100 JOBS STARTED AND < 10 JOBS IN PROCESS */
 if (jobs_started<NJOBS && jobs_started-jobs_completed < 10){
 /* START ANOTHER JOB */
 job(NULL);
 }
 }
 }
/* JOB_T: A SUBCLASS OF STATE MACHINE */
typedef struct {
 STDVARS;
 int jobid;
 int i;
 int nbops;

 } job_t;
/* JOB(): CREATE AND START A JOB */
job(caller) machine_t *caller;{
 PROLOGUE(job_t,job_func);
 }
/* JOB_FUNC(): CONTROL PROCEDURE FOR A JOB */
job_func(p) job_t *p;{
 DISPATCH1;
 p->jobid = jobs_started++;
 p->nbops = NBOPS;
 /* FOR EACH OPERATION */
 for (p->i=0; p->i < p->nbops; p->i++){
 /* PERFORM THE OPERATION */
 CALL(1,L1,opn(p));
 }
 jobs_completed++;
 printf("Ack Job %d\n",p->jobid);
 RETURN(1);
 }
/* OPN_T: OPERATION STATE MACHINE */
typedef struct {
 STDVARS;
 int taskid;
 int ntask;
 } opn_t;
opn(caller) machine_t *caller;{
 PROLOGUE(opn_t,opn_func);
 }
opn_func(p) opn_t *p;{
 DISPATCH2;
 p->ntask = NTASK;
 /* FOR EACH OPERATION */
 for (p->taskid=0; p->taskid < p->ntask; p->taskid++){
 /* PERFORM DEVICE CONTROL TASK */
 CALL(1,L1,dev_ctl(p));
 /* PERFORM MATERIAL HANDLING TASK */
 CALL(2,L2,mh_ctl(p));
 }
 RETURN(1);
 }
/* DEV_CTL_T: DEVICE CONTROL TASK STATE MACHINE */
typedef struct {
 STDVARS;
 } dev_ctl_t;
dev_ctl(caller) machine_t *caller;{
 PROLOGUE(dev_ctl_t,dev_ctl_func);
 }
dev_ctl_func(p) dev_ctl_t *p;{
 DISPATCH1;
 /* IT'S A NO-OP. DELAY A LITTLE AND RETURN */
 BREAK(1,L1);
 RETURN(1);
 }
/* MH_CTL_T: MATERIAL HANDLING TASK STATE MACHINE */
typedef struct {
 STDVARS;
 } mh_ctl_t;
mh_ctl(caller) machine_t *caller;{
 PROLOGUE(mh_ctl_t,mh_ctl_func);

 }
mh_ctl_func(p) mh_ctl_t *p;{
 DISPATCH1;
 /* IT'S A NO-OP. DELAY A LITTLE AND RETURN */
 BREAK(1,L1);
 RETURN(1);
 }


[LISTING TWO]

/* fast.c */

#include <stdio.h>
#include "fast.h"

#pragma check_stack(off)
#define STDVARS int state; int (*func)(); struct machine_struct *caller

typedef struct machine_struct {
 STDVARS;
 } machine_t;
/* STACK STRUCTURES FOR CACHEING USED STATE MACHINES */
struct mstk_struct {
 int n;
 struct machine_struct *stk[64];
 };
#define M_ALLOC(mstk,size,p) {\
 if (mstk.n <= 0) p = (struct machine_struct*)malloc(size);\
 else p = mstk.stk[--mstk.n];\
 }
#define M_FREE(mstk,p) {\
 if (mstk.n >= 64) free(p);\
 else mstk.stk[mstk.n++] = p;\
 }
#define PROLOGUE(typ,f,stk)\
 register typ *p;\
 extern int f();\
 M_ALLOC(stk,sizeof(*p),p);\
 p->caller = caller;\
 p->func = f;\
 p->state = 0;\
 (*p->func)(p);
#define BREAK(n,lab) p->state=(n); ENQUE(p); return; lab:
#define CALL(n,lab,expr) p->state=(n); (expr); return; lab:

machine_t * ptemp=NULL;
int retn_val=0;
#define RETURN(v,stk)\
 ptemp=p->caller;\
 retn_val=(v);\
 M_FREE(stk,p);\
 if (ptemp){(*ptemp->func)(ptemp);};

unsigned int ninq=0;
machine_t *queue[256];
machine_t **enq = queue, **deq = queue;

#define ENQUE(p) {*enq++ = p; if (enq>=(queue+256)) enq=queue; ninq++;}

#define DEQUE(p) {p = *deq++; if (deq>=(queue+256)) deq=queue; ninq--;}

int jobs_started=0;
int jobs_completed=0;
#define NBOPS (rand()%5 + 10)
#define NTASK 10
int njobs = 1000;
main(){
 register machine_t *p;
 /* REPEAT UNTIL ALL JOBS ARE COMPLETE */
 while(jobs_completed < njobs){
 /* RUN WHATEVER CAN BE RUN */
 if (ninq){
 DEQUE(p);
 (*p->func)(p);
 }
 /* IF < 100 JOBS STARTED AND < 10 JOBS IN PROCESS */
 if (jobs_started<njobs && jobs_started-jobs_completed < 10){
 /* START ANOTHER JOB */
 job(NULL);
 }
 }
 }
/* JOB_T: SUBCLASS OF MACHINE */
struct mstk_struct jobstk;
typedef struct {
 STDVARS;
 int jobid;
 int i;
 int nbops;
 int taskid, ntask;
 } job_t;
job(caller) machine_t *caller;{
 PROLOGUE(job_t,job_func,jobstk);
 }
job_func(p) register job_t *p;{
 DISPATCH2;
 p->jobid = jobs_started++;
 p->nbops = NBOPS;
 /* FOR EACH OPERATION */
 for (p->i=0; p->i < p->nbops; p->i++){
 p->ntask = NTASK;
 /* FOR EACH TASK */
 for (p->taskid=0; p->taskid < p->ntask; p->taskid++){
 /* DO DEVICE CONTROL */
 BREAK(1,L1);
 /* DO MATERIAL HANDLING */
 BREAK(2,L2);
 }
 }
 jobs_completed++;
 RETURN(1,jobstk);
 }









November, 1993
Heap Checking


A pair of libraries for handling heap-related bugs




Steve Oualline


Steve is author of C Elements of Style (M&T Books, 1992) and can be contacted
at sdo@crash.cts.com.


Because they don't cause immediate problems that are easy to spot, heap errors
are among the most difficult and frustrating programming bugs to root out.
Heap errors are sneaky. They quietly modify random data, causing other
portions of your program to fail and you usually don't have any idea what
causes the problem.
In comparison, logic errors are easy to find. Suppose, for example, you make a
logic error in a check-balancing program--you subtract, instead of add,
deposits. The resulting negative balance is obvious and easy to correct.
With heap errors, however, the balance might end up being something
86,#*2,*(#.%%. Unquestionably garbage is being written into memory--but how?
In this case, memory used for data was changed by bad code. The code could
just as easily have changed code memory or DOS's memory. Since DOS doesn't
like having it's memory monkeyed with, your system will likely crash even to
the point that Ctrl-Alt-Del doesn't do the job, forcing a hard reset.
This article presents two libraries which address problems such as these. The
SafeHeap library intercepts all the heap-related calls (malloc, free, and the
like) and performs extensive error checking before actually executing the
operation. The LogHeap library prints out a log message every time the heap
changes. These log messages can then be used help locate heap problems, such
as memory leaks.


Borland's Heap-checking Routines


Borland recognizes the difficulties of heap debugging and provides you with a
number of library routines designed to ensure the integrity of the heap. These
functions, see Table 1, give you a tremendous set of tools for finding program
errors. For example, if you try to free the same block twice, heap corruption
will occur. You can check for this problem by simply using heapchecknode to
make sure that the pointer you're freeing is really allocated; see Example 1.
By putting calls to the Borland checking routines in strategic places in your
code, you can detect heap problems early. There are some problems with this
approach, however. First, you must modify your source code. You must also
decide where to put the checks, and you may make the wrong choice.
Additionally, the C library makes your job harder by burying heap allocation
inside library routines such as strdup. Finally, this method requires work
that you want to avoid whenever possible.


The SafeHeap and LogHeap Libraries


To address the problem of heap errors and fill in some of the gaps with
Borland C, I wrote the SafeHeap and LogHeap libraries. This set of debugging
tools includes code that checks memory leaks, data integrity on each call, and
the like. I've also written a group of sample programs that illustrate how to
use the library. The source code and executables for the library and sample
programs are available electronically; see "Availability," page 3.)
To use the library, you need to divide your program into two parts--one part
for the heap routines, the rest for the program. See Figure 1(a). You can also
add another layer between the heap and program for debugging. This layer,
which I call the "paranoid" layer, checks all heap requests for sanity and
report errors the moment they happen. See Figure 1(b). Borland has helped in
the construction of this layer by supplying the source code to their run-time
library. First, edit the module farheap.asm, renaming the functions as in
Figure 2. The prefix r_ indicates that these are the "real" routines. (You
don't have to worry about calloc since it actually calls malloc to allocate
the memory.)
Now you can write the paranoid layer. Your version of malloc looks Figure 3,
although in practice it's not this simple. All this version tells you is that
you have an error. It doesn't give you vital information--who or what caused
the problem, for instance.


Getting the Return Address


To get the return address, you first need the address of the calling
procedure. One way of getting this is to run the program under the debugger,
put a breakpoint at the error message, and display the call stack.
(Unfortunately, some programs are so large and complex that they resist
debugging. Run a program normally and you get a pointer error. Run it under
the debugger and you crash the debugger and everything else.)
Here's another way of getting calling procedure's address. Looking at the
assembly code for a far call, you see it starts with a CALL FAR instruction,
see Figure 4(a), which pushes the return address segment and offset onto the
stack. The initialization code of a C procedure looks like Figure 4(b). The
first instruction saves the bp register on the stack. Next the current stack
pointer is saved in the register bp. This sets up the bp register so that it
points to space for the local variables. Finally the stack pointer is adjusted
to allocate stack space for these variables.
The result of all this is that the bp register points to a block of memory
that contains the information in Figure 5. You can get a pointer to the return
address with char **ret_ptr = MK_FP(_SS, _BP+2);. This gives you the absolute
return address. You need to transform this address into one that's relative to
the beginning of the program. The variable _psp contains the segment of the
Program Segment Prefix (PSP). This is a 0x100 byte data area at the beginning
of each program. You can use this variable to help transform our return
address into something useful.
Start by breaking the address apart into a segment and offset; see Figure
6(a). You then adjust the segment by the value of the PSP segment. One final
adjustment is needed for the PSP, 0x100 bytes, or 0x10 paragraphs; see Figure
6(b).


Turning an Address into a Line Number


Once now you've got the caller address, you need its location in the code. The
link map comes to the rescue. The tlink /v option or the bcc -lv option
generates a link map containing line numbers. Just scan the listing for an
address close to the one in the log file to determine the source line
containing the error.
For example, when you run the program BAD_FREE (one of the sample programs
available electronically), you get the message heap error(free) 023E:0058
20E0:0004 (2,4). The first number in this message is the address call the
caused the problem. Looking through the map file for BAD_FREE, you come across
Figure 7. You return address (023E:0058) is between line 023E:004D (line 16)
and 023E:005A (line 18). Thus, you've narrowed down the problem to line 16
where there's an illegal call to free.


Memory Leaks


A memory leak occurs when you allocate, but never free, a block of memory.
This can cause you to run out of memory. To find memory leaks, you need to use
a different version of the intercept library. Instead of merely reporting
errors, this version reports all heap allocation and deallocation calls.
The result is a complete (and somewhat large) log file you can use for
locating memory leaks. All you have to do is pair up the malloc calls with the
corresponding frees. Anything else is a memory leak.
Rather than do this manually, I've included the utility H_CHECK. First run
your program with the LogHeap library, then run H_CHECK, and you'll get a list
of allocated memory that was never freed.



Technical Details


The compiler provided a number of surprises when creating this set of
debugging tools. First, the function fopen calls malloc. This normally isn't a
problem, unless you're trying to call fopen from your own "intercept" malloc.
The problem is that you've introduced an infinite recursion. The function
malloc detects an error, calls fopen to open the log file, which calls malloc
which detects an error which calls . . . and on, and on.
The solution is to use the log routine to turn off logging while inside the
library. If the in_log flag is set, you can ignore any recursive calls.
The other problem concerns NULL pointer checking. The first time I ran
BAD_NULL, I found location 0000:0000 was being changed in two places. This was
surprising since I'd only changed it once. Who was the mystery player?
As it turns out, the startup code installs a new divide-by-zero handler,
although when the program exits, it restores the old value. This was the
restoration I was logging. Consequently, I changed NULL pointer checking to
not complain if interrupt vector 0 is restored to it's original value
(contained in _Int0Vect).


C++ Heap Checking


C++ uses new and delete to allocate memory. In Borland C++, these routines are
front ends for malloc and free. Borland C++ defines the symbol __BCPLUSPLUS__
when compiling C++ code. The SafeHeap and LogHeap libraries use this symbol to
automatically compile their own version of new and delete. To create a C++
version of the libraries, edit the makefile to include the Borland C++ option
-P (force C++ compile).


Conclusion


SafeHeap and LogHeap provide a set of tools for finding many different heap
problems. They can't find everything, but they will catch most errors. Better
yet, they catch errors early, before they have time to corrupt memory and
cause other parts of the program to fail.
Table 1: Borland library routines that deal with the heap.
Routine Description
heapcheck Walks through the heap and checks each block's critical attributes
such as link pointer and size.
heapcheckfree Checks the free space in the heap to make sure that each word
contains the same value. This value can be set by the function heapfillfree.
heapchecknode Checks a given pointer to make sure it points to an allocated
block in the heap.
heapfillfree Fills the free blocks in the heap with a constant.
 Example 1: Using the heapchecknode function
 Figure 1: (a) Heap routine and main program layers; (b) "paranoid" layer of
program added for debugging purposes
 Figure 2: Renaming functions in farheap.asm.
 Figure 3: The "paranoid" layer version of malloc.
 Figure 4: (a) Getting the calling procedure's address using a CALL FAR; (b)
initialization code of a C procedure.
 Figure 5: The bp register points to a block of memory that contains this
information.
 Figure 6: (a) Breaking the address apart into a segment and offset; (b)
adjusting the segment by the value of the PSP segment.
 Figure 7: Line numbers for bad_free.obj (bad_free.c) segment BAD_FREE_TEXT



























November, 1993
Finding Run-time Memory Errors


A sophisticated tool for the thorniest of bugs




Taed Nelson


Taed is a senior software engineer at National Semiconductor Corporation. He
can be reached via the Internet at nelson@berlioz.nsc.com.


Probably the most insidious bug-infesting C programs is the array-bounds
violation. In its more subtle forms, it merely leads to slightly incorrect
results. Virulent strains can cause stack corruption, segmentation violations,
and ultimately, programmer insanity. A wide variety of free and commercial
malloc() debugging packages are available to help combat this plague.
Unfortunately, they're awkward to use and only address part of the problem.
However, a number of "modern" tools designed to detect memory-related errors
are now available. These tools, such as Purify from Pure Software, Insight
from Parasoft, MemCheck from StratosWare, and Sentinel from Virtual
Technologies, are modern in that they perform sophisticated checking and
generate detailed reports of C/C++ programs at run time. In this article, I'll
focus on Purify 2.0. For information on other tools, see the accompanying
textbox entitled "Run-time Debuggers."
Besides array-bounds violations, Purify 2.0 detects memory leaks and the use
of uninitialized memory, NULL pointers, and free()ed memory. (Figure 1 lists
errors Purify can catch.) Currently, Purify is available only on Sun
SPARC-stations, but HP 9000-support is forthcoming. It supports C, C++, and
Fortran 77. I've used it with cc and CC from Sun, lcc from Lucid, and GNU gcc
and g++ from Free Software Foundation. Pure Software supports a number of
other compilers as well.


Innocence Lost


Although I write software for different platforms, my primary project has been
OPAL, a PLD programming package consisting of 20 programs and 80,000 lines of
code, which supports both DOS and Sun UNIX environments. Since DOS is tolerant
of memory errors, we decided to focus quality assurance on the UNIX platform,
where a segmentation violation is likely.
The result was a brute-force malloc() replacement (written by Andy Valencia,
of Sequent, and myself) which could detect array-bound violations of
malloc()ed memory. Unfortunately, it used enormous amounts of memory (a
minimum of two virtual pages--typically 8K--for each malloc() call), which
usually thrashed the system, and frequently exhausted all available virtual
memory. I first ran across Purify at the Winter 1992 USENIX conference. Since
then, I've used Purify on every program I've written, as well as on the
public-domain packages to which I contribute.


Overview


Purify can be used almost immediately after installation. To illustrate how to
use it, I've written a short program (see string1.c, Listing One, page 92)
which contains a number of errors. While the example is admittedly contrived,
all the errors are familiar to C programmers. For the sake of illustration,
imagine the code is tens of thousands of lines long where you would have only
a small chance of finding the errors by sight.
Purify can be run either from the command line or from within a make file by
prefixing your normal linking command with the command purify. Although I'm
compiling with the -g debugging option, Purify doesn't require it. If you use
-g, Purify will produce more readable error messages. Since there's only one
source module, I'll skip the make file and type purify gcc -g -o string
string1.c.
Purify first allows gcc to compile string.c, intercepting the link step. It
then modifies the object file string.o and the standard library libc.a to
insert all of its own error checking code. It then uses its own incremental
linker to produce the executable file.


Unitialized Memory Read


When execution of string starts, Purify verifies you have a license. If so, it
runs normally, but whenever an error is detected, it prints a message. Since
the messages are sent to stderr, they are easy to separate from normal program
output. The messages produced by Purify are shown in Listing Two, page 92;
these are edited for brevity in subsequent runs.
The first reported error is an "unitialized memory read" of the local variable
string on line 13 of string1.c, the stringCopy() function call. Returning to
the source file, you'll notice I passed string as an argument without
initializing it to a section of allocated memory. Knowing a bit about
pointers, I suspect that the garbage pointer is what lead to the second error,
a segmentation violation. Purify specified the stack frame and line number of
both errors. The familiar UNIX message "segmentation fault" doesn't convey
this information.
The first error is easy enough to fix; the modified version of main() is in
Listing Three, page 92. Compiling is much faster this time since the Purify-ed
version of libc.a had been cached. Nevertheless, when I run my program
(Listing four, page 92), I still have an unitialized memory read, but it's due
to the local variable length in stringCopy(). Purify shows the entire stack
frame and size of the error. Also, since the program didn't crash this time, a
summary is printed at the end.
Once again, the error is simple to fix by initializing length to 0. After I
compile and run this version, Purify reports no errors. Feeling overjoyed, I
decide to exercise my program a bit with different strings. The new program is
in Listing Five, page 92.


Array Bounds Violations and NULL Pointers


When I first run the new version, it produces a long list of errors. To
shorten this list, I use Purify's batch mode by entering purify -batch gcc -g
-o string string3.c. This option consolidates all error messages of the same
type that occur on the same line. This output is shown in Listing Six, page
92. This report shows I had an "array bounds write" error by writing past the
end of my malloc()ed memory. The report identifies where the error occurs,
where that memory was allocated from, and the amount of memory allocated.
When I examine my code, I realize I neglected to allocate enough memory to
store each of my strings, specifically the second. I can either fix this by
allocating more memory, or by passing an additional parameter to stringCopy().
Since the latter is more general, I go with that alternative. (Lack of array
size testing is one of the programming mistakes that the Internet Worm used to
its advantage.)
The report also identifies the obvious use of the NULL pointer. At first, you
may not think that this feature is special; after all, UNIX "reports" the
error with a segmentation violation. The advantage is that Purify identifies
the line number and stack frame when the error occurred. This is a trivial
error to fix by checking that neither "source" nor "destination" are NULL. In
addition to the two bug fixes, I decide to get a bit fancy with my testing by
adding a loop; see Listing Seven page 92.


Memory Leaks


In the newest version (see Listing Eight, page 92) all of my access errors
have been fixed. The report now identifies 300 bytes of leaked memory. A
memory leak is allocated memory that has no active pointer pointing to it.
Purify also identifies potential memory leaks, those areas that have a valid
pointer that isn't pointing at the first byte. Usually, potential leaks are
due to incrementing a pointer across a string and not freeing it. However,
they sometimes hint at a variety of other problems.
Many C programmers have been conditioned not to worry about memory leaks since
the memory is reclaimed by the operating system when the program ceases
execution. However, neglecting to free() memory used early in a program's
execution can cause large programs to page fault unnecessarily or run out of
virtual memory. The X Window System and programs built on top of it are
notorious for this type of error, particularly since some of us remain logged
in for months at a time.
In this case, since Purify points out exactly where the memory was allocated,
I realized that I forgot to free() my allocated memory. This memory leak is
easy to fix by placing free(string); at the end of the loop. After this
change, the program runs as expected and Purify reports no errors with it.



Purify's API


Purify provides a large number of functions which can be called from a
debugger (such as dbx or gdb), or from within the program itself. Functions
are provided to control the batch mode, print to the log file, report on the
state of memory leaks, and to print detailed information about memory
locations. Many of these are useful within an assert() statement to ensure
that previously fixed problems do not return.
In addition, watchpoints are provided to break on read, write, allocation,
free, entry, or exit of a specific or range memory location. From within a
debugger, it's also possible to break on any detected error.


Error Suppression


Purify will sometimes detect a violation that you know is acceptable. Usually
it is a known error that does no harm, in a library over which you have no
control, or one that hasn't been fixed yet. Occasionally, it will be due to a
bit of strange code that confuses Purify.
Purify messages can be suppressed by using a .purify file. This file can exist
in your home directory (for general problems), or in the current directory
(for project-specific problems). I have in my home directory a .purify file
which contains the entry "suppress abw tzload," where abw stands for "array
bounds write." This fixes a bug that SunOS 4.1.1 has in its standard version
of tzload(), a routine called by many of the time functions, which writes one
byte past the end of allocated memory. A relative of the offending code is in
Listing Nine, page 92. I am supplying this code not to point fingers at Sun,
but to show that these types of errors can crop up in extremely reliable
commercial code.
It's also possible to suppress errors based on a certain stack frame. For
example, instead of ignoring all "array bounds writes" in tzload(), I may only
wish to suppress them if tzload() is called from tzsetwall(). To do this, my
.purify file would contain suppress abw tzload tzsetwall. The use of wildcards
is allowed.


Purify's Innards


The first phase of Purify runs after compilation and before linking. It takes
each object file and library and inserts a special function call before every
memory access. It inserts additional code during stack frame creation, and
within malloc() and free(). Since it is the object files that are modified,
Purify can detect errors in all aspects of the program, even hand-optimized
assembly code and commercial libraries for which no source is available.
The second phase starts during run time, when the additional code is executed.
This code overhead maintains and checks a two bit entry for every byte in the
heap, stack, data, and bss sections. The entry indicates the state of the
byte: unallocated, allocated but uninitialized, or allocated and initialized.
By checking this state on every access, Purify can easily detect the use of
stray pointers (unallocated memory), memory that has been free()ed, and
uninitialized memory.
However, since Purify operates at the instruction level, it cannot detect
stray instruction fetches; that is, a runaway program counter, like those
products that utilize debugging modes within the microprocessor itself. On the
other hand, this type of error is rare, and will quickly be rewarded with a
segmentation violation on UNIX. It is also likely that Purify would detect the
cause of the invalid program counter, which is often caused by corrupting the
stack with an out-of-bounds write to a local array.
Unfortunately, Purify will also overlook unitialized bits within a byte, as
long as at least one bit has been initialized. This can result from the use of
an operator assignment expression, see Listing Ten, page 92. Purify Software
did this intentionally for subtle reasons.
Array bounds violations are detected in a similar manner by allocating extra
space both before and after the requested memory and marking that space as
unallocated. This is done for both malloc()ed and static arrays. An extremely
bad array-bounds violation; for example, array[1000] on a 10-byte array, has
some small chance of ending up in the valid section of another array, but I've
never seen this happen in practice.


Performance Issues


It's no surprise that Purify affects program performance. Although, based on
the poor performance of my previous home-grown tools and Purify's many
additional features, I'm satisfied with the cost.
Link-time performance can be poor. The first link of a program will take about
ten times longer than usual. After that, though, it will only take a few times
as long because Purify caches the modified object files, and uses an
incremental linker by default. Although these cached files tend to clutter
your project directory, the documentation does show exactly how to set up a
crontab entry to remove old ones.
Run-time memory use is about 50 percent more than usual. About half of this
increase is due to the two-bit state of each byte, and the remainder is due to
the larger executable file. The executable file is typically about three times
larger than normal, due to the inserted Purify code. With large applications,
the extra memory usage becomes a concern due to the performance degradation
caused by page swapping.
Run-time speed is generally three times slower, plus a few seconds overhead
for license detection. For the types of programs I typically write this isn't
a hindrance, so I use Purify all the time. Purify-ed programs which make use
of a GUI, particularly those built on top of X Window, could quickly annoy the
user. This is especially true if the machine has little real memory (less than
16 Mbytes). Programs that perform a lot of memory allocation and freeing also
experience slow-downs since the memory is not reallocated immediately.


For More Information


Purify 2.0
Pure Software Inc.
1309 South Mary Avenue
Sunnyvale, CA 94087
408-720-1600
info@pure.com


Run-time Debuggers


Purify is just one run-time debugging tool available for programmers. Other
UNIX-based tools include Sentinel from Virtual Technologies (Dulles, VA) and
Insight from Parasoft (Pasadena, CA). For PC developers, there's MemCheck from
StratosWare (Ann Arbor, MI) and Bounds-Checker from Nu-Mega (Nashua, NH). All
of these tools are roughly equivalent in functionality, yet the cost varies
from $99.00 for MemCheck to $1298.00 for Purify. They all detect memory leaks,
bad pointers, malloc() and free() mismatches, and boundary violations of
malloc()ed memory. Most also catch memory leaks, reading unitialized memory
and bad C library function calls.
Purify is the only tool that works at the object-code level, which, in theory,
enables it to work with any language, compiler, or library. Unfortunately,
this makes it very system-dependent, and it's only available on the Sun
SPARC-station and HP 9000.
Sentinel, available on most UNIX platforms, is similar in functionality to
Purify, but operates as a wrapper around the standard C library. Thus, it only
detects memory errors passed as arguments to one of the library functions.
This restricts its use on specialized programs and code in languages other
than C/C++.
Insight is also available for most UNIX workstations. It operates at the C
source-code level by inserting function calls around each line of code. In
some cases, this enables it to detect errors better than the other products.
However, it can't be used on libraries for which source isn't available, or on
code which isn't written in C. It also provides a graphical debugging tool for
examining run-time memory use, and some compile-time errors which I had only
previously seen in GNU gcc (such as printf() type-checking).
On the PC side, MemCheck, available for DOS, Windows, and Macintosh, operates
similar to Sentinel by checking all calls to the standard C Library. It also
shares the same problems.
Bounds-Checker, available for DOS and Windows, is an automatic tools that pops
up when it detects problems in the heap, stack, or data segment, and finds
illegal memory accesses. The DOS version is not as powerful as the other
products, delivering only a better version of UNIX's "segmentation violation."
(For details on the Windows version, see "Debugging Windows Applications,"
page 78.)
To give you a flavor of each, these tools were used on the program in Figure
2. Listings Eleven through Fifteen, page 93, show the output. Listing Eleven
is the Purify report, Listing Twelve, the Sentinel, and Listing Thirteen is
that of Insight. Listing Fourteen is the report generated by MemCheck and
Listing Fifteen that from Bounds-Checker.
All of the tools detected both errors in that simple example. While Purify,
Sentinel, and Insight each give a great deal of information, there's something
to be said for MemCheck's brevity and Bounds-Checker's pop-up environment.
--T.N.

 Figure 1: The errors caught by Purify
Figure 2: Test.c, a sample program with two errors.
1 #include <string.h>
2 #include <malloc.h>
3
4 const char *Hello = "Hello, world!";
5
6 int main (int argc, char *argv[])
7 {
8 char *string;
9
10 /* Allocate memory, but we forgot the null character. */
11 string = (char *) malloc (strlen (Hello));
12
13 /* Copy, but it will copy the last byte into unallocated memory. */
14 strcpy (string, Hello);
15
16 /* Leave, but we didn't free() string. */
17 return (0);
18 }
_FINDING RUN-TIME MEMORY ERRORS_
by Taed Nelson


[LISTING ONE]

/* Copy the entire string from source to destination. */
/* Return the number of characters copied. */
unsigned int stringCopy (char *destination, const char *source)
{
 unsigned int length;
 while ((*(destination++) = *(source++)) != '\0') {
 length++;
 }
 return (length);
}
int main ()
{
 char *string;
 stringCopy (string, "Hello world!");
 exit (0);
}


[LISTING TWO]


Purify'd string (pid 12837)
 Purify 2 (C) 1990-93 Pure Software Inc. Patents Pending.
 Contact us at: support@pure.com or (408) 720 1600.
 In Europe: support@pts.co.uk or (+44) 61 776 4499.
 Purify checking enabled.

**** Purify'd string (pid 12837) ****
Purify (umr): uninitialized memory read:
 * This is occurring while in:
 main [line 19, ~nelson/Papers/string1.c, pc=0x19ad4]
 start [crt0.o, pc=0x2064]
 * Reading 4 bytes from 0xf7fff8a4 on the stack

 This is local variable "string" in function main.

**** Purify'd string (pid 12837) ****
Purify (cor): Received signal 11
SIGSEGV (segmentation violation, signal bit = 0x00000400), may dump core:
 * This is occurring while in:
 etext [/n/share/src/gnu/gcc/gcc-2.3.2/libgcc2.c, pc=0x85720]
 main [line 19, ~nelson/Papers/string1.c, pc=0x19ae8]
 start [crt0.o, pc=0x2064]
 * Handler function: SIG_DFL
 * Current signals: 0x00000400 (SIGSEGV)
 * Pending signals: 0x00000000


[LISTING THREE]

int main (void)
{
 char *string;
 string = (char *) malloc (20);
 stringCopy (string, "Hello world!");
 exit (0);
}



[LISTING FOUR]


Purify (umr): uninitialized memory read:
 * This is occurring while in:
 stringCopy [line 8, ~nelson/Papers/string2.c, pc=0x21744]
 main [line 21, ~nelson/Papers/string2.c, pc=0x217dc]
 start [crt0.o, pc=0x2064]
 * Reading 4 bytes from 0xf7fff814 on the stack
 This is local variable "length" in function stringCopy.

**** Purify'd string (pid 12878) ****
 * 1 access error.
 * Basic memory usage:
 262136 code
 273000 data/bss
 16392 heap
 1864 stack
 * Shared library memory usage:
 696320 libc_pure_200.so.1.8 (shared code)
 16384 libc_pure_200.so.1.8 (private data)
 8192 libinternal_stubs.so.1.0 (shared code)
 8192 libinternal_stubs.so.1.0 (private data)


[LISTING FIVE]

#include <stdio.h>
/* Copy the entire string from source to destination. */
/* Return the number of characters copied. */
unsigned int stringCopy (char *destination, const char *source)
{
 unsigned int length = 0;

 while ((*(destination++) = *(source++)) != '\0') {
 length++;
 }
 return (length);
}
int main (void)
{
 char *string;
 string = (char *) malloc (20);
 stringCopy (string, "Hello world!");
 stringCopy (string, "supercalifragilisticexpialidocious");
 stringCopy (string, "");
 stringCopy (NULL, NULL);
 exit (0);
}


[LISTING SIX]

**** Purify'd string (pid 13077) ****
Purify (abw): array bounds write (15 times):
 * This is occurring while in:
 stringCopy [line 9, ~nelson/Papers/string3.c, pc=0x21730]
 main [line 24, ~nelson/Papers/string3.c, pc=0x21824]
 start [crt0.o, pc=0x2064]
 ==== At first occurrence: ====
 * Writing 1 byte to 0x85774 in the heap
 1 byte past end of a malloc'd block at 0x85760 of 20 bytes
 * This block was allocated by malloc called from:
 main [line 21, ~nelson/Papers/string3.c, pc=0x217e8]
 start [crt0.o, pc=0x2064]

**** Purify'd string (pid 13077) ****
Purify (npr): null pointer read: reading 1 byte from 0x0
 * This is occurring while in:
 stringCopy [line 9, ~nelson/Papers/string3.c, pc=0x21724]
 main [line 26, ~nelson/Papers/string3.c, pc=0x21850]
 start [crt0.o, pc=0x2064]

**** Purify'd string (pid 13077) ****
Purify (cor): Received signal 11
SIGSEGV (segmentation violation, signal bit = 0x00000400), may dump core:
 * This is occurring while in:

 etext [~nelson/Papers/string3.c, pc=0x85778]
 main [line 26, ~nelson/Papers/string3.c, pc=0x21850]
 start [crt0.o, pc=0x2064]



[LISTING SEVEN]

#include <stdio.h>

#define MAX_STRING 100
const char *StringList[] = {
 "Hello world!",
 "supercalifragilisticexpialidocious",
 "",

 NULL
};
/* Copy the entire string from source to destination. */
/* Return the number of characters copied. */
unsigned int stringCopy (char *destination, const char *source, unsigned int
size)
{
 unsigned int length = 0;
 if ((source == NULL) (destination == NULL)) {
 return (0);
 }
 while ((length < size) &&
 ((*(destination++) = *(source++)) != '\0')) {
 length++;
 }
 return (length);
}
int main (void)
{
 const char **example;
 char *string;
 /* Test all of the example strings in StringList. */
 for (example = StringList; *example != NULL; example++) {
 string = (char *) malloc (MAX_STRING);
 stringCopy (string, *example, MAX_STRING);
 printf ("The copied string is \"%s\".\n", string);
 }
 exit (0);
}


[LISTING EIGHT]


Purify: Searching for all memory leaks...
There are 200 leaked bytes (44.6% of the 448 allocated bytes in the heap)

 100 bytes (2 times). Last memory leak at 0x858d0
Report (mlk): 200 total bytes lost, malloc called from:
 main [line 38, ~nelson/Papers/string4.c, pc=0x218a0]
 start [crt0.o, pc=0x2064]

 Purify Heap Analysis (combining suppressed and unsuppressed chunks)

 Chunks Bytes
 Leaked 2 200
 Potentially Leaked 0 0
 In-Use 3 248
 ----------------------------------------
 Total Allocated 5 448



[LISTING NINE]

void tzload (void)
{
 ...
 sp->chars = (char *) calloc ((unsigned) sp->charcnt,
 ...

 for (i = 0; i < sp->charcnt; i++)
 sp->chars[i] = *p++;
 sp->chars[i] = '\0';
 ...
}



[LISTING TEN]


int main ()
{
 unsigned int data;
 data = 0x01;
 printf ("%u\n", data);
 exit (0);
}



[LISTING ELEVEN]

Purify'd test (pid 11557 at Sat Aug 14 09:53:44 1993)
Purify 2.1.0 SunOS 4.1, Copyright 1992, 1993 Pure Software Inc.
For contact information type: "purify -help"
Purify licensed to Pure Software Central

**** Purify'd test (pid 11557) ****

Purify (abw): array bounds write:
 * This is occurring while in:
 strcpy [p9.o, pc=0xcf0c]
 main [line 14, test.c, pc=0x1b394]
 start [interface.c, pc=0x2064]
 * Writing 14 bytes to 0xc2ff0 in the heap (1 byte at 0xc2ffd illegal).
 * This is at the beginning of a malloc'd block of 13 bytes.
 * This block was allocated from:
 malloc [p6.o, pc=0x46d4]
 main [line 11, test.c, pc=0x1b36c]
 start [interface.c, pc=0x2064]

**** Purify'd test (pid 11557) ****
Purify: Searching for all memory leaks...

There are 13 leaked bytes (100% of the 13 allocated bytes in the heap)

Purify (mlk): 13 bytes at 0xc2ff0 lost, allocated from:
 malloc [p6.o, pc=0x46d4]
 main [line 11, test.c, pc=0x1b36c]
 start [interface.c, pc=0x2064]

Purify Heap Analysis (combining suppressed and unsuppressed chunks)
 Chunks Bytes
 Leaked 1 13
 Potentially Leaked 0 0
 In-Use 0 0
 ----------------------------------------
 Total Allocated 1 13


**** Purify'd test (pid 11557) ****
 * Program exited with status code 1.
 * 1 access error.
 * Basic memory usage:
 311288 code
 475976 data/bss
 16392 heap
 2888 stack
 * Shared library memory usage:
 696320 libc_pure_210.so.1.8 (shared code)
 16384 libc_pure_210.so.1.8 (private data)
 8192 libinternal_stubs.so.1.0 (shared code)
 8192 libinternal_stubs.so.1.0 (private data)

[LISTING TWELVE]

The SENTINEL Debugging Environment, Version 1.4.0.10
(c) Copyright 1992,1993 Virtual Technologies, Inc.

Error Output from: /t/cpcahil/./test
Running on: rama


SENTINEL: Warning [14]: An attempt was made to access data beyond the end of
an
 allocated data section. The program attempted to write 14 bytes
 to location 0x896A8. That address is at offset 0 in the 13
 byte data area that starts at location 0x896A8 (there is only
 room to write 13 bytes).
 Reading symbol table...Sun format...................Done

 This problem was detected at the following location:
 strcpy() [string.c:744]
 main() [test.c:14]

 This problem is *probably* associated with a 13 byte data area
 allocated on the 5th call to malloc() which returned 0x896A8.
 The context of the call to malloc() was as follows:
 malloc() [allocext.c:150]
 main() [test.c:11]

*********************** SENTINEL: LIST OF MEMORY LEAKS **********************

POINTER STAT LOCATION ALLOC NUMBER TOTAL DATA
TO DATA WHERE ALLOCATED FUNCT LEAKS LEAKED
-------- ---- --------------------------------- -------------- ------
----------
0x0896A8 **** main() [test.c:11] malloc(5) 1 13


[LISTING THIRTEEN]

**WRITE_OVERFLOW** [orig.c:15]
>> strcpy (string, Hello);
 Writing overflows memory: string

 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
 13 1 
 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww


 Writing (w) : 0x0002eca0 thru 0x0002ecad (14 bytes)
 To block (b) : 0x0002eca0 thru 0x0002ecac (13 bytes, 3 elements)
 string, allocated at orig.c, 12

 main() orig.c, 15

**Memory corrupted. Program may crash!!**

************************* INSIGHT SUMMARY *********************************
* Program : orig *
* Arguments : *
* Directory : /home/elephant2/whicken/D *
* Compiled on : Aug 13, 1993 13:01:32 *
* Run on : Aug 13, 1993 13:02:20 *
* Elapsed time : 00:00:00 *
***************************************************************************

PROBLEM SUMMARY - by type
===============
 Problem Detected Suppressed
 -------------------------------------------------
 WRITE_OVERFLOW 1 0
 -------------------------------------------------
 TOTAL 1 0
 -------------------------------------------------

PROBLEM SUMMARY - by location
===============
WRITE_OVERFLOW: Writing overflows memory, 1 occurrence
 1 at orig.c, 15

MEMORY LEAK SUMMARY
===================
1 outstanding memory reference for 13 bytes.

Outstanding allocated memory
----------------------------
 13 bytes 1 chunk allocated at orig.c, 12



[LISTING FOURTEEN]

MemCheck V3.0 Active (License: StratosWare Corporation)
Strcpy at ddjtest.c(16),len=14: overwrites destination...
 destination is _fmalloc at ddjtest.c(13), size=13, #1
mc_endcheck after ddjtest.c(16): buffer overwritten before call...
 buffer is _fmalloc at ddjtest.c(13), size=13, #1
mc_endcheck: no free for _fmalloc at ddjtest.c(13), size=13, #1


[LISTING FIFTEEN]

----------------------------------------------------------------------------
F:\DDJ\WINTEST.EXE loaded 02:45 PM Thursday August 19

---------- 03:05 PM Thursday August 19 ----------
***** Error, Source will overrun destination ****


 Procedure: WINMAIN (00038H)
 Module: WINTEST
Source File: WINTEST.C
Line Number: 00018

Trying to copy 00014 bytes to an area allocated by: WINTEST.C LINE #
00015 starting offset within buffer 00000 The destination buffer is only
00013 bytes long.

CALL STACK
----------
#WINTEST!WINMAIN(hInstance,PrevInstance,lpszCmdLine,nCmdShow)
 (1AAE,0000,1A87:0080,0001)

***************************************************************************
 YOUR PROGRAM'S data usage
***************************************************************************
 Stack Usage:
 stack space available stack space used
 --------------------- ----------------
 04908 01018
WINTEST
 Memory used (in bytes):
 Local Heap Global Heap
 ---------- -----------
 00525 00000
 Local leaks:
 Function Name Size Program location
 ------------- ---- ----------------
 _malloc 00013 WINTEST.C LINE # 00015

***************************************************************************
 GDI data usage on your program's behalf
***************************************************************************
 Memory used (in bytes):
 Local Heap Global Heap
 ---------- -----------
 00000 00000

***************************************************************************
 USER data usage on your program's behalf
***************************************************************************
 Memory used (in bytes):
 Local Heap Global Heap
 ---------- -----------
 00000 00480
















November, 1993
Eavesdropping on Interrupts


Tracking down software problems by monitoring interrupt activity




Rick Knoblaugh


Rick is a software engineer specializing in systems programming, is the
coauthor of Screen Machine, a screen design/prototyping/code-generation
utility. He can be reached at P.O. Box 1109, Half Moon Bay, CA 94019.


INTM is an interrupt-monitoring program for MS-DOS that traps and logs
interrupt activity, enabling your debugger or other program to gain control
when specified interrupts occur. It accomplishes this by using 80386
protected-mode code which acts as a V86 monitor, providing protected-mode
initialization and creation of a virtual-8086 task within which DOS will
execute.
INTM executes at privilege-level 0 and configures the environment so that the
program receives control whenever interrupts occur within the V86 task.
Interrupts can then be dispatched to their original interrupt-service routines
or (optionally) control can be transferred first to a user-specified
interrupt. This interrupt can be one which generates a "breakpoint" in your
debugger, or it can be any interrupt you choose.
I'll briefly review initialization of the protected-mode environment (for a
more complete discussion, see "Your Own Protected-Mode Debugger," DDJ,
September 1992), then show how your programs can communicate with INTM,
directing it to conditionally log interrupt activity and/or generate
breakpoints on specified interrupts. The entire program consists of a number
of files; most are provided electronically; see "Availability," page 3. Among
these files are: INTEQU.INC, equates for the program; INTSTRUC.INC, structures
for the program; INTDAT.INC, program data; INTMAC.INC, macros; the
initialization functions are provided in INTINIT.ASM, see Listing One, page
94; INTISR.ASM, interrupt-service routines; INTMAKE, the make file; INTM.LNK,
the linker input file, INTDISP.C, a sample program for retrieving logged data;
INTEXAMP.C, a sample program which demonstrates how to interface with the
interrupt monitor; and INTM.EXE, the executable.


Initialization


The global-descriptor table (GDT) is the first area to be initialized. It
contains entries for all segments to be used by INTM, including the task-state
segment (TSS) and interrupt-descriptor table (IDT). After these data areas are
established, virtual-8086 mode is entered by creating a protected-mode
exception stack frame, setting the VM bit in the EFLAGS, and executing an
IRETD. INTM then terminates and stays resident in your system.


Interrupt Handling


One of the differences between real and protected modes is that interrupts
pass through gates in the IDT rather than through the interrupt vectors. INTM
provides entries for all hardware interrupts, causing them to pass through a
common routine (see pass_thru in INTISR.ASM, available electronically). The
pass_thru routine can optionally log the interrupt and/or pass control to a
user-specified breakpoint interrupt before causing a transfer to the original
interrupt service routine (per the real-mode interrupt-vector table).
Besides providing the address of the interrupt-service routine, each IDT entry
contains a descriptor privilege level (DPL) which is compared against the
current privilege level (CPL) of code attempting to invoke an interrupt
handler via the INT n instruction. INTM initializes the DPL of all IDT entries
to 0. Thus, when DOS, the BIOS, or other code running within the V86 task
(which executes with CPL=3) attempts to execute an INT n instruction, a
general-protection exception (interrupt 0Dh) is generated.
At that point, INTM detects that a software instruction caused the exception
and routes the interrupt to the pass_thru routine for processing.


Controlling Logging/Breaking


The logging of interrupts and the generation of breakpoints is controlled by
the int_struc structure (See Listing Two, page 98). The log_each and brk_each
structure members are bitmaps containing one bit for each interrupt to be
logged or for which breakpoints are to be generated. These actions are enabled
for given interrupts by setting the corresponding bits. The logging and
breakpoint-generation functions can be globally enabled or disabled by setting
enable_log and enable_brk True or False. brk_action specifies the breakpoint
interrupt to be generated when any of the interrupts indicated in brk_each
occur. int_now always holds the number of the interrupt which is currently
being serviced. This can be used if you choose to create your own interrupt
monitoring program, which gets control via the brk_action interrupt, when
specified interrupts occur.
I've provided an interface via user-interrupt 61h, with four functions to let
you communicate with INTM and set the desired values in the int_struc area
(see user_int_isr, INTISR.ASM, available electronically).
Function 0 returns a pointer to the logging buffer. The pointer can be used to
retrieve the logging information. The program, INTDISP.C (available
electronically) demonstrates how to use this function. Function 1 can be used
to directly control interrupt logging. It returns a pointer to the int_struc
area which can be used to enable/disable interrupts, and so on.
Functions 2 and 3 can also be used to enable/disable the desired logging
options. The program, INTEXAMP.C (also available electronically) illustrates
this method of configuring INTM.


Conclusion


The capability of monitoring interrupt activity and breaking on the occurrence
of interrupts can be an effective means of tracking down software problems.
You may even discover some activity taking place in your system that you
didn't know existed.


References


80386 Programmer's Reference Manual. Santa Clara, CA: Intel Corp., 1986
Green, Thomas. "80386 Protected Mode and Multitasking." Dr. Dobb's Journal
(September, 1989)
Knoblaugh, Rick. "Your Own Protected-Mode Debugger." Dr. Dobb's Journal
(September, 1992)
Margulis, Neil. "Advanced 80386 Memory Management." Dr. Dobb's Journal (April,
1989)
Turley, James L. Advanced 80386 Programming Techniques. Berkeley, CA:
Osborne/McGraw-Hill, 1988.
Williams, Al. "Homegrown Debugging--386 Style!" Dr. Dobb's Journal (March
1990)

Williams, Al. "Roll Your Own DOS Extender: Part II." Dr. Dobb's Journal
(November, 1990)
[LISTING ONE]
;---------------------------------------------------------------
;intinit - main module for Int Monitor 
;--------------------------------------------------------------
;Copyright 1991, 1993 ASMicro Co. 
;--------------------------------------------------------------
; 09/25/93 Rick Knoblaugh 
;--------------------------------------------------------------
;include files 
;---------------------------------------------------------------
.386P
 include intequ.inc
 include intstruc.inc
 include intmac.inc
 include intdat.inc

;--------------------------------------------------------------
;EXTERNALS 
;--------------------------------------------------------------
isrcode segment para public 'icode16' use16
 extrn int_0:far
 extrn int_1:far
 extrn int_2:far
 extrn int_3:far
 extrn int_4:far
 extrn int_5:far
 extrn int_6:far
 extrn int_7:far
 extrn except_8:far
 extrn except_9:far
 extrn except_0ah:far
 extrn except_0bh:far
 extrn except_0ch:far
 extrn except_0dh:far
 extrn except_0eh:far
 extrn except_0fh:far
 extrn int_20h:far
 extrn int_21h:far
 extrn int_22h:far
 extrn int_23h:far
 extrn int_24h:far
 extrn int_25h:far
 extrn int_26h:far
 extrn int_27h:far
 extrn int_70h:far
 extrn int_71h:far
 extrn int_72h:far
 extrn int_73h:far
 extrn int_74h:far
 extrn int_75h:far
 extrn int_76h:far
 extrn int_77h:far
 extrn user_int_isr:far
isrcode ends

 assume cs:code, ds:nothing, es:nothing



code segment para public 'code16' use16
 assume cs:code, ds:data, es:data
 .8086
start proc far
 push ds ;save psp seg
 mov ax, data
 mov ds, ax
 mov es, ax
 call verify_cpu
 jnc start_200 ;continue if 386/486 in real mode


start_050:
 mov ah, DOS_PRT_STRING
 int 21h
 mov ax, 4c01h
 int 21h
.386P
start_200:
 call setup_ints ;take over user int

 call init_gdt
 call init_tss

 mov ax, data
 mov ds, ax
 assume ds:data


 cli ;no ints until protected mode

 mov ax, gdt_seg
 movzx eax, ax
 shl eax, 4
 mov gdtadrs, eax

 mov ax, idt_seg
 movzx eax, ax
 shl eax, 4
 mov idtadrs, eax


 call reprogram_pic


 lgdt gdtl
 lidt idtl

 mov dx, iostack3 ;get stack segment
 mov bx, sp ;and pointer
 mov eax, cr0
 or eax, 1 ;turn on protected mode bit
 mov cr0, eax ;go into protected mode

;
;jump to clear prefetch queue
;
 db 0eah ;far jump
 dw offset code:start_400

 dw gdt_seg:sel_code
start_400:
 mov ax, offset gdt_seg:sel_tss
 ltr ax
 xor ax, ax
 lldt ax ;null ldt
 mov ax, seg data
 movzx eax, ax
 push eax ;gs
 push eax ;fs
 push eax ;ds
 push eax ;es
 xor ax, ax

 push ax
 push dx ;stack segment
 push ax
 push bx ;stack pointer

 push 2 ;VM bit set in upper eflags
 push 3000h ;NT=0, IOPL=3, CLI in lower eflags
 push ax
 push seg code ;cs of where to return
 push ax
 push offset code:start_500 ;ip of where to return
;
;Must ensure that Nested Task bit is not set in eflags. If it were,
;processor would attempt to switch to a task via the selector in
;the TSS backlink field. Since that field is now zero, an invalid TSS
;fault would occur.
;
 pushf
 pop ax
 and ax, NOT NT_FLAG
 push ax
 popf
 iretd

start_500: ;begin vm86 task here
 pop bx ;get saved psp seg
 sti ;interrupts ok now
 mov dx, code + 1 ;init code we are dropping
 sub dx, bx
 mov ax, (DOS_TSR_FUNC shl 8)
 int 21h

start endp



reprogram_pic proc near
 in al, 21h
 mov ah, al
 mov al, 11h ;init
 out 20h, al
 mov al, 20h ;irq0 to int 20h
 out 21h, al
 jmp short $ + 2
 jmp short $ + 2

 mov al, 4
 out 21h, al
 jmp short $ + 2
 jmp short $ + 2
 mov al, 1
 out 21h, al
 jmp short $ + 2
 jmp short $ + 2
 mov al, ah
 out 21h, al
 ret
reprogram_pic endp

.8086

verify_cpu proc near
 xor ax, ax
 push ax
 popf
 pushf
 pop ax
 and ax, 0f000h
 cmp ax, 0f000h
 jz verify_c800 ;not 386
 mov ax, 0f000h
 push ax
 popf
 pushf
 pop ax
 and ax, 0f000h
 jz verify_c800 ;not 386
 mov dx, offset noprot_msg
.386P
 smsw ax ;get pm flag into carry
 rcr ax, 1
 jmp short verify_c999
verify_c800:
 mov dx, offset not386_msg
 stc
verify_c999:
 ret
verify_cpu endp




setup_ints proc near
 mov bx, USER_INT
 mov di, offset old_user_int
 mov cx, isrcode
 mov dx, offset isrcode:user_int_isr
 call get_int
 ret
setup_ints endp

;--------------------------------------------------------------
;get_int - For a given interrupt vector, store contents and 
; load with new isr address. 
; 

; bx = int number 
; es:di = location to store contents 
; dx = offset new isr 
; cx = cs of new isr
;--------------------------------------------------------------
get_int proc near
 cld
 push ds
 xor ax, ax
 mov ds, ax
 shl bx, 2
 mov ax, [bx]
 stosw
 mov ax, [bx].d_segment
 stosw
 cli
 mov [bx].d_offset, dx
 mov [bx].d_segment, cx
 sti
 pop ds
 ret
get_int endp



init_gdt proc near
 mov ax, gdt_seg
 mov ds, ax
 assume ds:gdt_seg

 mov dx, tss_seg
 movzx edx, dx ;base data segment
 mov ecx, (TSS_END - TSS_BEG ) - 1 ;limit
 mov ah, TSS_DESC
 mov si, offset sel_tss
 call make_entry

 mov dx, tss_seg
 movzx edx, dx ;base data segment
 mov ecx, (TSS_END - TSS_BEG ) - 1 ;limit
 mov ah, RW_DATA ;alias as r/w for editing
 mov si, offset sel_tss_alias
 call make_entry

 mov dx, gdt_seg
 movzx edx, dx ;base data segment
 mov ecx, (GDT_END - GDT_BEG ) - 1 ;limit
 mov ah, RW_DATA ;alias as r/w for editing
 mov si, offset sel_gdt_alias
 call make_entry

 mov dx, isrcode
 movzx edx, dx ;base of isr code segment
 mov ecx, 0ffffh ;max segment size
 mov ah, ER_CODE
 mov si, offset sel_isrcode
 call make_entry

 mov dx, code

 movzx edx, dx ;base code segment
 mov ecx, 0ffffh ;max segment size
 mov ah, ER_CODE
 mov si, offset sel_code
 call make_entry

 xor edx, edx ;zero base
 mov ecx, 8fffffh ;page granularity and 4 gig limit
 mov ah, RW_DATA
 mov si, offset sel_databs
 call make_entry

 mov dx, iostack
 movzx edx, dx ;base stack segment
 mov ecx, (STACK_END - STACK_BEG ) - 1 ;limit
 mov ah, RW_DATA
 mov si, offset sel_stack
 call make_entry

 mov dx, data
 movzx edx, dx ;base data segment
 mov ecx, (DATA_END - DATA_BEG ) - 1 ;limit
 mov ah, RW_DATA
 mov si, offset sel_data
 call make_entry

 int 11h ;equipment check
 mov edx, 0b800h ;color segment
 and al, 30h ;monitor bits
 cmp al, 30h ;30h=monochrome
 jne init_gdt500
 mov edx, 0b000h ;monochrome segment

init_gdt500:
 mov ecx, VID_PAGE_SIZE - 1 ;page size - 1
 mov ah, RW_DATA
 mov si, offset sel_video
 call make_entry

 ret
init_gdt endp

;--------------------------------------------------------------
;make_entry - Load a GDT entry from information passed as 
; follows: 
; 
; ds=gdt segent 
; si=offset of gdt entry to load 
; ah=type dpl 
; ecx=limit (also, bits 23:20 are g, b, 0, and avl)
; edx=base segment (convert it to linear) 
;--------------------------------------------------------------
make_entry proc near
 shl edx, 4 ;convert seg to linear
 mov [si].seg_limit_low, cx
 mov [si].seg_base_low, dx
 shr edx, 16
 mov [si].seg_base_mid, dl
 mov [si].seg_type_dpl, ah

 shr ecx, 16
 mov [si].seg_limit_gran, cl
 mov [si].seg_base_top, dh

 ret
make_entry endp

;--------------------------------------------------------------
;init_tss - Initilize TSS with PL0 stack and io bit map base. 
;--------------------------------------------------------------
init_tss proc near
 mov ax, tss_seg
 mov ds, ax
 assume ds:tss_seg
 xor si, si

 mov ax, offset gdt_seg:sel_stack
 mov [si].t_ess0, ax
 mov ax, offset iostack:io_sp
 movzx eax, ax
 mov [si].t_esp0, eax

 lea bx, [si].t_iomap
 mov [si].t_iobase, bx
 ret
init_tss endp


code ends
 end start

[LISTING TWO]
;---------------------------------------------------------------
;intstruc.inc - structures for Int monitor 
;--------------------------------------------------------------

idt struc
i_offset dw ?
i_selector dw ?
i_unused db 0
i_dpl_id db PRESENT + (DPL0 shl 5) + INT_GATE
i_offset2 dw 0
idt ends

seg_descrip struc
seg_limit_low dw ?
seg_base_low dw ?
seg_base_mid db ?
seg_type_dpl db ?
seg_limit_gran db ?
seg_base_top db ?
seg_descrip ends

err_stack_area struc ;stack with error code
e_pushed_int dw ? ;int pushed by int monitor
e_pushed_bp dw ? ;bp pushed by int monitor
e_errcode dd ? ;error code
e_eip dd ?
e_cs dw ?

 dw ?
e_eflags dd ?
e_esp dd ?
e_ss dw ?
 dw ?
e_es dw ?
 dw ?
e_ds dw ?
 dw ?
e_fs dw ?
 dw ?
e_gs dw ?
 dw ?
err_stack_area ends

stack_area struc ;stack without error code
s_pushed_int dw ? ;int pushed by I/O monitor
s_pushed_bp dw ? ;bp pushed by I/O monitor
s_eip dd ?
s_cs dw ?
 dw ?
s_eflags dd ?
s_esp dd ?
s_ss dw ?
 dw ?
s_es dw ?
 dw ?
s_ds dw ?
 dw ?
s_fs dw ?
 dw ?
s_gs dw ?
 dw ?
stack_area ends

user_stack struc
user_ip dw ?
user_cs dw ?
user_flags dw ?
user_stack ends

doub_word struc
d_offset dw ?
d_segment dw ?
doub_word ends

buf_record struc ;format of logged data
buf_int db ?
buf_ax dw ?
buf_record ends


int_struc struc
enable_log db TRUE
int_now db 0 ;interrupt that is occuring right now
log_each db (MAX_INTS/8) dup(0) ;bit on/off indicating to log
enable_break db TRUE
brk_each db (MAX_INTS/8) dup(0) ;bit on/off indicating to break
brk_action db 0 ;int number for ALL break point

values_ax dw MAX_INTS dup((DONT_COMP shl 8) OR DONT_COMP)
int_struc ends
tss_dat struc
t_backlink dw ?
 dw ?
t_esp0 dd ?
t_ess0 dw ?
 dw ?
t_esp1 dd ?
t_ess1 dw ?
 dw ?
t_esp2 dd ?
t_ess2 dw ?
 dw ?
t_cr3 dd ?
t_eip dd ?
t_eflags dd ?
t_eax dd ?
t_ecx dd ?
t_edx dd ?
t_ebx dd ?
t_esp dd ?
t_ebp dd ?
t_esi dd ?
t_edi dd ?
t_es dw ?
 dw ?
t_cs dw ?
 dw ?
t_ss dw ?
 dw ?
t_ds dw ?
 dw ?
t_fs dw ?
 dw ?
t_gs dw ?
 dw ?
t_ldt dw ?
 dw ?
t_tbit dw ?
t_iobase dw ?
t_iomap db IO_MAP_SIZE dup(0)
t_iopad dw 0ffh ;follow last map byte with 0ffh
tss_dat ends

ex_mov_data struc
ex_mdum db 8 dup(?)
ex_mdat db 8 dup(?)
ex_msource db 8 dup(?)
ex_mdest db 8 dup(?)
ex_mcs db 8 dup(?)
ex_mss db 8 dup(?)
ex_mov_data ends



flag_bits record fill0:14, vmbit:1, resumef:1, fill1:1, nest_taskf:1,\
 iopl:2, overf:1, direc:1, inter:1, trapf:1, sign:1, \
 zero:1, fill3:1, auxcarry:1, fill4:1, parity:1, \

 fill5:1, carry:1




;------end of intstruc.inc
























































November, 1993
Performance Verification


Cache, RISC, and embedded systems




Roger Crooks


Roger can be contacted at Tektronix, P.O. Box 460 DS-92-688, Beaverton, OR
97076.


Designers of high-performance embedded systems look for performance gains
wherever they can be found. Although you can always increase performance with
faster components, you'll also increase the system's cost--and embedded
systems are generally cost-sensitive, have size constraints, and limited power
budgets at the outset. Consequently, it's critical that the embedded software
function as optimally as possible before you begin adding faster components.
Because real-time embedded systems are event driven, the design engineer must
verify if the software reacts to events within a specified amount of time.
Measurement tools, such as performance analysis, can help improve system
performance without adding cost to your system. This article examines how you
can use performance-analysis tools to debug the time-domain aspects of
embedded software in a RISC-based system that uses cache memory.
As processing power becomes less expensive, many designers are looking at
using higher speed RISC technology in new designs. If you're considering using
RISC, be aware that it adds new problems to the already difficult task of
debugging embedded software systems. Higher clock rates, expanded code,
caches, large register sets, sophisticated compilers, and complex assembly
programming all make the move to RISC a non-trivial decision. To underscore
the complexity of debugging RISC-based systems, I'll examine one component of
a RISC system--the use of caches.
Many high-performance RISC and CISC microprocessors incorporate high-speed
cache memory to achieve maximum performance. One of the fundamental aspects of
RISC is that its execution units must be kept busy. This means that one or
more instructions must be loaded by the processor on each clock cycle. The
only way to achieve this performance at a reasonable system cost is to add
high-speed cache between main memory and the microprocessor. Figure 1 shows
where a logic analyzer is connected to monitor data. Caches are typically
integrated on the microprocessor. Secondary caches reside between main memory
and the primary cache for added performance. Regardless of the type or size of
cache, the impact on embedded systems is similar--they can drastically impact
the time-domain aspects of your software.


Caches and Embedded-system Performance


It's generally accepted that adding cache to a system will improve
performance. While true for most systems, there are cases when an embedded
system's performance may actually decrease with the addition of a cache. But
whether your embedded system's performance increases or decreases, there's no
question that the time-domain behavior of your embedded software will be less
deterministic.
The function of a cache is to store a portion of main memory--which uses
slower RAM--in a smaller, high-speed RAM that can feed the microprocessor at
its maximum clock rate. When code is resident in cache, performance will be
optimal. Without a cache, every instruction must be fetched from the slower
main memory which may take multiple bus cycles per instruction. In this case,
performance will be slower but deterministic.


How Caches are Controlled


There are different types of caches and different algorithms for controlling
them. Likewise, there are many theories on obtaining maximum performance out
of a cache design, as well as theories on how to determine the optimal amount
of cache. Ultimately you have to measure your system to determine which method
is optimal for your application, because the best method for one application
might be the worst for another application. The same is true for the optimal
size of cache; it can be very application dependent. Since most embedded
systems are designed for a single application, you can determine the best
cache algorithms and optimal cache size by using performance-analysis tools to
measure system performance.
When the CPU fetches an instruction, the system first determines if that
instruction is in cache. If so, it's fetched and executed. If not, the cache
is flushed and filled from main memory. You pay a performance penalty to
initially fill the cache (the larger the cache, the larger the penalty), but
hopefully this will be offset by the improvement gained by operating out of
cache. Sequential software will benefit the most, whereas code that contains
many calls to different portions of memory will suffer. In short, the order in
which you link your functions can impact the performance of cache-based
systems.
These factors are what cause cache-based software to be nondeterministic. If
you're writing mission-critical software, you have to write for the worst
case. But what is the worst-case situation--cache off or cache on?
Theoretically, a situation can occur where worst-case performance occurs with
the cache on.


How Caches Impact Deterministic Software


The problem with using a cache is that the software won't always be in cache
when needed, causing a time lag before the software can be executed. This time
lag can vary, making your software less deterministic. Figure 2 depicts three
typical scenarios in which the cache can jeopardize the deterministic aspects
of your embedded software.
Case 1: If a specific routine (say, an interrupt handler) is in cache when
needed, it will execute very fast. In this case you get the optimal
performance measurement.
Case 2: If the specific routine is not in cache, the cache must be flushed and
then loaded. This scenarios gives you a second performance measurement.
Case 3: If the specific routine is partially in cache, it will execute until
it reaches an instruction that is not in cache, flush the cache, fill the
cache with the rest of the routine. Performance in this case is not very
deterministic since the amount of code initially in cache can vary.


Maximizing System Performance with Performance Analysis


Performance analysis is a method for determining where your software is
spending most of its time. There are two types of performance analysis:
traditional performance analysis and single-event mode.
Traditional performance analysis measures the amount of time spent by multiple
events simultaneously as shown in Figure 3. The most common use of traditional
performance analysis is to determine which functions, if optimized, will
result in the biggest improvement in overall system performance. For example
in Figure 3, making Addr_range_3 execute 10 percent faster will have a greater
impact on overall system performance than improving Addr_range_4 by 50
percent.
You can optimize functions a number of ways: by recoding them in assembler,
using different algorithms, or possibly locking the function to cache
(available on certain microprocessors). By locking a function to cache, once
the function is loaded it will always remain in cache. This limits the use of
the cache for other functions, but it may be worth the price if overall system
performance improves. Again you can experiment with different techniques and
measure the results with the performance analyzer.
Single-event mode is an alternate measurement capability of performance
analyzers. Single-event mode measures the duration of an event each time it
executes and displays the ranges of time it took to execute as in Figure 4. It
is the most efficient way to determine the impact of caches on system
performance. Although you could use traditional performance analysis to
measure your whole program while it runs repeatedly, it won't provide much
information on where a problem might be. A more practical and useful
measurement is to use single-event mode to measure your cache performance.
You can also verify other time-critical functions, such as interrupt handlers
and data-processing functions, with single-event mode. Systems are designed
around the expectation that certain functions will execute in a specified
amount of time, otherwise data will be lost. With single-event mode, you can
profile these critical routines under worst-case conditions over long periods
of time.
When you are finished running your tests, single-event mode will display the
timing results in a histogram display. From this display, you can determine
the minimum, maximum, and average time it took to execute your targeted
routine as in Figure 4. If even one occurrence violates your system
specification, an error will eventually occur.
You can also vary other system parameters such as the cache-control algorithm
or the size of the cache to determine its impact on overall system
performance.


Measuring a Data-independent Algorithm



If your data-processing algorithm is data dependent, meaning that the time it
takes to execute is dependent on the data it has to process, you'll want to
monitor that single function over long periods of time. For example, if you're
writing software to read compressed video data off a CD-ROM drive and display
it on your monitor, your data flow might look like Figure 5 where the two
critical routines are the uncompress data routine (t1) and the display data
routine (t2). What you want is the overall time (t3) to be fast enough to
avoid display flicker caused by excessive time between frames.
If t1 is too slow, there'll be additional disk rotations between reading the
next block of data. On a CD-ROM drive, this time can be excessive. A slow
video memory or adding special effects to the video as it's being displayed
can impact t2. Certain effects may take more processing time and cause flicker
or jerky motion.
The first task is to optimize each function. Factors that might impact t1
include the compression method used and the size of the input buffer that
reads data off of disk. Single-event mode can be used to determine the most
optimal compression methods for your type of data. The reason you need a
performance analyzer for timing more than just a few frames is that most
compression methods are data dependent. With single-event mode, you can run
your whole video clip and see the min/max/avg time for the decompression
routine over the whole application. This is more efficient and reliable than
timing-selected frames of data.
You may not have the option of changing the compression method, so you may
want to experiment with the size of the data buffer read off disk. You'll want
to determine the most optimal amount of data to process on one pass. Some
processors have data caches that can deliver a broad range of performances. By
experimenting with different buffer sizes and running single-event mode, you
can zero into the right buffer size for optimal performance.
One of the ways manufacturers of video-display adapters differentiate
themselves from competitors that use the same hardware is by writing more
efficient firmware. Performance-analysis tools can be invaluable in
fine-tuning firmware to get the maximum amount of performance out of the
hardware.
It's important that once you have optimized routines, you next optimize how
routines interact with each other. In this example, after doing an analysis of
the display routine, you'll want to look at how the two routines work
together. Ideally, the t1 and t2 times should be relatively balanced (see
Figure 6). It doesn't do your system any good if t1 is optimal but passes too
much data to t2 to handle efficiently. Here you'll want to look at the best
combination of t1 and t2 that minimizes the total time t3.
A similar condition occurs in a dual-processor system where you need to look
at "load balancing" between the t1 processor and the t2 processor. You may
find that by optimizing t1, t2 is excessively idle and can't keep up with the
data when it's sent. You might achieve better system performance by having t1
run at a less than optimal rate and passing data to the video display
processor more frequently.
However, without actually running the system and measuring performance, you
can only make an educated guess at how to design the system software. By
having two logic-analyzer acquisition cards with performance-analysis
capabilities monitoring the two processors, you can balance your system's
processing capability between the two tasks.


How to Debug Cache-based Software


There's an inherent conflict in debugging cache-based software. First, to test
the system correctly, the cache must be turned on to accurately reflect how
the system will ultimately perform. However, for maximum visibility of data
for debugging, the cache needs to be disabled (see Figure 1). This ensures all
executed instructions are fetched off the bus and can be captured by a logic
or performance analyzer.
Most debug tools--emulators, debug monitors, and so on--are intrusive to the
cache. When a breakpoint is reached with these tools, the cache is flushed and
refilled when execution starts again, changing how your system ultimately
runs. A logic analyzer is a passive device, only monitoring data, making it
non-intrusive.
For debugging the logic of the software, the best strategy is to turn the
cache off. The cache won't affect what the software does, but will affect how
the software executes in the time domain. Once the logic of the program is
debugged, you should then turn the cache on to debug the time-domain aspects
of the software.
If your software is failing inside cache, you'll need to trace the execution.
Although the traditional printf statements are usually not an option for
embedded systems, you can use a similar method with a logic analyzer that
supports performance analysis. One method is to insert dummy write
instructions that writes data to a location in memory. This debug statement
will have slight impact on your program size or execution speed, but provides
an easy way to trace your program. The logic analyzer can be set to monitor
these memory locations to time a function or to simply trace execution.
There are three similar methods for monitoring your program via dummy write
instructions. The first method, shown in Figure 7, is the simplest for
higher-level languages. It performs a dummy write to an unused portion of
memory monitored by the logic analyzer.
The second method is similar and is a bit easier when programming in
assembler. With this method, you write the contents of the program counter to
a single memory location. The logic analyzer is set to monitor writes to that
memory location displaying the last value of the program counter. Although
this method provides you with trace information, it doesn't tell you why
something doesn't run as expected. Additional information can be obtained by
writing the value of the program counter to a single test memory location.
The third method yields more debug information by writing intermediate values
of a critical calculation to unused memory locations. For example, if you were
performing an iterative calculation, such as processing data, you could write
intermediate register values or variables to the test location. Again the
logic analyzer would record the data for analysis.
It's critical that you run your final verifications with frozen software.
Other than the obvious, the reason is that each time you compile your code,
your critical routines will reside in different locations in memory. If
routines are moved into different boundaries, then how they are loaded into
cache will be much different.
 Figure 1: A typical cache-based system with the optional secondary cache.
 Figure 2: Depending on what portion of a routine is in cache when it is about
to be executed, the tiem to execute will vary.
 Figure 3: Traditional performance analyzers will measure defined routines by
monitoring the address bus of the microprocessor. Each time an address appears
on the bus, the performance analyzer will "bin" that address and display the
bins in a histogram format.
 Figure 4: Single-event mode measures one event (or function) repetitively and
displays the range of execution times in a histogram format. Minimum, maximum
and average times are also displayed.
 Figure 5: A simplified example of displaying compressed video data on a
display.
 Figure 6: Performance analysis can measure how tasks are split in a single
processor system. Here t1 and t2 are evenly split. An additional 4 percent of
the total time is spent outside of these two critical routines.
Figure 7: Monitoring dummy write instructions
Loop_Being_1
 Write FF to Test_Location_1 <-Test Instruction
 code
 code
 code
 code
 Goto Loop_Begin_1 else goto Loop_Begin_2
Loop_Begin_2
 Write FF to Test_Location_2 <-Test Instruction
 code
 code
 code
 code
 Goto End else goto Loop_Begin_2
End
 Write FF to Test_Location_3 <-Test Instruction

_Performance Verification_
by Roger Crooks













November, 1993
A Netware Chat Utility


Understanding IPX programming




Eduardo M. Serrat


Eduardo is a systems analyst with Galmes y Casale s.r.l. and can be contacted
at Libertad 38, (5000) Cordoba, Argentina


Multiuser operating systems like OpenVms on VAXs or TSX-32 on 386/486-based
PCs provide a "phone" utility that let users chat interactively or across the
network. Functionally, this utility resembles a real phone--you can call a
user, accept or reject incoming calls, even set up a conference chat. The
utility creates one viewport on the screen for each user engaged in
conversation.
Unfortunately, Netware doesn't provide an interactive chat utility. Sure,
network users can communicate via the Send command to transmit messages up to
45 characters long. But the message displays only on the last line of the
target workstation, locks up the keyboard, and halts any running programs
until the target user presses Ctrl-Enter. Alternatively, there are commercial
applications like Direct Micro's LAN Intercom and Dave Frailey's shareware
Chat utility. Like the Send command, however, LAN Intercom only displays
messages a single line at a time (although it does have other add-on features)
and you don't get the source code with either Frailey's Chat or LAN Intercom.
Because a Netware chat utility is indeed useful, I wrote a phone-like utility
similar to those available on other operating systems--by using IPX
programming and Netware services calls. With this utility, which I call
"Phone," you can carry on interactive dialogs across the network. Remote users
can access the network via Novell Remote Access Server facilities, then use
the phone utility. (Where I work, we use Phone for customer software support,
coordinating maintenance activities on the fly.)
In the process of presenting the utility, I'll examine two Netware APIs--IPX
services and INT 21h extensions. For IPX background information, see "IPX: The
Great Communicator" by Rahner James (DDJ, May 1992).


IPX, Addresses, and Sockets


Netware IPX (Internetwork Packet Exchange), is a high-performance
connectionless, datagram-oriented protocol. Because it has its roots in XNS
(Xerox Network Standard protocol), IPX features a big-endian byte ordering
instead of the low-endian approach used by Intel CPUs. To deal with this
"inverted" byte order, you must swap byte values when filling in numeric
values in the packet. Since IPX packets can arrive at their destination out of
sequence, the application using IPX as its transport is responsible for
ensuring correct packet delivery. (Netware Core Protocol, NCP, deals with
this.) Figure 1 and Table 1 show the format of an IPX packet and a description
of the fields contained on its header.
Each workstation connected to a Novell network is uniquely identified by an
internetwork address consisting of a 4-byte network number, and a 6-byte node
number related to the physical address of the network adapter. Figure 2 shows
internetwork addresses. Most of the work needed to transmit packets over the
network involves determining internetwork addresses of the source and
destination, and filling the IPX header correspondingly.
Sockets let several applications running on the same workstation distinguish
the packets destined for them. Before a program can send or receive packets
over the network, it must create a socket.
Netware reserves socket 0000h to 3FFFh and assigns socket numbers from 8000h
to FFFFh to third-party developers. General-purpose sockets are from 4000h to
7FFFh. You can use these for your network programs. Alternately, you can
simplify programming by choosing sockets 4141h, 4242h, 4443h, and so on to
avoid swapping byte values when filling in sockets numbers in the packet.
"Temporary" sockets remain active until deleted, or until the program
finishes, whichever happens first. "Permanent" sockets (used by TSRs) remain
active until explicitly deleted. Sockets must be unique in a single
workstation, but the same socket number can be used by another workstation in
the network.


Novell Netware APIs


Netware provides you with two APIs. The first provides IPX services (INT 7Ah)
for sending and receiving packets from the network. The second API provides
Netware services as an extension of the INT 21h (DOS interrupt). These are
implemented by Netware's Workstation Shell that's normally made up of IPX.COM
and NETX.COM (the redirector).
IPX services can be invoked by a FAR CALL or using INT 7Ah. When programming
in a high-level language, calling IPX using FAR CALL requires some inline
assembly code to save registers, execute a FAR CALL instruction, and restore
registers.
Using INT 7Ah is preferred because you don't need to deal with inline assembly
code. Furthermore, almost every high-level language provides procedures for
generating interrupts. Before using IPX services, you must make sure IPX is
loaded at your workstation. To do that, call the multiplexer to determine the
presence of IPX. If you use the FAR CALL method of invocation, this call also
provides you with the segmented address required.
To call the multiplexer, load register AX with 7A00h and issue INT 2Fh. Upon
returning from this interrupt, register AL contains this status information:
00h, IPX not loaded (address not available); FFh, IPX loaded. In addition,
register ES contains the segment of IPX services address; and register DI
contains its offset.
If you issue an INT 7Ah and IPX isn't loaded, you'll crash the system, so it's
important to detect the presence of IPX before proceeding with IPX requests;
see Example 1. Before you can request IPX to send or receive packets, always
create a socket for this purpose.


Event-control Blocks


Event-control blocks (ECBs) are a fundamental structure for send and receive
packet requests; see Table 2. When receiving packets, IPX uses a "no wait"
approach; that is, after submitting such a request, control returns to our
program, and the corresponding ECB is queued as a outstanding request until a
packet for the specified socket arrives. If a packet arrives and there aren't
outstanding requests, the packet is dropped. You can then place any number of
receive packet requests to avoid this situation. IPX doesn't guarantee that
outstanding ECBs will be used in the order they were submitted, so you must
check this in your program.
By filling in the event-service routine field of the ECB with the address of a
FAR routine, you can specify that you want a routine to be called
asynchronously upon completion of the ECB. Alternatively, you can zero this
field and poll the status flag in the ECB to see if the request completed (see
Example 2). The latter method is used when sending packets. In this case, you
need to poll the status flag of the submitted ECB until IPX zeroes it,
signaling that the request finished successfully. Tables 3 and 4 show the
status flag values and the completion codes. When writing event-service
routine conventions, all interrupt-services routine conventions must be
considered. For example, to avoid DOS reentrancy problems, you can't call DOS
services. When IPX transfers control to the event-service routine, ES:SI
points to the address of the serviced ECB, other registers contents are
undetermined. To let the ESR code address data, you need to copy ES into DS
using inline assembly code. Finally, you must declare ESR as a far procedure
to return properly with a FAR RETURN instruction. The ESR routine in Example 3
shows how the contents of register ES are copied into DS to establish
addressability for global variable readflg. (The Turbo Pascal directive {$F+}
is used to define the ESR as a far procedure.)
When filling in the ECB, you must supply an immediate address value which IPX
uses to fill the packet's DESTINATION field. This address can be the same
target node address, or the address of a routing node in an internetwork. This
address is obtained using the IPX's GetLocalTarget function (see Listing One,
page 100).
Suppose, for example, your program runs on workstation #1 and intends to send
packets to workstation #2 located on another segment and connected to yours
via a router. Here, GetLocalTarget returns address 00001b401564 as the
immediate address. On the other hand, if you're sending a packet to server #1,
then the immediate address returned by this function would be the same as
server #1--00001b307035 as in Figure 2.
You can divide the packet for transmission or reception using the ECB's
fragment-count and fragment-descriptor fields. This provides an automatic way
to split the packet into meaningful parts. The phone utility, for example,
uses two fragments--the first contains the IPX header, the second the packet's
data. You fill the fragment count with a value between 1 and 42, and the
corresponding fragments addresses and lengths.
Netware services, which are invoked as extensions of the INT 21h (DOS
interrupt), consist of functions for determining connection numbers, internet
addresses, user information, message transmission, and so on. The necessary
function codes and structures are in Listing One.


Implementing and Using the Phone Utility


I wrote the phone utility in Turbo Pascal 5.0. The IPX and Netware services
routines are contained in the Turbo Pascal unit IPXUNIT.PAS (Listing One), and
the main program in PHONE.PAS (available electronically; see "Availability,"
page 3.) I recommend copying the PHONE.EXE executable to the PUBLIC directory
so the utility is accessible network-wide.
When invoking the phone utility in dialing mode, you must provide in the
command line the user name you're calling and (optionally) the connection
number of this user. Netware's Userlist command can be used to view connected
users and connection numbers.
In answer mode, the phone utility is invoked without parameters. The phone
utility notifies the user being called, sending a message as does Send
command. This message contains the name of the user generating the call, and
its starting time. This notification is repeated at 30 second intervals, until
the called user answers the call.
The sequence of procedures the phone utility uses to set up a dialogue is:

1 User #1 invokes a phone utility in dialing mode providing a Username.
2 Phone utility calls GetConnection to obtain a connection number for this
user. If more than one connection exists for this user, the program exits, and
asks for a connection number to be specified.
3 With this connection number, a program calls GetInternetAddress to get the
internetwork address of the target.
4 Program gets the immediate address to send a packet calling the
GetLocalTarget procedure.
5 Program creates socket for transmission and reception.
6 Program requests IPX, a receive packet service for this socket, and
specifies an ESR to be called upon completion.
7 The program Sends message notifying the target that a call is in progress,
and includes a packet which contains the connection number of the user
originating the call.
8 Repeats step #7 until the service requested in step #6 is completed or the
call is canceled by the originating user. If the user is satisfied, processing
resumes at step #13.
9 User #2 invokes the utility in answering mode.
10 The phone utility creates a socket for transmission and reception, and
requests IPX for a receive packet service. The packet sent in step #7 is
received.
11 The connection number obtained from the received packet, is used as
described in step #3 and #4 to obtain internetwork and immediate addresses.
12 The procedure sends a packet that is received by the outstanding request of
the originating workstation generated in step #6.
13 The screen at both workstations present two viewports corresponding to the
users involved in the conversation. Characters typed at one workstation, are
echoed at the remote viewport and vice versa, until one of the users presses
Esc to terminate the conversation.


Some Special Considerations


The message sent by Phone to notify a user that a call is in progress produces
the same effect as the Send command. Therefore, if you're running an
application that can't be interrupted, you'll need to use the Castoff command
to reject a phone's notification messages. (Alternatively, use Caston to
enable the messages.)
Users connected to the network and running graphical applications won't
receive the utility's notification messages when being called. Consequently,
the phone utility can't be used to establish a dialogue with these users.
However, anyone using Windows 3 can use Phone without any problem because
Windows supports Novell Netware's broadcast messages. These users can create a
PIF for the phone utility, and include it in their preferred applications
group.
 Figure 1: IPX packet fomat.
Table 1:
IPX header format (* indicates filling is required).
Name Type Description
Checksum Word Obsolete field inherited from XNS protocol. Always contains
FFFFh.
Length Word Contains length of IPX header + data in packet.
TransportControl Byte Initial value of 00h and is incremented as the packet
passes one router. When packet reaches its destination, this field counts
number of routers packet passed through.
Packet Type* Byte Describes packet type. For example 04h is used for an IPX
packet and 05h for a SPX packet.
Destination Net* Array [1..4] Byte Destination network.
Destination Node* Array [1..6] Byte 48-bit address of destination node.
DSocket* Word Destination socket.
Source Net* Array [1..4] Byte Source network.
Source Node* Array [1..6] Byte 48-bit address of source node.
SSocket* Word Source socket.
Table 2:
Event Control Block (* indicates filling is required).
Field Name Type Description
Link Address Pointer Used by IPX when queuing ECB.
Event Service Routine* Pointer Address of routine to be called when a packet
arrives.
Status Flag Byte Provides status for ECB for each processing stage (see Table
3).
Completion Code Byte Completion status code.
Socket* Word Socket for send/receive.
WorkSpace Array [1..4] Byte Reserved for IPX use.
Drive Work Space Array [1..12] Byte Reserved for IPX use.
Immediate Address* Array [1..6] Byte Packet destination address. (It can be a
routing node address.)
Fragment Count* Word Number of fragments composing packet.
Fragment Address* Pointer Address and Length of fragment.
Fragment Length* Word
.
.
.
Fragment Address
Fragment Length
Table 3:
ECB's status flag values.
Value Description
00h IPX finished processing ECB.
FAh ECB being processed by IPX.
F8h ECB is being queued by IPX.

FBh Send/receive being processed into ECB.
FEh Waiting for Incoming packets.
FFh ECB is in use for sending a packet.
Table 4:
ECB completion codes.
Value Description
00h Success.
FCh Error, ECB was canceled.
FDh Error, packet overflow.
FFh Error, invalid socket in ECB.
Example 1: Detecting IPX presence.
function IpxPresent;
const
 MULTIPLEXER = $2F;
 IPXINSTALLED = $FF;
begin
 regs.ax:=$7A00;
 intr(MULTIPLEXER,regs);
 if (regs.al = IPXINSTALLED) then IpxPresent:=TRUE
 else IpxPresent:=FALSE;
end;
Example 2: Polling ECB's status flag in send packet requests.
Procedure IpxSendPacket;
const
 IPX_SendPacket = $03;
begin
 regs.bx:=IPX_SendPacket;
 regs.es:=Seg(SendEcb);
 regs.si:=Ofs(SendEcb);
 IpxServicesCall;
 while (SendEcb.StatusFlag <> 0) do ;
end;
Example 3: ESR used by the Phone Utility
{$F+$S-} {Far Proc, No Stack Check }
Procedure EsrHandler;
begin
 inline($06 { push es }
 /$1f { pop ds }
 );
 readflg:=true;
end;
 Figure 2: Internetwork addresses
[LISTING ONE]
_A NETWARE CHAT UTILITY_
by Eduardo M. Serrat

Unit IpxUnit;

Interface

uses dos;
const
 IPX_PACKET_TYPE = 4;
type
 NetWrkAdr = record
 NetworkNumber : array [1..4] of byte;
 NodeAddress : array [1..6] of byte;
 end;


 IpxHeader = record
 CheckSum : word;
 Len : word;
 TransportControl : byte;
 PacketType : byte;
 Destination : NetWrkAdr;
 DestinationSocket : word;
 Source : NetWrkAdr;
 SourceSocket : word;
 end;
 ConNbrArr = record
 Len : word;
 Count : byte;
 Connections : array [1..250] of byte;
 end;
 ftype = record
 Adr : pointer;
 Len : word;
 end;
 Ecb = record
 LinkAddress : pointer;
 EventServiceRoutine: pointer;
 StatusFlag : byte;
 CompletionCode : byte;
 SocketNumber : word;
 WorkSpace : array [1..4] of byte;
 DriverWorkSpace : array [1..12] of byte;
 ImmediateAddress : array [1..6] of byte;
 FragmentCount : word;
 FragmentDescriptor : array [1..2] of ftype;
 end;
 ConnInfo = record
 Len : word;
 ObjectID : array [1..4] of byte;
 ObjectType : word;
 ObjectName : array [1..48] of byte;
 LoginTime : array [1..7] of byte;
 Reserved : word;
 end;
 NetType = array [1..4] of byte;
 NodType = array [1..6] of byte;
var
 regs : registers;
 ipxrutofs,
 ipxrutseg : word;
{-----------------------------------------------------------------------------}
function LeadingZero(w:word) : String;
function Time : String;
procedure WriteHexByte(b : byte);

function IpxPresent : boolean;
procedure IpxServicesCall;
function IpxCreateSocket (Socket : word) : boolean;
function LocalConnectionNumber : byte;
procedure IpxDeleteSocket (Socket : word);
procedure GetInternetAddress (ConnectionNbr : byte; var NetNod : NetWrkAdr);
procedure UserInfo (ConnectionNumber: byte; var ConnInfoRec : ConnInfo);
procedure GetConnections (UserName: string; var ConNbrRec : ConNbrArr);
procedure GetLocalTarget(DestNet : NetWrkAdr;

 DestSock : word; var LocalTarget : NodType );
procedure SendMessage(ConnectionNumber : byte; Message : String);
Procedure IpxSendPacket(var SendEcb : Ecb);
Procedure IpxReadPacket(var ReadEcb : Ecb);

Implementation

{----------------------------------------------------------------------------}
function LeadingZero;
var
 s : String;
begin
 Str(w:0,s);
 if Length(s) = 1 then
 s := '0' + s;
 LeadingZero := s;
end;
{----------------------------------------------------------------------------}
function Time;
var
 h, m, s, hund : Word;
begin
 GetTime(h,m,s,hund);
 Time:=LeadingZero(h)+':'+LeadingZero(m)+':'+LeadingZero(s);
end;
{----------------------------------------------------------------------------}
procedure WriteHexByte;
const
 hexChars : array [0..$F] of Char =
 '0123456789ABCDEF';
begin
 Write(hexChars[b shr 4],
 hexChars[b and $F]);
end;

{----------------------------------------------------------------------------}
function IpxPresent;
const
 MULTIPLEXER = $2F;
 IPXINSTALLED = $FF;
begin
 regs.ax:=$7A00;
 intr(MULTIPLEXER,regs);
 if (regs.al = IPXINSTALLED) then IpxPresent:=TRUE
 else IpxPresent:=FALSE;
end;
{----------------------------------------------------------------------------}
procedure IpxServicesCall;
begin
 intr($7a,regs);
end;
{----------------------------------------------------------------------------}
function IpxCreateSocket;
const
 IPX_CreateSocket = $00;
 PermanentSocket = $FF;
 TemporarySocket = $00;
var
 SwapSocket : word;

begin
 SwapSocket:=swap(Socket);
 regs.al:=TemporarySocket;
 regs.bx:=IPX_CreateSocket;
 regs.dx:=SwapSocket;
 IpxServicesCall;
 if (regs.al = $00) then IpxCreateSocket:=TRUE
 else IpxCreateSocket:=FALSE;
 {0FEh Full Socket Table
 0FFh Socket Already Opened}
end;
{----------------------------------------------------------------------------}
procedure IpxDeleteSocket;
const
 IPX_DeleteSocket = $01;
var
 SwapSocket : word;
begin
 SwapSocket:=swap(Socket);
 regs.bx:=IPX_DeleteSocket;
 regs.dx:=SwapSocket;
 IpxServicesCall;
end;
{----------------------------------------------------------------------------}
function LocalConnectionNumber;
const
 GET_CONNECTION_NUMBER = $DC;
begin
 regs.ah:=GET_CONNECTION_NUMBER;
 regs.al:=$00;
 msdos(regs);
 LocalConnectionNumber:=regs.al;
end;
{----------------------------------------------------------------------------}
procedure GetInternetAddress;
const
 GET_INTERNET_ADDRESS = $13;
 NETWARE_SERVICE_E3 = $E3;

var
 ReqBlk : record
 Len : word;
 ReqType : byte;
 ConnNbr : byte;
 end;
 ResBlk : record
 Len : word;
 NetNod : NetWrkAdr;
 SrvSocket : word;
 end;
begin
 with ReqBlk do
 begin
 Len:=sizeof(ReqBlk) - sizeof(Len);
 ReqType:=GET_INTERNET_ADDRESS;
 ConnNbr:=ConnectionNbr;
 end;

 with ResBlk do Len:=sizeof(ResBlk) - sizeof(Len);


 regs.ah:=NETWARE_SERVICE_E3;
 regs.ds:=seg(ReqBlk); regs.si:=ofs(ReqBlk);
 regs.es:=seg(ResBlk); regs.di:=ofs(ResBlk);
 msdos(regs);
 if regs.al <> $00 then writeln('Error GETINTERNETADDRESS...')
 else
 begin
 NetNod.NetworkNumber:=ResBlk.NetNod.NetworkNumber;
 NetNod.NodeAddress:= ResBlk.NetNod.NodeAddress;
 end;
end;
{----------------------------------------------------------------------------}
procedure UserInfo;
const
 GET_CONNECTION_INFORMATION = $16;
 NETWARE_SERVICE_E3 = $E3;
var
 ReqBlk : record
 Len : word;
 ReqType : byte;
 ConnNbr : byte;
 end;
begin
 with ReqBlk do
 begin
 Len :=sizeof(ReqBlk) - sizeof(Len);
 ReqType:=GET_CONNECTION_INFORMATION;
 ConnNbr:=ConnectionNumber;
 end;
 with ConnInfoRec do Len:=sizeof(ConnInfoRec) - sizeof(Len);
 regs.ah:=NETWARE_SERVICE_E3;
 regs.ds:=seg(ReqBlk); regs.si:=ofs(ReqBlk);
 regs.es:=seg(ConnInfoRec); regs.di:=ofs(ConnInfoRec);
 msdos(regs);
end;
{----------------------------------------------------------------------------}
procedure GetConnections;
const
 GET_OBJECT_CONNECTION_NUMBERS= $15;
 USER_BINDERY_OBJECT_TYPE = $0001;
 NETWARE_SERVICE_E3 = $E3;
var
 ReqBlk : record
 Len : word;
 RequestType : byte;
 ObjectType : word;
 NameLength : byte;
 Name : array [1..48] of byte;
 end;
 swapbind : word;
 i : integer;
begin
 swapbind:=swap(USER_BINDERY_OBJECT_TYPE);
 with ReqBlk do
 begin
 Len:=sizeof(ReqBlk) - sizeof(Len);
 RequestType:=GET_OBJECT_CONNECTION_NUMBERS;
 ObjectType:=SwapBind;

 end;
 ReqBlk.NameLength:=Length(UserName);
 for i:=1 to ReqBlk.NameLength do ReqBlk.Name[i]:=ord(UserName[i]);

 with ConNbrRec do Len:=sizeof(ConNbrRec) - sizeof(Len);
 regs.ah:=NETWARE_SERVICE_E3;
 regs.ds:=seg(ReqBlk); regs.si:=ofs(ReqBlk);
 regs.es:=seg(ConNbrRec); regs.di:=ofs(ConNbrRec);
 msdos(regs);
 if regs.al <> 0 then ConNbrRec.Count:=0;
end;
{----------------------------------------------------------------------------}
procedure GetLocalTarget;
const
 IPX_GetLocalTarget = $02;
var
 ReqBlk : record
 Dnetwork : NetWrkAdr;
 DSocket : word;
 end;
 ResBlk : record
 Ltarget : NodType;
 end;
 swapsocket : word;
begin
 swapsocket:=swap(DestSock);
 ReqBlk.Dnetwork:=DestNet;
 ReqBlk.DSocket :=swapsocket;

 regs.bx:=IPX_GetLocalTarget;
 regs.es:=seg(ReqBlk);
 regs.si:=ofs(ReqBlk);
 regs.di:=ofs(ResBlk);

 IpxServicesCall;

 if regs.al = $00 then LocalTarget:=ResBlk.Ltarget;
 {0FAh No path to Destination}
end;
{----------------------------------------------------------------------------}
procedure SendMessage;
const
 USER_BINDERY_OBJECT_TYPE = $0001;
 NETWARE_SERVICE_E1 = $E1;
var
 ReqBlk : record
 Len : word;
 Bindery : word;
 ConnNbr : byte;
 Mlen : byte;
 Mens : array [1..45] of byte;
 end;
 ResBlk : record
 Len : word;
 Filler : array [1..100] of byte;
 end;
 i : integer;
begin
 with ReqBlk do

 begin
 Bindery:=swap(USER_BINDERY_OBJECT_TYPE);
 ConnNbr:=ConnectionNumber;
 Mlen:=Length(Message);
 Len:=Mlen + 4;
 for i:=1 to Mlen do mens[i]:=ord(message[i]);
 end;

 ResBlk.Len:=$6400;

 regs.ah:=NETWARE_SERVICE_E1;
 regs.ds:=seg(ReqBlk); regs.si:=ofs(ReqBlk);
 regs.es:=seg(ResBlk); regs.di:=ofs(ResBlk);
 msdos(regs);
end;

{----------------------------------------------------------------------------}
Procedure IpxSendPacket;
const
 IPX_SendPacket = $03;
begin
 regs.bx:=IPX_SendPacket;
 regs.es:=Seg(SendEcb);
 regs.si:=Ofs(SendEcb);
 IpxServicesCall;

 while (SendEcb.StatusFlag <> 0) do ;
end;
{----------------------------------------------------------------------------}
Procedure IpxReadPacket;
const
 IPX_ReceivePacket = $04;
begin
 regs.bx:=IPX_ReceivePacket;
 regs.es:=Seg(ReadEcb);
 regs.si:=Ofs(ReadEcb);
 IpxServicesCall;
 if regs.al <> $00 then
 begin
 writeln('Error Read Packet ');
 WriteHexByte(Regs.al);
 end;
 {0ffh NonExistant socket}
end;
{----------------------------------------------------------------------------}
{----------------------------------------------------------------------------}
begin
end.














November, 1993
Examining OPTLINK for Windows


Linker optimizations that increase speed while reducing size




Matt Pietrek


Matt, author of Windows Internals, is a programmer at Nu-Mega specializing in
debuggers and file formats. He can be reached on CompuServe at 71774,362.


Third-party development tools intended to replace and enhance the standard
development environment tools can greatly enhance the productivity of DOS and
Windows programmers. Yet, there can be pitfalls when choosing to replace
standard machinery. For example, if you're a Borland C++ developer using the
standard Turbo Debugger for Windows (TDW), you can reasonably expect full
technical support from Borland when a debugging problem arises. However, if
you've replaced TDW with, say, Symantec's Multiscope debugger, who do you call
in the event of a problem? At best, the product support will be fragmented. At
worst, both companies may point fingers in the other direction leaving you
somewhere in the middle.
Also, the executable file format or debug specification that's in vogue today
may be obsolete tomorrow. If you commit to using a third-party tool that
doesn't keep up with the latest industry standards, you're stuck. Therefore,
there has to be a compelling reason for a user to switch to a new set of
tools. A speed increase of 10 percent or a file size reduction of 2 percent
may not be enough to convince you to give up the security of the programs you
already use. Instead, a third-party tool not only has to provide compatibility
with your current tools, it also needs to offer significant advantages. I've
put OPTLINK for Windows version 4.01, from SLR Systems to the test to see if
it meets these criteria.


What is OPTLINK?


OPTLINK for Windows is intended as a drop-in replacement for Microsoft's
LINK.EXE and Borland's TLINK.EXE. OPTLINK runs from the DOS command line, and
generates DOS executables, as well as DLLs and executables for Windows and
OS/2 1.x. It does not generate OS/2 2.x LX format files, nor the PE format
files used by Win32 operating systems such as Windows NT. However, SLR has
indicated that it intends to support PE format files soon.
OPTLINK performs all the standard optimizations that LINK and TLINK perform,
including far call translation, segment packing, and fixup chaining. Far call
translation occurs when the linker sees a far call instruction to a procedure
that's in the same code segment. For example, given a call of the form call
far ptr xxxx:yyyy where xxxx is the same as the current code segment, the
linker can replace that one instruction with:

NOP
 PUSH CS
 CALL NEAR PTR YYYY

This second sequence is both faster to execute because it avoids a costly
segment register load, and avoids the need for a fixup in the .EXE or .DLL
file, thus shrinking the file size and speeding up load time.
Segment packing occurs when the linker takes segments of the same class and
concatenates them together. For instance, if you were using the medium or
large memory models, and had files A.C, B.C, and C.C, the resulting code
segments in the .OBJs would be A_TEXT, B_TEXT, and C_TEXT. Without segment
packing, the linker would produce three separate code segments in the .EXE.
While not really a problem for DOS executables, in Windows this wastes space
in the file and forces Windows to use more selectors when it loads the
program. In addition, segment packing affords the linker additional
opportunities to perform far call translations, saving even more space.
Fixup chaining is a method of compressing the load-time relocation information
in NE format files. (NE files are Windows and OS/2 1.x files.) To give an
example, consider a program that makes 20 calls to the Windows BeginPaint()
API. Without fixup chaining there would be 20 fixups referring to BeginPaint()
in the .EXE. Each fixup is eight bytes in length, so the total space used for
relocations is 160 bytes. A linker that does fixup chaining (such as OPTLINK)
can get away with only putting one fixup record in the file. How's this? The
NE format has a clever method of letting fixups be applied in a linked-list
fashion. The head of the list is pointed to by the single relocation record.
At the spot in the segment where the address of BeginPaint() will be plugged
in is a 16-bit offset to another place where BeginPaint()'s address also needs
to be applied. When the operating system loader brings the file into memory,
it just visits each node of the chain and leaves behind a copy of the
necessary information (the target address). Not only does fixup chaining save
space by eliminating redundant fixup records, it can also speed up load times
significantly. For more information on fixup chaining (as well as segment
packing), see my article "Liposuction Your Corpulent Executables and Remove
Excess Fat" (Microsoft Systems Journal, July 1993).
In addition to the main program (OPTLINKS.EXE), the package comes with a few
other programs. OPTIMP is a superset of the IMPLIBs shipped with the Borland
and Microsoft development environments. STRIPDEB removes the debug information
from the end of an executable, similar to Borland's TDSTRIP and Microsoft's
CVPACK /STRIP. FIXLIB accepts a Borland-produced .LIB file format and modifies
the dictionary so that LINK, TLINK, and OPTLINK can all use it. According to
the SLR folks, the dictionary in Borland .LIBs is incorrect at times, and both
LINK and OPTLINK are unable to use it. I personally love FIXLIB because I can
now use Borland's IMPORT.LIB with LINK and OPTLINK. IMPORT.LIB has all the
exported Windows functions, not just those documented in Microsoft's LIBW.LIB.


OPTLINK vs. TLINK


Before Borland became a major presence in the C/C++ market, OPTLINK was
targeted at users of Microsoft C who wanted smaller .EXEs and faster linking.
However, SLR now appears to be targeting users of Borland's TLINK. The reason
can be summarized in two words: debug capacity. As all too many users of
Borland's TLINK 5.x know, when building a program with debugging information,
TLINK can run out of memory amazingly early. This is especially true with C++
programs. The use of class hierarchies leads to much more debugging
information than the equivalent C code would produce. Borland users who have
stuck with TLINK are getting increasingly frustrated with turning on debugging
information in just select modules to prevent TLINK from running out of
memory. OPTLINK has a much greater capacity when processing Borland's
debugging information, so it has a major inroad with Borland's customer base.
In fact, Borland representatives have themselves recommended OPTLINK when
pressured about TLINK's capacity problems.
One of TLINK's attempts to deal with the sheer volume of debugging information
was to introduce symbol table compression (using the /Vt switch). A compressed
symbol table is in the same format as a non-compressed symbol table. The
compression that occurs is more a matter of eliminating duplicate type
information. For instance, if you defined a struct in an .H file and included
that file in three separate .CPP files, the type information describing the
structure will show up three times in an uncompressed symbol table. By using
/Vt with TLINK, there would only be one copy of the struct's type information.
OPTLINK performs debug information compression implicitly as part of the link
process. In fact, OPTLINK does a better job of eliminating redundant
information than TLINK /Vt does. I determined this by linking a couple of
programs with both TLINK /Vt and OPTLINK. To see the resulting debug
information, I ran TDUMP -v -ex on the two executable files. I then compared
each debug information subsection table in the two .EXEs. The detailed results
are breathtakingly dull, so I'll spare you a recitation of them here. The
short summary is that OPTLINK was more aggressive in eliminating types, member
definitions, class definitions, and so on. Table 1 shows the debug information
sizes for the two files. With one minor exception, I noted that OPTLINK fully
supports the Borland debug specification, down to inclusion of the browser
symbols and code coverage tables. The minor exception is that OPTLINK doesn't
output browser information for local symbols.
Another compelling reason for Windows programmers to consider OPTLINK is that
it produces significantly smaller executable files and DLLs for Windows. The
primary size reduction comes from OPTLINK's ability to chain fixups as noted
earlier. In linking the OWL WCHESS.EXE sample program, OPTLINK produces 503
fixups as compared to 3412 by TLINK. At eight bytes per fixup, that's a
savings of over 22K, and more than 10 percent of the .EXE's size. Needless to
say, the OPTLINK version will load faster as well.
Since OPTLINK defaults to producing Windows files that will only run in
protected mode, it makes all entries in the NE entry table FIXED, even if the
function is in a MOVEABLE segment. By using FIXED entries instead of MOVEABLE,
OPTLINK can eliminate three bytes of overhead per entry. TLINK also defaults
to PROTMODE operation, but generates MOVEABLE entries if the function is in a
MOVEABLE segment. Another space savings offered by OPTLINK includes a smaller
DOS stub if you let it provide a default stub.
Despite all the benefits OPTLINK offers, there are a few rough edges if you're
a TLINK user. OPTLINK was originally developed as a Microsoft LINK
replacement; Borland support was added later. As such, it doesn't appear that
OPTLINK has been "burned in" as much for TLINK replacement as it has for LINK
replacement. For example, in a linker response file, it's legal for a program
to specify only the base file name for the target to be built (for instance,
"FOO", rather than "FOO.EXE"). When I passed OPTLINK such a response file and
told it to build a DLL, it created the file with a .EXE extension, rather than
.DLL. The bit indicating that the file was a DLL was set inside the NE file,
but the file's extension was wrong. TLINK handles this situation correctly.
Another quirk is OPTLINK's response file handling. I'm in the habit of
invoking Borland's command-line compiler (BCC.EXE) with just a .C or .CPP
file, and letting it supply the defaults when invoking TLINK. To make BCC work
with OPTLINK, I made a copy of OPTLINKS.EXE called TLINK.EXE, and supplied an
appropriate /TLINK mode OPTLINKS.CFG file. For a test, I ran BCC A.C, where
A.C was a minimal DOS program. When using Borland's TLINK, the linker accepted
the output from BCC without a peep. When using the renamed OPTLINKS, it
prompted me for both library files and a .DEF file (ala LINK). Pressing the
Enter key at each prompt yielded an .EXE file, but this prompting is annoying
when it happens continually in a development situation. Since the program was
a DOS program, OPTLINK shouldn't have asked for a .DEF file (TLINK doesn't).
Another problem I encountered with TLINK compatibility had to do with default
.DEF files for Windows .EXEs. If you don't specify a .DEF file when using
TLINK, it uses a set of defaults, including a 5K program stack. While OPTLINK
will also use defaults, it has a nasty habit of not specifying any stack at
all for the generated .EXE. To circumvent this problem, I tried putting a
/STACK:5120 directive in the OPTLINKS.CFG file. While this worked for Windows
programs, it also gave DOS programs a 5K stack. Borland- produced DOS programs
start out with an initial small stack, and at run time switch the SS:SP to a
larger stack. Creating a DOS .EXE with an initial 5K stack was certainly not
the behavior I desired from OPTLINK. The point of all this is that although
SLR has put on a snazzy coat of TLINK paint, some areas appear to be lightly
tested. In addition, OPTLINK seems to want to revert to LINK compatibility
mode whenever it gets a chance.


OPTLINK vs. Microsoft's LINK


In the past, OPTLINK's primary target audience was Microsoft C and MASM
developers who needed faster link times and increased capacity. With LINK 5.50
from the Visual C++ package, Microsoft has significantly narrowed both gaps.
However, OPTLINK still holds some advantages for Microsoft users.
To a certain extent, debug information capacity is less a problem with
Microsoft tools than the corresponding tools offered by Borland. The reason is
that the linker doesn't have to do all the work of massaging the debug
information into its final form. When producing CodeView-style information,
OPTLINK emits a preliminary version of the debug information that's relatively
easy for the linker to process. Afterwards, OPTLINK invokes CVPACK.EXE which
takes care of merging all the debug information into one unit and eliminating
duplicate information. Interestingly, OPTLINK doesn't complain if it can't
execute CVPACK. If you have older tools that only recognize the CodeView 3.0
debug specification, OPTLINK can produce this format as well as producing the
default CodeView 4.0 debug information.
In the speed category, OPTLINK was just slightly faster than LINK on my test
executable, but not enough to get excited about; see Table 2. In all fairness,
the test .EXE wasn't large enough to test the virtual memory systems of either
OPTLINK or LINK. On large industrial- grade applications, SLR claims some
users see performance gains of up to 50 percent over LINK.
Regarding the parts of the .EXE used by the operating system, OPTLINK produces
NE files that aren't dramatically different than what LINK produces. LINK
chains fixups, so you won't see the dramatic space savings like you would when
comparing OPTLINK to TLINK. In fact, OPTLINK appears to produce the identical
fixups to LINK, although in a different order. Two other NE tables where
there's a difference between the two linkers are the resident and non-resident
names tables (where the names of your exported functions live). LINK puts
entries in these tables in a seemingly random order, while OPTLINK sorts the
name in the reverse order of the entry table (for example, 15, 14, 13, and so
on).
Other differences between OPTLINK and LINK-produced Windows executables
include the entry table. Like TLINK, LINK defaults to PROTMODE, yet still
generates MOVEABLE entries where appropriate. OPTLINK always appears to
generate the smaller FIXED entries, thereby saving three bytes per entry. In
addition, some segments in NE files are a few bytes larger in the
OPTLINK-created executable than in the LINK-produced .EXE. While this may just
be an effect of rounding-up segment sizes, it could potentially be the source
of different behaviors when comparing the two linkers. For example, you might
have a fence-post error and try to read one byte past the end of a data
structure at the end of a segment. The LINK-produced program could GP fault,
while the OPTLINK- produced program might not.



Unique Features


Because SLR Systems is an underdog in a market dominated by the likes of
Borland, and Microsoft, OPTLINK has added some unique features to distance
itself from the pack. One such feature, resource binding at link time,
performs the .RES binding and flag setting operations that you normally use
RC.EXE for. However, since OPTLINK doesn't actually compile .RC files, you
can't get rid of RC just yet. Presumably the reason for integrating the
resource binding into the linker is for increased build speed. Your MAKE
program only needs to invoke one program when building the executable target,
rather than two. Also, it's conceivable that OPTLINK could gain additional
speed by writing the segments and resources into the executable in their final
positions. When using RC after the link step, the executable's segments could
be written to the file twice; once by the linker, and again later by RC. As a
final note on resource binding, OPTLINK has the somewhat odd (but harmless)
habit of looking in the LIB= directory for the .RES file when it doesn't find
it in the default directory.
Another advantage of OPTLINK over TLINK or LINK isn't really a feature at all.
When linking Windows files, OPT-LINK defaults to 16-byte alignment, while LINK
and TLINK default to 512-byte alignment. All segments and resources in an NE
file start at a file offset that's a multiple of the alignment value (512,
1024, 1536, and so on). When a segment or resource isn't a multiple of the
alignment value, the linker needs to pad the file with wasted space until it
gets to the next alignment value multiple. For a more detailed description,
see the previously mentioned Microsoft Systems Journal article. In short, when
using the default settings, Windows .EXEs and .DLLs linked with OPTLINK are
often significantly smaller than when linked with either LINK or TLINK. For
example, BCW.EXE from Borland C++ 3.1 would lose around 115K in wasted file
space if linked with OPTLINK (not counting additional savings from fixup
chaining). Quattro Pro for Windows 1.0 would lose around 145K in the same
manner. Although you can get the same effect with LINK or TLINK, OPTLINK's
choice of default behaviors and values seem more finely tuned.
OPTLINK also has a smattering of smaller features that distinguish it from its
competition. The /FIXDS option tells OPTLINK to modify the prologues of
exported functions to load DS from SS upon entry. Microsoft has had this
option (/GA) in its compiler since Version 7.0, and Borland has always had
"smart callbacks," so this feature is probably only of use to users of
Microsoft C 6.0 and earlier. The /XREF switch tells OPTLINK to generate a
cross-reference of public symbols in the .MAP file. Each line of the cross
reference shows what source module the symbol was defined in, and what modules
reference the symbol. While this is a nice feature, I did encounter a probable
six-legged creature. If you initialize a variable as part of its declaration
(for example, HWND HMainWnd=0), you won't see the declaring module in the list
of referencing modules. The /REORDERSEGMENTS option gives OPTLINK the leeway
to rearrange segments in order to try to combine more segments into one
(segment packing).


Considerations


Developers who use the Borland Integrated Development Environment won't be
able to use OPTLINK without reverting to make files or using the transfer
system. This may change in Borland C++ 4.0, however. OPTLINK will integrate
into the Visual C++ IDE by simply renaming OPTLINKS.EXE to LINK.EXE.
Another thing to watch for is shifting debug formats. Both Borland and
Microsoft have gone through at least three major changes to the debug
specification in the last couple of years. Rumor has it that another change to
Borland's 16-bit debug specification is in the works. If you lock yourself
into using a special feature of OPTLINK, and a new compiler comes out
afterwards, you're at the mercy of SLR to get an update out quickly.
Fortunately, SLR seems to be good about keeping OPTLINK up-to-date with the
latest compiler offerings.


Conclusion


If you're in the Microsoft camp, and you work on small- to medium-sized
projects and don't need any of the unique features of OPTLINK, it may not be
particularly beneficial. However, if you work on larger projects or can use
some of OPTLINK's unique features, OPTLINK is probably worth the money.
For Borland users, the decision is a little easier. For a small investment in
time to set it up, you'll get more debug capacity, and smaller executables.
The only people who it may not be suitable for are diehard IDE users and
programmers who cower at the sight of a command-line switch (OPTLINK features
eleven). If you have the need for a high-performance, high-end linker, OPTLINK
may be just what you're looking for.
Table 1: Comparisons for OPTLINK and TLINK. Program was the OWL CHESS example
compiled with BC++ 3.1. OPTLINK produces far fewer fixups as compared to
TLINK, saving more than 10 percent of the .EXE's size. OPTLINK switches:
/NOREL /TLINK /Twe /c /x /n /v /Vt /A=16 /P=65535. TLINK switches: /Twe /c /x
/n /v /Vt /A=16 /P=65535. All tests run on a Gateway 4DX2-66V in non-turbo
mode; 16 Mbytes installed; Windows was not running; Memory Manager was 386MAX
6.02; disk cache was Hyperdisk 4.21 with 7168 Kbytes in the cache; the times
are the average of several runs, with the first run discarded; link times do
not include resource binding; file sizes do not include resources.
 OPTLINK TLINK
 4.01 5.1
File Size (w/Debug) 297570 421547
File Size (no Debug) 151632 189046
Link time (w/Debug) 6.5 sec 9.1 sec
Number of Fixups 503 3412
Table 2: Comparing OPTLINK with LINK. Program was a mixed MSVC C and MASM 5.1
program. Both times include the time to run CVPACK. LINK 5.50 ordinarily does
debug compression internally, but the presence of MASM 5.1 information may
have forced it to use CVPACK. Link times do not include resource binding and
file sizes do not include resources. OPTLINK switches: /NOMAP /NOREL /NOLOGO
/SI /CO /NOD /align:16 /CVVERSION:4. LINK switches: /NOLOGO /BAT /CO /NOD
/align:16.
 OPTLINK TLINK
 4.01 5.1
File Size (w/Debug) 389296 390484
File Size (no Debug) 161776 162697
Link time (w/Debug) 5.7 sec 6.3 sec
Number of Fixups 446 446


For More Information



OPTLINK for Windows
SLR Systems
1622 North Main Street
Butler, PA 16001
412-282-0864
$350.00
















November, 1993
Debugging Windows Applications


When the journey is not the reward.




Ray Valdes


Ray is a DDJ senior technical editor and can be contacted at
rayval@well.sf.ca.us.


Writing software is often viewed as a creative act, akin to painting a picture
or gourmet cooking. If so, then debugging a program often seems like cleaning
the paintbrushes or washing the dishes. But this doesn't always have to be the
case.
Over the years, I've adopted a variety of useful approaches to debugging
Windows apps, techniques I'll share with you in this article. I'll also
examine tools such as Nu-Mega Technologies' Bounds-Checker for Windows,
Periscope's WinScope, and Avanti's PinPoint. Accompanying this article is C
code that implements a simple, relatively transparent, trace facility you can
add to your Windows programs. Finally, I'll cover some truly useful debugging
"tools" that aren't software but are as effective as any program I've used.


Debuggers as Commodity Items


Choosing a debugger or other development tool used to be a decision fraught
with anxiety over making the wrong choice and possibly wasting significant
amounts of money and time. And, as we all know, there are as many styles of
debugging as coding. Debugging (and coding) styles seem to follow the "baby
duck" syndrome: You get imprinted on the first viable way of doing things and
find it hard to change later, even if your chosen approach is less than
optimal.
One benefit of the current huge installed base of Windows is that the
development-tools market now offers more value to the programmer/consumer than
ever before. Language vendors (Borland, Microsoft, and Zortech among others)
include powerful debuggers with their compiler packages. Even low-end
products, such as Quick-C for Windows or Turbo-C++ for Windows, sport whizzy
interactive development environments (IDEs) that offer source-level GUI
debuggers we used to could only fantasize about. And those of us who lusted
after the $2000 hardware-assisted Periscope debugger can now get something
almost as powerful for one-fifth the price (Nu-Mega's Soft-ICE, no hardware
required).
For a long time, Borland's Turbo Debugger was the debugging tool of choice for
programmers in the trenches (although MultiScope served as a dark-horse
alternative). In recent years the race among mainstream debuggers has evened
up somewhat, with the newest release of CodeView moving up in the pack.
Moreover, low-cost upgrades from the various language vendors make your
purchase decision less of an "either/or" choice. With Turbo-C++ for Windows
Visual Edition selling for $75.00, and Microsoft's Visual C++ for $139.00,
there's no reason not to have multiple tools in your arsenal. Debuggers have
almost become commodity items, with comparable sets of features and
more-or-less equivalent user interfaces that allow you to transition back and
forth among different products, as long as you don't get too deeply involved
in customized scripts and configurations.
For those who are feature-hounds, however, there's much to consider, and
certain debuggers, such as Soft-ICE for Windows, do stand out as unparalleled
tools for a particular purpose (such as debugging device drivers and
system-level programs). While detailed coverage of debugger features is beyond
the scope of this article, both Arthur English's book Advanced Tools for
Windows Developers (Sybex, 1993) and Charles Mirho's "Putting Four Debuggers
for Windows through Their Paces" (Microsoft Systems Journal, April 1993)
provide in-depth information. English devotes 120 pages to detailed
descriptions of Turbo Debugger, MultiScope, CodeView for Windows, and the
debugger in QuickC for Windows, while Mirho's 20-page article includes
coverage of Nu-Mega's Soft-ICE for Windows.
Yet even with such a cornucopia of software technology, I've found the
half-life of debugging tools installed on my disk to be quite short. I put
each one on for a while and then reallocate its storage. The tools go back on
the shelf, mostly unused, for a number of reasons, not just the baby-duck
syndrome.


The Minimalist Curmudgeon


My first Windows project in 1986 involved four programmers adding some 30,000
lines of code to an existing 100,000-line program. At the time, we had two
choices in debugging tools: Use WDEB, which was the cumbersome
machine-language debugger from Microsoft, or roll our own alternative. Some of
the programmers on the team attacked bugs via what I call the "Starship
Enterprise" approach: a command center with two monitors, large desktop
machine, and reams of printout. I used the "traveling light" approach, a
quickly-written simple trace facility that displayed messages in a child
window on the screen; see Figure 1. Depending on what kind of bug you're
looking for, either approach has advantages and drawbacks. My approach lent
itself to application-level rather than system-level debugging, and also
constituted more of a platform-independent, product-independent approach.
Since that time, I've reused portions of this code on application programs
running on the Macintosh, DOS, UNIX, and VAX/VMS platforms. The other approach
can be invaluable in tracking low-level bugs, especially those within Windows
itself, or in large bodies of code that you have not written.
My trace facility consists of about 800 lines of C Code (dbgtrace.c), plus a
small header file (dbgtrace.h). The debug module gets compiled and linked with
your application. The listings are not shown here but are available
electronically; see "Availability," page 3. The package includes an example
program that shows how to trace your application via calls to the dbg_Trace()
function. First, of course, you must register the debug window by calling
dbg_RegisterDebugWindow(), and then instantiate it, via
dbg_CreateDebugWindow(). Other than that, you can insert dbg_Trace() calls
freely into the program. You don't need to remove these calls for your release
(non-debug) version, nor bracket them with #ifdefs; the preprocessor will
instantiate the calls as specified by a compile-time #define. If the flag is
not #defined, the header file turns dbg_Trace() and the other trace functions
into null statements which do not get compiled. The most complex routine,
fmtStr(), is based on code I did not write originally. This routine is a
general-purpose string formatter based on code written by Mike Geary and
posted to Compuserve in 1986. It provides a subset of printf() functionality,
and does not know how to display floating-point values. At the time I created
my trace facility, Windows did not allow printf; now, you can use the
Windows-compatible wsprintf(). I've not bothered to do so. Figure 1 shows a
simple text editing application being traced. This example highlights a
limitation of application-level tracing: It's difficult to trace a program if
you can't modify the source. Here, the application relies on the edit control
built into the Windows environment. However, in this case there is a
workaround: By subclassing the edit control, I can intercept messages sent to
its windo
w procedure with my own callback function, call the trace facility, and the
call the edit control's WndProc.
When I first created it, the trace facility was rather elaborate, allowing
different levels of detail, as well as menu choices to turn tracing on and
off; there was also an option for writing messages to a logfile for later
review. Over the years, in moving from a project on one platform to another,
the code has decreased in size as I shed features that were not useful to me
or difficult to maintain across multiple platforms. My debugging methodology
changed from bug-finding to bug-prevention (a key point, but more on this
later). And rather than keep all the multiplatform code in one module with
lots of #ifdefs, there have evolved particular versions for each environment.
The result is something you can implement from scratch in a few hours (or
perhaps you already have!).
One programmer who has done so is Bill Rytand of Avanti Software (Palo Alto,
California), author of PinPoint, a trace utility for Windows. Bill isn't
alone. Programmers have come up to him at trade shows telling him they are
kicking themselves for not bringing a similar product to market. Actually, his
product has not yet been released, and at this writing is only available in
beta form. I therefore won't say too much about it, except that it seems to
provide a full-featured implementation of intrusive, application-level,
tracing. There is a small API which you invoke from your application, to
display various kinds of messages or to open/close a logfile. Messages can go
either to a run-time window in a separate application (the PinPoint Analyzer)
or to a logfile for later review by the Analyzer. If you correctly place trace
statements at the beginning and end of your functions, the Analyzer can
display an outline-like trace that is indented according to the depth of
calls. In practice, I found using the PinPoint facility to be more cumbersome
than my dirt-simple facility, but it certainly has more features, including
support for programs written in C++ and Visual Basic.
Ivan Gerencir has written another trace utility; see the accompanying textbox
entitled "A Multi-app Message Trace Facility for Windows." Ivan's tracer is
much more sophisticated than my own. It's a standalone application that can
display messages sent from multiple applications. In addition, messages can be
saved to a file and printed. Unlike my generic C code, Ivan's program is
written in C++ and requires Borland's OWL application framework.


Debuggers Considered Harmful


Since so many other programmers seem to revel in the technology-intensive
approach to debugging--always seeking out the latest winner in the feature
wars--at times I wondered if my technology-averse approach was oddball. Now I
don't think so. In The Art of Computer Programming, Volume 1, Donald Knuth
writes: "The most effective debugging techniques seem to be those which are
designed and built into the program itself." Furthermore, Mirho states in his
MSJ article that, "The advantages of run-time tracing are that it takes less
time and mental effort than using a debugger, it costs nothing, and it can be
controlled with environment variables."
Finally, I ran across the following passage in Steve McConnell's recently
published book Code Complete: A Practical Handbook of Software Construction:
Given the enormous power offered by modern debuggers, you might be surprised
that anyone would criticize them. But some of the most respected people in
computer science recommend not using them. They recommend using your brain and
avoiding debugging tools altogether. Their argument is that debugging tools
are a crutch and that you find problems faster by thinking about them than by
relying on tools. They argue that you, rather than the debugger, should
mentally execute the program to flush out errors.
Of course, McConnell then goes on to say:
[This] argument against debuggers isn't valid. The fact that a tool can be
misused doesn't imply that it should be rejected.... The debugger isn't a
substitute for good thinking. But, in some cases, thinking isn't a substitute
for a good debugger either. The most effective combination is good thinking
and a good debugger.


Nu-Mega's Tour de Force


I've been using Bounds-Checker for Windows 1.02 from Nu-Mega (Nashua, New
Hampshire) for the last six months on a variety of projects and have found it
to be an essential tool for writing Windows programs. It's simple in concept,
easy to operate, and very effective at finding errors involving memory
overwrites, invalid Windows API calls, null pointers, and resource leaks.
Automatic parameter validation isn't a new idea. Steve Jasik's Macintosh
debugger ("The Debugger") has offered parameter checking of Toolbox calls for
years. On UNIX platforms, Pure Software's Purify offers protection against
memory overwrites and leakage. Bounds-Checker brings these features to
Windows. The program comes on a single disk, and consumes about a
half-megabyte of disk space in a single directory. As with other debuggers,
there's a VxD that must be loaded via an entry in the Windows system.ini file.
Finding bugs means launching Bounds-Checker and using it to run your program.
As errors are encountered, it brings up a message box informing you that some
action such as a memory overwrite is going to occur; see Figure 2. If you have
CodeView-or Borland-compatible debug information in the executable, a window
will show you the relevant portions of your source code file. At that point
you can log the error and continue, or terminate the application and go fix
the bug. A minor annoyance is that there is no "Restart" option; you must
relaunch Bounds-Checker to run your application again. (Because I've never had
occasion to look at the manual, perhaps there is a way around this.)
Bounds-Checker isn't perfect. There are kinds of memory overwrites it can't
see, those restricted to your data segment. Example 1 shows a program
containing a number of errors. The comments identify whether a particular
error was caught or not. For example, it knows about strcpy and will point out
several instances where a string is overwritten. But if you declare the
variable as a char* rather than a char[], or if you replace strcpy() with your
own code, then these errors will slip by undetected. Nevertheless, there are
many kinds of bugs, especially Windows-related ones, that Bounds-Checker will
find. The best are those you didn't know you had.
Earlier this year, I was working on an 8000-line program that crashed about
every tenth time I ran it. I could never discover the reason why--although I
never sat down for a concerted effort (which might well have proved
fruitless). As with many projects, the pressure to add features overwhelms the
pressure to produce a bug-free program, so I added more code and the bug "went
away," soon to be forgotten. When I received my copy of Bounds-Checker, I ran
it on various programs I had lying around, and it zeroed in on my forgotten
bug. The bug is all-too-common among Windows programmers: using a device
context without doing GetDC first. Due to previous invocations to GetDC, my
code happened to work most of the time, but this was a fatal error waiting to
happen.
In another situation, Bounds-Checker found an error that was very difficult to
find via program tracing or by stepping through the code with the Visual C++
debugger. One of my error-handling routines was occasionally doing a longjmp()
using an invalid jmpbuf structure, and thereby blowing away the stack without
any indication as to where the anomaly occurred. Now, when writing Windows
code I make sure to use Bounds-Checker on my work-in-progress every few runs.
I would do it more often but there is a small but perceptible slowdown in
program execution.



Periscope's WinScope


WinScope, from Periscope (Atlanta, Georgia), is another tool for
non-intrusively eavesdropping on the conversation that your application has
with the Windows environment. Unlike API validators like Bounds-Checker (or
SeaBreeze Software's SafeWin), WinScope is more of a passive observer--like a
beefed-up version of Spy, recording its observations in a trace file which can
be viewed after program execution.
The user interface consists of eight MDI child windows that display various
categories of events captured by WinScope, such as messages, API calls,
toolhelp notifications, Windows hooks, modules, and windows; see Figure 3.
Each category can be displayed in various ways: hierarchically in outline
form, alphabetically, or numerically if appropriate. The windows allow you to
specify a filter for each category.
Unlike Bounds-Checker, which is oriented to testing a particular program,
WinScope allows you to listen in to the crowded room that constitutes the
Windows environment, in which many tasks, modules, DLLs are all conversing
with each other. You'll want to set your filters judiciously, otherwise you'll
be overwhelmed by the large torrent of information.
The user interface is very GUI-oriented, so much so that I had difficulty
using it on my mouseless notebook computer. There are so many options that, to
my taste, it ambles uncomfortably over to the Starship Enterprise console
scenario. WinScope has gotten some rave reviews since its recent release and
appears to be well-crafted and full-featured. Nevertheless, I found the
information it gave me not particularly helpful in solving my debugging
problems--too much data that I didn't need and no quick way to specify what I
did need. I found the user interface, despite its GUI orientation, to be a bit
hard to use without consulting the manual. Even after consulting the manual,
there seems to be no single-step way to do what I want, which is to easily
observe one particular message going to one particular window belonging to one
particular task. For the kind of information I want to observe, I find it
simpler and cheaper to insert a single trace statement in the appropriate
WndProc.
However, I think WinScope has potential as an educational tool for exploring
what the run-time dynamics of the Windows environment. In fact, the WinScope
manual uses the subtitle: "Discovery and debugging tool for Windows
application development." So if you don't share my minimalist bias, you may
find WinScope to be just your ticket.
Also worth mentioning here is the upcoming Version 2 of Nu-Mega's
Bounds-Checker for Windows, which is currently in beta. This adds a number of
features, including a tracing facility similar to that found in WinScope, with
a collapsible/expandable view of the event stream. Other features include
checking API return values, the ability to view not just source code but to
also browse data values at run time, and more complete coverage of the Windows
API (to include "new" DLLS such as COMMDLG, TOOLHELP, OLECLI, and OLESVR).


The Secret Method for Bug-free Windows Apps


In my experience, I've found that the best way to achieve bug-free Windows
apps is to not write them--Windows apps, that is. In one word: architecture.
The single most important thing you can do to avoid Windows bugs is not
include windows.h in most of your modules. Each time you include this header
file in a module, you pay a price in program robustness and maintainability,
not to mention compilation speed.
This means architecting your application so that most of the complex
algorithms are found in platform-independent modules that constitute a core
"engine." Of course, your engine needs to display data and interact with the
outside world. This can often be done via a narrow, well-defined interface
layer--a mini-API, if you will. With this architecture, you can write and
debug the core on DOS or some other platform, and drive it with a stdio-based
interface or some other scaffolding. If your test harness allows event
logging, regression testing, and so forth, so much the better.
This methodology is essentially the one so well-articulated by Al Stevens in
his article, "A Multi-tool Approach to Windows Development," in Dr. Dobb's
Sourcebook of Windows Programming (Fall, 1993). Of course, if you've inherited
an existing monstrous pile of spaghetti code, our nostrums will not be much
help. And it's possible that you may be writing a UI-intensive application
where it's impossible to extricate the core from the periphery.


The Other Secret


The other method for writing bug-free Windows apps is not to write
them--bug-free apps, that is. In one word: safe coding practices, the kind
your mother would teach you if she managed developers at Microsoft. I say
"Microsoft" because three books have recently been published by developers who
work (or have worked) at Microsoft that foster coding habits that will
inoculate your apps against bugs. Given the high quality of these books, their
company affiliation must be more than coincidence. Yes, I know you know
Windows NT has bugs and is slow, but I don't think 4.2 million lines of code
written in four years can have a prayer of working without following these
guidelines.
The best of the lot is Steve Maguire's Writing Solid Code: Microsoft's
Techniques for Developing Bug-Free C Programs (Microsoft Press, 1993). The
others are Dave Thielen's No Bugs: Delivering Error-Free Code in C and C++
(Addison-Wesley, 1992), and the previously mentioned Code Complete: A
Practical Handbook of Software Construction (Microsoft Press, 1993) by Steve
McConnell.
The basic attitude is expressed in the preface to Maguire's book:
Finding and fixing bugs [should be] Development's responsibility [as opposed
to that of the Testing Department].... The development teams [have] a goal of
having a "nearly shippable product every day." This means that when a feature
is marked complete, any bugs found in it will have to be fixed before any new
work is attempted. Work in progress will be brought to a standstill if serious
bugs are found in features marked complete. We [call this] attitude "zero
defects."
Elsewhere, Dave Cutler at Microsoft has expressed this more colorfully: having
the Windows/NT programmers "eat their own dog food every day" (doing a daily
build of evolving NT system and installing it on all machines).
In his book's epilogue, Maguire writes:
You're probably wondering if I really believe it's possible to write bug-free
programs. The answer is no... not absolutely. But I do believe you can come
very close to writing bug-free programs, much closer than the current norm;
you just have to decide to do it.
In David Thielen's book, he writes:
[It] just isn't possible to have entirely bug-free code with today's tools and
technology. [But] this book focuses on teaching you how to deliver code with
as few bugs in it as possible. Just as important, it also focuses on giving
you the knowledge to discover what bugs still exist in the program before you
ship it.
You can buy all three books for less than the cost of a whizzy but
semi-useless debugging tool (or just get Maguire's for starters). If you
follow the advice in them, the payoff is all but guaranteed, in my experience.
The three books set down guidelines that I've been trying to follow for years.
These principles are general in nature, and are not particularly tied to the
Windows environment, or even the C language.
If you've been programming for awhile, you already know the basic homilies:
Write clearly and simply, use meaningful variable names, use ASSERTs, use
function prototypes, work with Lint, don't ignore compiler warnings, partition
your datatype space with typedefs, don't mix error return values with real
data, don't use unnecessary casts (they mask errors otherwise caught by the
compiler), use the STRICT option in windows.h, rely on unit tests and function
tests, strive for functional cohesion in your procedures, don't mistake terse
source code for efficient machine code, remember that bugs don't just "go
away," don't fix bugs later, fix them now, and so on.
Even so, chances are you'll find a nugget in Maguire's or Thielen's books that
you may not have encountered before--or at least an interesting real-world
story.
Even if you find no earth-shattering secret knowledge here, it's worthwhile to
have these guidelines in hardcopy form, so you can bang your head against them
when you run into a problem caused by ignoring this advice.
Maguire's writing style is fluid and clear, and the content is intended to
provoke both thought and action (namely, changes in your bad habits)
Thielen's book, by contrast, is more uneven and rough around the edges, almost
like a beta version. He focuses on providing C and ASM code you can use (for
example, to walk the heap or check the stack), rather than highlighting
problems with existing code. He also has a short chapter on commercial
debugging tools.
McConnell's Code Complete shifts the focus away from bug prevention towards
sound software construction and architecture. As I said earlier, your
program's architecture (or lack thereof) can predetermine the success of your
debugging efforts. This big book is worth a look because of its comprehensive
nature, covering all phases of the development process. A full discussion of
this book is beyond the scope here, but it deserves your attention.
 Figure 1: A simple trace facility for debug messages.
 Figure 2: Bounds-Checker in action.
 Example 1: A short workout for Bounds-Checker.
 Figure 3: WinScope at work.


A Multi-app Message Trace Facility for Windows




Ivan Gerencir




Ivan holds a degree in Power Electronics from the University of Belgrade,
Yugoslavia, and is now in Germany developing Windows applications. He can be
reached via CompuServe at 100135,1031.



Many Windows programmers use the function MessageBox() as a simple method for
monitoring the behavior of a single application under development. But
MessageBox() has limitations, even when used to trace just one application. As
a result, I developed the program MsgDisp, which is a Windows tool for
displaying informational messages from any number of running applications--all
in a single widow. The messages are text strings sent over via a shared DLL.
Using MsgDisp instead of MessageBox() has several advantages. The first is
that you can examine the text of your messages after your application has
ended, because they are kept in a listbox control belonging to the message
display task. Second, you can use MsgDisp where it would be very impractical
to use MessageBox--in loops, for example. (Otherwise, you would have to keep
pressing the Enter key every time your program cycles through the loop.) Also,
MsgDisp enables you to save the messages received by the display task to a
file, as well as letting you print them directly.
Finally, MsgDisp goes beyond single-application tracing by allowing you to
monitor the behavior of cooperative processing applications (which communicate
with each other using Windows messages) to reveal the exact order of
processing of these messages, since MsgDisp displays messages in the order
they are received.
I wrote the MsgDisp application in C++, using Borland's Object Windows Library
(OWL) as provided in Borland C++ 3.1 and Application Frameworks. Listing One
(page 104), excerpted source code from the MsgDisp program, is discussed in
detail in the programmer's notes that, along with the complete source code,
are available electronically; see "Availability," page 3. This includes a
resource file, makefile, and a small test program that demonstrates the
process of sending and displaying text messages.
The MsgDisp application (msgdisp.exe) is a Windows app that, once started,
awaits Windows messages. Applications that wish to display trace messages do
so by calling a function, also named MsgDisp, which sends text strings to
msgdisp.exe. This is done via the DLL msgdispi.dll, which your application
calls via the standard mechanism of an import library (msgdispi.lib) linked in
with your code. The tracing function MsgDisp() is declared in the small header
file msgdispi.h.
The function MsgDisp() takes one argument, which is a pointer to the string
that you want displayed by msgDisp.exe. MsgDisp.exe has to be active in order
for messages to be seen. However, if it is not running, this won't prevent
your application from executing normally. You can therefore start the MsgDisp
app at any suitable moment for debugging.
There were some interesting issues I encountered in developing MsgDisp; these
are fully discussed in the programmer's notes (available electronically). For
example, I use the Windows atom management functions instead of passing
pointers to global memory objects. Also, instead of using FindWindow() to get
the handle of the receiving window, I use HWND_BROADCAST to send a message to
all top-level active windows, with a private message number allocated by the
Windows function RegisterWindowMessage(). Finally, I had to dig around the OWL
source code in order to discover why OWL's DispatchAMessage was not sending
messages to my application.
_A MULTI-APP MESSAGE TRACE FACILITY FOR WINDOWS_
by Ivan Gerencir


[LISTING ONE]

//--------------------------------------------------------------//
// MsgDisp -- Message Text Display (excerpted listing) //
// Copyright (c) Ivan Gerencir, Aug 1993 //
//--------------------------------------------------------------//

//-------------------------TMainWindow member functions-------------
#define RESOURCEID 200
TMainWindow :: TMainWindow() // Constructor
: TDialog(NULL, APPNAME)
{
 TextControl = new TListBox (this, RESOURCEID); // instantiate ListBox
}
//----------------------------------------------------------------
void TMainWindow :: ~TMainWindow() // Destructor
{
 delete TextControl; // Destroy interface to ListBox
}
//----------------------------------------------------------------
void TMainWindow :: SetupWindow()
{ TDialog :: SetupWindow();
 Menu = GetMenu(HWindow);
 PopUp = TRUE; // PopUp menu item is initially checked
 CheckMenuItem (Menu, MENU_POPUP, MF_BYCOMMAND MF_CHECKED);
 // Register message used for communicating with us
 if( (MsgDispMessage = RegisterWindowMessage(APPNAME)) == 0)
 { MessageBox (HWindow,
 "Unable to register Windows Message", APPNAME, MB_OK);
 PostQuitMessage(0);
 }
}
//----------------------------------------------------------------
void TMainWindow :: DefWndProc (RTMessage Msg)
{ // Called directly from StdWndProc (owl.cpp) for messages >= 0x8000
 if(Msg.Message == MsgDispMessage) // If we received our message
 { if (Msg.WParam == MAGIC_ID) // and magic data proves to be OK
 { RecvString (Msg); // then receive the string
 return;
 }
 }
 TDialog :: DefWndProc(Msg);
}
//----------------------------------------------------------------
void TMainWindow :: RecvString (RTMessage Msg)
{ // Client sends this message with atom number in LOWORD(LParam).
 char buffer[MAXMSGSTRLEN];
 GlobalGetAtomName (LOWORD(Msg.LParam), buffer, sizeof(buffer));

 TextControl->AddString (buffer);
 TextControl->SetSelIndex (TextControl->GetCount() - 1);
 if (PopUp) // If PopUp menu item checked
 { BringWindowToTop (HWindow);
 ShowWindow (HWindow, SW_SHOWNORMAL);
 }
}
//----------------------------------------------------------------
#define REDISPLAY TRUE

void TMainWindow :: WMSize (RTMessage Msg) // Implements WM_SIZE message
{ RECT Rect; // Fill the whole parent window
 Rect.top = Rect.left = 0;
 Rect.right = LOWORD(Msg.LParam);
 Rect.bottom = HIWORD(Msg.LParam);
 MoveWindow (TextControl->HWindow, Rect.left, Rect.top, Rect.right,
 Rect.bottom, REDISPLAY);
}
//-----------------------------------------------------------------
void TMainWindow :: CMSave (RTMessage)
{ int NumStrings = TextControl->GetCount();
 if(NumStrings == 0)
 return;
 OPENFILENAME ofn;
 static char FileBuf[256] = "";
 char FileTitleBuf[256];

 // Prepare OPENFILENAME structure for GetSaveFileName function
 memset(&ofn, 0, sizeof(ofn));
 ofn.lStructSize = sizeof(ofn);
 ofn.hwndOwner = HWindow;
 ofn.lpstrFilter = APPNAME " files (*.msg)\0" "*.msg\0"
 "All files (*.*)\0" "*.*\0";
 ofn.lpstrFile = FileBuf;
 ofn.nMaxFile = sizeof(FileBuf);
 ofn.lpstrFileTitle = FileTitleBuf;
 ofn.nMaxFileTitle = sizeof(FileTitleBuf);
 ofn.lpstrTitle = APPNAME " Save";
 ofn.lpstrDefExt = "msg";
 ofn.Flags = OFN_OVERWRITEPROMPT OFN_PATHMUSTEXIST OFN_HIDEREADONLY;

 if (GetSaveFileName (&ofn) == 0)
 { DWORD Err = CommDlgExtendedError();
 if(Err)
 { char b[64];
 sprintf(b, APPNAME " Error %08lX", Err);
 MessageBox (HWindow, "Data not saved", b, MB_OK MB_ICONSTOP);
 }
 return;
 }
 strcpy(FileBuf, FileTitleBuf); // Remember entered filename for next time

 FILE *f = fopen(FileBuf, "wt"); // Open file for write in text mode
 if (f == NULL)
 { char b[256];
 sprintf (b, "Unable to open file \"%s\" for write", FileBuf);
 MessageBox (HWindow, b, APPNAME " Save data", MB_OK MB_ICONSTOP);
 return;
 }

 // Now dump contents of ListBox into file
 for(int i = 0; i < NumStrings; ++i)
 { char Str[MAXMSGSTRLEN];
 TextControl->GetString(Str, i);
 fputs(Str, f);
 fputs("\n", f);
 }
 fclose(f);
 MessageBox (HWindow, "Data saved", APPNAME, MB_OK);
}
//-----------------------------------------------------------------
void TMainWindow :: CMPrint (RTMessage)
{ int NumStrings = TextControl->GetCount();
 if(NumStrings == 0)
 return;
 PRINTDLG pd; // Prepare PRINTDLG structure for PrintDlg function
 memset(&pd, 0, sizeof(pd));
 pd.lStructSize = sizeof(pd);
 pd.hwndOwner = HWindow;

 // Constant PD_HIDEPRINTTOFILE defined in COMMDLG.H
 // is long but doesn't end in L, so the IDE compiler
 // issues a warning. Unfortunately, #pragma warn -cln
 // does not suppress it. Command line compiler BCC however,
 // handles it OK.
 #pragma warn -cln
 pd.Flags = PD_RETURNDC PD_HIDEPRINTTOFILE;

 if (PrintDlg(&pd) == 0)
 { DWORD Err = CommDlgExtendedError();
 if(Err)
 { char b[64];
 sprintf(b, APPNAME " Error %08lX", Err);
 MessageBox (HWindow, "Printing canceled", b,
 MB_OK MB_ICONSTOP);
 }
 return;
 }
 // If PrintDlg allocated additional structures, release them
 if(pd.hDevMode) { GlobalFree(pd.hDevMode); pd.hDevMode = NULL; }
 if(pd.hDevNames) { GlobalFree(pd.hDevNames); pd.hDevNames = NULL; }

 HDC DC = pd.hDC;
 POINT PageDims; // Get physical page size in pixels
 Escape (DC, GETPHYSPAGESIZE, NULL, NULL, &PageDims);

 TEXTMETRIC tm; // Calculate LineHeight according to the current font
 GetTextMetrics (DC, &tm);
 int LineHeight = tm.tmHeight + tm.tmExternalLeading;

 POINT PrintOffs; // Get printing offset on the page
 Escape (DC, GETPRINTINGOFFSET, NULL, NULL, &PrintOffs);
 // Reduce vertical page size to reflect printable area,
 // for the printer cannot print on whole physical page
 PageDims.y -= 2 * PrintOffs.y + LineHeight;

 DOCINFO di; // Prepare DOCINFO structure for StartDoc function
 di.cbSize = sizeof(di);
 di.lpszDocName = APPNAME;

 di.lpszOutput = NULL; // no redirection to file

 StartDoc (DC, &di); // Start print job
 StartPage (DC);
 int YPos = 0;
 for(int i = 0; i < NumStrings; ++i)
 {
 if(YPos > PageDims.y) { EndPage(DC); StartPage(DC); YPos = 0; }
 char Str[MAXMSGSTRLEN];
 TextControl->GetString (Str, i);
 TextOut (DC, PrintOffs.x, YPos, Str, strlen(Str));
 YPos += LineHeight;
 }
 EndPage (DC);
 EndDoc (DC);
 DeleteDC (DC);
}
//-----------------------------------------------------------------
void TMainWindow :: CMRepaint (RTMessage)
{ // Redraw complete window with all the children
 RedrawWindow (HWindow, NULL, NULL,
 RDW_ERASE RDW_FRAME RDW_INTERNALPAINT 
 RDW_INVALIDATE RDW_UPDATENOW RDW_ALLCHILDREN);
}
//----------------------------------------------------------------
void TMainWindow :: CMInsertBreak (RTMessage)
{ // Implements Menu OptionsInsert break choice
 TextControl->AddString ("======================================");
 // Force selected item to the last so that
 // it is always displayed
 TextControl->SetSelIndex (TextControl->GetCount() - 1);
}






























November, 1993
PROGRAMMING PARADIGMS


Forth and Standards and Chaos and Life


Michael Swaine
The charter of this space in the magazine has always been to look at the new,
the unusual, the downright bizarre paradigms of programming. Or something like
that.
But some old paradigms are bizarre enough for the most jaded taste, and some
new paradigms can look surprisingly familiar when viewed in the right light.
Such is the case in this month's effort to cast some light on Forth and
standards and chaos and life.


Forth Amendment


"Standard Forth." Sounds like an oxymoron, doesn't it? The joke goes, "If
you've seen one Forth, you've seen, well, one Forth."
Yet the maverick language does have its very own ANSI X3 standards committee,
earnestly enunciating the defining characteristics of the Official Standard
Version. This is an effort that one might reasonably expect to be less assured
of success than, say, the standardization of the Defense Department's official
language, Ada. (And perhaps this is not an inappropriate place to thank the
DoD for clearing that up. It had always been a bit of a mystery to me in just
what language the DoD communicated, although I knew it wasn't English.)
That's what one might expect, that is, until one reflected on the applications
in which Forth is the language of choice. Like government work.
One of the differences between conversation and writing-for-publication is
that in writing-for-publication you have to make one message work for many (or
at least several) recipients; and you can't count on these recipients having
the same background. It would make my job a lot easier if you all knew exactly
the same things. Then I wouldn't have to include brief sotto voce asides like
this one:
I'm about to describe Forth for readers unfamiliar with it. If you know enough
about Forth to know that the "government work" I mentioned above was a
reference to the fact that Forth was the first high-level language on NASA's
Massively Parallel Processor that the Goddard Space Flight Center used for
image processing, or some other NASA application, then you may want to skip
this part.
There are, I'm sure, readers of this magazine who are not at all familiar with
Forth, although it's available on pretty much any platform. They've been
missing the splendid ironies in this column so far. The next 500 words or so
are to help them get the jokes.
Toward the end of this description of Forth I'll pass along some second-hand
observations on the standard. The best source on the standard itself is The
American National Standards Institute Inc. (11 West 42nd Street, New York, NY
10036). A very good source for some perspective on the standard, and the
first-hand source of the aforementioned observations, is Jack Woehr's book,
Forth: The New Model (M&T Books, 1992).


Define "print"


For those as yet untouched by the magic of Forth, then here is some
background. It's an interpreted language (though compilers exist) whose
interpreter grabs one word (space-delimited string) at a time from the input
stream, looks for it in a dictionary of words, and, if successful, executes
the action associated with it. If not, it tries to interpret the word as a
number and push it onto a stack.
This stack is central to Forth. A word (unless it's a number) names a
function, and the parameters of any function are passed via this same stack.
The output of a function is pushed onto this stack and every function expects
to find its parameters on this stack when it's invoked.
Word-at-a-time immediate execution, the dictionary, the stack: That pretty
much describes the essentials of the language. It's very simple.
This all leads to a style of programming familiar to users of certain
electronic calculators. To add 2 and 2 and view the result, you type "2 2 +
.".
There are four distinct items here for the interpreter to examine: two numbers
and two functions. The word "+" performs the obvious function, but the syntax
is postfix, as it always is in Forth. This code causes two numbers to get
pushed onto the stack, then "+" eats them, pushing the result ("4" for
ANSI-standard implementations) onto the stack. The word "." causes the item on
top of the stack to be printed.
You program in Forth by adding new words to the dictionary, defined using the
":" and ";" words. When the interpreter searches the dictionary, it stops when
it finds the most recent definition of the word, so you can change the
definition of any Forth word, for example rerouting output to a printer by
redefining ".". You can also extend the definition of a word by writing the
new definition using the word itself: ": + 1 + + ;". The words ":" and ";"
bracket the definition, the first "+" is the name of the word being defined,
and the rest is the definition. This new "+" returns "5" as the result of "2 2
+ ." which some Forth programmers would consider incorrect.
This definition looks like recursion; that is, defining a function in terms of
itself, but the "+" inside the definition is really the old "+". Recursion is
possible in Forth, but you have to explicitly say that you're recursing.


When the Only Law was a Hook and a Draw


Forth is the kind of language in which you can define your own floating-point
packages. Forth has also sometimes been the kind of language in which you must
define your own floating-point packages, although the standards effort may
change that.
Forth got its start in astronomy, having been invented by Chuck Moore as a
language for controlling telescopes. Its early experience was a good predictor
of how it would come to be used. It is today an official standard of the
International Astronomical Union for exchanging procedures for controlling
telescopes and related equipment. It's popular with space types at NASA. And
it's used heavily in embedded systems and controllers.
Its inventor was a good model for its current users, too. Moore is a
cowboy-hatted iconoclastic individualist, and Forth people today tend to be
pretty individualistic. A language that requires programmers to define their
own floating-point packages attracts programmers who prefer to define their
own floating-point packages.
Which, naturally, raises some difficulties when it comes to settling on a
standard. Some would like to see a more architectural standard; the current
effort specifies Forth syntactically. The standard does bring floating-point
math into Forth, but not without ruffling feathers. The standard also defines
a "block" to be 1024 characters, rather than bytes. This, as Woehr points out,
eliminates what was "the one truly portable file format in the whole computer
universe." Sigh.
It was, of course, unfair of me to say that "standard Forth" sounds like an
oxymoron. Forth has had many standards over the years (79-STANDARD,
83-STANDARD, FIG-Forth, polyFORTH) and now it has another. In much the same
way, far from being lawless, the Old West was a place and time where the law
was a gun, and everybody had one.


Grid Iron


Forth code can be remarkably small: A Forth program will normally be a lot
smaller than the equivalent C program, and often smaller than the equivalent
assembly language program. It can also be very fast.
That's why it was the language of choice for a machine designed specifically
to support cycle-chewing research in cellular automata, or CA.
Cellular automata were introduced almost 50 years ago by John von Neumann and
rediscovered repeatedly since, for example by Stephen Wolfram, who discovered
that working in one dimension rather than two simplified things.
The particular CA model that nearly everyone knows, and that I can explain
without getting confused, is John Horton Conway's "game" of Life. Life is
"played" on a grid of squares, or cells, each cell being either "alive" or
"dead." The grid is usually thought of as being infinite, usually implemented
as a finite torus. A simple rule, which is simultaneously applied to every
cell, defines which cells will "live" or "die" in the next "generation," much
like the rules that define transitions in finite-state machines. The rule for
Life is: A live cell with two or three neighbors stays alive, while any other
live cell dies; a dead cell with three neighbors gives birth, while any other
dead cell stays dead. The neighbors of a cell are the eight adjacent cells
(including those diagonally adjacent).
Life, with its simple rule, generates a universe of interesting patterns.
Other rules can model physical systems or computational processes or
biological systems, but Life itself is sufficiently powerful that it can be
used to produce logic gates and perform calculations.
Life is not a game, although it can be treated like one. And CAs are serious
science. von Neumann took CAs very seriously as a model of computing. And work
with CAs has led to some serious results.
But the problem in doing CA research is that it is very demanding on the
processor. The action is all at a macroscopic, long time interval, statistical
level, but you've gotta build the microscopic structure to see the
macroscopic.
So, of course, you need a CA machine. You need to scrap those
high-level-language implementations of the CA grids and replace them with some
grid iron.

So some guys did that. Tommaso Toffoli and Norman Margolus were the guys and
the machine was called "CAM," originally built at MIT's computer science lab,
later marketed as a roughly $1500 PC card by Systems Concepts of San
Francisco. The machine is described in their book, Cellular Automata Machines:
A New Environment for Modeling (MIT Press, 1987). For CA computations, CAM is
faster than a suitably-programmed Cray-1.
And they programmed it in Forth.
This was natural, since Forth is so extensible. Writing Forth code involves
writing new Forth words, and it is easy to use the relatively low-level Forth
words supplied in the CAM software to create powerful higher-level words: ":
BBM CENTER CW CCW OPP + + + >PLN0 ;" defines the rule for an interesting and
useful billiard-ball model of collisions of molecules in a gas.
But this is all, in a sense, ancient history. CA and its successors have
advanced beyond T&M's book, processors beyond CAM, and programmers beyond
billiard-ball models.
The creators of cellular automata systems are trying to create living beings.


That Would be Cool


Cellular automata have spawned Artificial Life.
It starts with the gliders in Life. A glider is a pattern of cells that
reproduces itself after some generations, but in a different location on the
grid. Alternatively, it's a pattern that walks across the grid. Realizing that
this self-reproduction was part of what he needed to prove Life capable of
universal computation, Conway ran a contest to find the other critical
component: a glider gun, a pattern that would shoot out gliders.
The glider gun was found, Life was shown to be capable of universal
computation, and a group at MIT executed the tour de force of creating an
adder in Life. The adder gobbled up streams of gliders representing input
numbers and spit out a glider stream that represented the sum. This was more
or less the end of the story, mathematically, and, as Steven Levy tells it in
his book, Artificial Life: A Report from the Frontier where Computers Meet
Biology (Pantheon Books, 1992), researchers were asking whether CA was worth
any further study.
One researcher in particular who asked the question was Tommaso Toffoli, and
given that he went on to invest considerable time and effort in CA, apparently
his answer was yes.
The answer was yes, however, not for mathematical reasons. Toffoli agreed that
the area was sterile mathematically. He thought it was interesting for its
applications to reality. Of course CAs could be used to model physical
systems, but Toffoli had something bolder in mind. CAs are themselves actual
dynamic systems, and can be studied directly, not as models of something else,
but as a way of directly observing the behavior of complex systems.
One person who understood that idea was Stephen Wolfram, who sees CAs as a
path to understanding complexity. Wolfram also saw a striking similarity
between time plots of some of his one-dimensional CAs and patterns on certain
seashells.
What happened then was probably inevitable. The best-known CA was "Life,"
described as a game, there were all these seashell analogies being drawn, and
predictably the workers in this developing field began to attract ridicule
from the scientific establishment.
And the practitioners invited it. They spoke loosely of their creations as
critters and ants, and of themselves as studying the essence of life, or as
gods in the process of creating living beings.
They would have been ridiculed even if they hadn't literally meant what they
said, as some in the developing field absolutely did.
The field in question is now known as "Artificial Life."


Get A-life


The field became official in 1987 when the first a-life conference was held at
Los Alamos. The announcement, quoted in Levy's book, described a-life as "the
study of artificial systems that exhibit behavior characteristic of natural
living systems. It is the quest to explain life in any of its possible
manifestations, without restriction to the particular examples that have
evolved on earth.... The ultimate goal is to extract the logical form of
living systems."
Although the work grew from Von Neumann's cellular automata, the concerns of
a-life researchers are quite different from Von Neumann's and some assumptions
have changed radically, too.
A-life systems now exist that exhibit most of the behaviors one would like to
associate with life. That in itself is not so impressive, but what is
impressive is that the artificial organisms, evolving over rapid generations,
encounter many of the problems that confront living systems and evolve to deal
with them. A-life systems have been built that evolve strategies for dealing
with predators and parasites, learn to compete and to cooperate, explore the
options in game-theory situations, and learn Kepler's laws of planetary
motion. A-life research provides an answer to the old Woody Allen question, is
sex necessary? (The answer: yes, sex is a means of getting down from a local
maximum when doing evolutionary hill-climbing.)
And then somebody thought of applying the evolutionary processes of a-life to
computer programs. Rather than write 'em, let's grow 'em, was the seemingly
wacko notion. It's coming to seem less wacko, since some programs grown in
this way have come up with solutions comparable to the best human programming.
The self-replication of patterns of cells in early CAs was only a hint of
these populations of programs that spawn new, better-evolved programs that are
today being grown on a Connection Machine at UCLA.
Genetic algorithms, this subdiscipline is called. While the freakiest a-life
work is the kind that creates the most lifelike behavior in the most
creature-like creatures, the concept of genetic algorithms is plenty scary if
you're one of the lifeforms with which such systems would ultimately compete.
It's hard to think of a running program as a living being, and unnerving if
you succeed in doing so. It's a little easier--but unnerving enough--to think
of these things as programs that reproduce themselves, evolving new and
better-adapted kinds of programs, written in new, weird, inhuman code.
Wait a minute. Weird code?
Is it possible that when--you may say "if"--that if or when these chaos-born,
self-defining grid-livers attain the sophistication and intelligence of the
average beginning computer science student, they will choose to program in
Forth?
Hmm.



























November, 1993
C PROGRAMMING


Info/Video-Glut and D-Flat++ Resumed




Al Stevens


Now they tell me that we're going to have 500 TV channels. I know that it's
fashionable to pretend that you don't watch (or own) a TV set, but somebody
must be watching to make having 500 channels pay off. Besides, its a big
opportunity for programmers. Sit back and read about the next killer vertical
application.
I remember when TV came to Lorton, Virginia. It was about 1949, and my
grandparents had one of the first sets in town. My four brothers and I would
walk to their house after school and watch Captain Video and Howdy Doody.
There were four channels then, a lot for the time, but we were within range of
the Washington, D.C stations. They started broadcasting at four in the
afternoon and shut down at eleven at night.
The point of this parable is that it didn't take much time to figure out what
to watch. It took about ten seconds to scan the channels (and that was with no
remote), getting off the couch, walking to the set to flip the dial, and walk
back. Most viewers had the schedule memorized, and they didn't have to scan.
Just tuned it in. Sometime in the 1960s cable came along, and there were maybe
ten stations to choose from. Still no problem keeping up with the schedule;
flipping channels took only a little longer. Where I live now, clearly in a
culturally-deprived area, we have only 36 channels and a remote. It takes
maybe a minute or two to go around the dial, especially if you have to outwait
the commercials to see what's on. But back in Lorton, my nephews have almost
100 channels. It takes them much longer now to find out that there's nothing
on that they want to watch. My in-laws have a dish antenna on their farm in
Pennsylvania, and it takes longer than most shows last to scan all of the
stations. But 500 channels? How will you know what to watch? How will you know
what's on?
There is a problem, and the TV industry will need a solution. It calls out for
an on-line service that keeps track of everything that is scheduled for
broadcast and that stores a profile of every viewer's personal viewing
preference. The profile would look like a complex database query.
Select(DiNiro BUT NOT "Raging Bull")
Select(Galloping Gourmet AND broccoli)
Select(Bob Packwood AND Germaine Greer)
Select(Mr. Ed OR Ed Meese)
Each day you would be notified of the programs that matched your profile. You
could rearrange your schedule to watch, or you could tell the system to
automatically tape the selected programs on your multi-deck VCR.
But some problems will persist. I'm not sure how Ms. Stevens and Ms. Erickson
will be able to select only the Perry Mason reruns they haven't seen yet.
Select(Perry Mason and "defendant found guilty")
There's clearly a golden opportunity for an enterprising database programmer
lurking somewhere in this idea. You have my permission to run with it.


Back to D-Flat++


This month we return to D-Flat++. I shelved the project for a half year while
I worked on other things. I'll review what D-Flat++ is and what we've covered
so far, to bring you up to date.
D-Flat++ is a descendent of D-Flat, which is a C function library that
implements the Common User Access (CUA) user interface in a DOS, text-mode
environment. I published that library in this column. It took about a
year-and-a-half to cover the whole thing. When I began to rewrite D-Flat as a
C++ class library (formally named "D-Flat++" and called "DF++" here for
short), I decided to omit some of the features that D-Flat has, mainly because
I never used them, and I further decided to publish only that part of the code
that is interesting enough to warrant discussion so that the project wouldn't
take so long. You can get all of the source code by downloading it or by
sending for a careware version. I'll explain how that works later in this
column. You can get back issues of DDJ to catch up on the discussions. The
November 1992 through April 1993 issues cover everything up until now. The DDJ
CD-ROM includes all of D-Flat as well as D-Flat++.
Up to this point in the DF++ project we've discussed the desktop, the
application, and some control windows. A DF++ application runs from a virtual
desktop that contains a screen, a keyboard, a mouse, a clock, and a speaker.
The desktop and the devices are represented by classes in the library. The
application is a base class from which you derive a custom application class.
The application's menu is defined as a part of the derived class. Your derived
application class includes member functions that execute when the user chooses
menu selections.
The control windows are text boxes, edit boxes, push buttons, check boxes,
list boxes, and so on. They are each implemented in classes that are derived
from a Control class. All of the windows in DF++ are derived from a base
DFWindow class.


Portability


In an earlier column, I discussed a portability layer that lets DF++ compile
with both Borland C++ and Microsoft C++. I had hoped that it would work with
those compilers and perhaps some others. I gave up on Zortech when I couldn't
get reliable tech support from them on CompuServe. I have now temporarily
dropped Microsoft too because the MSC/C++ compiler doesn't support templates
yet. Once you start using templates, you can't do without them. When time
permits, I'll port DF++ to Comeau C++ (a CFRONT port) and the new Watcom C++
compiler, both of which support templates. Until then, you need Borland C++
3.1 or newer to compile DF++.
I'm trying to avoid the container classes and templates that come with some
compilers. Not that they aren't good tools, it's just that there are no
standards in place yet, and the different implementations are unlike one
another and probably unlike whatever the ANSI committee comes up with. That's
why you'll find a Tree template and a String class and other such things
bundled into DF++. I'll try to comply when ANSI publishes a draft, that is if
I'm not on a rocking chair at the Old Columnists retirement village.


Old Dog, New Trick


I learned a lesson about class design that I haven't seen in any of the
advanced C++ style books yet. You know which ones I mean; they tell you how
and when to use this or that language feature and when not to. It's generally
understood that you should make data members private and provide access to
them through a member function interface. The books all tell you that. The
reasons are founded in sound object-oriented design principles. When you
shield the class-using programmer from the implementation details, you protect
the data. You make it easier on yourself later as well, when you want to
modify the implementation. By providing a function interface and hiding the
implementation, you can change the implementation without affecting the user's
code.
Now, here's another convention to consider. We all know that only a class's
member functions can access the private data members. If those data members
have a public interface, the class's members should use it, too. Why? Because
when you set about to change that implementation, you'll be changing a lot
less of the implementation code if the members use the public interface.
How did I come to that conclusion? Nothing worth knowing comes easy. DF++ has
a base DFWindow class that handles all of the operations that are common to
all windows. All of the other window classes are derived from DFWindow. One of
its functions is to maintain the parent/child/sibling relationship among
windows. Before templates, I built the list head, tail, and linked-list
pointers into the DFWindow class itself. Not long ago, I decided to use a
Family class that I had designed based on the LinkedList class that I
discussed a few months ago. To my dismay, I found that a whole lot of DFWindow
member functions use those embedded pointers. There was a very nice interface
there for non-member functions to use, but the member functions didn't use it.
My work would have been a lot easier if I had used prudent design techniques
at the outset. (I'm always telling myself that.)


Resources and What C++ Left Out


D-Flat++ uses resources similar to the GUIs that it emulates. Resources in
this context are menus and dialog boxes. I discussed menus in April. The
menubar is defined as an instance of a MenuBar class with a list of
MenuBarItem objects, each one of which has a PopDown object associated with
it. Each PopDown object is defined by a list of MenuSelection objects. This is
similar to the arrays of structures that D-Flat uses, except in this case we
use class objects, and in this case we--sadly--have no resource language.
Windows programmers are familiar with the Resource Compiler program and
resource language that defines menus and dialog boxes. I implemented the
D-Flat resource language with C preprocessor macros and did a reasonable job
of emulating the syntax of the Windows languages. No matter how I try,
however, I haven't yet figured an elegant way to do the same thing with DF++.
It seems that they left something out when they designed C++. You can't tack
initializers onto the declaration of arrays of class objects if the classes
are more complex than a simple C structure, which means that you can't use
such initializers to declare a default dimension for the array. That
limitation prevents me from using the preprocessor to translate resource
statements into array declarations the same way I did with C. You gain
something, you lose something.


Dialog Boxes



I discussed D-Flat++ control classes in February and March. DF++ dialog boxes
are classes that you derive from a Dialog class and embed control classes in
them. To illustrate the technique, and to have some commonly used dialog
boxes, DF++ includes message and error windows, a yes/no selection window, and
file open and save-as dialog boxes. There are five control classes derived
from the PushButton class: OK and Cancel buttons, Yes and No buttons, and a
Help button. We'll look at this design from the top down by examining the
FileOpen and SaveAs dialog boxes. Listing One, page 141, is fileopen.h, the
source file that defines the FileOpen and SaveAs dialog box classes. FileOpen
is derived from the Dialog class, and SaveAs is derived from FileOpen. We'll
discuss the former.
The FileOpen class contains several class objects, each of which is a control.
The Label objects are static text that the dialog box displays and that the
user cannot change or tab to. The program may change them, but the user
cannot. The FileOpen dialog box uses several Labels to identify the other
controls. D-Flat++ associates a Label object with a control if the Label
object immediately precedes the control. If the Label has a shortcut key
defined, the user can tab directly to the control by pressing Alt and the
shortcut key--just like Windows. Observe that the control definitions have no
initializers to give them any details. You cannot express initializing values
inside the class definition. The dialog box's constructor do that.


Drives, Subdirectories, and Paths


Three of the control classes used in the FileOpen dialog box class are derived
from the ListBox control class. They are the FileNameListBox control, the
DirectoryListBox control, and the DriveListBox control. A fourth class, the
PathNameLabel control, is derived from the TextBox class. These control
classes are defined in Listing Two, page 140, directry.h. Their purposes are
to display the current path and lists of the files, directories, and drives.
Listing Three, page 140, is directry.cpp, which contains the member functions
for the disk and directory control classes. The lists are initialized by their
constructors. All but the DriveListBox class have member functions that allow
a using dialog box to change their lists. For example, when the user changes
drives or subdirectories, the dialog box needs to update the lists of files
and subdirectories.
There is some dark logic in the DriveListBox constructor to determine whether
the drive is a RAM disk or a network drive or to see if the B: drive is
remapped to A: as happens when a computer has only one diskette drive. The
logic involves calls to the DOS IOCTL function with some undocumented
protocols. (Just my small contribution to Andrew Schulman's "Undocumented
Corner.")
Listing Four, page 140, is fileopen.cpp, which contains the member functions
for the FileOpen and SaveAs classes. The declarator for the class constructor
is where the program defines the properties of the dialog box and its controls
by providing parameters for their constructors. In this case, the Dialog class
is constructed with the "File Open" title, a height of 19, and a width of 57.
The Label objects are provided with text values to display and x/y screen
position relative to the dialog box parent window. The tilde character (~) in
the text identifies the label's shortcut key. Other controls have positions
and other necessary initializers.
Several of the FileOpen member functions are overrides of virtual functions in
the DFWindow and Dialog classes. The ControlSelected and ControlChosen
functions are called when a control is selected or chosen. For example, when
the user moves the selector bar on a list box, this action is called
"selecting" the item. When the user double-clicks or presses Enter on an item,
this is called "choosing" the item. The control window reports these actions
to its parent by calling virtual functions. This notification allows the
parent, a dialog box in this case, to take action based on the user's actions.
In the FileOpen class, for example, the user's selection of a file name from
the list causes the dialog box to copy that name into the file name edit box.
There are OKFunction, CancelFunction, and HelpFunction virtual functions that
are called when the user chooses the related pushbutton. The first two are
called as well when the user presses Enter and Esc, and the third one is
called when the user presses F1. These functions allow the dialog box to take
whatever action is appropriate. If the derived dialog box class does not
override them, the Dialog class takes default actions. For the first two, it
closes the dialog box window and returns to the program that built it.
HelpFunction, after I have it completed, will display help about dialog boxes
in general if the derived dialog box does not override the function.
The derived dialog class may override two more functions. These are the
EnterFocus function, which is called when a control is about to get the focus,
and the LeaveFocus function, which is called when a control has lost the
focus. These functions allow the dialog box to take appropriate action on the
user's entries as they occur.


The Dialog Class


Listing Five, page 143, is dialog.h, which defines the base Dialog class. The
class definition is fairly simple. It includes a few data members and the
primitive functions common to all dialog boxes. The constructors are all
protected, so you cannot instantiate an object of the class. You must derive a
class such as the FileOpen class discussed above and instantiate it instead.
Listing Six, page 143, is dialog.cpp, which contains the member functions. The
TestShortCut function associates shortcut keys on labels to their immediately
following control class objects. The Execute function runs the dialog box
after it is constructed. I couldn't simply call that function from the
constructor because that would preclude any derived dialog box classes from
overriding it or any other member functions. The Execute function captures the
focus for the dialog box and sets the focus to the first enabled control in
the dialog box. Then it stays in a loop, calling DispatchEvents for as long as
the dialog box is active.


Using a Dialog Box


A program displays the FileOpen dialog box on the screen and executes it by
declaring an instance of the class and calling its Execute member function as
shown here:
FileOpen fo;
fo.Execute();
The Execute function gives the user's focus to the dialog box and does not
return to the calling program until the user chooses the OK or Cancel button.
The program can test the result by calling the OKExit member function as shown
here:
if (fo.OKExit()) {
String fname = String(fo.FileName());
// ....
}
The FileName member function returns the chosen filename. Although the dialog
box is closed and erased from the screen, the class object exists until it
goes out of scope, so the program retrieve data values from the controls as
the user changed them.
These procedures are the same for all modal dialog boxes. I haven't worked out
those for modeless dialog boxes yet, but that feature should be in place by
the time you read this.


How to Get the Source Code


D-Flat++ is still preliminary but far closer to a working version than the
first one. I'm writing this column in August and will soon upload Version 2.
The first version was incomplete, but there was enough of an implementation to
give you an idea of how DF++ works and how it differs from D-Flat. The second
version has enough functionality to build an application, although there are
more features to come. Later versions should be released by the time you read
this. The C D-Flat function library is still available, too. You can download
DF and DF++ from the CompuServe DDJ Forum or from M&T Online. You can also get
them by sending a stamped, self-addressed diskette mailer and a formatted
diskette to me at Dr. Dobb's Journal, 411 Borel Avenue, San Mateo, California
94402. The software is free, but if you wish, include a dollar for my
"careware" charity, the Brevard County Food Bank.
_C PROGRAMMING COLUMN_
by Al Stevens

[LISTING ONE]

// --------- fileopen.h

#ifndef FILEOPEN_H
#define FILEOPEN_H

#include "dflatpp.h"
#include "directry.h"

// ------------ File Open dialog box
class FileOpen : public Dialog {
 // -----File Open Dialog Box Controls:
protected:

 // ----- file name editbox
 Label filelabel;
 EditBox filename;
 // ----- drive:path display
 PathNameLabel dirlabel;
 // ----- files list box
 Label fileslabel;
 FileNameListBox files;
 // ----- directories list box
 Label dirslabel;
 DirectoryListBox dirs;
 // ----- drives list box
 Label disklabel;
 DriveListBox disks;
 // ----- command buttons
 OKButton ok;
 CancelButton cancel;
 HelpButton help;
 // ------ file open data members
 String filespec;
 // ------ private member functions
 void SelectFileName();
 void ShowLists();
 // --- functions inherited from DFWindow
 virtual void ControlSelected(DFWindow *Wnd);
 virtual void ControlChosen(DFWindow *Wnd);
 virtual void EnterFocus(DFWindow *Wnd);
 virtual void OKFunction();
public:
 FileOpen(char *spec = "*.*", char *ttl = "File Open");
 const String& FileName() { return filespec; }
};
class SaveAs : public FileOpen {
 virtual void OKFunction();
public:
 SaveAs(char *spec = "", char *ttl = "Save As")
 : FileOpen(spec, ttl) { }
};
#endif



[LISTING TWO]

// ---------- directry.h

#ifndef DRIVE_H
#define DRIVE_H

#include <dir.h>
#include <dos.h>
#include "listbox.h"
#include "label.h"

class PathNameLabel : public Label {
 char *CurrentPath();
public:
 PathNameLabel(int x, int y, int wd) :
 Label(CurrentPath(), x, y, wd) {}

 void FillLabel();
};
class DriveListBox : public ListBox {
 unsigned currdrive;
public:
 DriveListBox(int lf, int tp);
};
class DirectoryListBox : public ListBox {
public:
 DirectoryListBox(int lf, int tp);
 void FillList();
};
class FileNameListBox : public ListBox {
public:
 FileNameListBox(char *filespec, int lf, int tp);
 void FillList(char *filespec);
};
#endif


[LISTING THREE]

// ---------- directry.cpp

#include <direct.h>
#include "directry.h"

char *PathNameLabel::CurrentPath()
{
 static char path[129];
 _getdcwd(0, path, 129);
 return path;
}
void PathNameLabel::FillLabel()
{
 SetText(CurrentPath());
}
DriveListBox::DriveListBox(int lf, int tp) : ListBox(lf, tp, 10, 10, 0)
{
 SetAttribute(BORDER);
 currdrive = getdisk();
 union REGS regs;
 for (unsigned int dr = 0; dr < 26; dr++) {
 setdisk(dr);
 if (getdisk() == dr) {
 // ----- test for remapped B drive
 if (dr == 1) {
 regs.x.ax = 0x440e; // IOCTL func 14
 regs.h.bl = dr+1;
 int86(0x21, &regs, &regs);
 if (regs.h.al != 0)
 continue;
 }
 String drname(" :");
 drname[0] = dr+'A';
 // ---- test for network or RAM disk
 regs.x.ax = 0x4409; // IOCTL func 9
 regs.h.bl = dr+1;
 int86(0x21, &regs, &regs);

 if (!regs.x.cflag) {
 if (regs.x.dx & 0x1000)
 drname += " (Net)";
 else if (regs.x.dx == 0x0800)
 drname += " (RAM)";
 }
 AddText(drname);
 }
 }
 setdisk(currdrive);
 SetScrollBars();
}
DirectoryListBox::DirectoryListBox(int lf, int tp) : ListBox(lf, tp, 10, 13,
0)
{
 SetAttribute(BORDER);
 FillList();
}
void DirectoryListBox::FillList()
{
 ClearText();
 int ax;
 struct ffblk ff;
 ax = findfirst("*.*", &ff, FA_DIREC);
 while (ax == 0) {
 if ((ff.ff_attrib & FA_DIREC) != 0) {
 if (strcmp(ff.ff_name, ".")) {
 String fname("[");
 fname += ff.ff_name;
 fname += "]";
 AddText(fname);
 }
 }
 ax = findnext(&ff);
 }
 SetScrollBars();
}
FileNameListBox::FileNameListBox(char *filespec,int lf,int tp)
 : ListBox(lf, tp, 10, 14, 0)
{
 SetAttribute(BORDER);
 FillList(filespec);
}
void FileNameListBox::FillList(char *filespec)
{
 ClearText();
 int ax;
 struct ffblk ff;
 ax = findfirst(*filespec ? filespec : "*.*", &ff, 0);
 while (ax == 0) {
 AddText(ff.ff_name);
 ax = findnext(&ff);
 }
 SetScrollBars();
}


[LISTING FOUR]

// ---------- fileopen.cpp


#include <io.h>
#include <dir.h>
#include "fileopen.h"
#include "notice.h"

// ----------------- File Open Dialog Box
FileOpen::FileOpen(char *spec, char *ttl) :
 Dialog(ttl, 19, 57),
 // ----- file name editbox
 filelabel ("~Filename:", 3, 2),
 filename (13, 2, 1, 40),
 // ----- drive:path display
 dirlabel (3, 4, 50),
 // ----- files list box
 fileslabel("F~iles:", 3, 6),
 files (spec, 3, 7),
 // ----- directories list box
 dirslabel ("~Directories:", 19, 6),
 dirs (19, 7),
 // ----- drives list box
 disklabel ("Dri~ves:", 34, 6),
 disks (34, 7),
 // ----- command buttons
 ok (46, 8),
 cancel (46,11),
 help (46,14),
 // ------ file open data members
 filespec(spec)
{
 filename.AddText(filespec);
}
// --- Get selected filename: files listbox->filename editbox
void FileOpen::SelectFileName()
{
 int sel = files.Selection();
 if (sel != -1) {
 String fname = files.ExtractTextLine(sel);
 filename.SetText(fname);
 filename.Paint();
 }
}
// ---- called when user "selects" a control
// e.g. changes the selection on a listbox
void FileOpen::ControlSelected(DFWindow *Wnd)
{
 if (Wnd == (DFWindow *) &files)
 // --- user selected a filename from list
 SelectFileName();
 else if (Wnd == (DFWindow *) &dirs 
 Wnd == (DFWindow *) &disks) {
 // --- user is selecting a different drive or directory
 filename.SetText(filespec);
 filename.Paint();
 }
}
// ---- called when user "chooses" a control
// e.g. chooses the current selection on a listbox
void FileOpen::ControlChosen(DFWindow *Wnd)

{
 if (Wnd == (DFWindow *) &files)
 // --- user chose a filename from filename list
 OKFunction();
 else if (Wnd == (DFWindow *) &dirs) {
 // --- user chose a directory from directory list
 int dr = dirs.Selection();
 String dir = dirs.ExtractTextLine(dr);
 int len = dir.Strlen();
 String direc = dir.mid(len-2,1);
 chdir(direc);
 ShowLists();
 }
 else if (Wnd == (DFWindow *) &disks) {
 // --- user chose a drive from drive list
 int dr = disks.Selection();
 String drive = disks.ExtractTextLine(dr);
 setdisk(drive[0] - 'A');
 ShowLists();
 }
}
// ---- called when user chooses OK command
void FileOpen::OKFunction()
{
 String fname = filename.ExtractTextLine(0);
 if (access(fname, 0) == 0) {
 filespec = fname;
 Dialog::OKFunction();
 }
 else if (fname.FindChar('*') != -1 
 fname.FindChar('?') != -1) {
 filespec = fname;
 ShowLists();
 }
 else
 // ---- No file as specified
 ErrorMessage("File does not exist");
}
// ------ refresh the current directory display and the directories and files
// list after user changes filespec, drive, or directory
void FileOpen::ShowLists()
{
 dirlabel.FillLabel();
 dirlabel.Show();
 dirs.FillList();
 dirs.Show();
 files.FillList(filespec);
 files.Show();
}
// ------- called just before a control gets the focus
void FileOpen::EnterFocus(DFWindow *Wnd)
{
 if (Wnd == (DFWindow *) &files)
 // --- The file name list box is getting the focus
 SelectFileName();
}
// ---- called when user chooses OK command
void SaveAs::OKFunction()
{

 String fname = filename.ExtractTextLine(0);
 if (access(fname, 0) != 0) {
 // ---- chosen file does not exist
 if (fname.FindChar('*') != -1 
 fname.FindChar('?') != -1) {
 // --- wild cards
 filespec = fname;
 ShowLists();
 }
 else {
 filespec = fname;
 Dialog::OKFunction();
 }
 }
 else {
 // ---- file exists
 String msg = fname + " already exists. Replace?";
 if (YesNo(msg)) {
 filespec = fname;
 Dialog::OKFunction();
 }
 }
}


[LISTING FIVE]

// -------- dialog.h

#ifndef DIALOG_H
#define DIALOG_H

#include "dfwindow.h"
#include "desktop.h"
#include "control.h"

class Control;
class Dialog : public DFWindow {
 virtual void SetColors();
 Bool isRunning;
 Bool okexit;
 void OpenWindow();
 friend Control;
 void TestShortcut(int key);
protected:
 Dialog(char *ttl, int lf, int tp, int ht, int wd,
 DFWindow *par = (DFWindow *)desktop.ApplWnd())
 : DFWindow(ttl, lf, tp, ht, wd, par)
 { OpenWindow(); }
 Dialog(char *ttl, int ht, int wd,
 DFWindow *par = (DFWindow *)desktop.ApplWnd())
 : DFWindow(ttl, ht, wd, par)
 { OpenWindow(); }
 virtual ~Dialog() {}
 virtual void CloseWindow();
 virtual void Keyboard(int key);
 virtual void OKFunction();
 virtual void CancelFunction();
 virtual void HelpFunction();

public:
 virtual void Execute();
 Bool OKExit() { return okexit; }
};
#endif




[LISTING SIX]

// ------------- dialog.cpp

#include <ctype.h>
#include "dialog.h"
#include "desktop.h"

Dialog *ThisDialog;

// ----------- common constructor code
void Dialog::OpenWindow()
{
 ThisDialog = this;
 windowtype = DialogWindow;
 DblBorder = False;
 SetAttribute(BORDER SAVESELF CONTROLBOX 
 MOVEABLE SHADOW);
 isRunning = False;
 okexit = False;
}
void Dialog::CloseWindow()
{
 ReleaseFocus();
 DFWindow::CloseWindow();
 isRunning = False;
}
// -------- set the fg/bg colors for the window
void Dialog::SetColors()
{
 colors.fg =
 colors.sfg =
 colors.ffg =
 colors.hfg = LIGHTGRAY;
 colors.bg =
 colors.sbg =
 colors.fbg =
 colors.hbg = BLUE;
}
void Dialog::Keyboard(int key)
{
 switch (key) {
 case ESC:
 CancelFunction();
 break;
 case '\r':
 OKFunction();
 break;
 case ALT_F4:
 CloseWindow();

 break;
 case ' ':
 if ((desktop.keyboard().GetShift() & ALTKEY) == 0)
 break;
 // ---- fall through
 case ALT_F6:
 DFWindow::Keyboard(key);
 break;
 default:
 if ((desktop.keyboard().GetShift() & ALTKEY) != 0)
 TestShortcut(key);
 break;
 }
}
void Dialog::TestShortcut(int key)
{
 key = desktop.keyboard().AltConvert(key);
 key = tolower(key);
 Control *Ctl = (Control *)First();
 while (Ctl != 0) {
 if (key == Ctl->Shortcut()) {
 Ctl->ShortcutSelect();
 break;
 }
 Ctl = (Control *) Ctl->Next();
 }
}
void Dialog::Execute()
{
 ThisDialog = 0;
 CaptureFocus();
 Control *Ctl = (Control *) First();
 while (Ctl != 0) {
 if (Ctl->isEnabled()) {
 Ctl->SetFocus();
 break;
 }
 Ctl = (Control *) Ctl->Next();
 }
 // ---- modal dialog box
 isRunning = True;
 while (isRunning)
 desktop.DispatchEvents();
}
void Dialog::OKFunction()
{
 okexit = True;
 CloseWindow();
}
void Dialog::CancelFunction()
{
 CloseWindow();
}
void Dialog::HelpFunction()
{
}


































































November, 1993
ALGORITHM ALLEY


Palindrome Encryption




Tom Swan


It's been said that a black hole in space is the ultimate data compressor.
But, as I recently discovered, the grand poobah of compressors is actually an
anonymous clerk who invents acronyms for government publications. ATBAOTCOF is
a gem that I recently came across in a volume on southern navigational
hazards. As everybody knows, ATBAOTCOF is an abbreviation for "Area to Be
Avoided Off the Coast of Florida." Don't laugh. It takes real talent to come
up with compact, phonetic acronyms like that. It's phonetic because, if you
could pronounce it, ATBAOTCOF is the sound you'd probably make upon
accidentally straying into the ATBAOTCOF and being fined a maximum civil
penalty of $50,000.


Palindromic Encryption


Government or military acronyms aren't always so ridiculous. The word "radar,"
for instance, is a shining exception. An acronym for "Radio Detecting And
Ranging," radar is also a palindrome, a word that is spelled the same forward
and backward. My middle name, Reyer, is a palindrome. Madam is another.
Palindromes can also be phrases or sentences such as, quoting from Webster's,
Roma tibi subito motibus ibit amor.
A few months ago, while working on a book about file structures, I stumbled
across a data-encryption method used in Microsoft Windows Calendar files. I
call it the "palindrome encryption algorithm" because the technique produces
numeric palindromes from most input sequences. Unfortunately, there's no
corresponding recovery method. Once data is encrypted using this algorithm,
there is no way to restore the original information.
That may seem to make palindrome encryption as appealing as a bank that
accepts deposits but doesn't permit withdrawals. The algorithm may have
practical applications, though, in producing encrypted keys with built-in
redundancy checks. For example, a Calendar file is identified by the
palindrome signature bytes B5 A2 B0 B3 B3 B0 A2 B5, formed by adding the ASCII
values of the characters in the uppercase string CALENDAR to the lowercase
string radnelac, which is calendar spelled backwards.
Example 1, Algorithm #13, describes the palindrome encryption algorithm in
pseudocode. PALPAK.PAS (Listing One, page 144) implements the algorithm in
Pascal. Some input strings produce repetitious sequences of identical
values--evidence, I suppose, that a corresponding recovery method is unlikely
to be discovered. ABCD, for example, translates to the four hexadecimal bytes
A5 A5 A5 A5. Technically, that's a numeric palindrome, but it's not a
desirable outcome. Interestingly, however, preprocessing this and similarly
degenerate cases as palindromes--in other words, feeding the program
ABCDDCBA--produces a usable palindrome key. In fact, inputting palindromes
such as radar and madam always produces numeric palindromes. How strange!
Also interesting is the fact that any palindrome with an even number of
symbols can be compressed by 50 percent simply by throwing out the redundant
half. This suggests that palindrome encryption could double as a
data-compression method, but since there's no corresponding recovery
algorithm, this property has no apparent value. If anybody discovers a
practical use for Algorithm #13 or an adaptation, I'd like to know.


Blazing Text Compression


July's "Alien Text-File Compression" produced a basketful of mail along with
several interesting text-compression techniques. Chuck Guzis sent along his
text compressor that can squash files at blazing speed. Listings Two and Three
(page 144 and 145, respectively) implement Chuck's algorithm in C. Compile the
programs, then enter a command such as COMP5 IN.TXT OUT.TXT to compress file
IN.TXT to file OUT.TXT. Use DECOMP5 similarly to decompress a packed file.
Chuck explains how the code operates:
I based the algorithm on the old five-level Baudot TTY's that used five-bit
words and escape codes to shift between character sets. This inspired me to
use "short" bytes to compress eight-bit characters. Huffman encoding works
similarly, but instead of varying the number of bits according to a
character's frequency of occurrence, I use only two lengths--short (five-bit)
and long (eight-bit) codes. Five bits can store 32 values--enough room for 26
characters plus six punctuation marks or other symbols. Using this fact, the
algorithm packs three characters in a 16-bit word, leaving one bit as a flag.
Unencoded characters with values below 128 are stored literally. Characters
with values above 128 are stored with a zero prefix, which also indicates a
run of same value characters when followed by a value less than 128. Most
files compress by about 30 percent. Unpacking uses a simple table lookup, and
is very fast.


Short and Sweet


Chuck Walbourn contributed short, but sweet, ideas for compressing text
information containing numeric values. He writes:
A good way to compress text that contains numbers is to convert them into
binary integers or floats. This saves space, of course, but only if the
representation is smaller than the ASCII original. You also need to include an
escape code to mark encoded values. Another method is to use binary coded
decimal (BCD), replacing pairs of digits with bytes. For example, the string
"Abc123456" compresses to "Abc" 0x12 0x34 0x56, a total of six bytes instead
of nine.
These techniques are examples of a simple but effective compression method
called "compact notation." You use this method every time you "compress" dates
such as January 15, 1994 to a shorthand numeric form 1-15-94. In a computer,
assuming 1900 for the century, any date can be further packed into a 16-bit
word using four bits for the month, five bits for the day, and seven bits for
the year. The interesting point about compact notation is that it compresses
data by searching for equivalent, but shorter, forms of symbols or other
values. Only the form of the information changes, not their values.
Zenon Revisited
July's opener, in which an alien from Zenon compressed an encyclopedia by
placing a mark on a rod, drew several comments. Russ Pillsbury writes, "I got
a real kick out of the marked rod [compression] method. At sub-electron
distances, you can insert nearly an infinite number of marks on the same rod.
Since any book or volume of books will generate a unique fraction, you can
probably place the entire works of humanity on that same stick!"
Sounds great, Russ. When you are finished marking a sample rod, be sure to
send me a review copy. First, however, you may want to listen to author Neil
Rubenking's ideas for gaining even better compression ratios (as if infinite
compression weren't enough). Neil writes:
There's a variation on your stick-marking compression that gets truly
excellent results when some characters are more common than others. You start
with an imaginary ruler that's marked off with a space for each character
proportional to the frequency of that character. Lay this ruler against your
stick to start. Take the first character in your input and find its range on
the ruler. Now reduce the ruler to the size of that range and move it so it
precisely fits on the stick. Find the second character on your ruler and
squish the ruler to fit that range. Keep going until you've handled all the
input characters. The nice thing is that the number of required significant
digits increases more slowly when you encode a more common character. Say you
just use A, B, and C, in proportion 2:1:1. Encoding an A reduces the active
range to 1/2; encoding a B or C reduces it to 1/4. You can also do this with
standard IEEE real numbers, jamming in characters until you've used up all
your significant digits and then starting a new real number. It works! The
hard part is knowing when to stop decoding the real numbers, but it can be
done.
This compression technique, which goes by the name "arithmetic encoding,"
reminds me of the supposedly true observation that every work--including all
of Shakespeare plus the text of every Superman comic ever published--is
encoded somewhere in the value of pi--assuming, that is, pi's decimal
expansion never repeats. In fact, my unfinished novel is also probably encoded
in pi, so I may as well not complete the thing since it has already been
"published." (And if you believe that, I know where you can rent a roomful of
monkeys to write your next computer program.) (For more information on
arthimetic encoding, see "Arithmetic Coding and Statistical Modeling" by Mark
Nelson, DDJ, February 1991.)


Mashed Dictionaries


Neil Rubenking also offered an interesting, and perhaps more practical, text
compression technique, which employs a variation on Chuck Guzis's method of
using short character codes:
I use a kind of differential compression to mash down the dictionary for my
anagram generator, NAMEGRAM. First, I reduce every input letter to a number
from 0 to 25 (the dictionary consists of capitalized words with no
punctuation). Words are limited to 31 characters. Each word is represented by
the number of letters it shares with the previous word, the number of
additional letters, and the codes for the letters themselves. So your
AARDVARK, AARDWOLF, AARONIC example would be encoded as 0 8 A A R D V A R K 4
4 W O L F 3 4 ONIC. This data stream is subject to additional compression by
one-third. Every number is less than 32, so 15 bits is enough to hold three
numbers. I simply mash three numbers into every two bytes, obtaining better
results than PKZIP. A one megabyte file, for example, mashes down to about
290K. The same zipped file is 330K! This just goes to show what you can do
when you have control over the data being compressed.


Your Turn


Write to me, make suggestions, complain about your taxes, or even better,
share a favorite algorithm. Send mail in care of DDJ, or send compressed text
files to my Compuserve ID, 73627,3241. When sending files, please tell me what
encryption or compression methods you use. Be forewarned: Undecipherable files
will be forwarded to the appropriate government agency for dumping in the
ATBAOTCOF.
Example 1:
input

 S: String;
var
 I, E, C: Integer;
begin
 I <- 1;
 E <- Length(S);
 while (I <= Length(S)) do
 begin
 C <- (Uppercase(S[I])
 + Lowercase(S[E]);
 Write C to output;
 I <- I + 1;
 E <- E - 1;
 end;
end;
_ALGORITHM ALLEY_
by Tom Swan


[LISTING ONE]

(* -------------------------------------------------------------------------
*(
** palpak.pas -- Algorithm #13: Palindrome Encryption **
**
--------------------------------------------------------------------------**
** Demonstrates data encryption technique for producing keys or signatures, **
** Microsoft Calendar files, for example, begin with signature bytes formed **
** by applying the algorithm to the word Calendar. There is no known **
** recovery method for restoring text encrypted using this method. **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
)* -------------------------------------------------------------------------
*)

program PalPak;
var S1, S2: String;

function Uch(C: Char): Integer;
begin
 if (C in ['a' .. 'z']) then
 Uch := Ord(C) - 32
 else
 Uch := Ord(C)
end;
function Lch(C: Char): Integer;
begin
 if (C in ['A' .. 'Z']) then
 Lch := Ord(C) + 32
 else
 Lch := Ord(C)
end;
procedure Encrypt(S: String);
var I, E, C: Integer;
begin
 I := 1;
 E := Length(S);
 while (I <= Length(S)) do
 begin
 C := Uch(S[I]) + Lch(S[E]);
 Write(C, ' ');
 I := I + 1;
 E := E - 1

 end
end;
var S: String;
begin
 repeat
 Write('Enter a string: ');
 Readln(S);
 if Length(S) > 0 then
 begin
 Write('Encrypted string: ');
 Encrypt(S);
 Writeln
 end
 until Length(S) = 0;
end.

[LISTING TWO]

/* ----------------------------------------------------------------------- *\
** comp5.cpp -- Compress an ASCII text file. **
** - A fast, efficient text compressor that can acheive up to about 33 **
** percent reduction on most text files. - Lowercase characters are **
** compressed 3 per word. Since there are 32 values in 5 bits, the program **
** can pack the basic 26-character alphabet plus 6 specials characters: **
** <space>, comma, period, semicolon, hyphen, and quote. -If a value of 00 **
** is detected, next character is taken as a literal respresentation if **
** the next character has a value in excess of 127 (0x80); otherwise, the **
** value represents a repeated character count of the next char. **
** Copyright (c) 1993 by Chuck Guzis. All rights reserved. **
\* ----------------------------------------------------------------------- */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define LINE_MAX 128 // Length of the longest line
#define DUPE_THRESHOLD 4 // Smallest duplicated byte count
typedef unsigned char UCHAR; // Define a UCHAR
typedef unsigned int UINT; // Define a UINT
long Comp_Count; // How many bytes of compressed text?
long Orig_Count; // How many bytes of original text?
UCHAR
 lbi[ LINE_MAX+1], // Input line buffer
 lbo[ LINE_MAX+1]; // Output line buffer
FILE *in; // Input file
FILE *out; // Output file

// Our 5-bit alphabet.
unsigned char Alphabet[] = "abcdefghijklmnopqrstuvwxyz ,.;- ";

// An array for testing for the above.
UCHAR AlphaMap[256]; // One byte per ASCII value

// Function prototypes.
void main(int, char **); // Main function
void Error(char *, char *); // Error routine
void Compress(void); // Compressor
int DupeCheck(UCHAR *); // Scan for duplicate characters


void main( int argc, char *argv[])
{
 UCHAR
 *ch, // Pointer and short int used to
 i; // build AlphaMap.
 in = NULL;
 out = NULL;
 Comp_Count = 0;
 Orig_Count = 0;

// Open the files, give an error if problems arise.
 if ( argc != 3)
 Error( "Command form is - COMP5 <in-file> <out-file>\n", NULL);
 if ( !(in = fopen( argv[1], "r")) )
 Error( "Can\'t open %s.\n", argv[1]);
 if ( !(out = fopen( argv[2], "wb")) )
 Error( "Can\'t create %s.\n", argv[2]);

// Construct the AlphaMap array. This is done purely for speed, as
// the memchr() function could be used to search the string.
 memset( AlphaMap, 0, sizeof( AlphaMap)); // Clear it to 0
 for ( ch = Alphabet, i = 1; *ch; ch++, i++)
 AlphaMap[ *ch] = (UCHAR) i;
// Read the input file, compress and encode it.
 while( fgets( (char *)lbi, sizeof( lbi)-1, in))
 { // Read until eof
 Orig_Count += (strlen( (char *)lbi)); // Total original UCHARs
 Compress();
 } // Digest the file
// Show some summary data.
 printf( "\n\n"
 "\tOriginal text size:\t\t%ld\n"
 "\tCompressed text length:\t\t%ld\n"
 "\tSavings:\t\t\t%ld\n",
 Orig_Count, Comp_Count, Orig_Count-Comp_Count);
 fclose(in);
 fclose( out);
 exit(0);
} // End of main

// Compress - Compress a line and do repeated byte encoding.
// This is a two-pass operation. Internally, we store a flag in lbi of
// 0x8000 + count. The next byte in lbi[] may contain the repeated byte.
void Compress( void)
{
 register UINT k;
 UCHAR
 *ch1, // Source pointer
 *ch2; // Destination pointer
 for ( ch1 = lbi, ch2 = lbo; *ch1;)
 { // Scan the line
 if ( (k = DupeCheck( ch1)) > DUPE_THRESHOLD )
 { // Compression is okay
 *ch2++ = 0;
 *ch2++ = (UCHAR) k;
 *ch2++ = *ch1; // Store the repeated string
 ch1 += k;
 }

 else
 { // See about characters > 127 -- quote them
 if ( *ch1 > 127)
 { // If needs quoting
 *ch2++ = 0;
 *ch2++ = *ch1++;
 }
 else
 {
// See if there are three consecutive symbols that reside in our 5-bit
// alphabet. Note that an end-of-line will fail the test automatically, so
// it doesn't require checking.
 if ( AlphaMap[ *ch1] &&
 AlphaMap[ *(ch1+1)] &&
 AlphaMap[ *(ch1+2)])
 { // Bingo--got all three
 k = 0x8000 ((AlphaMap[ *ch1]-1) << 10) 
 ((AlphaMap[ *(ch1+1)]-1) << 5) 
 (AlphaMap[ *(ch1+2)] -1);
 *ch2++ = (UCHAR) (k >> 8); // Store first byte
 *ch2++ = (UCHAR) (k & 255); // Store second byte
 ch1 += 3; // Advance
 }
 else
 *ch2++ = *ch1++; // Everything else
 } // If character below 128
 } // If not compressible string
 } // Scan the line
 Comp_Count += (ch2 - lbo); // Update compressed count
 fwrite((char *)lbo, (size_t)(ch2 - lbo), sizeof( UCHAR),
 out); // Dump the line
 return;
} // Compress

// DupeCheck - Return the number of duplicated characters.
// Scans to end of line. Always returns at least 1.
int DupeCheck( UCHAR *what)
{
 UCHAR cref; // Reference character
 int k; // Induction variable
 for ( cref = *what++, k = 1; *what; what++, k)
 if ( cref != *what)
 break; // Just scan for same character
 return k;
} // DupeCheck

// Error - Give an error and a message, then exit.
// A string argument may be given, if desired.
void Error( char *msg, char *str)
{
 if ( in)
 fclose(in); // Close files if still open
 if ( out)
 fclose( out);
 if ( str)
 fprintf( stderr, msg, str);
 else
 fprintf( stderr, msg);
 exit(1);

} // Error


[LISTING THREE]

/* ----------------------------------------------------------- *\
** decomp5.cpp -- Decompress file compressed with comp5. **
** Copyright (c) 1993 by Chuck Guzis. All rights reserved. **
\* ----------------------------------------------------------- */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define LINE_MAX 128 // Length of the longest line
typedef unsigned char UCHAR; // Define a UCHAR
typedef unsigned int UINT; // Define a UINT
FILE *in; // Input file
FILE *out; // Output file

// Our 5-bit alphabet.
char Alphabet[] = "abcdefghijklmnopqrstuvwxyz ,.;- ";

// Function prototypes.
void main( int, char **); // Main function
void Error( char *, char *); // Error routine
void Decompress( void); // Decompress
void main( int argc, char *argv[])
{
 in = NULL;
 out = NULL;
// Open the files, give an error if problems arise.
 if ( argc != 3)
 Error( "Command form is - DECOMP5 <in-file> <out-file>\n", NULL);
 if ( !(in = fopen( argv[1], "rb")) )
 Error( "Can\'t open %s.\n", argv[1]);
 if ( !(out = fopen( argv[2], "w")) )
 Error( "Can\'t create %s.\n", argv[2]);
// Read the input file, decode it.
 Decompress();
 fclose(in);
 fclose( out);
 exit(0);
} // End of main

// Decompress - Read and decompress the data.
// This routine does all of the work. Single-character I/O is done
// in the interest of simplicity.
void Decompress( void)
{
 UCHAR ch; // Current character
 UINT k; // Scratch word cell
 while( !feof(in))
 { // Read and decode
 ch = (UCHAR) fgetc( in);
 if ( ch & 0x80)
 { // Special flagging, packed 5 bits
 k = (ch << 8);

 ch = (UCHAR) fgetc( in);
 k = ch; // Form a word
 fputc( Alphabet[ (k >> 10) & 31], out);
 fputc( Alphabet[ (k >> 5) & 31], out);
 fputc( Alphabet[ k & 31], out); // Unpack all three
 }
 else
 {
 if ( ch == 0)
 { // Escape or repeated byte
 ch = (UCHAR) fgetc( in);
 if ( ch < 128)
 { // Repeated byte
 k = ch;
 ch = (UCHAR) fgetc( in); // Get the actual byte
 while (k--)
 fputc( ch, out); // Write the repeated byte
 }
 else
 fputc( ch, out); // Not repeated, but just quoted
 }
 else
 fputc( ch, out); // Just an ordinary character
 } // If not flagged
 } // Read the input
 return;
} // Decompress

// Error - Give an error and a message, then exit.
// A string argument may be given, if desired.
void Error( char *msg, char *str)
{
 if ( in)
 fclose(in); // Close files if still open
 if ( out)
 fclose( out);
 if ( str)
 fprintf( stderr, msg, str);
 else
 fprintf( stderr, msg);
 exit(1);
} // Error




















November, 1993
UNDOCUMENTED CORNER


Novell's NetWare Core Protocol




by Pawel Szczerbina


Pawel has been programming networks for the past five years. He lives in
southern Sweden and works as an independent developer and consultant
specializing in the NetWare environment. You can reach him on CompuServer at
72133,2232 or by phone at +46-40116759.




Introduction




by Andrew Schulman


Since launching the "Undocumented Corner" in March 1993, I've focused entirely
on Microsoft. As the major force in PC software, this fixation on Microsoft is
appropriate. Some might even charge that Microsoft uses its undocumented
interfaces to shore up a near-monopoly position. If it weren't for all the
undocumented interfaces used by so many popular applications, almost anyone
could create clones of DOS and Windows and wade into the Microsoft-dominated
PC operating-systems market.
However, Microsoft isn't the only company that holds back information vital to
third-party developers. Novell, Microsoft's chief competitor for PC operating
systems and an apparently active supporter of the Federal Trade Commission's
three-year investigation of Microsoft trade practices, has among network
developers a reputation for failing to document vital interfaces in its
NetWare operating system. In fact, practically everyone I've spoken to on this
subject says the same thing: "Novell? When it comes to undocumented
interfaces, they're even worse than Microsoft."
It is odd, therefore, for Novell to complain of unfair treatment at the hands
of Microsoft. In the arena it dominates, and almost monopolizes--network
operating systems--Novell appears to behave in much the same way as Microsoft
does in its home turf. Thus, Novell is what the law sometimes calls in pari
delicto or "in equal fault."
A chief example is the NetWare Core Protocol (NCP), the subject Pawel
Szczerbina tackles this month. NCP is the packet interface that the NetWare
workstation shell, such as NETX.EXE, uses to send requests to a NetWare file
server. NCP is similar to the Server Message Block (SMB) interface used in
PC-NET, LAN Manager, and various other unsuccessful Microsoft networking
products, except that Microsoft, to its credit, has made the SMB specification
available (on CompuServe, type GO MSNETWORKS and download the 250K file
SMBSPE.ZIP which, however, has not been updated since November 1990).
Novell claims that NCP is not undocumented but rather "proprietary." In fact,
Novell last year began licensing NCP to third parties. A July 6, 1992 Novell
press release claims that "NCPs are a documented set of procedures used by
NetWare to accept and respond to requests by workstations on the network.
Under the new program, licensees can develop NCP-compatible Service Requesters
and/or Service Providers."
However, the price appears to range from $15,000 to $30,000. In the words of
Rob Young, director of marketing for Novell's NetWare Systems Group, "Some
people make a connection between openness and free, and we don't necessarily
think those things go hand in hand_. We didn't do this to boost revenues; the
fee is for support and certification" (see "Novell Gives Green Light to
Licensing NetWare Core," PC Week, July 13, 1992). Yeah, right. If the price
isn't a stumbling block for you, contact the NetWare Technology Licensing
Program (800-733-9673).
Information on NCP is also scattered in a number of books, including Carl
Malamud's Analyzing Novell Networks (VNR, 1992), Mark Miller's LAN Protocol
Handbook (M&T Books, 1990), and even in Novell's Guide to NetWare LAN Analysis
(Sybex, 1993) by Laura Chappell, a Novell employee. Chappell explains that
"Novell does not publish details regarding its NCPs because the information is
Novell Confidential.' The NCP codes in this chapter, however, were taken from
the LANalyzer for NetWare screens, which provide clear, accurate decodes of
all NetWare NCP functions and subfunctions."
Indeed, LAN protocol analyzers are the major source of information on NCP. In
addition to Novell's own LANalyzer, other NCP-aware analyzers include The
Sniffer from Network General (Menlo Park, California) and The Snooper from
General Software (Redmond, Washington). The Snooper comes with complete C
source code, including Steve Jones's NCP.C.
Related to NCP is another piece of undocumented NetWare, the so-called F2
interface. The NetWare workstation shell extends INT 21h to provide many
NetWare-specific functions. While many of these extensions are documented by
Novell, some are not, including INT 21h AH=F2h, which can be used to issue
"raw" NCPs to the file server. Novell only documents a tiny subset of this,
INT 21h AX=F244h, to erase files. If you examine Table 4 of Pawel's article,
you will note that 44h is the NCP Erase Files function. In other words, Novell
officially documents only one NCP function--Table 4 lists over 300! The F2h
interface is a good way to issue raw NCPs to the server, and Novell recommends
it as the primary way to do this for certain tasks. Also, there is a file on
CompuServe (GO NOVLIB) called SC3X0n.EXE (where n is a number that increments
as they update it) that shows how to use certain "undocumented" NCPs on
NetWare 3.11. Most of these code samples use the F2h interface. However, all
of the samples bear a big (yet ridiculous) disclaimer that says: "This
software is considered pre-release and may be used at your own risk and has
been provided due to the many requests of our customers. Support for this
module will be provided at the sole discretion of Novell, Inc." Finally,
there's a new Client API for Assembly (now in beta) which covers more F2h
calls, though it doesn't go into detail about how F2h calls translate into NCP
transactions on the wire.
Pawel's article is a first look at the subject of undocumented NetWare. The
second edition of Undocumented DOS, which I am just finishing now (and which
should be in bookstores in November or December 1993), will have a new section
on NetWare, including details on the NETX shell and the F2 interface. Even
better, Tim Farley is actively working on a book, Undocumented NetWare, to be
published in early 1994 as part of a series I edit for Addison-Wesley. Tim
contributed major additions to Pavel's article. Interestingly, the idea for
this book first came from an engineer at Microsoft. With Microsoft's
accusations against Novell, and Novell's accusations against Microsoft, the
safest thing is probably to assume that they're both telling the truth. Our
friend at Microsoft says of NCP, "who knows how many inventions would be
enabled by publishing this."
To really use any of the information Pawel provides here, you need to download
his NCPTEST source code (see "Availability," page 3).
Right now it looks like future "Undocumented Corners" will look at low-level
aspects of Windows, such as undocumented DPMI, the Virtual Machine Control
Block structure, Instance Data lists, and so on. Another possibility is the
fascinating topic of undocumented NT, though I find it difficult to be
enthusiastic about an operating system that requires 16 Mbytes of memory.
Please send your comments and suggestions to me on CompuServe at 76320,302
(that's 76320.302@compuserve.com from the Internet).
The NetWare Core Protocol (NCP) is a packet format that client workstations on
the network use to communicate with a NetWare server. NCP provides services
such as file manipulation, message sending, transaction processing, printing,
and the like.
In a DOS-based workstation, the NetWare shell (NETX.EXE, NET3.EXE, VLM. EXE,
and so on) is responsible for establishing and maintaining an NCP session with
a file server. The NetWare shell also provides a DOS-like interface for
applications to use; higher-level APIs, like the NetWare C Interface, use this
low-level interface.
Requests originating from applications, regardless of the API used, are
eventually converted by the NetWare shell to NCP packets and sent to the
server. Responses from the server are converted back to the format of whatever
API the application is using. Some of the functions specified in the different
interfaces may end up sending several NCP packets. There are also several
calls that access local data within the workstation shell and consequently
never generate any requests to the server. But, in general, almost everything
in NetWare rests upon NCP.
NCP isn't just for communication between users and file servers. Print servers
also rely on NCP, as can any NetWare object type. Regardless of the physical
implementation of the print server (VAP, NLM, EXE, or a standalone device), at
some point it has to use NCP to access the NetWare queue management system;
that is, check for data present, read queue data, and so on. It can then
distribute the print data to the printers (in case they are not directly
connected to the print server itself) using any protocol it likes. Novell's
own PSERVER is a good example: It uses NCP to log in to the server and access
queues, and uses a separate RPRINTER/NPRINTER protocol to communicate with
remote printers.
In the OSI model, NCP covers the transport and session layers, and is
conceptually similar to the SMB protocol in Microsoft's LAN Manager. You can
think of NCP as an operating-system interface or collection of remote
procedure calls. The difference, compared to a traditional operating-system
interface, is that here, the OS itself resides on a different physical
machine--the NetWare server. NCP is a client/server or distributed interface.
NCP uses Novell's Internetwork Packet Exchange (IPX) as the underlying
network-level protocol. IPX is a connectionless (datagram) transport service
that does not guarantee packet delivery. The session control and packet error
checking are performed by NCP.
Now, you may ask: Why should I care about the undocumented NetWare Core
Protocol? Isn't programming using published and documented Novell APIs, like
the NetWare C Interface, the recommended way to develop software in this
environment?
It certainly is, as this makes your programs portable to different client
platforms and frees you from having to write your own NCP requester.
Nevertheless, there are several reasons why NetWare developers have long
wanted information on NCP.
Understanding what happens between the NetWare shell and a file server may
help you to more quickly locate problems, especially in complex network
environments involving several pieces of software. Once you know what NCP
traffic is produced by the standard NetWare API calls, you may find other and
more efficient ways to accomplish the same tasks. Also, some software, such as
programs that boot the workstation from the file server, can't be implemented
without knowing how NCP works.
If you want to understand NCP, you'll need some kind of network traffic
analyzer to see what happens on the wire. LAN analyzers such as The Snooper,
The Sniffer, or NetWare's LANalyzer are essential tools for developing
software in networked environments.
This article explains the basic principles behind NCP. Recent additions to NCP
like packet signing, Burst Mode, and Large Internet Packets aren't covered.
NCPTEST.C, a sample program that shows how to access services of a NetWare
file server is available electronically (see "Availability," page 3). NCPTEST
requires that IPX be loaded, but does not require Novell's NetWare shell. The
NetWare shell typically occupies 30--50 Kbytes, so in some specialized
applications NCP programming might even be a good low-memory alternative to
using the shell. If you compile NCPTEST using Borland C++ 3.1, use command
line: bcc -ml -a -K ncptest.c ipxint.c ncpint.c. (The appropriate memory model
is very important.) Since I use Borland's screen I/O library, NCPTEST isn't
directly generic and you'll have to make changes before compiling with another
compiler. NCPTEST requires a NetWare 2.1x or higher server.
NCPTEST finds the nearest NetWare server, and sets up three separate
connections to it, all under the same user ID and password. This is normally
impossible under the regular DOS client software for NetWare, so it's a good
example of something you can do by going directly down to NCP and ignoring the
shell. The program also displays the state as it builds up the connection,
then sits in an "idle" state until you press any key to kill each connection.
Before you kill the program, go to another station that's logged into the same
server, and execute the Novell command USERLIST /A. This shows all the users
logged in, and their addresses. You'll see three separate connections being
taken up by the user specified on the NCPTEST command line, but all of them
have the same net address. This is normally impossible in NetWare--you can log
in more than once, but not from the same machine.


Client-server Communications


NCP is a half-duplex, request/re-sponse protocol. The client (generally
NETX.EXE or some other NetWare workstation shell) is the active side: It
initiates the communication process by sending a request. The server
(generally the NetWare file server) processes the request and responds with a
reply. The client then processes the reply and possibly issues a new request.
For example, the client may issue a request to open a file; the server would
then send a response indicating whether the file could be opened. The client
can have only one request (packet) outstanding at any time. The client must
wait for a response before sending the next request (this is not true for
Burst Mode NCP).
Since NCP uses an unreliable network protocol for delivery, it must ensure
that the packets are delivered in sequence and without duplication. The client
accomplishes this by assigning a sequence number to each request. The server
is expected to respond with a packet containing the same sequence number. If
it does, the client increments the sequence number by one and transmits the
next request. Responses having incorrect sequence numbers are discarded by the
client. It is legal for a client to resend a request to which the server
already replied, that is to resubmit the latest request. In this case, the
server assumes that the reply got lost on its way to the client and reexecutes
the request.
If the server does not respond at all to a request, the client resubmits the
same request a number of times before assuming that the server is unreachable.
This is necessary because of the unreliable underlying transport protocol and
media. Because of a busy network, packets may never get sent. They may get
dropped by a router, malformed on their way to the destination, and so on.
It's up to the client to choose the number of, and time between, retries. For
instance, the DOS client, NETX, has a number of parameters in its INI file
(NET.CFG) which control the client's NCP timeout behavior, including NCP SET
TIMEOUT= and NCP TIMEOUT MULTIPLIER=. Then, if the server can't immediately
respond because it is busy, it will send a special packet (called "Positive
Acknowledge") to the client. This indicates that the request is being
processed; the client should continue waiting for the reply. Such a packet has
the same sequence number as the request that caused it. The server will first
send when it receives a duplicate request (because the client times out) and
the server still processes that request. (Novell calls this "Positive
Acknowledge" packet "Request Being Processed," at least in LANalyzer traces.)

The server also needs to know if a client is still alive. If the client stops
sending requests without properly terminating the connection, the server must
know whether the person operating the workstation temporarily stopped using it
or if the station has been turned off. Since each workstation occupies at
least one user connection at the server, there's a possibility that a number
of workstations could unnecessarily take up connections that could be made
available for others to use. After a period of workstation inactivity, the
server sends a watchdog packet to the client, asking if the client is still
active. If the client doesn't respond, the server will send a number of
watchdog packets, at regular intervals. Should none of the watchdog packets
produce a reply from the client, the server will break the corresponding
connection and make it available for other workstations.
Except for the information exchanged on the watchdog socket, NCP uses no
special system packets devoted to keeping the session alive, confirming
successful arrival of data packets, and the like. This makes NCP an efficient
protocol. However, the request/response nature of the protocol makes it slow
on WAN links where the time between a request and a response can be
considerable. To remedy this, Novell has introduced Burst Mode NCP which makes
NCP a windowed protocol, able to send several packets in one direction before
requiring a reply. Burst Mode is currently used only for reading and writing
files.


NCP Packet Formats


Regardless of its type, an NCP packet occupies the data portion of an IPX
packet. All NCP packets, except for the Burst Mode NCP, have a structure like
that in Tables 1 and 2.
A request packet starts with a 16-bit Packet Type field, which indicates the
request type. Request types are listed in Table 3. Of the six request types
shown, the three most important are type 0x1111 (Allocate Slot Request), used
when creating a connection; type 0x2222 (Request), used for ordinary NCP
requests; and type 0x5555 (Deallocate Slot Request), used when terminating a
connection. (Novell LAN alyzer's terminology for 0x1111 and 0x5555 is "Create
Service Connection" and "Destroy Service Connection."
The client sets and increases the Packet Sequence Number field , as explained
earlier.
The Client Task Number in Table 1 identifies the client task that issued the
request. Each connection may have several tasks (programs) executing on the
same connection. The server tracks resources like semaphores, file locks, and
so on, for each of the client's tasks. When the client issues an End-of-Job
command, the server releases resources belonging to that task.
Together, the two Server Connection Number fields constitute a 16-bit
connection number, assigned by the server when creating a connection, and
later exchanged between the client and the server in all requests and replies.
Until recently, the connection number high byte was always 0 in all
commercially-available versions of NetWare, none of which allow more than 250
users to be on the server at once and therefore never need a connection number
over 250. However, Novell has been for years selling to special customers a
1000-user version of NetWare 3.11. Unfortunately, Novell didn't publicly
document how the 1000-user API differs from the older APIs which use a single
byte for a connection number. This is being resolved this year, with the
retail version of NetWare 4.0, which supports 500 and 1000 users.
The Function Code in Table 1 indicates which NCP function is to be executed by
this request (see Table 4, page 128). This field may be followed either
directly by the Request Data field, or by the subfunction length and code.
The Subfunction Length, if present, indicates the length of the following
data, excluding this field. If this field is present, the Subfunction Code
field is also present. Some commands, such as the Transaction Tracking System
(TTS) and semaphore handling functions, lack the length field. In this case,
the subfunction code directly follows the function code.
The Request Data field in Table 1 carries zero or more bytes of
function/subfunction dependent data.
A reply packet also starts with a 16-bit Packet Type field (see Table 2). The
allowable values are 0x1111 (Reply), used for the normal NCP replies, and
0x9999 (Positive Acknowledge), used to indicate that a server is busy
processing a request.
The reply Sequence Number carries the sequence number of the corresponding
request.
The reply Completion Code field (Table 2) indicates whether or not the request
executed successfully. A 0 indicates success; any other value usually
indicates an error. Novell's NetWare System Calls for DOS lists error codes.
The same completion code can take on different meanings, depending on which
function was executed.
The Connection Status field indicates the status of the connection itself. The
lower four bits should be interpreted as a value between 0--15: 0 means that
the connection is alive; any other value indicates that a connection is in
error and is no longer valid. Bit 4 set indicates that the server is down. Bit
6 set means the workstation has a message waiting at the server.


Creating an NCP Connection


The NCP client is responsible for creating a connection with the server. A
workstation may have several simultaneous NCP connections open to the same
server or to different servers. Each NCP connection is equivalent to a user
connection at the file server. A full connection address consists of the
4-byte network number, 6-byte physical node number, and 2-byte IPX socket
number. A client needs to open three IPX sockets for every NCP connection it
wishes to maintain. New additions to NCP, like the Burst Mode and Large
Internet Packets, require additional sockets.
The client's socket numbers must be consecutive: If the first socket is
0x4003, the second and third should be 0x4004 and 0x4005. Each socket can be
considered a communication channel, independent of other sockets, on which
conversation with the server takes place.
The first socket is used to exchange NCP requests and replies. The destination
socket at the file server is always 0x0451. NCP's IPX packet type is 0x11. The
second socket is a watchdog socket, used by the file server to determine
whether the client is alive. The third socket is a message socket that
receives message notifications from the server.
Before the client can create a connection to a server, it must know the server
address; that is, its network, node and socket numbers. Given a server name,
one can use the Service Advertising Protocol (SAP) to obtain the server
address. For an example, see the SAPGetNearestServer() function in NCPINT.C
(available electronically).
Once the client knows a server address, it uses an NCP Allocate Slot Request
packet to establish the connection. The client must set the packet type to
0x1111. Also, it should fill the connection bytes with 0xFF to indicate that
the client does not yet have a connection. The sequence number and function
number should be set to 0. The task number may be assigned any value.
The server will respond with a packet type 0x3333. The client should examine
the completion code and connection status for possible errors before
proceeding. For example, the server may be out of user connections. Provided
that the request is successful, the connection bytes in the reply packet will
contain the file server connection number the client has been assigned. This
number, which does not change during a lifetime of a connection, must be
supplied in all subsequent requests to the server until the connection is
terminated. The function NCPAttachToFileServerWithAddress() in NCPINT.C
demonstrates the connection process.
After creating the connection, you should also negotiate the maximum NCP
packet size that can be used. The two sides inform each other of their
respective maximum receive buffer sizes. The smaller of the two will be used
by both the client and the server. The value returned by a NetWare server is
1024 if there are no routers between the server and the client, otherwise it
is 512.
The procedure described above is basically what happens behind the scenes
whenever a NetWare shell loads, or a program issues the NetWare
AttachToFileServer call. At this point, if everything worked as expected, the
client has created a connection at the server. The client still needs to log
in, however, because by itself having a connection provides only a very
limited access to the server resources. You can use the NetWare FCONSOLE
utility to verify that a connection has been created. On a 3.x server you'll
see a string "NOT-LOGGED-IN" as the user name; this indicates that a
connection exists, but no one has yet logged in. No indication is given on 2.x
servers.


Logging In


Any NetWare object can log in to a file server. The most common object is the
user object (0x0001), but other types of objects (such as job and database
servers) also frequently log in.
There are two ways to log in to a server: using unencrypted or encrypted
passwords. Unencrypted passwords are visible on the network and present a
security threat. Encrypted passwords, supported by NetWare 2.15c and higher,
are encrypted at the workstation and not visible as plain text on the network.
Briefly, encryption of the passwords works as follows: The client requests an
encryption key from the server (NCP request 0x17 0x17; see Table 4). It then
asks the user for the plain password and encrypts it using the key and the
object's 4-byte ID number obtained using the GetBinderyObjectID request (0x17
0x35). The encrypted password is then sent to the file server for
verification. For each login attempt, the client needs to request a new
(different) encryption key from the server.
NCPTEST.C and NCPINT.C (available electronically, see page 3) use encrypted
passwords. That part of the source code is a direct translation of a Pascal
program written by Barry Nance. (See "Automatic NetWare Log-ins: How to Log
into NetWare from Your Programs without User Interaction," Byte, March 1993.
The program is available as ELOGON.ZIP in the BPASCAL forum on CompuServe, and
through other sources. According to Nance's article, he got the algorithm from
Terje Mathesen's NETWARE.PAS code on BIX.)


NCP Conversation


Once logged in to a server, the client has access to its resources as
specified by the client's privilege level. To use these resources, you need to
know the format of the NCP functions you plan to use. Table 4 lists over 300
NCP function and subfunction codes. It isn't a complete reference, but rather
a bare listing of the most used calls. To use any of these calls, you'll need
to know the call's specific packet format. Some sample formats are provided in
NCPINT.C (available electronically).
Often, the exact format of a request and reply can be inferred from Novell's
NetWare System Calls for DOS. The request buffers described in that document
are to be appended directly after the NCP request header, starting either at
the Subfunction Length or at the Request Data field, depending on the type of
function. The reply buffers, without the reply length field, normally start at
the Reply Data field in the NCP reply packet.
Also, a careful study of the source code and header files provided with the
NetWare C Interface helps to understand NCP packet formats. Some NCP
functions, however, are not directly accessible through standard APIs and
consequently are not described in easily available documents. One way to learn
about the format of such packets it to use a protocol analyzer capable of
decoding NCP. Another, if you can afford it, is to license NCP documentation
directly from Novell.


Closing the Connection


When the client no longer needs a connection, it should free it, thus making
its corresponding server connection available for others to use. To accomplish
that, the client sends a NCP Deallocate Slot Request to the server. (According
to LANalyzer, "NCP Deallocate Slot" is "Destroy Service Connection.") Such a
request looks like an ordinary NCP request, except for the Packet Type field,
which should be set to 0x5555 (see Table 3). The Function Code field should be
set to 0; no subfunction code is present. Upon successful execution of this
request, the connection is invalid.
The format of the watchdog request and reply, mentioned earlier, differs from
that of normal NCP packets. The file server socket from which these packets
originate is dynamic, often 0x4001. The IPX packet type varies, although it
usually is 0x11 or 0x00. The NetWare 3.1x file servers have three setable
parameters for watchdog packet configuration: "Delay Before First Watchdog
Packet," "Number Of Watchdog Packets," and "Delay Between Watchdog Packets."


The Message Socket



When a server has a message for a client, it informs the client of the
message's presence by sending a packet to client's message notification
socket. The client is expected to issue a NCP Get Broadcast Message request
(0x15 0x01) to read the message itself, which stops the server from sending
more notifications. The client does not have to read the message immediately;
the server will continue sending notification packets approximately once a
second. Also, all replies originating from the server will set bit 6 in the
Connection Status until the client reads the message. The file server socket
from which these packets originate is dynamic, often 0x4001. The format of the
watchdog and message packets can be found in NCPTEST source code.


Where From Here?


Using NCP could enable a whole range of innovative NetWare programming. Some
possibilities include building your own simple NetWare protocol analyzer,
writing an asynchronous version of the NetWare C interface, connecting client
hardware to NetWare, writing replacement workstation shells, or perhaps (like
Ornetix Technologies' SerView) even writing your own NCP server, only
implementing a subset of the NCP functionality.


References


Chappell, Laura. Novell's Guide to NetWare LAN Analysis, Novell Press, ISBN
0-7821-1143-2.
Rose, Charles G. Programmer's Guide to NetWare, McGraw-Hill, ISBN
0-07-607029-8.
Novell, NetWare System Calls for DOS.
Novell Network Encyclopedia. A CD-ROM containing several technical bulletins
and application notes.
Nance, Barry. "Automatic NetWare Log-ins: How to Log into NetWare from Your
Programs without User Interaction," Byte, March 1993.
Table 1: NCP request packet.
Packet Description
byte 0, 1 Packet type, 2 bytes (see Table 3).
byte 2 Packet sequence number, 1 byte.
byte 3 Server connection number, low, 1 byte.
byte 4 Client task number, 1 byte.
byte 5 Server connection number, high, 1 byte.
byte 6 NCP function code, 1 byte (see Table 4).
byte 7, 8 Subfunction length, 2 bytes, hi/lo (optional).
byte 9 NCP subfunction code, 1 byte (optional; see Table 4).
byte 10... Request data (optional).
Table 2: NCP reply packet.
Packet Description
byte 0, 1 Packet type, 2 bytes (see Table 3).
byte 2 Packet sequence number, 1 byte.
byte 3 Server connection number, low, 1 byte.
byte 4 Server task number, 1 byte.
byte 5 Server connection number, high, 1 byte.
byte 6 Completion code.
byte 7 Connection status.
byte 8... Reply data (optional).
Table 3: NCP packet types (*How Novell LANalyzer refers to packet).
0x1111 Allocate Slot Request (*Create Service Connection).
0x2222 Request (see Table 4).
0x3333 Reply.
0x5555 Deallocate Slot Request (* Destroy Service Connection).
0x7777 NCP Burst packet (request/reply).
0x9999 Positive Acknowledge (*Request Being Processed).
Table 4: NCP functions and subfunctions.
 Subfunction Code
Function Name Function Code 
Abort Servicing Queue Job And File 0x17 0x73
Add Audit Property 0x58 0x02
Add Bindery Object to Set 0x17 0x41
Add Trustee to Directory 0x16 0x0D
AFP Alloc Temporary Dir Handle 0x23 0x0B
AFP Create Directory 0x23 0x01
AFP Create File 0x23 0x02
AFP Delete 0x23 0x03
AFP Get Entry ID From Name 0x23 0x04

AFP Get Entry ID From Netware Handle 0x23 0x06
AFP Get Entry ID From Path Name 0x23 0x0C
AFP Get File Information 0x23 0x05
AFP Open File Fork 0x23 0x08
AFP Rename 0x23 0x07
AFP Scan File Information 0x23 0x0A
AFP Set File Information 0x23 0x09
Allocate Permanent Directory Handle 0x16 0x12
Allocate Resource 0x0F None
Allocate Special Temporary Directory Handle 0x16 0x16
Allocate Temp NS Dir Handle 0x57 0x0C
Allocate Temporary Directory Handle 0x16 0x13
Allow Task Access To File 0x4E None
Attach Queue Server To Queue 0x17 0x6F
Broadcast to Console 0x15 0x09
Change Auditor Password 0x58 0x04
Change Bindery Object Password 0x17 0x40
Change Bindery Object Password Encrypted 0x17 0x4B
Change Bindery Object Security 0x17 0x38
Change Property Security 0x17 0x3B
Change Queue Job Entry 0x17 0x6D
Change Queue Job Position 0x17 0x6E
Change To Client Rights 0x17 0x74
Change User Password 0x17 0x01
Check Audit Access 0x58 0x05
Check Audit Level Two Access 0x58 0x16
Check Console Privileges 0x17 0xC8
Check Pipe Status 0x15 0x08
Clear Connection Number 0x17 0xD2
Clear File 0x07 None
Clear File Set 0x08 None
Clear Logical Record 0x0B None
Clear Logical Record Set 0x0E None
Clear Physical Record 0x1E None
Clear Physical Record Set 0x1F None
Clear Volume Restrictions 0x16 0x22
Close And Queue Capture File 0x11 0x01
Close Bindery 0x17 0x44
Close File And Start Job Queue 0x17 0x69
Close Message Pipe 0x15 0x07
Close Old Auditing File 0x58 0x14
Close Semaphore 0x20 0x04
Commit File 0x3D None
Create Bindery Object 0x17 0x32
Create Directory 0x16 0x0A
Create New File 0x4D None
Create Property 0x17 0x39
Create Queue 0x17 0x64
Create Queue Job And File 0x17 0x68
Deallocate Directory Handle 0x16 0x14
Deallocate Resource 0x10 None
Delete Bindery Object 0x17 0x33
Delete Bindery Object From Set 0x17 0x42
Delete Directory 0x16 0x0B
Delete Old Auditing File 0x58 0x15
Delete Property 0x17 0x3A
Delete Trustee 0x16 0x2B
Delete Trustee From Directory 0x16 0x0E
Destroy Queue 0x17 0x65

Detach Queue Server From Queue 0x17 0x70
Disable Auditing On Volume 0x58 0x07
Disable Station Broadcasts 0x15 0x02
Disable File Server Login 0x17 0xCB
Disable Transaction Tracking 0x17 0xCF
Down File Server 0x17 0xD3
Enable Auditing On Volume 0x58 0x08
Enable Station Broadcasts 0x15 0x03
Enable File Server Login 0x17 0xCC
Enable Transaction Tracking 0x17 0xD0
End Of Job 0x18 None
Enter Login Area 0x17 0x0A
Erase Files 0x44 None
Examine Semaphore 0x20 0x01
File Close 0x42 None
File Create 0x43 None
File Open 0x4C None
File Read 0x48 None
File Release Lock 0x02 None
File Rename 0x45 None
File Search Continue 0x3F None
File Search Initialize 0x3E None
File Server Copy 0x4A None
File Set Lock 0x01 None
File Write 0x49 None
Fill Name Space Buffer 0x16 0x2F
Finish Servicing Queue Job And File 0x17 0x72
Get Account Status 0x17 0x96
Get Active Connection List By Type 0x7B 0x0E
Get Active LAN Board List 0x7B 0x14
Get Active Protocol Stacks 0x7B 0x28
Get Auditing Flags 0x58 0x13
Get Bindery Access Level 0x17 0x46
Get Bindery Object Access Level 0x17 0x48
Get Bindery Object Disk Space Left 0x17 0xE6
Get Bindery Object ID 0x17 0x35
Get Bindery Object Name 0x17 0x36
Get Broadcast Message 0x15 0x01
Get Cache Information 0x7B 0x01
Get Connection Information 0x17 0x16
Get Connection List From Object 0x17 0x1F
Get Connection's Open Files 0x17 0xDB
Get Connection's Semaphores 0x17 0xE1
Get Connection's Task Information 0x17 0xDA
Get Connection's Usage Statistics 0x17 0xE5
Get Connections Using A File 0x17 0xDC
Get CPU Information 0x7B 0x08
Get Dir Entry 0x16 0x1F
Get Dir Info 0x16 0x2D
Get Directory Base 0x57 0x16
Get Directory Cache Information 0x7B 0x0C
Get Directory Path 0x16 0x01
Get Disk Cache Statistics 0x17 0xD6
Get Disk Channel Statistics 0x17 0xD9
Get Disk Utilization 0x17 0x0E
Get DM Info 0x5A 0x01
Get Drive Mapping Table 0x17 0xD7
Get Effective Directory Rights 0x16 0x03
Get Effective Rights 0x16 0x2A

Get Encryption Key 0x17 0x17
Get Extended Volume Info 0x16 0x33
Get File Bit Map 0x55 None
Get File Server Date And Time 0x14 None
Get File Server Description Strings 0x17 0xC9
Get File Server Information 0x17 0x11
Get File Server Information 0x7B 0x02
Get File Server LAN I/O Statistics 0x17 0xE7
Get File Server Login Status 0x17 0xCD
Get File Server Misc Information 0x17 0xE8
Get File Size 0x47 None
Get File System Statistics 0x17 0xD4
Get Garbage Collection Information 0x7B 0x07
Get General Router And SAP Information 0x7B 0x32
Get Internet Address 0x17 0x13
Get IPX/SPX Information 0x7B 0x06
Get Known Networks Information 0x7B 0x35
Get Known Servers Information 0x7B 0x38
Get LAN Common Counters Information 0x7B 0x16
Get LAN Config Strings 0x7B 0x18
Get LAN Configuration Information 0x7B 0x15
Get LAN Custom Counters Information 0x7B 0x17
Get LAN Driver's Configuration Information 0x17 0xE3
Get Loaded Media Number List 0x7B 0x2F
Get Logical Record Information 0x17 0xE0
Get Logical Records By Connection 0x17 0xDF
Get LSL Information 0x7B 0x19
Get LSL Logical Board Statistics 0x7B 0x1A
Get Media Manager Object Children List 0x7B 0x20
Get Media Manager Object Information 0x7B 0x1E
Get Media Manager Object List 0x7B 0x1F
Get Media Name By Media Number 0x7B 0x2E
Get Member Set M of Group G 0x17 0x09
Get Name Space Entry 0x16 0x30
Get NetWare File Systems Information 0x7B 0x03
Get Network Router Information 0x7B 0x33
Get Network Routers Information 0x7B 0x34
Get Network Serial Number 0x17 0x12
Get NLM Information 0x7B 0x0B
Get NLM Loaded List 0x7B 0x0A
Get NLM's Resource Tag List 0x7B 0x0F
Get NS Entry Info 0x57 0x06
Get NS Info 0x57 0x17
Get NS Path 0x57 0x1C
Get Object Connection Numbers 0x17 0x15
Get Object Disk Restrictions 0x16 0x29
Get Object Effective Rights 0x16 0x32
Get OS Version Information 0x7B 0x0D
Get Packet Burst Information 0x7B 0x05
Get Path From Directory Entry 0x16 0x1A
Get Personal Message 0x15 0x05
Get Physical Disk Statistics 0x17 0xD8
Get Physical Record Locks By Connection And File 0x17 0xDD
Get Physical Record Locks By File 0x17 0xDE
Get Printer Queue 0x11 0x0A
Get Printer Status 0x11 0x06
Get Protocol Stack Configuration Information 0x7B 0x29
Get Protocol Stack Custom Information 0x7B 0x2B
Get Protocol Stack Numbers By LAN Board Number 0x7B 0x2D

Get Protocol Stack Numbers By Media Number 0x7B 0x2C
Get Protocol Stack Statistics Information 0x7B 0x2A
Get Queue Job List 0x17 0x6B
Get Queue Job's File Size 0x17 0x78
Get Relation Of An Object 0x17 0x4C
Get Semaphore Information 0x17 0xE2
Get Server Information 0x7B 0x36
Get Server Set Categories 0x7B 0x3D
Get Server Set Commands Information 0x7B 0x3C
Get Spool Queue Entry 0x11 0x04
Get Station Number 0x13 None
Get Station's Logged Information 0x17 0x05
Get User Information 0x7B 0x04
Get Volume Audit Statistics 0x58 0x01
Get Volume Info With Handle 0x16 0x15
Get Volume Info with Number 0x12 None
Get Volume Information 0x17 0xE9
Get Volume Name 0x16 0x06
Get Volume Number 0x16 0x05
Get Volume Segment List 0x7B 0x21
Get Volume Switch Information 0x7B 0x09
Get Volume Usage 0x16 0x2C
Is Bindery Object In Set 0x17 0x43
Is Station A Manager 0x17 0x49
Is User Audited 0x58 0x09
Lock File Set 0x04 None
Lock Logical Record Set 0x0A None
Lock Physical Record Set 0x1B None
Log File 0x03 None
Log Logical Record 0x09 None
Log Network Message 0x17 0x0D
Log Physical Record 0x1A None
Login As Volume Auditor 0x58 0x03
Login Object 0x17 0x14
Login Object Encrypted 0x17 0x18
Login User Object 0x17 0x00
Logout 0x19 None
Logout As Volume Auditor 0x58 0x0D
Map Directory Number To Path 0x16 0xF3
Map Number To Group Name 0x17 0x08
Map Number To Object 0x17 0x04
Map Object To Number 0x17 0x03
Map User To Station Set 0x17 0x02
Modify Maximum Rights Mask 0x16 0x04
Move Entry 0x16 0x2E
Negotiate Buffer 0x21 None
Negotiate LIP Buffer 0x61 None
Open Bindery 0x17 0x45
Open Data Stream 0x16 0x31
Open Message Pipe 0x15 0x06
Open Semaphore 0x20 0x00
Packet Burst Connection 0x65 None
Purge All Erased Files 0x17 0xCE
Purge Erased Files 0x16 0x10
Purge Salvagable File 0x16 0x1D
Read Audit Config Header 0x58 0x0B
Read Auditing Bit Map 0x58 0x0A
Read Extended NS Info 0x57 0x1A
Read NS Info 0x57 0x13

Read Property Value 0x17 0x3D
Read Queue Current Status 0x17 0x66
Read Queue Job Entry 0x17 0x6C
Read Queue Server Current Status 0x17 0x76
Recover Salvagable File 0x16 0x1C
Release File 0x05 None
Release File Set 0x06 None
Release Logical Record 0x0C None
Release Logical Record Set 0x0D None
Release Physical Record 0x1C None
Release Physical Record Set 0x1D None
Remove Audit Property 0x58 0x06
Remove Entry From Spool Queue 0x11 0x05
Remove Job From Queue 0x17 0x6A
Rename Bindery Object 0x17 0x34
Rename Directory 0x16 0x0F
Reset Audit History File 0x58 0x0F
Reset Auditing File 0x58 0x0E
Restore Directory Handle 0x16 0x18
Restore Erased File 0x16 0x11
Restore Queue Server Rights 0x17 0x75
Save Directory Handle 0x16 0x17
Scan Bindery Object 0x17 0x37
Scan Bindery Object Trustee Paths 0x17 0x47
Scan Dir Entry 0x16 0x1E
Scan Dir Restrictions 0x16 0x23
Scan Directory For Trustees 0x16 0x0C
Scan Directory Information 0x16 0x02
Scan Entry For Trustees 0x16 0x26
Scan File Information 0x17 0x0F
Scan File Physical 0x16 0x28
Scan NS Entry Info 0x57 0x03
Scan Property 0x17 0x3C
Scan Salvagable Files 0x16 0x1B
Scan Volume For Restrictions 0x16 0x20
Search File 0x40 None
Send Broadcast Message 0x15 0x00
Send Console Broadcast 0x17 0xD1
Send Personal Message 0x15 0x04
Service Queue Job And Open File 0x17 0x71
Set Dir Restriction 0x16 0x24
Set Directory Handle 0x16 0x00
Set Directory Information 0x16 0x19
Set Entry 0x16 0x25
Set Extended File Attributes 0x4F None
Set File Attributes 0x46 None
Set File Information 0x17 0x10
Set File Server Date And Time 0x17 0xCA
Set File Time And Date 0x4B None
Set NS Entry DOS Info 0x57 0x07
Set Queue Current Status 0x17 0x67
Set Queue Server Current Status 0x17 0x77
Set Spool Flags 0x11 0x02
Set Trustee 0x16 0x27
Set Volume Restrictions 0x16 0x21
Signal Semaphore 0x20 0x03
Specify Capture File 0x11 0x09
Spool Data To A Capture File 0x11 0x00
Spool Existing File 0x11 0x03

Submit Account Charge 0x17 0x97
Submit Account Hold 0x17 0x98
Submit Account Note 0x17 0x99
TTS Abort Transaction 0x22 0x03
TTS Begin Transaction 0x22 0x01
TTS End Transaction 0x22 0x02
TTS Get Application Thresholds 0x22 0x05
TTS Get Control Flags 0x22 0x09
TTS Get Statistics 0x17 0xD5
TTS Get Workstation Thresholds 0x22 0x07
TTS Is Available 0x22 0x00
TTS Set Application Thresholds 0x22 0x06
TTS Set Control Flags 0x22 0x0A
TTS Set Workstation Thresholds 0x22 0x08
TTS Transaction Status 0x22 0x04
Verify Bindery Object Password 0x17 0x3F
Verify Bindery Object Password Encrypted 0x17 0x4A
Verify Network Serial Number 0x17 0x0C
Wait On Semaphore 0x20 0x02
Write Audit Config Header 0x58 0x11
Write Auditing Bit Map 0x58 0x10
Write Extended NS Info 0x57 0x1B
Write NS Info 0x57 0x19
Write Property Value 0x17 0x3E






































November, 1993
PROGRAMMER'S BOOKSHELF


Warm and Fuzzy




Peter D. Varhol


Ironically, I began reading Fuzzy Thinking by Bart Kosko on the same day one
of my students gave a presentation about digital logic. The data line is
either high or low, he explained. At five volts or more it was high, at three
or less low. And in between? "Indeterminate," was his response, as though this
state of affairs was only a minor annoyance to engineers. To my student, the
indeterminate condition was something to be hunted down and fixed. To Kosko,
and to fuzzy theorists in general, this state represents an integral part of a
fuzzy condition that includes various degrees of high and low.
Fuzzy Thinking is not a book on computers or engineering, but, rather, one on
a way of thinking. The fuzzy system of thought, which has much in common with
Eastern religions, is that all states of being exist to some degree. Once you
recognize this as the true and universal worldview, it's possible to model
complex, nonlinear systems without high-order mathematics, and ultimately to
raise the intelligence level of machines through the use of adaptive fuzzy
systems. Kosko debunks the myth of what he calls "bivalence"--that a state is
either A or not-A. While much of our work with computers assumes a bivalence
model, a moment's reflection reveals that it's clearly a poor approximation of
the world. Certainly the grass is green, at least as a first approximation. As
we obtain greater precision, however, we find that some of the grass reflects
light at a slightly different wavelength than the rest. If we look closer
still, sections of each individual blade of grass reflect light differently,
making the concept of green still more fuzzy. His criticism here is
well-taken; it's clear that fuzzy thinking is a correct way of looking at the
world.
Kosko also attempts to debunk the concepts of probability, a somewhat more
difficult proposition since probability is itself a slippery concept. Rather
than debunking it, he decides that it's merely a part of fuzzy theory. He
forms the fuzzy concept of "subsethood," the degree to which one set is a
subset of another set. Probability is the proportion to which one set--the
true outcome--contains the set of all possible outcomes.
However, Kosko has some logical problems in trying to do away with the
probabilistic worldview. First, it's easy to claim that probability and fuzzy
logic are the same thing, or at least produce the same results, as many
probabilistic theorists do. This problem is apparently confounded by a number
of bright scientists who either relegate fuzzy thinking to a semantic niche,
or dismiss it altogether. Even with the subsethood principle, he's
uncomfortable with any substantive discussion of probability.
Fuzzy Thinking is presented in three parts. The first part, "The Fuzzy Past,"
presents the historical antecedents of both Aristotelian and Eastern thought,
and compares the two, largely to the benefit of the Eastern approaches. Kosko
lays down the philosophical and logical antecedents of fuzzy thinking.
The second part, "The Fuzzy Present," examines the founding and application of
fuzzy-set theory. Kosko is at his most convincing here. He combines a layman's
introduction to the fuzzy-application theorem with a glimpse at the
personalities and personal insights that give life to the story. His examples
are straightforward, yet still convey the expressive power of fuzzy thought,
primarily in control applications.
Of particular interest is his assertion that fuzzy logic enables engineers to
control nonlinear systems via fuzzy "rules" without the complexities of
nonlinear mathematics. This is where neural networks come in. But adaptive
fuzzy systems go beyond the neural network, which only creates a mathematical
model of the process. These produce rules from system behavior that can be
incorporated into fuzzy control devices designed to monitor and control
similar behavior.
As a roadmap to fuzzy rules, especially in human-behavioral systems, Kosko
introduces fuzzy cognitive maps (FCMs) which are simply diagrams of states,
actions, and outcomes, along with their relationships. The FCM in Figure 1,
for example, describes the rules by which I plan (or would like to plan) my
day. While FCMs seem to be more suited to group rather than individual
actions, they still portray, reasonably accurately, influences on the
decisions individuals can make. Applying changes to one or more states in the
diagram results in different actions or conclusions. From FCMs, rules can be
derived to determine actions based on the strength of the influence.
In the third part of his book, "The Fuzzy Future," Kosko focuses less on
applications and more on how a fuzzy philosophy of life might change our
society. He points out that many philosophical issues--life and death, ethics,
and God (and presumably questions such as, "Is rap music really music?")--are
all really fuzzy concepts, and can be dealt with rationally by society once
their fuzzy nature is recognized and factored into the debate. Kosko proposes
that machines employing fuzzy thought will gradually increase the wealth and
happiness of society. All in all, this is the most fuzzy part of the book.
But does fuzzy thinking have implications for computer science? According to
Kosko, fuzzy thinking leads to machines with higher IQs. He points out
examples of current products (mostly from Japan and South Korea) that
incorporate fuzzy "rules" in control systems to provide environmental
controls, adjust fuel flow in engines, and fine-tune antilock brakes, among
other applications.
It's worth noting that traditional expert systems assume a bivalent, or, at
best, probabilistic, model of the world. This may be why many expert systems
are difficult and time consuming to build. Most experts express their opinions
not as absolutes or probabilities, but rather as signposts or rules of thumb,
which together tend to point to a conclusion. This process seems much better
modeled by fuzzy thinking.
Fuzzy thinking also leads to conclusions beyond Kosko's. Are computer programs
provably correct? Not insofar as the requirements are fuzzy. Many such fuzzy
goals we relate to software engineering may not be bivalent. Requirements that
are expressed bivalently may be artificially so.
Some of Kosko's comments bring fuzzy thinking to a more personal and useful
level. For example, he believes that one of Lotfi Zadeh's motivations in
developing fuzzy theory was a reaction against the tendency of engineers to
solve increasingly difficult real-life problems by "throwing more math at
them." This attitude may be analogous to the programmer who solves
increasingly complex problems by writing more code, rather than stepping back
and questioning the whole approach to the problem. For programmers, different
perspectives often lead to more elegant solutions.
At the end, I couldn't help but shrug my shoulders and say to myself, "Yet
another model of the world." Well argued? Yes, for the most part. The Truth? I
wouldn't go that far. The techniques would seem to be useful in computing
problems where the bivalent model either breaks down or becomes too complex.
However, as Kosko admits, fuzzy rules are not necessarily a snap to derive and
test.
Part of the problem is that, while data and circumstances are often (always,
according to Kosko) fuzzy, the decisions based on these circumstances are, in
fact, bivalent. While a control system may look at the degrees of warmth in
determining how to control the air conditioner, at some point that air
conditioner has to be either on or off. Kosko might argue that the on/off
bivalence is due to a lack of detail in the control system (and he's probably
right) but we may not need to control down to that level of detail.
Kosko's view of the world is valid to a point, and useful for those of us who,
in our daily work, simplify problems down to the bivalent "A or not-A"
paradigm. While he glorifies large chunks of Eastern thought as inherently
fuzzy, Kosko fails to consider Western culture's ability to hold two
contradictory beliefs as an indication of mental dexterity.
I've usually viewed fuzzy logic as no more than an extension of set theory
that allows for partial membership in sets, as a way of modeling certain
aspects of the world. Kosko claims that it is the true and proper way to view
the world. I'm perhaps too comfortable with my familiar paradigms, but I don't
feel I have to give up Western bivalence to see some value in fuzzy thinking.
Still, I'm willing to learn, and have turned to Byte Dynamics' Fuzzy Logic
Designer, a Windows-based development tool and code generator, to explore
fuzzy thinking in more hands-on terms. (Look for my report in an upcoming
issue of DDJ.)
What is significant is that the computational sciences are starting to make
serious efforts to deal with nonlinear systems, which are much more common in
nature than linear systems. Kosko admits that the math of most nonlinear
systems is beyond us right now, and proposes fuzzy thinking as a suitable
alternative. Fuzzy thinking is a systematic approach to this end, and may
serve as a stepping stone into a better understanding and modeling of
nonlinear systems. Then again, to be appropriately fuzzy, maybe not.
Fuzzy
Thinking
Bart Kosko
Hyperion, 1993, 318 pp.
$24.95
ISBN 1-56282-839-8
 Figure 1: Typical fuzzy cognitive map (FCM).
























November, 1993
SWAINE'S FLAMES


Strong Models


In this month's "Programming Paradigms," there appears a claim that may seem
pretty fanciful. It's the strong view of artificial life, the view that
researchers in this area are not just modeling life, but studying it directly.
That the programs they write are not models, but instances.
It's analogous to the strong view of artificial intelligence, which says that
AI attempts to understand the processes of intelligence, human or other. In
both cases, the strong view assumes something that sounds absurd when you put
it bluntly: that you can study the real world by writing computer programs.
If you look closely at the computer models being used in various branches of
science today, you come across many examples of this idea.
Most people would support a weaker view: Although we can learn from models,
the model is not the thing modeled. Programming is not scientific research. A
computer program has no empirical content. The map is not the territory.
The strong view, on the other hand, seems like cheating. You know the story of
the guy who's on his knees under a streetlamp and tells his friend he's
looking for his earring and the friend asks where did you lose it and he says
in the alley and the friend says then why are you looking here and he says the
light's better? The strong view says that sometimes that works.
It would be nice if, instead of having to do messy real world experiments, we
could write a program and do experiments with it, and never have to check your
results against reality, because the program is reality.
That's the strong view. Is there any precedent in science for this kind of
thinking?
Yes, lots. In Galileo's time it was thought that heavier objects fell faster
than light ones. Galileo designed the following experiment: Tie a light object
and a heavy object together with a string and drop them from a height. If
heavy objects fall faster, this joint object should fall faster than either of
its components alone. Since the light component will fall slower than the
heavier, it should slow it down, so the joint object should fall slower than
its heavy component alone. This is a contradiction; therefore heavy and light
objects fall at the same speed. Galileo didn't actually do the experiment;
thinking it through was enough to debunk the accepted theory and support his
new theory. This was a thought experiment with zero empirical content, and it
advanced science. Descriptions of thought experiments make some of the best
reading in science, and are often crucial in advancing scientific thought.
James Robert Brown's book The Laboratory of the Mind (Routledge, 1991) is a
very readable collection of some of the most important thought experiments in
science.
Brown's book also advances a strong view of thought experiments. It runs
something like this: Assume that there are actual laws of the universe to be
discovered. Now, any structure that follows those laws can be used to examine
them, including artificial structures: thought experiments, computer programs.
Science isn't about instances; it's about discovering the laws of the
universe.
So if you write a program that follows some of the laws of life, you can use
it to study those laws.
One step further: If science is the attempt to discover nature's algorithms,
computer programs are the best laboratories.
Michael Swaineeditor-at-large










































December, 1993
December, 1993
EDITORIAL


Cryptography Fires Up the Feds


Cryptography is like of those West Virginia subterranean fires that smolder
along coal seams for months before flaring up above ground. The current flame
along the encryption firing line involves a pair of Federal grand jury
subpoenas handed out to distributors of Phil Zimmermann's PGP ("Pretty Good
Privacy") message signature and privacy software.
Earlier this fall, the Austin Code Works (a Texas software distributor) and
ViaCrypt (a Phoenix cryptography-tool developer) were slapped with demands to
produce contracts, payments, correspondence, and related information
concerning their international distribution of PGP and RSA cryptography source
code. Neither company was told why they must turn over this information, nor
were they given any indication of when or what the next shoe to drop might be.
For the past year Code Works has been selling Grady Ward's Moby Crypto, a
collection of crypto software that includes PGP, RSA, MD4, DES, and the like.
Although not mentioned in the subpoena, Code Woks has also been separately
selling a DES encryption and decryption software package. For the time being,
both have been removed from the Code Works' shelves. ViaCrypt, on the other
hand, licensed PGP from Zimmermann, combined it with ViaCrypt's DigiSig+
cryptographic engine, and released a toolkit called "ViaCrypt PGP," the first
commercial PGP-gased package. Interestingly, ViaCrypt is also a sublicensee of
RSA public-key encryption from Public Key Partners, holder of the RSA patent
and a big-time competitor and long-time critic of PGP.
Ostensibly, the subpoenas are part of a U.S. Customs investigation into the
export of PGP. (A letter the State Department's Enforcement Branch fired off
to the Code Works begins with, "It has come to the attention of this office
that your company is making cryptographic source code. . .available for
commercial export. . . .") State Department regulations lump cryptographic
software with munitions and weapons, making it subject to export licenses as
per International Traffic in Arms Regulation guidelines. However, Code Works'
current advertisements clearly state that both Moby Crypto and DES Encryption
are "not for export," and ViaCrypt says sales are made "export regulations
permitting." In short, there's no indication that either company has exported
crypto software, leading you to believe the investigation is really nothing
more than a fishing expedition.
The timing is curious, considering that the Clinton administration views many
high-tech export rules as antiquated Cold War laws that hinder U.S. trade.
Consequently, the administration is rethinking export laws so that U.S.
manufacturers can more easily export communications and other high-tech
equipment--what's protected today may be fair game in a few months. Of course,
the government also wants to make it harder to sell high-tech military
equipment to renegade countries. Unfortunately, cryptography has a foot in
both military and civilian communications camps.
Neither the Code Works nor ViaCrypt had anything to do with developing PGP.
You could even argue that Zimmermann really isn't the "author" of the
software. True, he did write Version 1.0, but subsequent editions (2.3 is the
current release) are the contributed efforts of U.S. and non-U.S. programmers
who've created what's been described as the strongest, easiest-to-use
encryption utility available to the public in source form. There's no question
that PGP was exported, but neither is there a hint that Zimmermann shipped it
overseas. He assiduously avoided the chance of his exporting PGP, to the point
of having other people upload the software to the nets. The bottom line is
that PGP was legally on the net and anyone with a PC and modem could have
moved it across international borders--just as with DES, which has been on the
nets and authorized by the government for more than a decade.
Still, you have to wonder why the government is taking action now. PGP has
been around for a couple of years. Maybe the Feds are upset that Zimmermann's
encryption scheme is good--PGP is thought to be stronger than DES, the NSA and
FBI reportedly can't crack it, and the thought of publicly available
cryptography scares the dickens out of them. Or maybe the announcement of a
commercial PGP-based application finally hitting the shelves prompted PGP's
competitors to lean on the government. We just don't know, and the Feds aren't
talking.
The government is struggling to cope with a changing world, one in which
technology has altered many of the old rules. Regulations, written for a
paper-based society, are adapting well to digital reality. International
electronic networks make it hard to control software distribution and
information dissemination. Like wildfire, bank transfers and e-mail are
circling the globe unfettered--and encryption is keeping secret the contents
of these communications. But the means by which Washington is attempting to
maintain control over cryptography is, in the long run, injurious to us all.
From a business perspective, these tactics hobble U.S. companies from
competing internationally. More importantly, the First Amendment guarantees us
the right to speak in an encrypted way and insidious attempts to douse public
access to cryptography, cloaked under the guise of software-export
investigations, appear to stifle these rights.
Jonathan Erickson
editor-in-chief











































December, 1993
Letters


Floating Point




Dear DDJ,


I found the article "32-bit Floating-point Math" by Al Williams (DDJ, June
1993) both interesting and useful. At first, the answers the package gave to
test calculations were more than adequately accurate. But eventually I noticed
that some calculations were wrong, not just inaccurate. I haven't had use for
the fdivxy function, so I can't vouch for or condemn it, but I've found
problems with both the faddxy and fmulxy routines.
faddxy has two errors. A minor one, as long as the memory model remains small,
is that it ends with a POP DX instruction when a POP DS is what's needed. The
more serious problem is that when both arguments are negative and the addition
results in a cleared sign bit, faddxy does not recognize that an overflow has
occurred and so does not perform the special handling required.
fmulxy has one error. When adjusting the radix point after multiplication, it
can attempt a double right shift by 32 bits (SHRD, EAX, EDX, CL). Since the
SHRD instruction uses only the least significant 5 bits of the third argument,
this is equivalent to shifting by 0.
Example 1 illustrates very conservative modifications to the original code
which eliminate the problems I've mentioned.
Thomas Senchyna
San Rafael, California
Al replies: Thanks for your interest in FPM. Your observations are correct.
One of the difficulties in writing numerical software is testing all the
possibilities.
Although your changes to FPM were correct, I took the liberty of changing them
somewhat to make the resulting code shorter (see Example 2).


Language Wars




Dear DDJ,


In his letter in the May 1993 DDJ, David Smead states that programming
productivity won't be boosted much by any programming language. Does this mean
that we can program in assembler and still have programs written as fast as in
C or some other languages? Can we extend this argument about C++ and C to say
that since C is translated into native machine language ultimately, we don't
need C?
A programming language is a notation that lets me state the solution to the
problem I am trying to solve which, incidentally, is also understood by the
computer. We can describe the solution to any problem using a Turing machine
whose basic operations are extremely simple, but no one programs for Turing
machines. The first step in solving any problem is to understand it and be
able to describe it using a convenient notation. To say that it does not
matter what notation we use to state and solve problems seems extremely
simplistic.
In spite of nearly nine years of programming in C, I'm neither a fan of C and
C++, nor do I hate them. But I do know that to solve the problems we are
solving today, C is woefully inadequate. Having written an RDBMS engine in C
with OOP mechanisms handcrafted, I have come to appreciate the usefulness of
language constructs. Any new notation (by which I mean the syntactic
constructs and their semantics) go a long way in the readability of a
software. I would gladly take C++ over C, in spite of C++ being a complex
language. I also appreciate the fact that mere language alone does not
guarantee programming nirvana. But in the hands of a trained person, a
language with powerful notational abilities can make a difference by providing
built-in support for things that would have to be poorly hacked in a language
like C.
We should recognize the fact that new languages are something more than just a
fad. While C++ and Objective-C are quite similar in concept, they provide a
significant expressive power compared to C or Pascal. Each new class of
languages seems to emphasize a different style of programming and I don't see
why we should not welcome a language that offers a more powerful abstraction
mechanism. It is true that we don't see all the details behind such
abstractions, but I see that as an advantage.
Let's not be overly concerned about new languages and their features. We do
need to be liberated from the highly procedural, unforgiving, and poorly
documenting languages like C and instead see a language that gives us
mechanisms which aid productivity. (I am talking about a language which alerts
me to errors that can be detected by source analysis, garbage collection,
hiding pointers, incremental compilation, source-code portability, and so on.
Language researchers have made a lot of progress in these area and we may soon
have these features without the performance penalty.) We have come a long way
from Fortran and Cobol, but we do have a long way to go yet.
M. Prakash
Moline Illinois


The Book Stops Here




Dear DDJ,


I was glad to see a negative book review in Al Stevens's "C Programming"
column (DDJ, October 1993). I'm not saying I agree with it--I haven't picked
up the book. I have, however, spent a lot of money over the years on this kind
of book, have learned quite a bit from some of the better ones, and have a
pile of some of the not-so-good ones. I have learned to read book reviews and
know that, if over 30 percent of the review is spent discussing the bad
aspects of the book, the book is a piece of trash. I found it very refreshing
to read Al's undiluted thoughts about this particular work, and especially
enjoyed his speculation about the probable interaction between the book
publisher and author. I consider book reviews to be a service for which I
justify part of the subscription prices I pay to several software magazines;
and I consider negative reviews to be every bit as important as rave or mixed
reviews. If I had a Macintosh project coming up and stumbled across this book
before I stumbled across Al's review of it, I might have spent more of my
money for another entry in my pile of door stops.
Thomas J. Murphy
Oswego, New York


Improving on Algorithm #0




Dear DDJ,



Tom Swan's algorithm #0 (see DDJ, May 1993) does a fascinating job of fast
approximation. However, most languages have something like sqr(x) already
built in, so I tried to expand the formula to calculate cube roots. After some
twiddling with exponents and factors, I got it stable and working. My version
of line 6 in Tom's pseudocode is y<--(x/(y2)+2*y)/3.
Comparing the two formulas, it seemed to me that both of were special cases of
a general algorithm to calculate the mth root of x. (Let m be 2 for sqr(), 3
for cube, and so on.) Fortunately, the solution turns out to be very simple.
The formula for the mth root of x approximation is y<--(x/(y(m-1))+(m-1)*y)/m.
Now, that's definitely something I like to have in my bags of tricks and want
to thank Tom for his article.
Another interesting experience was that this algorithm quite soon delivers
precise results within the accuracy of the data type, that is, double (64
bits). So it seems to make sense simply to test the y variable after each
iteration. If it doesn't change, you're done and you exit the loop. A typical
numerical example for m=3 and x=70,957,944 results to 414 after 34 iterations.
After 28 iterations the result is still far from accurate (>426), but the last
six passes lead to perfect accuracy. Consider this: For more than 80 percent
of the execution time, the intermediate result is far from accurate. Any
attempt to reduce execution time by one or two passes will be more than
compensated by the computational overhead involved. But if you prefer to write
bigger pieces of code that execute slower, please do not call that mess a
function. Use the classifier "object" instead.
Gerhard Schwartzmann
Vienna, Austria


That Sounds "Rational"




Dear DDJ,


The article "Comparing Object-oriented Languages" (DDJ, October 1993)
inadvertently stated the name of our company as "Rational Systems." Rational
Systems is a Massachusetts-based company that sells C compilers and DOS
extenders.
The correct full name of our company is "Rational." Among the
software-engineering we provide are Rational Rose, an object-oriented analysis
and design tool; Rational C++ Booch Components, an object-oriented class
library; and Rational Apex, a sophisticated software-engineering environment
for Ada.
Kara Myers
Rational
Santa Clara, California


Pedal Pushers




Dear DDJ,


Al Steven's solution to adding a satellite keypad is interesting ("C
Programming," DDJ, August 1993), but I ask, "Why do it the hard way?" For an
alternative approach, you might want to take a look at my tip entitled
"Windows, DOS, and...Foot Pedals?" which appeared in Leor Zolman's "Tech Tips"
column (Windows/DOS Developer's Journal, February 1993).
In the column, I describe how to add an "infinite" variety of satellite
keypads with minimal hardware, no chips, and no software. The trick is to tap
into an existing keyboard; really quite easy with the two keyboards I've
mapped (the Keytronic E03435 and the Honeywell 101RXe). From that tip: Such
satellite keypads need contain only switches--no chips, no software. You
simply plug in the one you've chosen to the new connector on your regular
keyboard. I know, it's too simple. I hear that a lot.
Homer B. Tilton
Tucson, Arizona


OOP Languages




Dear DDJ,


I found the articles in your October 1993 issue on object-oriented languages
both informative and inspiring. Your publication is the first I've read which
mentions C+@, Sather, Parasol, or Beta. Although I was somewhat disappointed,
however, that your coverage of these object-oriented languages didn't include
Actor and Allegro. Allegro appears in an advertisement in your magazine and
Actor has been popular with software developers for years.
Perhaps in a future issue you may cover the current version of Actor. Some of
your readers may be interested in alternative hybrid languages such as
Knowledge Garden's KnowledgePro 2.0 and StepStone's Objective-C.
I believe these software tools will become more significant when the new
object-oriented operating systems become widely available: Taligent's Pink,
Next's NextStep, Microsoft's Cairo, and the like.
Alton Johnson
Cromwell, Connecticut
Example 1: Thomas Senchyna's implementation of faddxt
faddxt PROC USES SI
 .
 .
 .
DIADD2:
 XOR CH, CH
 ADD EAX, EBX

 JNS SHORT nskludge ;modified--was DOADD3
 OR DH, DH
 JZ SHORT DOADD3
 NEG EAX
 NOT CH
 JMP SHORT DOADD3 ;added
nskludge ;added
 CMP DH, 2 ;added
 JNE SHORT DOADD3 ;added
 STC ;added
 RCR EAX, 1 ;added
 DEC CL ;added
 NEG EAX ;added
 NOT CH ;added
DOADD3:
 PUSH DS
 LDS BX, DWORD PTR ANSOFF
 MOV [BX] .SIGN, CH
 MOV [BX] .SCALE, CL
 MOV [BX] .NUM, EAX
 POP DS ;modified--Dr. Dobb's says POP DX
 RET
faddxy ENDP
fmulxy PROC
 .
 .
 .
 MUL EBX
TSTZLP:
 BSR ECX, EDX
 JZ SHORT DXZ
 CMP CL, 31 ;added
 JB SHORT nonkludge ;added
 MOV EAX, EDX ;added
 SUB XARG.SCALE, 32 ;added
 JUMP SHORT DXZ ;added
nokludge: ;added
 INC CL
 SHRD EAX, EDX, CL
 SUB XARG.SCALE, CL
DXZ:
 .
 .
 .
fmulxy ENDP
Example 2: Al Williams' improvement on Senchyna's improvement
faddxy PROC USES SI
 .
 .
 .
DOADD2:
 XOR CH, CH
 ADD EAX, EBX
 JNS SHORT nskludge
 OR DH, DH
 JNZ DOADD3A
nskludge:
 CMP DH,2
 JNE SHORT DOADD3

 STC
 RCR EAX,1
 DEC CL
DOADD3A:
 NEG EAX
 NOT CH
DOADD3:
 .
 .
 .
 POP DES ;POP DX
in origional code was incorrect
 RET
 .
 .
 .
fmulxy PROC
 .
 .
 .
TSTZLP:
 BSR ECX, EDX
 JZ SHORT DXZ
 CMP CL, 31
 JB SHORT nokludge
 MOV EAX, EDX
 SUB XARG.SCALE, 32
 JMP SHORT DXZ
nokludge
 INC CL
































December, 1993
The Information Utility


Exchanging spatial data




Rolf Oswald


Rolf is the systems manager for LINNET Graphics International Inc. and can be
reached at 600-191 Broadway, Winnepeg, MB R3C 378, or at techmgr@LINNET.ca.


Spatial data, such as that derived from digital satellite and orthophoto
images, provides us with facts about measurements and observations, and
information about location and relationship with respect to other entities on
this planet. This information is typically stored in databases that use
geographic information systems (GIS) to perform complex spatial queries and
data manipulation. Toward this end, large organizations that acquire hardware
and software to support GIS systems must convert existing analog data to any
number of often-incompatible digital formats.
Databases that don't provide access to such foreign datasets can severely
restrict an organization. Standards have been defined for hardware components,
network protocols, and user interfaces for maximum interoperability between
computer systems, but problems remain. Many data-exchange related standards
aren't keeping pace with emerging hardware/software technology. Some industry
standards (SAIF in Canada, DIGEST and SDTS in the U.S.) are often too complex
for the average user and not yet widely supported by vendors, while other
standards based on popular GIS packages (usually derived from early CAD-based
systems) are de facto standards. Governments, utilities, and private companies
must therefore find a mechanism for sharing spatial data.


Spatial Data


Figure 1 is an example of graphically represented information, along with
corresponding attribute information. The graphic component defines the
object(s) in terms of its spatial characteristics--shape, size, coordinates,
color, line style, and the like. This information is usually stored in a
proprietary file format that's viewed and manipulated with CAD or GIS
software. The attribute component describes the object(s) in terms of its
measured characteristics. The attribute data is generally stored in tabular
format in a relational database, then viewed and manipulated with SQL
(Structured Query Language).
Complete exchange of data between GISs requires that both the graphic and
attribute components be transferred. Figure 1 also shows the datastream
describing the graphic and attribute data being transferred. The data stream
consists of:
Header information: a dataset identification block, coordinate-projection
information, and feature-classification structure.
Schema definition of the primary tables, which contain data elements
corresponding directly to the features they describe.
Feature data, the coordinate information for each feature occurrence, and the
feature's primary attributes (such as the key identifiers).
Topology information, which describes the relationships among features--the
connectivity of linear features and the neighbor relationship of area or
polygon features.
Indirect data, which describes any attribute data associated indirectly with
the primary tables, plus the schema definition.
Table relationships, defined by identifying the foreign-key relationships
between the primary and indirect tables.


The Spatial Data-exchange Dilemma


Most GISs employ a proprietary internal file structure controlled by the GIS
engine. These systems generally provide a mechanism to import and export a
growing number of de facto standard formats, allowing for limited movement of
data between foreign, albeit similar, systems. Without a correct import or
export format, however, data transfer to a foreign system is cumbersome.
Exchanging spatial data for which a direct exchange format isn't available
involves a programming project, and more than one translation is unachievable
for most data-processing shops.
If a common exchange format doesn't exist, then a customized translator must
be written; see Figure 2. The number of translators required to facilitate
data exchange between all formats is n*(n--1). As Figure 2 illustrates, data
exchange between four systems employing different data formats would require
12 translators. A better approach is to combine existing data-exchange formats
with custom-developed formats to reduce the number of translators. It's also
possible to move data through a multistep translation process; see Figure 3.
The ideal solution to the data-exchange dilemma is a single, neutral data
format--one that translates all other formats to and from the neutral format.
As Figure 4 shows, the number of translators would be n*2, and the development
effort would be shifted to the vendors of the original data format, not the
end users.
This solution is difficult because it requires support from all developers and
vendors. Progress is being made in this direction with the acceptance of
industry-standard formats such as SAIF and DIGEST. For now, however, the best
solution is to use existing standards where possible, combined with
custom-developed translation mechanisms.


The Information Utility


The Manitoba Land Related Information System (MLRIS) is an agreement between
Manitoba's government departments and utility companies to collect land-based
information to a specified standard, and provide this data to an Information
Utility (IU) to be stored and distributed on demand to authorized users. A
common base map is used as the geodetic control base for all other land-based
data.
In addition to providing the data-exchange facility to end users, the IU will
exist as a single source for land-related data. Users won't need to search for
foreign datasets, whose existence is often unknown, thus avoiding duplicate
collections of land-based information. A streamlined data-exchange process
allows users to concentrate on acquiring foreign data and incorporating it
into analysis and decision-making processes. Collecting data according to
provincially accepted standards ensures that it is accurate and of high
quality.


The IU System Architecture


The IU is a massive database that stores spatial data as vector, attribute,
and image data. The database is managed by a spatial database manager and
Oracle relational database management system (RDBMS), along with
inhouse-developed system utilities for importing, exporting, and cataloging
the data.
Figure 5 shows the IU system architecture, which consists of a number of
interconnected modules that provide data indexing, transfer, viewing, and
analysis functions.
The host system is an RS/6000-based processor running AIX with a large amount
of online storage. All vector and attribute data are stored online in the
RDBMS controlled by the GIS engine; all image data are stored on CD-ROM.
The IU database is actually a composite of many databases, including the
enterprise database, which contains the following: the data the IU is capable
of serving, an observation database containing a subset of data that end users
can view directly, and a thematic database used for custom mapping and
analysis.
Data is moved in and out of the enterprise database through an interchange
facility which interprets an incoming datastream and translates it to a
neutral format to be stored on the enterprise database. Conversely, it can
translate data destined for a foreign system to the proper exchange format.
The data-directory module acts as the catalog and administration component of
the IU. Popkin Software's System Architect is used to build and store data
profiles in a data dictionary. User profiles and user privileges are
maintained to assist in the routing of export data and preventing unauthorized
access to proprietary or sensitive information. An embedded accounting module
provides statistics required for usage billing and royalty disbursements. An
e-mail subsystem enables communication with the IU system administrator for
dataset-retrieval requests and/or data profile and metadata exchange. Various
administration modules assist the IU administrator in processing
data-retrieval requests and maintaining the data directory.
A communications facility connects the IU to the outside world for dial-up
access to the IU via 14.4 Kbps lines. Once connected to the IU, the end user
can access the data directory for an index to the IU database contents, as
well as metadata for any available datasets. By connecting to the Observation
database through an interface program called IUACCESS, the user can view
graphics, key maps, or other data classified as "high-demand" data. This
permits the user to make graphic queries on selected features or view the IU
database through a visual interface.



Simplifying the Data Exchange


Requirements for spatial-data exchange are often based on the user's need to
know the location and identification of specific features, and maybe come
characteristics about those features. Therefore, it isn't necessary to
translate and transfer all of the original data from one spatial database to
another. In fact, spatially oriented data can be represented as vector
information (points, lines, polygons), raster data, and attribute data
(tabular data). By minimizing the amount and type of information to be moved
from one system to another, less effort is needed to create the translator.
This minimizing process should begin at the source system of the data to
ensure that the correct level of detail is abstracted from the source
database. The agency responsible for the data is best equipped to make this
decision.
The considerations in data-transfer mechanism development between the IU and
foreign databases are as follows:
Original-feature symbol definitions aren't transferred during the exchange
process. Symbology is the graphic representation of the vector data. No
industry standard currently exists, so each user can define his own symbology.
Naturally a default symbol set should be available.
Features imported into the IU are reclassified for consistency with a standard
established strictly for the IU. Feature classification is very user
dependent. Although some industry standards exist, users often reassign
feature codes, networks, and layers to meet their requirements and avoid
duplication of coding. The translation mechanism allows for reclassification
of the features based on data provided by the user in the form of a parameter
file.
Attribute data is treated independently from the feature data (where
possible). Users can request attribute data only, provided that the graphic
component (location and occurrence) hasn't changed since a previous retrieval.
Accepting and providing text files accompanied by the corresponding schema
makes this data portable to most database systems.
Topological relationships aren't transferred between systems because no
established standard facilitates this. Topological structures can, however, be
reconstructed on the destination GIS.


Importing Data into the IU


Importing data into the IU database is a multistep process involving a
database administrator. Two issues are important during data preparation: The
data must be properly identified and described to maintain a meaningful data
directory, and the source data must be reorganized into a neutral format for
storage and future distribution.
Figure 6 shows the graphic representation of data to be imported and attribute
information maintained by the data provider. In this example a subdivision
survey plan will be imported, along with key information (primary table) that
identifies each subdivision parcel, and owner and assessment data for each
parcel (indirect table). The indirect table contains information that doesn't
directly describe the feature object, but contains measurements that may or
may not exist for the feature objects. A primary table, on the other hand,
must contain only one identifier record for each feature object.
The import procedure begins with the preparation of the data model; see Figure
7. To create the model, enough information must be received from the data
provider about the dataset(s) to be imported. Using a CASE tool, the data
model is constructed, which defines the entities for which the data provider
is supplying data. Data structures are built for each entity, defining all of
the data elements, and metadata that describes each entity is captured.
Metadata capture follows a standard that sufficiently describes the data
entities within the IU database. The data model describes the features
comprising the subdivision survey plan and associated attribute data.
The purpose for creating a data model is twofold: to export a schema
definition in the form of data definition language (DDL) to be used later for
loading data into the IU database; and to capture information required by the
data directory informing users as to available data. When the data model is
complete, the metadata and the schema definition are exported and used as
input to a load procedure that populates the data directory.
The graphic-data import process involves importing a data file that describes
the spatial constructs of the features. The data provider extracts from the
corporate database a graphic data file, usually in a proprietary format, and
sends it to the IU administrator. A translation process is determined and
format-translation parameters are fed into the translation program along with
the source input file. Under the MLRIS initiative, custom translation programs
were developed for the most common: AutoCAD (DXF), GDS, and Intergraph. The
translation parameters that the administrator creates depend on the features
described by the input dataset. For example, feature codes determined by the
data provider are usually reclassified to adhere to standard developed by the
IU administrator. The translation process produces a data file in a neutral
format that describes the spatial orientation of each feature and its primary
identifier or database key. In our example, the graphic data file consists of
point, line, and centroid information describing the spatial composition of
the subdivision survey plan, along with the feature code associated with the
survey parcels and the parcel identifier attached to each parcel centroid. For
more details on the translation process, see the accompanying text box
entitled, "Translating Geographical Data."
Next, a load-preprocessor program is executed, which ensures that the source
input file is acceptable for loading into the IU database. This process
verifies that the source input file does not contain any unwanted or undefined
features, as described in the data directory in the prior steps. If exceptions
are found, the load process can't continue until the data model is revised to
include the missing features, or the input file is filtered to remove unwanted
features. The preprocessor also attaches a standard header to the input file
to ensure that each dataset is loaded with the same coordinate-projection
information and other key database load parameters.
Finally, the preprocessor program collects statistics and counts of features
and creates an area-coverage index of each dataset processed. An area-coverage
index is created by matching the coordinate information for each feature
against a master tile grid (see Figure 8), where each tile is predetermined in
size and location and varies from 100 to 10,000 sq. km. An area-coverage index
file is created that identifies each tile within Manitoba for which feature
data is present in the dataset being processed. The resulting dataset
area-coverage index is loaded into the data directory so users can view
graphically the area coverage of a dataset and retrieve data by a given tile
number. The area-coverage index is also used for maintenance and retrieval of
data from the database on a tile basis.
The final procedure in loading information into the IU database may involve
the import of attribute data. Indirect attribute information is data that
describes a measurement or occurrence of data associated with a feature. This
information may or may not exist for each feature and is therefore stored and
maintained separately from the primary table.
The data provider typically delivers the indirect attribute data file in
fixed-length record ASCII format which is easily manipulated through SQL and
reformatted in preparation for load into the IU database. (Reformatting is
necessary only where MLRIS standards haven't been maintained by the data
provider, or where key information is reorganized to improve storage and
retrieval.)
The corresponding schema definitions for tables to be created in the IU
database are obtained from the previously constructed data model. An automatic
process generates an import data file consisting of: comma-delimited records,
a load-control file containing the SQL instructions for inserting records into
a relational-database table, and a DDL for creating the database objects
necessary to store the indirect attribute information. The load process for
the indirect attribute data is queued and executed in batch mode.
To ensure a successful data-load procedure, the IU administrator reviews the
log files maintained during the load procedure, uses the GIS engine to display
the graphic data, and queries the attached database records. Once verified,
the data directory is updated one last time to indicate that the loaded
datasets are available for distribution.


Exporting Data from the IU


Exporting data from the IU is initiated by an end user who, by viewing the
data directory on a local PC, has determined that the IU database contains
needed information. The user makes this determination. When required, the IU
host system downloads a fresh copy of the directory to ensure that the user is
getting current information. The user can query the data directory either
through a tabular mechanism or thematic display which identifies the areas of
coverage for each dataset theme. After querying the data directory and
selecting a dataset, the IU host system is connected via modem, and the
request is queued on the host system for processing.
In processing the dataset-retrieval request, the IU administrator must first
extract the data from the IU enterprise database. This extract file is in a
neutral format and must be translated to the format required by the
destination system. The user profile, as stored in the data directory,
contains the information and parameters required to deliver the dataset in the
correct format. The database schema that corresponds to the dataset being
delivered is exported from the data dictionary and integrated with the
dataset. The final dataset can be delivered via modem or put on tape,
diskette, or CD-ROM.


Conclusion


The Information Utility is Manitoba's answer to the data-exchange dilemma. By
using only one format for land-related data, government officials and private
businesses can concentrate on using data to make business decisions instead of
struggling to convert data into workable formats.


Translating Geographical Data


To illustrate the mechanics of translating graphic and attribute data from a
foreign system to the Information Utility's neutral format, I'll examine a
portion of the process used to translate data from a typical geographical
information system called graphic Data System (GDS).
The translation process is initiated by a user with a GDS subsystem developed
to assist in the interchange of data between GDS and the Information Utility
(IU). The first step in the process is to identify the map sheet and area to
be extracted, giving the minimum and maximum coordinate extents. The
translator process is then initiated. It first creates a header for the
translated file, then retrieves all of the feature objects defined in the GDS
database for the given area.
A translation table (see Listing One, page 194) verifies that each feature has
been predefined, and that proper feature coding has been assigned. Some
feature objects are defined as complex objects and require further parsing to
translate the elementary objects that make up the complex object. Each feature
object found in the specified extract area is translated according to its
object type: line, circle, text, or attributes (see Listing Two, Page 104).
The only real anomaly in the process is translating circle features. Circles
are defined differently in different GDS systems; we chose to describe the
circle as a series of line segments. In the translation program, each chord
segment is 0.5 meters in length. This length is variable and can be set
according to the level of resolution required. Sample output from the
translator is available electronically; see "Availability," page 3.
--R.O.
 Figure 1: Spatial data as viewed through an application and transferred
through an exchange format.
 Figure 2: Direct data exchange between formats [n*(n-1)].
 Figure 3: Data exchange through multiple formats.
 Figure 4: Data exchange utilizing a common format [n*2].
 Figure 5: The Information Utility system architecture.
 Figure 6: An example of graphic and attribute data to be loaded into the
Information Utility database.
 Figure 7: Procedural steps for importing graphic and attribute data into the
Information Utility database.
 Figure 8: Creating an area-coverage index by matching feature coordinates
against a tile grid.
[LISTING ONE]


;*******************************************************************************
; GDS OCD to IU feature code two-way translation table
;*******************************************************************************
; This table provides the translation of GIS Object Code Descriptors to their
; respective feature code, layer, network, and feature type assignments. The
; explode flag indicates whether or not the GDS object is a complex object
that
; requires parsing into elementary objects.
; Format must follow this structure:
; GDS OCD (or portion of) = feature code,layer,network,feature type,explode
;*******************************************************************************
;Water FM entities
;*******************************************************************************
VALVE:WATER = wr_main_valv, 301, 301, n, N
HYDRTEE:WATER = wr_hydr_tee, 301, 301, n, N
HYDRANT:WATER = wr_hydrant, 301, 301, n, N
HYDRBRAN:WATER = wr_hydr_brch, 301, 301, l, N
HYDRVALV:WATER = wr_hydr_valv, 301, 301, n, N
MAINS:WATER = wr_watermain, 301, 301, l, N
TEE:WATER = wr_fit_tee, 301, 301, n, N
BEND:WATER = wr_fit_bend, 301, 301, n, N
CROSS:WATER = wr_fit_cross, 301, 301, n, N
PLUG:WATER = wr_fit_plug, 301, 301, n, N
JUNCTION:WATER = wr_junction, 301, 301, n, N
COUPLER:WATER = wr_fit_cplr, 301, 301, n, N
REDUCER:WATER = wr_fit_reduc, 301, 301, n, N
CASEMENT:WATER = wr_casement, 301, 301, l, N
ANODE:WATER = wr_anode, 301, 302, pt, N
PIT:WATER = wr_pit, 301, 301, l, N
DEPTH:WATER = wr_depth_mrk, 301, 302, pt, N
REPAIR:WATER = wr_repair_mk, 301, 302, pt, N
MAINANNO:WATER = wr_anno, 301, 303, pt, Y
VALVANNO:WATER = wr_anno, 301, 303, pt, Y
STREANNO:WATER = wr_anno, 301, 303, pt, N
MISCANNO:WATER = wr_anno, 301, 303, pt, N
BLDGANNO:WATER = wr_anno, 301, 303, pt, N
;*******************************************************************************



[LISTING TWO]

/* *************************************************************************
*/
/* IUC_TRANSLATE_TO_IU - Function to translate GDS graphics to the IU */
/* *************************************************************************
*/
/* received: *library, *details, *IUDATA, count */
/* returned: *num_transed, TRUE if successful */
/* *************************************************************************
*/

#define OBJECT 0 /* GET_ITEM returns 0 to type if object */
#define ITEM 3 /* GET_ITEM returns 3 to type if item */
#define LINE 4 /* GET_ITEM returns 4 to itype if open line */
#define CIRCLE 7 /* GET_ITEM returns 7 to itype if circle */
#define TEXT -1 /* GET_ITEM returns -1 to itype if text block */

/* ****** PROTOTYPE DEFINITIONS *******/
/* TRANSLATE_LINE - Function to translate a line */
void TRANSLATE_LINE(
 FILE *IUDATA);


/* TRANSLATE_CIRCLE - Function to translate a circle */
void TRANSLATE_CIRCLE(
 FILE *IUDATA);

/* TRANSLATE_TEXT - Function to translate text */
void TRANSLATE_TEXT(
 FILE *IUDATA);

int IUC_TRANSLATE_TO_IU(
 IUR_TRANS *library, /* translation library structure */
 ASR_GLODET *details, /* GDS object details structure */
 FILE *IUDATA, /* File handle structure for IUDATA file */
 unsigned *num_transed, /* number of objects translated */
 int lib_pos) /* library structure position */
{
 /* Variable declaration */
 IUR_FACILITIES facilities; /* Structure to hold facilities RDB info */
 IUR_DEPTH_MRK depth; /* Structure to hold depth RDB info */
 ASR_GLEDET other_details; /* GDS object extended details structure */
 int block_num; /* block number inside object to explode */
 int block_type; /* block type inside object to explode */
 char scratch[81]; /* Scratch string writing area */

 /* Get extra details of object */
 ASC_ITEM_DETAILS(&other_details);

 /* Is this graphic an object? */
 if(other_details.type == OBJECT)
 {
 /* Check the explode flag. If it's set, explode object into blocks */
 if(library->explode[lib_pos] == TRUE)
 {
 /* Set up extract loop */
 for(block_num = 1; block_num <= details->nblock; block_num++)
 {
 /* Make this block current and get the type */
 ASC_BLOCK(block_num);
 ASC_GET_BLOCK_TYPE(&block_type);
 /* Is this block a line */
 if(block_type == LINE)
 {
 /* Output the library header. Force type to linear */
 *num_transed = *num_transed + 1;
 fprintf(IUDATA, "feat %d %s %s %s l xy 0.000000 0.000000 1\n",
 *num_transed, library->iu_code[lib_pos][IUDATA_FEAT],
 library->iu_code[lib_pos][IUDATA_LAYER],
 library->iu_code[lib_pos][IUDATA_NETWORK]);
 TRANSLATE_LINE(IUDATA);
 }
 /* Is this block a circle */
 else if(block_type == CIRCLE)
 {
 /* Output the library header. Force type to linear */
 *num_transed = *num_transed + 1;
 fprintf(IUDATA, "feat %d %s %s %s l xy 0.000000 0.000000 1\n",
 *num_transed, library->iu_code[lib_pos][IUDATA_FEAT],
 library->iu_code[lib_pos][IUDATA_LAYER],
 library->iu_code[lib_pos][IUDATA_NETWORK]);

 TRANSLATE_CIRCLE(IUDATA);
 }
 /* Is this block a text */
 else if(block_type == TEXT)
 {
 /* Output the library header */
 *num_transed = *num_transed + 1;
 fprintf(IUDATA, "feat %d %s %s %s %s xy 0.000000 0.000000 1\n",
 *num_transed, library->iu_code[lib_pos][IUDATA_FEAT],
 library->iu_code[lib_pos][IUDATA_LAYER],
 library->iu_code[lib_pos][IUDATA_NETWORK],
 library->iu_code[lib_pos][IUDATA_TYPE]);
 TRANSLATE_TEXT(IUDATA);
 }
 else
 {
 /* This is an unknown block type */
 sprintf(scratch, "%d (%s)", other_details.itype, details->ocd);
 IUC_ERRORS(5, scratch);
 }
 }
 }
 else
 {
 /* Put in the library information */
 *num_transed = *num_transed + 1;
 fprintf(IUDATA, "feat %d %s %s %s %s xy 0.000000 0.000000 1\n",
 *num_transed, library->iu_code[lib_pos][IUDATA_FEAT],
 library->iu_code[lib_pos][IUDATA_LAYER],
 library->iu_code[lib_pos][IUDATA_NETWORK],
 library->iu_code[lib_pos][IUDATA_TYPE]);
 /* Put in the coordinates */
 fprintf(IUDATA, "coor %.3lf %.3lf\n", details->pos.x, details->pos.y);
 }
 }
 /* Is this graphic an ITEM */
 else if(other_details.type == ITEM)
 {
 /* Put in the library information */
 *num_transed = *num_transed + 1;
 fprintf(IUDATA, "feat %d %s %s %s %s xy 0.000000 0.000000 1\n",
 *num_transed, library->iu_code[lib_pos][IUDATA_FEAT],
 library->iu_code[lib_pos][IUDATA_LAYER],
 library->iu_code[lib_pos][IUDATA_NETWORK],
 library->iu_code[lib_pos][IUDATA_TYPE]);
 /* Is this item a line? */
 if(other_details.itype == LINE)
 {
 TRANSLATE_LINE(IUDATA);
 }
 else if(other_details.itype == CIRCLE)
 {
 TRANSLATE_CIRCLE(IUDATA);
 }
 else if(other_details.itype == TEXT)
 {
 TRANSLATE_TEXT(IUDATA);
 }
 else

 {
 /* This is an unknown item type */
 sprintf(scratch, "%d (%s)", other_details.itype, details->ocd);
 IUC_ERRORS(5, scratch);
 *num_transed = *num_transed - 1;
 }
 }
 else
 {
 /* This is not an object or an item */
 sprintf(scratch, "%d", other_details.type);
 IUC_ERRORS(4, scratch);
 return FALSE;
 }
 /* Get the attributes and place them in IUDATA file */
 if(_OW)
 {
 if(IUC_GET_RDB_INFO(details->ocd, &facilities, &depth))
 {
 /* If this is a depth marker, we are interested in depth */
 if(strstr(details->ocd, "DEPTH"))
 {
 fprintf(IUDATA, "attr \"%s\", \"%s\", %3.2lf, %3.2lf\n",
 depth.id, depth.ocd, depth.invert, depth.depth);
 }
 else
 {
 fprintf(IUDATA, "attr \"%s\", %2.1lf, \"%s\", \"%s\"\n",
 facilities.ocd, facilities.size, facilities.install_date,
 facilities.material);
 }
 }
 }
 /* Everthing worked OK */
 return TRUE;
}
/* TRANSLATE_LINE - Function to translate a line */
/* *************************************************************************
*/
void TRANSLATE_LINE(
 FILE *IUDATA)
{
 /* Variable declaration */
 int num_vertices; /* number of vertices in linear object */
 int vertex; /* vertex counter */
 double bulge; /* bulge factor between vertices */
 double vertex_x, vertex_y; /* x and y coords of line vertex */
 /* ********************************************************************* */
 /* Get the number of vertices in this ine item */
 ASC_GET_BLOCK_LENGTH(&num_vertices);
 /* Loop through number of vertices until complete */
 for(vertex = 1; vertex <= num_vertices; vertex++)
 {
 /* Get the vertex and print out to the file */
 ASC_GET_BLOCK_VERTEX(vertex, &bulge, &vertex_x, &vertex_y);
 /* Print out the vertex */
 fprintf(IUDATA, "coor %.3lf %.3lf\n", vertex_x, vertex_y);
 }
}
/* TRANSLATE_CIRCLE - Function to translate a circle */

/* *************************************************************************
*/
void TRANSLATE_CIRCLE(
 FILE *IUDATA)
{
 /* Variable declaration */
 double ctr_x, ctr_y; /* x and y coords of circle centre */
 double radius; /* radius of a circle */
 double chord_length = 0.5; /* chord length for splitting circle */
 double dx, dy; /* delta x and y coords for splitting circle */
 double dO; /* delta theta for the angle to split circle */
 double total_dO; /* summing variable for dO */
 double pi = 3.14159265359; /* value of pi */
 double vertex_x, vertex_y; /* x and y coords of line vertex */
 /* ********************************************************************** */
 /* Get the details of the circle */
 ASC_GET_ARC_DETAILS(2, &ctr_x, &ctr_y, &radius);
 /* Output the first point to the IUDATA file */
 fprintf(IUDATA, "coor %.3lf %.3lf\n", ctr_x + radius, ctr_y);
 /* Calculate the delta angle to split based on chord length */
 dx = ((2 * pow(radius, 2.0)) - pow(chord_length, 2.0)) / (2 * radius);
 dy = sqrt(pow(radius, 2.0) - pow(dx, 2.0));
 dO = atan2(dy, dx);
 /* Loop through and create a line with the calc'd delta theta */
 total_dO = 0.0;
 while(total_dO <= (2 * pi))
 {
 /* Calculate the line vertices */
 vertex_x = ctr_x + dx;
 vertex_y = ctr_y + dy;

 /* Print out the vertex */
 fprintf(IUDATA, "coor %.3lf %.3lf\n", vertex_x, vertex_y);
 /* Increment the delta theta of the circle angle */
 total_dO = total_dO + dO;
 /* Re-calc the dx and dy */
 dy = radius * sin(total_dO);
 dx = radius * cos(total_dO);
 }
 /* Close the circle by outputting the first point again */
 fprintf(IUDATA, "coor %.3lf %.3lf\n", ctr_x + radius, ctr_y);
}
/* TRANSLATE_TEXT - Function to translate text */
/* **************************************************************************
*/
void TRANSLATE_TEXT(
 FILE *IUDATA)
{
 /* Variable declaration */
 double justpos_x, justpos_y; /* x and y coords of text position */
 double rotation_x, rotation_y;/* sin and cos of text rotation */
 double rotate_x, rotate_y; /* x and y coords of text rotation */
 char text[241]; /* text to be translated */
 /* *********************************************************************** */
 /* Get the text position and rotation */
 ASC_GET_BLOCK_TEXT_POSITION(&justpos_x, &justpos_y);
 ASC_GET_BLOCK_TEXT_ROTATION(&rotation_x, &rotation_y);
 /* The rotation supplied is a sine and cosine. Change to xy */
 rotate_x = justpos_x + rotation_x;
 rotate_y = justpos_y + rotation_y;
 /* Put in the coordinates */

 fprintf(IUDATA, "coor %.3lf %.3lf %.3lf %.3lf\n", justpos_x, justpos_y,
 rotate_x, rotate_y);
 /* Put in the text itself */
 ASC_GET_BLOCK_TEXT(text);
 fprintf(IUDATA, "text \"%s\"\n", text);
}


[LISTING THREE]

udb-feature
feat 1 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623153.468 5526056.618
coor 623140.811 5526070.650
feat 2 cow_pid_anno 901 905 pt xy 0.000000 0.000000 1
coor 623156.580 5526071.060 623157.315 5526071.738
text "4-1-16966"
feat 3 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623140.641 5526000.000
coor 623126.927 5526008.687
feat 4 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623064.398 5526000.000
coor 623065.035 5526011.885
coor 623066.079 5526031.367
coor 623067.123 5526050.850
coor 623069.496 5526095.131
coor 623074.089 5526098.292
feat 5 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623046.226 5526000.697
coor 623012.741 5526002.458
feat 6 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623013.857 5526023.146
coor 623012.741 5526002.458
feat 7 cow_pid_anno 901 905 pt xy 0.000000 0.000000 1
coor 623109.960 5526009.726 623110.702 5526010.395
text "11058"
feat 8 cow_pid_anno 901 905 pt xy 0.000000 0.000000 1
coor 623077.940 5526000.900 623078.939 5526000.851
text "17-3-11058"
feat 9 cow_pid_anno 901 905 pt xy 0.000000 0.000000 1
coor 623115.930 5526002.695 623116.836 5526002.273
text "8-3-11058"
feat 10 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623000.000 5526003.125
coor 623012.741 5526002.458
feat 11 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623049.517 5526061.554
coor 623048.431 5526041.472
coor 623047.345 5526021.389
coor 623046.226 5526000.697
coor 623046.188 5526000.000
feat 12 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623012.741 5526002.458
coor 623012.608 5526000.000
feat 13 cow_surv_lin 901 904 l xy 0.000000 0.000000 1
coor 623099.635 5526000.000






December, 1993
Cross-platform Compression


From PCs to UNIX to MVS--and back




Pierre J. Dion


Pierre is an operation support-system manager for the provincial government of
Alberta (Canada) and private consultant. He can be reached on CompuServe at
73740,3452.


As the volume of data transferred between disparate computer systems
increases, so does the demand for efficient data-transmission tools--and data
compression is a proven approach to transfer efficiency. The problem is that
compression utilities for transferring data between mainframe, workstation,
and PC environments are hard to come by. While numerous compression programs
exist for PCs and UNIX workstations, there are few for mainframe environments
such as IBM MVS--and even fewer cross-platform ones. In this article, I'll
address this void by presenting a cross-platform implementation of an existing
Lempel, Ziv, and Huffman (LZH) compression program. My approach to
cross-platform compression is based on the LZHUF.C program mentioned in Mark
Nelson's The Data Compression Book (M&T Books, 1991). The program, originally
written by Haruyasu Yoshizaki and modified by Paul Edwards and Mark Nelson,
provides encoding/decoding functions in a single ANSI C source file.
I had no problems compiling LZHUF.C using MVS SAS/C 5.00 for IBM MVS and Turbo
C++ Professional for the PC. However, in the MVS version, the characters "["
and "]", which aren't supported by MVS EBCDIC must be replaced with the
digraphs ("" and ""), respectively. ANSI C does offer trigraphs as
substitutes, but this only makes C code even more difficult to read,
especially if arrays are used extensively.
During my MVS testing, LZHUF would initially hang when encoding file greater
than 250K. (The CPU would time out.) In an e-mail exchange, Paul Edwards told
me he had the same problem with the LZHUF MVS-compiled code. Instead of
pursuing a solution to the LZHUF code, Paul opted to work with AR.C, an
alternate compression source file written by Okumura. After considering the
rules of portable C, I noticed that LZHUF.C integer declarations which
operated properly in the 16-bit integer PC environment caused problems in the
MVS 32-bit environment. Following standard C guidelines, I converted all
declarations, and the code worked fine. Furthermore, by adjusting LZHUF
lookahead buffer from the original setting of 60 to 15, performance improved
while maintaining over 80 percent compression on data file. More specifically,
MVS LZHUF encodes a 1-Mbyte data file in about 18 host-CPU seconds and decodes
the compressed file in 3 host-CPU seconds. On a DOS-based 486/33 with 64K
internal cache, LZHUF encodes a 1-Mbyte data file in about 82 CPU seconds and
decodes the compressed file in 45 CPU seconds. In a single VAX session,
preliminary encoding and decoding tests were carried out with success, but the
session was interrupted for maintenance, leaving no time to establish
performance estimates.
Although LZHUF.C for MS-DOS doesn't offer performance equal to PKWARE, LHARC,
or ARJ, it had the potential for considerable savings in storage space and
transfer time, while minimizing communication risks when used as the basis for
developing a cross-platform compression program. On the other hand, MVS LZHUF
is resource intensive. During primetime, some sites can charge $1.00 or more
per host CPU second. These charges may motivate you to take advantage of site
discounts for night runs.
Even though I got LZHUF to work properly in each environment, I still needed
to address the task of converting ASCII to EBCDIC and the tricky issue of MVS
record formats.


ASCII and EBCDIC Translation


IBM mainframe environments such as MVS and VM rely on a character set called
"EBCDIC." In addition, any mainframe installation can implement specific
versions of EBCDIC defined by a code-page numbering system. The IBM 3270
terminal emulation software user's guide provides translation table for EBCDIC
code which (I assume) is generally used for North American installations.
However, the code page should be verified with site installation to avoid
translation errors.
Most EBCDIC characters are represented in ASCII but with different decimal
values that can be translated by cross-references. For example, the character
A is equal to decimal 193 in EDCDIC and 65 in ASCII. As such, in the MVS
version of LZHUF, ASCII character values are used to reference the subscript
of an array of EBCDIC characters. The inverse is used in the MS-DOS version of
LZHUF. This approach to character translation is shown in Listing One (page
98). Since encoding is a slower process, translation is done when decoding and
just before the byte gets written to the output file.
To further enhance the translation function and minimize the use of
command-line arguments, I modified the MVS and PC versions of LZHUF to
indicate the origin of the data file by writing either a 1Eh or 1Fh,
respectively, as the first byte of the encoded file. Thus, this first header
byte is verified before decoding and if the encoded file origin is different
from the decoding environment, translation occurs accordingly; see Listing Two
(page 98).
The values 1Eh and 1Fh were chosen because they are the same in ASCII and
EDCDIC. Also, 1Eh in ASCII represents an arrow pointing upward (MVS) and 1Fh
points downward (MS-DOS). Interestingly enough, 18h and 19h could have been
used for the exact same reasons.
Because of the change in the encoded file header, LZHUF was renamed LZH. Thus,
the LZH-encoded file has a header in which the first byte designates origin
and the next four bytes are an unsigned long integer representing the text
size.


Handling MVS and MS-DOS Files


MVS file allocation and handling is special. Unlike MS-DOS and UNIX, the unit
of file processing on MVS is a record, not a character. This makes relative
access difficult to attain. In MS-DOS and UNIX, you can move a file pointer
back and forth with ease. In MVS, you can move a file pointer, but with
certain restrictions and only on files allocated with particular record
formats. To use standard C functions such as fseek and rewind, your file must
be allocated with a record-format fixed-block sequential (FBS). However, most
sites predominantly use other record formats such as fixed block (FB) and
variable block (V), which do not support relative access. Even if FBS is
highly recommended as an efficient record format, changing existing practice
is problematic. Currently, IBM is rewriting its mainframe operating systems in
C, which may increase usage of record formats that support relative file
access. Until then, some changes to LZHUF.C are needed to support various
types of record format, by removing functions requiring relative access.
In LZH, the relative access functions fseek and rewind are used to determine
text size before encoding (fseek reaches the end of file and returns the file
size, while rewind rewinds.) Alternatively, MVS LZH goes through the whole
file and counts each byte, then closes and reopens the file as shown in
Listing Three (page 98). In the original LZHUF, all files are open and closed
in the main function using command-line arguments and relative file access is
performed in the function Encode(). Because MVS LZH closes and reopens the
data file, the procedure to determine text size was moved into main. The
alternative is to write some Job Control Language or CList that creates a
temporary file allocated with a record format of FBS and copy the original
data into it, leaving LZH to encode the temporary file. But this option
consumes too much system overhead when compared to the simple changes I made
to MVS LZH. On DOS and UNIX, LZH should maintain the relative access
functions.
Okumura's AR.C, as chosen by Edwards, makes extensive use of relative access
methods including fwrite and fread. Applying this program as is to MVS would
require a standardized usage of the file record format FBS--a change justified
by the benefits.
Under MVS, the LZH-encoded file should be allocated with record format
variable since the size of the encoded file is not determined until
compression is completed. If the record format is fixed, MVS by default
returns an error for the last incomplete record and pads the record. This
isn't a critical error since LZH decodes every byte until it reaches the
original text-file size specified in the encoded file header. What is critical
here is decoding to the right record size.
When encoding an DOS file, each byte is processed including carriage return
(CR) and linefeed (LF). These two sequential bytes are significant to DOS
since they established the end of a record and allow the length to vary. In
MVS, you can decode to a variable-block record format and translate and write
all DOS-legible characters to that file. However, on MVS, the proper way to
decode a file encoded under DOS is to allocate the MVS file with record format
fixed, preferably fixed-block sequential, complying to the largest record
length of the DOS data file. LZH must then be informed by a command-line
argument of the length of that record. LZH will refer to this value whenever
it encounters a CRLF sequence and pad accordingly as shown in Listing Four
(page 98). This will be different for UNIX-encoded files and should be
modified as needed. Moreover, when decoding files from other environments,
control and extended characters are often irrelevant. My experiments in MVS
concluded that these characters should not be translated and written to the
decoded output data set, as they often disrupt simple file-manipulation tasks
such as browsing or editing. If the record-length command-line argument is
omitted, then MVS LZH will, by default, decode according to record-format size
specified during file allocation. Finally, before closing the MVS data file,
the Decode() function pushes a CRLF sequence through Translate(), allowing a
verification of the line position at the DOS end of file and performing any
required padding.
Since MVS provides no characters to indicate the end of a record, decoding MVS
LZH compressed file in DOS requires knowledge of the record length, as given
by the command-line argument, in order to insert a CRLF at the end of each
record of the DOS output file. However, if the command-line argument for the
record length is not specified, then the CRLF insertion will be omitted, and
the DOS output file will be a data stream of fixed-size records with no
end-of-record marker. LZH DOS Translate() version is shown in Listing Five
(page 98). For practical purposes, record size could be noted in encoded
filenames.
The complete C source code for MVS LZH and DOS LZH is available
electronically; see "Availability," page 3.


LZH: An Applied Solution


LZH for MS-DOS and MVS is being used by the Alberta Government for bulk data
transfers in health-care billing. In this context, compressed data transferred
between private operators and the government provides transmission-cost
savings. In addition, data compression plays a crucial role in capacity
planning. A compression factor of 80 percent or better relaxes serious
constraints associated with transfer schedules and bulk data transfers
crowding the allotted time frame. For example, in the same transfer window,
data compression can offer sites the ability to increase data bulk and
implement procedures to resend on error; alternatively, the government can
revise the schedule to absorb growth in the number of participating sites. LZH
is also made available for usage within Alberta government departments, where
other potential savings can be exploited.
The richness and abundance of system utilities documented in portable C source
files should provide enough justification to promote implementation and usage
of C development tools on every platform. Upon critical review of this
worldwide C source library, it would be very easy to find many utilities that
can generate savings of various kinds once they're adapted to serve a purpose.
The LZH cross-platform compression project is an example of this potential.
However, my personal experience indicates that C is under-utilized in IBM
mainframe environments. Opening access to C tools would invite a greater
following of C programmers and a related proliferation of development. In the
spirit of open systems, this would represent a significant contribution.


References


IBM PC 3270 Terminal Emulation Software V1.21 User's Guide (1989). Austin,
Texas: IBM.
Nelson, Mark. The Data Compression Book. San Mateo, CA: M&T Books, 1991.
Plauger P.J. and Jim Brodie. Standard C: Programmer's Quick Reference Series.
Redmond, WA: Microsoft Press, (1989).
[LISTING ONE]


 /* ascii to ebcdic code page 437 - MVS version */
unsigned char ebcdic[256] = {
 0x00, 0x01, 0x02, 0x03, 0x37, 0x2D, 0x2E, 0x2F,
 0x16, 0x05, 0x25, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 ...
 0xEE, 0xEF, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
 };
 /* ebcdic to ascii code page 437 - DOS version */
unsigned char ascii[256] = {
 0x00, 0x01, 0x02, 0x03, 0x9C, 0x09, 0x86, 0x7F,
 0x97, 0x8D, 0x8E, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 ...
 0x38, 0x39, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
 };
static void Translate( unsigned char c)
{
 ... for MVS version
 putc(ebcdic[c],outfile);
 /* write corresponding ebcdic character to output file by
 referring to subscipt c (ascii char) in ebcdic table */
 ... for DOS version
 putc(ascii[c],outfile);
 /* write corresponding ascii character to output file by
 referring to subscipt c (ebcdic char) in ascii table */
 ...
}



[LISTING TWO]

unsigned char origin = 0x1E; /* initializing to MVS origin */
 /* ascii to ebcdic code page 437 */
unsigned char ebcdic[256] = {
 0x00, 0x01, 0x02, 0x03, 0x37, 0x2D, 0x2E, 0x2F,
 0x16, 0x05, 0x25, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 ...
 0xEE, 0xEF, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
 };
/* character translation */
static void Translate( unsigned short c )
{
 ...
 if ( origin == 0x1F ) {
 /* DOS file origin decoding on MVS
 convert ascii to ebcdic from table */
 if ( putc( ebcdic[c], outfile ) == EOF ) Error(wterr);
 ...
 } else {
 /* MVS file origin decoding on MVS */
 if ( putc( c, outfile ) == EOF ) Error(wterr);
 }
}
/* compress */
static void Encode( void )
{
 ...
 /* write first byte to compressed file */

 fputc(origin,outfile);
 ...
}
/* decode */
static void Decode( void )
{
 unsigned short c;
 ...
 Translate(c);
 ...
}
short main( short argc, char *argv[] )
{
 ...
 if (toupper(*argv[1]) == 'E') {
 ...
 Encode();
 } else {
 ...
 /* read first byte of compressed file
 and verify MVS or PC DOS origin */
 origin = fgetc(infile);
 if ( origin == 0x1E origin == 0x1F )
 Decode();
 ...
 }
 ...
}


[LISTING THREE]

Original relative file access used in LZHUF function Encode()

 fseek(infile, 0L, 2); /* go to end of file */
 textsize = ftell(infile); /* get file size */
 rewind(infile); /* go to beginning of file */


Determining text size without relative access.
Procedure moved to the main function of MVS LZH.

 /* count bytes to textsize */
 while ( getc( infile ) != EOF ) textsize++;
 /* close date file */
 fclose( infile );
 /* re-open data file, position at beginning of file */
 if (s = argv[2], (infile = fopen(s, "rb")) == NULL) {
 printf("Cannot re-open %s\n", s);
 return EXIT_FAILURE;
 }
 Encode();




[LISTING FOUR]

unsigned long llen = 0; /* initializing record line length */

 /* translate ascii to ebcdic and pad record length */
static void Translate(unsigned short c)
{
 unsigned short l = 0;
 static unsigned short lpos = 0, b = 0x40;
 if ( origin == 0x1F ) { /* DOS origin - MVS decoding */
 if ( b == 0x0D && c == 0x0A && lpos > 0 ) {
 /* 0x0D CR and 0x0A LF indicating DOS end of record
 with line position (lpos) active */
 for ( l = lpos; l < llen; l++)
 /* pad with EBCDIC spaces */
 if ( putc(0x40, outfile ) == EOF ) Error(wterr);
 lpos = 0; /* reset line position */
 }
 if ( c > 0x1F ) { /* skip DOS control characters */
 /* translate ascii to ebcdic */
 if ( putc(ebcdic[c], outfile ) == EOF ) Error(wterr);
 /* keep track of line position if required */
 if ( llen > 0 ) lpos++ ;
 }
 } else { /* MVS origin - MVS decoding */
 if ( putc(c, outfile ) == EOF ) Error(wterr);
 }
 b = c; /* since CR LF is sequential, b must retain c */
}
short main( short argc, char *argv[] )
{
 ...
 /* convert argument to llen*/
 if (argv[4] != NULL) llen = atol(argv[4]);
 ...
}



[LISTING FIVE]

unsigned long llen = 0; /* initializing record line length */
/* translate ebcdic to ascii and control record length */
static void Translate( unsigned short c )
{
 static unsigned short lpos = 0;
 if ( origin == 0x1E ) { /* MVS origin - DOS decoding */
 /* convert ebcdic to ascii from table */
 if ( putc( ascii[c], outfile ) == EOF ) Error(wterr);
 /* perform record length control as requested */
 if (llen > 0) {
 if (lpos == llen) {
 /* insert end of record DOS CR LF */
 if ( ( putc( 0x0D, outfile ) == EOF ) 
 ( putc( 0x0A, outfile ) == EOF ) ) Error(wterr);
 /* reset line position */
 lpos = 0;
 } else {
 /* keep track of line position */
 lpos++;
 }
 }
 } else { /* DOS origin - MVS decoding */

 if ( putc( c, outfile ) == EOF ) Error(wterr);
 }
}
short main( short argc, char *argv[] )
{
 ...
 /* convert argument to llen*/
 if (argv[4] != NULL) llen = atol(argv[4]);
 ...
}


[LISTING 6] MVS LZH

/**************************************************************
 MVS LZH based on:
 lzhuf.c
 written by Haruyasu Yoshizaki 1988/11/20
 some minor changes 1989/04/06
 comments translated by Haruhiko Okumura 1989/04/07
 getbit and getbyte modified 1990/03/23 by Paul Edwards
 so that they would work on machines where integers are
 not necessarily 16 bits (although ANSI guarantees a
 minimum of 16). This program has compiled and run with
 no errors under Turbo C 2.0, Power C, and SAS/C 4.5
 (running on an IBM mainframe under MVS/XA 2.2). Could
 people please use YYYY/MM/DD date format so that everyone
 in the world can know what format the date is in?
 external storage of filesize changed 1990/04/18 by Paul Edwards to
 Intel's "little endian" rather than a machine-dependant style so
 that files produced on one machine with lzhuf can be decoded on
 any other. "little endian" style was chosen since lzhuf
 originated on PC's, and therefore they should dictate the
 standard.
 initialization of something predicting spaces changed 1990/04/22 by
 Paul Edwards so that when the compressed file is taken somewhere
 else, it will decode properly, without changing ascii spaces to
 ebcdic spaces. This was done by changing the ' ' (space literal)
 to 0x20 (which is the far most likely character to occur, if you
 don't know what environment it will be running on.
 storage of filesize modified 1990/06/02 by Mark Nelson.
 When reading in the file size, I was getting sign extension
 when reading in bytes greater than or equal to 0x80, which
 messed everything up.
 MVS version modified 1993/03/05 by Pierre Dion.
 Added Translate() for ASCII to EBCDIC translation and padding
 to record size as defined by arguments. Also, 0x0D and 0x0A
 (DOS CR LF) and other DOS control characters are held back.
 Encoded file header modified to designate origin 0x1E for
 MVS and 0x1F for DOS.
**************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

FILE *infile, *outfile;
static unsigned long textsize = 0, codesize = 0, printcount = 0;
char wterr[] = "Can't write.";


unsigned long llen = 0;
unsigned char origin = 0x1E; /* initializing MVS origin */

 /* ascii to ebcdic code page 437 */
unsigned char ebcdic[256] = {
 0x00, 0x01, 0x02, 0x03, 0x37, 0x2D, 0x2E, 0x2F,
 0x16, 0x05, 0x25, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 0x10, 0x11, 0x12, 0x13, 0x3C, 0x3D, 0x32, 0x26,
 0x18, 0x19, 0x3F, 0x27, 0x1C, 0x1D, 0x1E, 0x1F,
 0x40, 0x5A, 0x7F, 0x7B, 0x5B, 0x6C, 0x50, 0x7D,
 0x4D, 0x5D, 0x5C, 0x4E, 0x6B, 0x60, 0x4B, 0x61,
 0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7,
 0xF8, 0xF9, 0x7A, 0x5E, 0x4C, 0x7E, 0x6E, 0x6F,
 0x7C, 0xC1, 0xC2, 0xC3, 0xC4, 0xC5, 0xC6, 0xC7,
 0xC8, 0xC9, 0xD1, 0xD2, 0xD3, 0xD4, 0xD5, 0xD6,
 0xD7, 0xD8, 0xD9, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6,
 0xE7, 0xE8, 0xE9, 0x4A, 0xE0, 0x4F, 0x5F, 0x6D,
 0x79, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
 0x88, 0x89, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96,
 0x97, 0x98, 0x99, 0xA2, 0xA3, 0xA4, 0xA5, 0xA6,
 0xA7, 0xA8, 0xA9, 0xC0, 0x6A, 0xD0, 0xA1, 0x07,
 0x20, 0x21, 0x22, 0x23, 0x24, 0x15, 0x06, 0x17,
 0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x09, 0x0A, 0x1B,
 0x30, 0x31, 0x1A, 0x33, 0x34, 0x35, 0x36, 0x08,
 0x38, 0x39, 0x3A, 0x3B, 0x04, 0x14, 0x3E, 0xE1,
 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48,
 0x49, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57,
 0x58, 0x59, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67,
 0x68, 0x69, 0x70, 0x71, 0x72, 0x73, 0x74, 0x75,
 0x76, 0x77, 0x78, 0x80, 0x8A, 0x8B, 0x8C, 0x8D,
 0x8E, 0x8F, 0x90, 0x9A, 0x9B, 0x9C, 0x9D, 0x9E,
 0x9F, 0xA0, 0xAA, 0xAB, 0xAC, 0xAD, 0xAE, 0xAF,
 0xB0, 0xB1, 0xB2, 0xB3, 0xB4, 0xB5, 0xB6, 0xB7,
 0xB8, 0xB9, 0xBA, 0xBB, 0xBC, 0xBD, 0xBE, 0xBF,
 0xCA, 0xCB, 0xCC, 0xCD, 0xCE, 0xCF, 0xDA, 0xDB,
 0xDC, 0xDD, 0xDE, 0xDF, 0xEA, 0xEB, 0xEC, 0xED,
 0xEE, 0xEF, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
 };


/*************** Start of LZHUF unchanged code section ***************/


static void Error(char *message)
{
 fprintf(stderr,"\n%s\n", message);
 exit(EXIT_FAILURE);
}

/********** LZSS compression **********/

#define N 4096 /* buffer size */
#define F 15 /* lookahead buffer size */
#define THRESHOLD 2
#define NIL N /* leaf of tree */

unsigned char text_buf[N + F - 1];


static unsigned short match_position, match_length,
 lson[N + 1], rson[N + 257], dad[N + 1];

static void InitTree(void) /* initialize trees */
{
 short i;

 for (i = N + 1; i <= N + 256; i++)
 rson[i] = NIL; /* root */
 for (i = 0; i < N; i++)
 dad[i] = NIL; /* node */
}

static void InsertNode(short r) /* insert to tree */
{
 short i, p, cmp;
 unsigned char *key;
 unsigned short c;

 cmp = 1;
 key = &text_buf[r];
 p = N + 1 + key[0];
 rson[r] = lson[r] = NIL;
 match_length = 0;
 for ( ; ; ) {
 if (cmp >= 0) {
 if (rson[p] != NIL)
 p = rson[p];
 else {
 rson[p] = r;
 dad[r] = p;
 return;
 }
 } else {
 if (lson[p] != NIL)
 p = lson[p];
 else {
 lson[p] = r;
 dad[r] = p;
 return;
 }
 }
 for (i = 1; i < F; i++)
 if ((cmp = key[i] - text_buf[p + i]) != 0)
 break;
 if (i > THRESHOLD) {
 if (i > match_length) {
 match_position = ((r - p) & (N - 1)) - 1;
 if ((match_length = i) >= F)
 break;
 }
 if (i == match_length) {
 if ((c = ((r - p) & (N - 1)) - 1) < match_position) {
 match_position = c;
 }
 }
 }
 }
 dad[r] = dad[p];

 lson[r] = lson[p];
 rson[r] = rson[p];
 dad[lson[p]] = r;
 dad[rson[p]] = r;
 if (rson[dad[p]] == p)
 rson[dad[p]] = r;
 else
 lson[dad[p]] = r;
 dad[p] = NIL; /* remove p */
}

static void DeleteNode(short p) /* remove from tree */
{
 short q;

 if (dad[p] == NIL)
 return; /* not registered */
 if (rson[p] == NIL)
 q = lson[p];
 else
 if (lson[p] == NIL)
 q = rson[p];
 else {
 q = lson[p];
 if (rson[q] != NIL) {
 do {
 q = rson[q];
 } while (rson[q] != NIL);
 rson[dad[q]] = lson[q];
 dad[lson[q]] = dad[q];
 lson[q] = lson[p];
 dad[lson[p]] = q;
 }
 rson[q] = rson[p];
 dad[rson[p]] = q;
 }
 dad[q] = dad[p];
 if (rson[dad[p]] == p)
 rson[dad[p]] = q;
 else
 lson[dad[p]] = q;
 dad[p] = NIL;
}

/* Huffman coding */

#define N_CHAR (256 - THRESHOLD + F)
 /* kinds of characters (character code = 0..N_CHAR-1) */
#define T (N_CHAR * 2 - 1) /* size of table */
#define R (T - 1) /* position of root */
#define MAX_FREQ 0x8000 /* updates tree when the */
 /* root frequency comes to this value. */

typedef unsigned char uchar;


/* table for encoding and decoding the upper 6 bits of position */

/* for encoding */

uchar p_len[64] = {
 0x03, 0x04, 0x04, 0x04, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x06, 0x06, 0x06, 0x06,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08,
 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08
};
uchar p_code[64] = {
 0x00, 0x20, 0x30, 0x40, 0x50, 0x58, 0x60, 0x68,
 0x70, 0x78, 0x80, 0x88, 0x90, 0x94, 0x98, 0x9C,
 0xA0, 0xA4, 0xA8, 0xAC, 0xB0, 0xB4, 0xB8, 0xBC,
 0xC0, 0xC2, 0xC4, 0xC6, 0xC8, 0xCA, 0xCC, 0xCE,
 0xD0, 0xD2, 0xD4, 0xD6, 0xD8, 0xDA, 0xDC, 0xDE,
 0xE0, 0xE2, 0xE4, 0xE6, 0xE8, 0xEA, 0xEC, 0xEE,
 0xF0, 0xF1, 0xF2, 0xF3, 0xF4, 0xF5, 0xF6, 0xF7,
 0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
};

/* for decoding */
uchar d_code[256] = {
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01,
 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02,
 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03,
 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08,
 0x09, 0x09, 0x09, 0x09, 0x09, 0x09, 0x09, 0x09,
 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A,
 0x0B, 0x0B, 0x0B, 0x0B, 0x0B, 0x0B, 0x0B, 0x0B,
 0x0C, 0x0C, 0x0C, 0x0C, 0x0D, 0x0D, 0x0D, 0x0D,
 0x0E, 0x0E, 0x0E, 0x0E, 0x0F, 0x0F, 0x0F, 0x0F,
 0x10, 0x10, 0x10, 0x10, 0x11, 0x11, 0x11, 0x11,
 0x12, 0x12, 0x12, 0x12, 0x13, 0x13, 0x13, 0x13,
 0x14, 0x14, 0x14, 0x14, 0x15, 0x15, 0x15, 0x15,
 0x16, 0x16, 0x16, 0x16, 0x17, 0x17, 0x17, 0x17,
 0x18, 0x18, 0x19, 0x19, 0x1A, 0x1A, 0x1B, 0x1B,
 0x1C, 0x1C, 0x1D, 0x1D, 0x1E, 0x1E, 0x1F, 0x1F,
 0x20, 0x20, 0x21, 0x21, 0x22, 0x22, 0x23, 0x23,
 0x24, 0x24, 0x25, 0x25, 0x26, 0x26, 0x27, 0x27,
 0x28, 0x28, 0x29, 0x29, 0x2A, 0x2A, 0x2B, 0x2B,
 0x2C, 0x2C, 0x2D, 0x2D, 0x2E, 0x2E, 0x2F, 0x2F,
 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
 0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F,
};

uchar d_len[256] = {
 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03,

 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03,
 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03,
 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04, 0x04,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05, 0x05,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06, 0x06,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07, 0x07,
 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08,
 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08, 0x08,
};

unsigned short freq[T + 1]; /* frequency table */

short prnt[T + N_CHAR];/* pointers to parent nodes, except for the */
 /* elements [T..T + N_CHAR - 1] which are used to get */
 /* the positions of leaves corresponding to the codes. */

short son[T]; /* pointers to child nodes (son[], son[] + 1) */

unsigned short getbuf = 0;
uchar getlen = 0;

static short GetBit(void) /* get one bit */
{
 unsigned short i;

 while (getlen <= 8) {
 if ((short)(i = getc(infile)) < 0) i = 0;
 getbuf = i << (8 - getlen);
 getlen += 8;
 }
 i = getbuf;
 getbuf <<= 1;
 getlen--;
 return (short)((i & 0x8000) >> 15);
}


static short GetByte(void) /* get one byte */
{
 unsigned short i;

 while (getlen <= 8) {
 if ((short)(i = getc(infile)) < 0) i = 0;
 getbuf = i << (8 - getlen);
 getlen += 8;
 }
 i = getbuf;
 getbuf <<= 8;
 getlen -= 8;
 return (short)((i & 0xff00) >> 8);
}

unsigned short putbuf = 0;
uchar putlen = 0;


/* output c bits of code */

static void Putcode(short l, unsigned long c)
{
 putbuf = c >> putlen;
 if ((putlen += l) >= 8) {
 if (putc(putbuf >> 8, outfile) == EOF) {
 Error(wterr);
 }
 if ((putlen -= 8) >= 8) {
 if (putc(putbuf, outfile) == EOF) {
 Error(wterr);
 }
 codesize += 2;
 putlen -= 8;
 putbuf = c << (l - putlen);
 } else {
 putbuf <<= 8;
 codesize++;
 }
 }
}


/* initialization of tree */

static void StartHuff(void)
{
 short i, j;

 for (i = 0; i < N_CHAR; i++) {
 freq[i] = 1;
 son[i] = i + T;
 prnt[i + T] = i;
 }
 i = 0; j = N_CHAR;
 while (j <= R) {
 freq[j] = freq[i] + freq[i + 1];
 son[j] = i;
 prnt[i] = prnt[i + 1] = j;

 i += 2; j++;
 }
 freq[T] = 0xffff;
 prnt[R] = 0;
}


/* reconstruction of tree */

static void reconst(void)
{
 short i, j, k;
 unsigned short f, l;

 /* collect leaf nodes in the first half of the table */
 /* and replace the freq by (freq + 1) / 2. */
 j = 0;
 for (i = 0; i < T; i++) {
 if (son[i] >= T) {
 freq[j] = (freq[i] + 1) / 2;
 son[j] = son[i];
 j++;
 }
 }
 /* begin constructing tree by connecting sons */
 for (i = 0, j = N_CHAR; j < T; i += 2, j++) {
 k = i + 1;
 f = freq[j] = freq[i] + freq[k];
 for (k = j - 1; f < freq[k]; k--);
 k++;
 l = (j - k) * 2;
 memmove(&freq[k + 1], &freq[k], l);
 freq[k] = f;
 memmove(&son[k + 1], &son[k], l);
 son[k] = i;
 }
 /* connect prnt */
 for (i = 0; i < T; i++) {
 if ( (k = son[i]) >= T ) {
 prnt[k] = i;
 } else {
 prnt[k] = prnt[ k + 1 ] = i;
 }
 }
}


/* increment frequency of given code by one, and update tree */
static void update(unsigned short c)
{
 short i, j, k, l;

 if (freq[R] == MAX_FREQ) {
 reconst();
 }
 c = prnt[c + T];
 do {
 k = ++freq[c];


 /* if the order is disturbed, exchange nodes */
 if (k > freq[l = c + 1]) {
 while (k > freq[++l]);
 l--;
 freq[c] = freq[l];
 freq[l] = k;

 i = son[c];
 prnt[i] = l;
 if (i < T) prnt[i + 1] = l;

 j = son[l];
 son[l] = i;

 prnt[j] = c;
 if (j < T) prnt[j + 1] = c;
 son[c] = j;

 c = l;
 }
 } while ((c = prnt[c]) != 0); /* repeat up to root */
}
unsigned short code, len;

static void EncodeChar(unsigned short c)
{
 unsigned short i;
 short j, k;
 i = 0;
 j = 0;
 k = prnt[c + T];

 /* travel from leaf to root */
 do {
 i >>= 1;

 /* if node's address is odd-numbered, choose bigger brother node */
 if (k & 1) i += 0x8000;

 j++;
 } while ((k = prnt[k]) != R);
 Putcode(j, i);
 code = i;
 len = j;
 update(c);
}

static void EncodePosition(unsigned short c)
{
 unsigned short i;

 /* output upper 6 bits by table lookup */
 i = c >> 6;
 Putcode((short) p_len[i],(unsigned long) p_code[i] << 8);
 /* output lower 6 bits verbatim */
 Putcode(6, (c & 0x3f) << 10);
}

static void EncodeEnd(void)

{
 if (putlen) {
 if (putc(putbuf >> 8, outfile) == EOF) {
 Error(wterr);
 }
 codesize++;
 }
}

static short DecodeChar(void)
{
 unsigned short c;

 c = son[R];

 /* travel from root to leaf, */
 /* choosing the smaller child node (son[]) if the read bit is 0, */
 /* the bigger (son[]+1) if 1 */
 while (c < T) {
 c += GetBit();
 c = son[c];

 }
 c -= T;
 update(c);
 return (short)c;
}

static short DecodePosition(void)
{
 unsigned short i, j, c;

 /* recover upper 6 bits from table */
 i = GetByte();
 c = (unsigned short)d_code[i] << 6;
 j = d_len[i];

 /* read lower 6 bits verbatim */
 j -= 2;
 while (j--) {
 i = (i << 1) + GetBit();
 }
 return (short)(c (i & 0x3f));
}

/******************* End of LZHUF unchanged code section ***************/


/* translate ascii to ebcdic and pad record length */

static void Translate(unsigned short c)
{
 unsigned short l = 0;
 static unsigned short lpos = 0, b = 0x40;

 if ( origin == 0x1F ) { /* DOS origin - MVS decoding */
 if ( b == 0x0D && c == 0x0A && lpos > 0 ) {
 /* 0x0D CR and 0x0A LF indicating DOS end of record
 with line position (lpos) active */

 for ( l = lpos; l < llen; l++)
 /* pad with EBCDIC spaces */
 if ( putc(0x40, outfile ) == EOF ) Error(wterr);
 lpos = 0; /* reset line position */
 }
 if ( c > 0x1F ) { /* skip DOS control characters */
 /* translate ascii to ebcdic */
 if ( putc(ebcdic[c], outfile ) == EOF ) Error(wterr);
 /* keep track of line position if required */
 if ( llen > 0 ) lpos++ ;
 }
 } else { /* MVS origin - MVS decoding */
 if ( putc(c, outfile ) == EOF ) Error(wterr);
 }
 b = c; /* since CR LF is sequential, b must retain c */
}


/* compression */

static void Encode(void)
{
 short i, c, len, r, s, last_match_length;

 fputc(0x1E,outfile); /* designate MVS origin */

 fputc((short)((textsize & 0xff000000L) >> 24),outfile);
 fputc((short)((textsize & 0xff0000L) >> 16),outfile);
 fputc((short)((textsize & 0xff00) >> 8),outfile);
 fputc((short)((textsize & 0xff)),outfile);
 if (ferror(outfile))
 Error(wterr); /* output size of text */
 if (textsize == 0)
 return;

 printf("In : %ld bytes\n", textsize);
 textsize = 0; /* rewind and re-read */
 StartHuff();
 InitTree();
 s = 0;
 r = N - F;
 for (i = s; i < r; i++)
 text_buf[i] = 0x20;
 for (len = 0; len < F && (c = getc(infile)) != EOF; len++)
 text_buf[r + len] = c;
 textsize = len;
 for (i = 1; i <= F; i++)
 InsertNode((short) (r - i));
 InsertNode(r);
 do {
 if (match_length > len)
 match_length = len;
 if (match_length <= THRESHOLD) {
 match_length = 1;
 EncodeChar((unsigned short) text_buf[r]);
 } else {
 EncodeChar((unsigned short) (255 - THRESHOLD + match_length));
 EncodePosition(match_position);
 }

 last_match_length = match_length;
 for (i = 0; i < last_match_length &&
 (c = getc(infile)) != EOF; i++) {
 DeleteNode(s);
 text_buf[s] = c;
 if (s < F - 1)
 text_buf[s + N] = c;
 s = (s + 1) & (N - 1);
 r = (r + 1) & (N - 1);
 InsertNode(r);
 }
 if ((textsize += i) > printcount) {
 printcount += 4096;
 }
 while (i++ < last_match_length) {
 DeleteNode(s);
 s = (s + 1) & (N - 1);
 r = (r + 1) & (N - 1);
 if (--len) InsertNode(r);
 }
 } while (len > 0);
 EncodeEnd();
 printf("\nOut: %ld bytes\n", codesize);
 printf("Out/In: %.3f\n", (double)codesize / textsize);
}

static void Decode(void) /* recover */
{
 short i, j, k, r, c;
 unsigned long count;

 textsize = (unsigned char) fgetc(infile);
 textsize <<= 8;
 textsize = (unsigned char) fgetc(infile);
 textsize <<= 8;
 textsize = (unsigned char) fgetc(infile);
 textsize <<= 8;
 textsize = (unsigned char) fgetc(infile);
 if (ferror(infile))
 Error("Can't read"); /* read size of text */
 if (textsize == 0)
 return;
 printf("Original Text size = %ld\n", textsize );
 StartHuff();
 for (i = 0; i < N - F; i++)
 text_buf[i] = 0x20;
 r = N - F;
 for (count = 0; count < textsize; ) {
 c = DecodeChar();
 if (c < 256) {
 Translate(c);
 text_buf[r++] = c;
 r &= (N - 1);
 count++;
 } else {
 i = (r - DecodePosition() - 1) & (N - 1);
 j = c - 255 + THRESHOLD;
 for (k = 0; k < j; k++) {
 c = text_buf[(i + k) & (N - 1)];

 Translate(c);
 text_buf[r++] = c;
 r &= (N - 1);
 count++;
 }
 }
 if (count > printcount) {
 printcount += 4096;
 }
 }
 /* forcing DOS CR LF on Translate ensures proper closing of MVS dataset */
 if ( origin == 0x1F ) {
 Translate(0x0D);
 Translate(0x0A);
 }
 printf("\nRestored\n");
}

short main(short argc, char *argv[])
{
 char *s;

 printf("\nLZH 1.02 - Multi-Platform Compression/Decoding,\n"
 " Based on LZHUF written by Haruyasu Yoshizaki 1988 (Japan),\n"
 " modified by Paul Edwards 1990 (Australia),\n"
 " and Mark Nelson 1990 (USA),\n"
 " IBM/MVS<>PC/DOS by Pierre Dion 1993 (Canada),\n");

 if (argc < 4) {
 printf("'lzh e file1 file2' encodes file1 into file2.\n"
 "'lzh d file2 file1 l' decodes file2 into file1.\n"
 " 'l' specifies record length,\n"
 " LZH will pad original ascii file to record length\n");
 return EXIT_FAILURE;
 }
 if ((s = argv[1], s[1] strpbrk(s, "DEde") == NULL)
 (s = argv[2], (infile = fopen(s, "rb")) == NULL)
 (s = argv[3], (outfile = fopen(s, "wb")) == NULL)) {
 printf("??? %s\n", s);
 return EXIT_FAILURE;
 }
 if (toupper(*argv[1]) == 'E') { /* a work around relative file access */
 while ( getc( infile ) != EOF ) textsize++;
 fclose( infile );
 if (s = argv[2], (infile = fopen(s, "rb")) == NULL) {
 printf("Cannot re-open %s\n", s);
 return EXIT_FAILURE;
 }
 Encode();
 } else {
 if (argv[4] != NULL) llen = atol(argv[4]);
 origin = fgetc(infile); /* determine file origin */
 if ( origin == 0x1E origin == 0x1F ) { /* MVS or PC DOS */
 Decode();
 } else
 printf("\nError: %s is not a LZH compressed file\n",argv[2]);
 }
 fclose(infile);
 fclose(outfile);

 return EXIT_SUCCESS;
}


[LISTING 7] DOS LZH

/**********************************************************************
 DOS LZH based on lzhuf.c
 written by Haruyasu Yoshizaki 1988/11/20
 see MVS LZH listing for comments
 DOS version modified 1993/03/05 by Pierre Dion.
 Added Translate() for ASCII to EBCDIC translation and inserting
 0x0D and 0x0A (DOS CR LF) for decoding MVS record format.
 Encoded file header modified to designate origin 0x1E for
 MVS and 0x1F for DOS.
***********************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

FILE *infile, *outfile;
static unsigned long int textsize = 0, codesize = 0, printcount = 0;
char wterr[] = "Can't write.";

unsigned long int llen = 0;
unsigned char origin = 0x1F; /* ascii origin */
 /* ebcdic to ascii translation code page 437 */
unsigned char ascii[256] = {
 0x00, 0x01, 0x02, 0x03, 0x9C, 0x09, 0x86, 0x7F,
 0x97, 0x8D, 0x8E, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
 0x10, 0x11, 0x12, 0x13, 0x9D, 0x85, 0x08, 0x87,
 0x18, 0x19, 0x92, 0x8F, 0x1C, 0x1D, 0x1E, 0x1F,
 0x80, 0x81, 0x82, 0x83, 0x84, 0x0A, 0x17, 0x1B,
 0x88, 0x89, 0x8A, 0x8B, 0x8C, 0x05, 0x06, 0x07,
 0x90, 0x91, 0x16, 0x93, 0x94, 0x95, 0x96, 0x04,
 0x98, 0x99, 0x9A, 0x9B, 0x14, 0x15, 0x9E, 0x1A,
 0x20, 0xA0, 0xA1, 0xA2, 0xA3, 0xA4, 0xA5, 0xA6,
 0xA7, 0xA8, 0x5B, 0x2E, 0x3C, 0x28, 0x2B, 0x5D,
 0x26, 0xA9, 0xAA, 0xAB, 0xAC, 0xAD, 0xAE, 0xAF,
 0xB0, 0xB1, 0x21, 0x24, 0x2A, 0x29, 0x3B, 0x5E,
 0x2D, 0x2F, 0xB2, 0xB3, 0xB4, 0xB5, 0xB6, 0xB7,
 0xB8, 0xB9, 0x7C, 0x2C, 0x25, 0x5F, 0x3E, 0x3F,
 0xBA, 0xBB, 0xBC, 0xBD, 0xBE, 0xBF, 0xC0, 0xC1,
 0xC2, 0x60, 0x3A, 0x23, 0x40, 0x27, 0x3D, 0x22,
 0xC3, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67,
 0x68, 0x69, 0xC4, 0xC5, 0xC6, 0xC7, 0xC8, 0xC9,
 0xCA, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F, 0x70,
 0x71, 0x72, 0xCB, 0xCC, 0xCD, 0xCE, 0xCF, 0xD0,
 0xD1, 0x7E, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78,
 0x79, 0x7A, 0xD2, 0xD3, 0xD4, 0xD5, 0xD6, 0xD7,
 0xD8, 0xD9, 0xDA, 0xDB, 0xDC, 0xDD, 0xDE, 0xDF,
 0xE0, 0xE1, 0xE2, 0xE3, 0xE4, 0xE5, 0xE6, 0xE7,
 0x7B, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47,
 0x48, 0x49, 0xE8, 0xE9, 0xEA, 0xEB, 0xEC, 0xED,
 0x7D, 0x4A, 0x4B, 0x4C, 0x4D, 0x4E, 0x4F, 0x50,
 0x51, 0x52, 0xEE, 0xEF, 0xF0, 0xF1, 0xF2, 0xF3,
 0x5C, 0x9F, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58,

 0x59, 0x5A, 0xF4, 0xF5, 0xF6, 0xF7, 0xF8, 0xF9,
 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
 0x38, 0x39, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF
 };
/************* Start of LZHUF unchanged code section ***************/
/* see MVS LZH listing 6 for section source */
/************* End of LZHUF unchanged code section ******************/
/* translate ebcdic to ascii and record length control */
static void Translate( unsigned short c )
{
 static unsigned short lpos = 0;
 if ( origin == 0x1E ) { /* MVS origin - DOS decoding */
 /* convert ebcdic to ascii from table */
 if ( putc( ascii[c], outfile ) == EOF ) Error(wterr);
 /* perform record length control as requested */
 if (llen > 0) {
 if (lpos == llen) {
 /* insert end of record DOS CR LF */
 if ( ( putc( 0x0D, outfile ) == EOF ) 
 ( putc( 0x0A, outfile ) == EOF ) ) Error(wterr);
 /* reset line position */
 lpos = 0;
 } else {
 /* keep track of line position */
 lpos++;
 }
 }
 } else { /* DOS origin - MVS decoding */
 if ( putc( c, outfile ) == EOF ) Error(wterr);
 }
}
/* compression */
static void Encode(void) /* compression */
{
 short i, c, len, r, s, last_match_length;
 unsigned long bar = 0;
 fseek(infile, 0L, 2);
 textsize = ftell(infile);
 fputc(0x1F,outfile); /* output ascii origin */
 fputc((short)((textsize & 0xff000000L) >> 24),outfile);
 fputc((short)((textsize & 0xff0000L) >> 16),outfile);
 fputc((short)((textsize & 0xff00) >> 8),outfile);
 fputc((short)((textsize & 0xff)),outfile);
 if (ferror(outfile))
 Error(wterr); /* output size of text */
 if (textsize == 0)
 return;
 bar = textsize / 39; /* encode status bar */
 printf("In : %ld bytes\n"
 "--10--20--30--40--50--60--70--80--90--100\%\r", textsize);
 rewind(infile);
 textsize = 0; /* rewind and re-read */
 StartHuff();
 InitTree();
 s = 0;
 r = N - F;
 for (i = s; i < r; i++)
 text_buf[i] = 0x20;
 for (len = 0; len < F && (c = getc(infile)) != EOF; len++)

 text_buf[r + len] = c;
 textsize = len;
 for (i = 1; i <= F; i++)
 InsertNode(r - i);
 InsertNode(r);
 do {
 if (match_length > len)
 match_length = len;
 if (match_length <= THRESHOLD) {
 match_length = 1;
 EncodeChar(text_buf[r]);
 } else {
 EncodeChar(255 - THRESHOLD + match_length);
 EncodePosition(match_position);
 }
 last_match_length = match_length;
 for (i = 0; i < last_match_length &&
 (c = getc(infile)) != EOF; i++) {
 DeleteNode(s);
 text_buf[s] = c;
 if (s < F - 1)
 text_buf[s + N] = c;
 s = (s + 1) & (N - 1);
 r = (r + 1) & (N - 1);
 InsertNode(r);
 }
 if ((textsize += i) > printcount) {
 printcount += bar;
 printf("%c",0xFE);
 }
 while (i++ < last_match_length) {
 DeleteNode(s);
 s = (s + 1) & (N - 1);
 r = (r + 1) & (N - 1);
 if (--len) InsertNode(r);
 }
 } while (len > 0);
 EncodeEnd();
 printf(" \nOut: %ld bytes\n", codesize);
 printf("Out/In: %.3f\n", (double)codesize / textsize);
}
static void Decode(void) /* recover */
{
 unsigned short i, j, k, r, c;
 unsigned long count, bar = 0;

 textsize = (unsigned char) fgetc(infile);
 textsize <<= 8;
 textsize = (unsigned char) fgetc(infile);
 textsize <<= 8;
 textsize = (unsigned char) fgetc(infile);
 textsize <<= 8;
 textsize = (unsigned char) fgetc(infile);
 if (ferror(infile))
 Error("Can't read"); /* read size of text */
 if (textsize == 0)
 return;
 bar = textsize / 40; /* decode status bar */
 printf("In : %ld bytes\n"

 "--10--20--30--40--50--60--70--80--90--100\%\r", textsize);
 StartHuff();
 for (i = 0; i < N - F; i++)
 text_buf[i] = 0x20;
 r = N - F;
 for (count = 0; count < textsize; ) {
 c = DecodeChar();
 if (c < 256) {
 Translate(c);
 text_buf[r++] = c;
 r &= (N - 1);
 count++;
 } else {
 i = (r - DecodePosition() - 1) & (N - 1);
 j = c - 255 + THRESHOLD;
 for (k = 0; k < j; k++) {
 c = text_buf[(i + k) & (N - 1)];
 Translate(c);
 text_buf[r++] = c;
 r &= (N - 1);
 count++;
 }
 }
 if (count > printcount) {
 printcount += bar;
 printf("%c",0xFE);
 }
 }
 printf(" \nRestored\n");
}
short main(short argc, char *argv[])
{
 char *s;
 printf("\nLZH ver 1.02 - Multi-platform Data Compression/Decoding,\n"
 "Based on LZHUF written by Haruyasu Yoshizaki 1988 (Japan),\n"
 " modified by Paul Edwards 1990 (Australia),\n"
 " and Mark Nelson 1990 (USA),\n"
 "IBM/MVS__PC/DOS by Pierre Dion 1993 (Canada),\n\n");

 if (argc < 4) {
 printf("'lzh e file1 file2' encodes file1 into file2.\n"
 "'lzh d file2 file1 l' decodes file2 into file1.\n"
 " 'l' (optional) specifies line length for crlf\n"
 " as defined by the original ebcdic file and\n"
 " must be expressed as an integer.\n"
 " Examples: lzh e datafile compfile\n"
 " lzh d compfile datafile 150\n");
 return EXIT_FAILURE;
 }
 if ((s = argv[1], s[1] strpbrk(s, "DEde") == NULL)
 (s = argv[2], (infile = fopen(s, "rb")) == NULL)
 (s = argv[3], (outfile = fopen(s, "wb")) == NULL)) {
 printf("??? %s\n", s);
 return EXIT_FAILURE;
 }
 if (toupper(*argv[1]) == 'E') {
 Encode();
 } else {
 if (argv[4] != NULL) llen = atol(argv[4]);


 origin = fgetc(infile); /* read infile origin */

 if ( origin == 0x1E origin == 0x1F ) /* MVS or DOS */
 Decode();
 else
 printf("\nError: %s is not a LZH compressed file.\n",argv[2]);
 }
 fclose(infile);
 fclose(outfile);
 return EXIT_SUCCESS;
}


















































December, 1993
Database Interoperability and Application Transportability


Tool strategies for database interoperability




Edward Dowgiallo


Ed is an independent database developer and can be reached at P.O. Box 390,
Effort, PA 18330, or on MCI Mail at 590-6310.


In the database world, "interoperability" refers to the ability to build
applications that can simultaneously access data in different databases
provided by different vendors. And since most databases are supposedly
compliant with ANSI/ISO/IFIPS SQL standards, you'd expect smooth
interoperability between them. After all, with relatively little trouble, you
can compile C programs using different ANSI C compilers. But the database
world is different.
Transportability, on the other hand, is the ability to move an application
from one network/hardware combination to another while continuing to use the
same database. Being able to do this lets you scale hardware to match changes
in the size of the user organization and the volume of application traffic. In
another sense, transportability is the ability to write an application and
move it from one vendor's database to another.
Interoperability and transportability are achieved using various strategies,
among them: single vendor, 4GL, middleware, gateway, 3GL, and CASE. Each
strategy is most appropriate in certain situations, and every major database
provider supports two or three--even though none are 100 percent effective.


The Single-vendor Strategy


In the single-vendor approach to an open-systems environment, you achieve a
heterogeneous hardware environment by standardizing on the database vendor. By
leveraging the UNIX operating system, most of the database vendors will claim
to support upwards of 100 operating environments. The vendor supplies you with
the database engine, application-development tools, and (for client/server
application architectures) a networking layer that makes the particular
network you're using transparent to the vendor's engine and tools. If you
adopt this strategy, you can scale your database server with little impact on
application code. The cost is that you're locked into a single vendor.
This strategy has its complication. Vendors gear their development efforts
toward systems used by most of their customers. Thus, new versions are slow to
appear on otherwise popular operating systems not traditionally used to
support database servers. SCO UNIX is one example of this.
Networking is not always well implemented because, again, vendors put their
efforts to where customers are, not where they might be going. Thus, a typical
vendor will support TCP/IP for its UNIX servers, SNA protocols for IBM
servers, and SPX/IPX for PC-based servers. But what if you'd like a PC server
to participate in a distributed application with a UNIX or IBM server?
Many companies with PC support under Oracle v6 discovered that they would have
to implement TCP/IP on all their PCs to support Oracle's SQL*Net to standard
UNIX servers. Even under Oracle v7, instead of committing to implementing
SPX/IPX listeners for the various UNIX servers, Oracle rewrote SQL*Net to
permit the equivalent of a SQL*Net router which converts packets from one
protocol to another.
Support is also an issue in the single-vendor strategy. Of the numerous
platforms that a database might claim to support, many are not directly
supported by the vendor. If you're buying Oracle for a Silicon graphics
computer, for instance, the port to the Silicon graphics version of UNIX was
done by Silicon Graphics. That machine doesn't have enough database customers
to warrant Oracle investing in a port. Silicon Graphics wants Oracle on their
box, so they work with Oracle to do the port. If you run into real problems,
you'll receive support from both companies, but neither will deal with your
problem well. Oracle lacks the experience with the native environment, and the
small vendor typically doesn't have a lot of database experience. Similar
situations exist for Ingres, Sybase, Informix and others.
Finally, the single vendor you select may be slow to adopt new technologies.
Most database vendors were slow to move to PCs and even slower in supporting
GUI environments. Complete toolsets under Windows, Presentation Manager, and
so on still aren't available for most of the major vendors.


The 4GL Strategy


4GL vendors base their strategy on the fact that the database vendors tend to
leapfrog each other in engine technology. Why standardize on a database engine
when the engine you choose will be surpassed by a competitor within a year?
The benchmark competition among the database vendors is legendary, with
vendors routinely announcing new performance records on a hardware platform
different from last week's record breaker.
Instead of standardizing on the database, the 4GL strategy is to standardize
on the application-building tool. After all, you want all your applications to
have a common look-and-feel. Therefore, you should let the 4GL tool hide the
complex details of the underlying database so that you can write applications
in a standard environment and treat the database as a commodity item.
However, most 4GL tools require the networking components provided by the
underlying database. If you're using Oracle, the tool will probably need
SQL*Net; for Sybase, it will need db/library. You'll then have problems in DOS
and Windows with the number of drivers you have to load in real memory not
leaving enough room to run much of anything else.
When you need to do distributed queries across databases from more than one
vendor, the data is typically joined by the tool in your PC. This bypasses all
the global-query optimization algorithms implemented by the database vendors
to avoid the kinds of horrendous network traffic this generates. Both Paradox
and Advanced Revelation take this approach, as do most 4GL vendors that
support multivendor joins.
Even if you just want to write applications against one database at a time,
you may incur suboptimal performance. Many PC development tools have added SQL
support by providing a driver in the tool that treats the SQL database like a
flat-file manager. Anytime data is needed, a SELECT statement is issued
against a single table. The performance of this type of implementation
strategy can be devastating. Similarly, many tools simply turn on auto-commit
and have no real transaction support. If your tool supports SQL, ASCII, dBase,
and Paradox data formats, ask the vendors about the sophistication of their
drivers.
UniFace, on the other hand, has a component called a Polyserver, which resides
in the database server and implements UniFace's own data dictionary. Features
are used in the underlying database when available, and simulated when not.
UniFace thus guarantees a set of features much greater than a least-common
denominator approach--there's an intelligent driver for each database
supported. The downside is that you're committed to using Uniface's networking
and application-development components, which have been trailing their
competitors in supporting GUI features.


The Middleware Strategy


Middleware vendors include companies such as Information Builders with its
EDA/SQL product. This strategy provides a networking interface that's
standardized on both sides. This supports an open-system strategy in that the
database networking infrastructure is independent of front-end and back-end
vendors. This enables you to potentially change vendors on either side because
the change doesn't involve altering the infrastructure. If nothing else, this
gives you greater clout when negotiating contracts with database and front-end
tool vendors.
At the programming level, the benefit is that the client software uses exactly
the same call-level interface, regardless of the database on the back end.
This makes it easy for a tool vendor to support numerous databases. Simply
writing one driver for the tool that supports a popular middleware product
makes all the database engines that have a driver for that middleware product
"supported" by the tool.
However, because the tool vendor is writing to a generic interface and can't
be sure which database engine is on the other side of that interface, the SQL
supported is typically generic ANSI SQL. This virtually guarantees a
least-common denominator level of support--SQL features that ensure
performance, data integrity, and location transparency probably won't be
available to your application, which nullifies most of your reasons for
switching to SQL in the first place. This will change as the interface
standards being promulgated by the SQL Access group become more feature rich.
A variation of the middleware strategy is the RPC approach, which database
vendors use to implement their own networking components. I avoid this unless
I'm developing an application with very special requirements because RPCs are
a major complication. They bypass existing technology, increase the skill
needed by maintenance personnel, and require you to rewrite existing
applications to take advantage of RPC.


The Gateway Strategy


The gateway strategy provides a co-existence stage for converting from one
database vendor to another. The purpose of a database gateway is to make a
"foreign" database behave like the database provided by the gateway vendor.
For instance, the Oracle gateway for DB2 allows DB2 to be accessed by tools
that can interface with Oracle's SQL*Net. If you're familiar with Oracle and
want to access DB2 data from your local workstation, this is a viable
approach: You can move data very easily from DB2 into Oracle, run SQL*Plus
scripts, generate ad hoc reports, and the like.
Unfortunately, any tool that depends on the Oracle system tables won't work
since the Oracle gateway doesn't simulate these tables in DB2. That excludes
SQL*Forms and most other development tools. SQL extensions such as Oracle's
ability to do self-joins (tree traversals in an inventory parts explosion, for
example) aren't supported since DB2 doesn't have this facility, so you end up
with the intersection of the features supported by the two database engines
and the ability to use tools that do not use information in the system tables
of either database.
This isn't to pick on Oracle. Most database gateways are simply dynamic SQL
programs with little specific knowledge about the database they interface
with. Typically, such programs don't do much beyond data-type translation
between the two database systems, but they aren't expensive to develop, and
there are marketing reasons to do so.
It's more expensive to develop a higher-quality gateway that really masks the
differences between the two systems. Oracle is planning to redo their DB2
gateway to make it more functional and competitive with similar gateways from
Ingres and Sybase, but differences between databases will always be visible
through the gateway.

The gateway strategy is inappropriate as a general-purpose infrastructure
component. Information Builders has many database gateways available for
FOC-Net and EDA/SQL, but it's unwise to include applications that send
transaction-oriented production traffic through such gateways as part of a
middleware strategy.
Gateways do have good specific uses, though. It's reasonable to make a gateway
part of a decision-support infrastructure, as long as the traffic is
relatively low volume. It's also reasonable to make a gateway part of a
data-replication or data-warehouse strategy. The key is that transaction
traffic should never run directly against the gateway.


The 3GL Strategy


3GL support for SQL generally comes as precompilers and call interfaces. Of
the two, precompilers are the easier way to write transportable code. The
typical precompiler supports either an SAA or ANSI syntax of SQL. Support is
typically provided for both static and dynamic SQL. Truly static SQL requires
that the engine support some form of stored execution plan, which is usually
implemented as a stored procedure or access plan.
If you want to support a specific group of databases when using a precompiler,
the differences can be handled via the conditional compile facility. Oddly,
connection management lacks standardization, so no matter how plain-vanilla
you write the SQL, you'll still have to at least conditionally compile the
database logon and logoff. Most shops will find SQL-standardized syntax for
error checking inadequate and want to use the lower-level facility. One or
more communication-area variables are usually available. It's best to write a
subroutine, which is called after each statement, checks the return codes,
handles the issues of a message, and returns a standardized code telling your
application whether to continue, do something over, or abort.
If any of the databases you want to write for only supports a call interface,
the problem becomes more complex. Most call interfaces have subroutines that
perform identical functions, although their names and parameters differ. The
basic steps in executing any SQL statement are to place the statement in a
command buffer, bind program variables, execute the statement, get a
completion code, and retrieve data returned, if any. Between a specific pair
of databases, it's usually possible to come up with a set of macros that mask
all these differences. But this isn't a trivial task. You get a break at the
call-interface level in that the differences between static- and dynamic-SQL
syntax disappear. All the SQL statements are strings passed to the interface.
The more dynamically the string is assembled at run time, the more complicated
the code, but the basic steps remain identical. Static SQL is usually only
available via a stored-procedure mechanism when using a call interface.
Conversely, all engines that support the static-SQL stored-access plan use a
precompiler interface. This static-SQL based strategy transfers the burden of
knowledge to the developer. If the developers aren't versed in several
engines, you can't achieve transportability between engines.


The CASE Strategy


CASE vendors are starting to provide serious support for SQL. Tools such as
IEF from Texas Instruments, Telon from Computer Associates, and APS from Sage
Software are viable for organizations doing in-house development of major
systems. For transportability and interoperability, products that generate
code can go a long way toward making the differences between database products
transparent. A CASE tool's degree of success in doing this depends on how much
custom knowledge about the target database has been built into its code
generator. If the degree of knowledge is high, you can build an application
with the CASE tool and migrate it to different operating environments just by
changing a few check boxes on the code-generator screen.
Unfortunately, while these tools are well developed for generating Cobol in an
MVS environment using various IBM front ends (CICS, IMS/DC, and so on) and
back ends (VSAM, IMS/DB, DB2, and the like), support for C, non-MVS
environments, and UNIX-based SQL database engines is only beginning to appear.
Take care to choose a tool with adequate support for the environments for
which you'll be developing. Consider, also, the degree to which particular
databases are supported, as these environments are still new to CASE
developers. Don't be surprised if the code generated resembles ANSI SQL and
ignores all the high-performance, nonstandard features of the target database
engine.













































December, 1993
REXX and the OS/2 Workplace Shell


A procedural language now, an OO language in the near future




Eric Giguere


Eric is a developer with Watcom International and the chief architect of the
VX-REXX library. He can be reached via the Internet at giguere@watcom.on.ca.


One way to approach object-oriented programming (OOP) is to ignore the
"language wars"--those debates on whether a particular language contains
polymorphically correct constructs--and to simply start programming with
objects. And if, by choice or by fiat, your language happens to be procedural,
you can still use object-oriented techniques. This article discusses using
such an approach with the REXX language, as found in OS/2 in the Workplace
Shell and also in Watcom's VX-REXX. I'll also take a peek at a true
object-oriented REXX.


A REXX Primer


REXX was developed in the early '80s by Mike Cowlishaw of IBM. It first gained
popularity on IBM mainframe systems but has since migrated to several
platforms, including OS/2 (a REXX interpreter comes with every copy of OS/2
2.x) and the Amiga.
REXX is a procedural language with all the usual features: control structures,
complex variables, operators of various kinds, and functions. Example 1 shows
a REXX program to sum the first N integers. REXX is a typeless language; no
declarations are needed. Instead, all data is stored as strings. REXX is also
designed to be interpreted and includes an INTERPRET statement for run-time
interpretation of arbitrary strings, akin to Lisp's equal.
Expandability is REXX's greatest strength. REXX has a subcommand facility that
allows it to be linked to an application and used as a macro language. New
functions can be written in C and included in your application, or packaged
together into a DLL for use by other applications.


Workplace Shell Objects


The OS/2 Workplace Shell (WPS) is an object-oriented user interface (not to be
confused with an object-oriented implementation of such an interface). WPS
objects are represented by icons that can be directly manipulated by the user.
These objects are not simple files, but independent entities derived from the
WPS class hierarchy. Programmers can extend the class hierarchy to build their
own object types.
Although WPS objects are geared toward direct user interaction, the system
provides an interface that allows REXX programs to create and manipulate these
objects. WPS exist independently of any user program, so the REXX interface
provides a simple way to customize the OS/2 environment.
The REXX interface is implemented as a set of functions in the REXXUTIL.DLL
external function library. To use these functions from REXX, it's necessary to
first call the built-in RXFuncAdd function, as in Example 2(a). This must only
be done once, usually on system startup in the STARTUP.CMD file, but it
doesn't hurt to include these lines in your own REXX programs to ensure the
functions are loaded.
The SysCreateObject function is used to create a WPS object; see Example 2(b).
The first parameter, upsclass, is the type of object to create. WPS provides
an extensive list of classes and allows programmers to create new classes. The
parameter title will be the on-screen title of the object (if it is visible).
Location is the folder in which the object is to be created--it may be a
pathname or a WPS object identifier such as <WP_DESKTOP>. The parameter
setupdata is a list of options to use in creating the object--the options
depend on the type of object being created. If you specify an object ID as one
of the setup options, the replaceflag parameter (Fail, Replace, or Update)
tells the WPS what to do if an object with that ID already exists. (For a
complete discussion of object IDs and the various setup options available,
refer to OS/2 2.1 Unleashed, by David Moskowitz and David Kerr, Sams
Publishing, 1933.)
If an object already exists, SysSetObjectData may be used to modify its data,
and SysDestroyObject to destroy it. (No function exists to retrieve an
object's data, however.) Listings One and Two (Page 100) demonstrate the use
of SysCreateObject and SysSetObjectData. Listing One, books.cmd, creates a
folder object on the desktop and populates it with program references for all
the .INF (Information) files on your system. Clicking on one of these objects
invokes the VIEW.EXE program on the respective file. Listing Two, bitmaps.cmd,
cycles through all the .BMP (bitmap) files on your system and sets the desktop
background to a new bitmap file every minute.
Typically, SysCreateObject is used to install software in appropriate folders.
A good example of this is found in buildvrx.cmd, which is available
electronically: see: "Availability," page 3.


VX-REXX Objects


Watcom's VX-REXX adds Presentation Manager support to the OS/2 REXX
interpreter. An object library accessible from REXX lies at the heart of the
VX-REXX system. The library defines a complete class hierarchy of window
objects similar to what you would find in a traditional object-oriented system
like C++. The objects are defined in C using IBM's System Object Model (SOM),
allowing programmers to extend the library by subclassing an existing class
and packaging it in a DLL.
VX-REXX objects are internal to a given REXX program, but objects are accessed
through function calls. Any other method would require changes to the language
definition. This is a typical restriction when using object-oriented
techniques with procedural languages. The X Window Toolkit, for example, uses
similar techniques for programming in C.
VX-REXX objects are created with VRCreate; see Example 2(c). Since most
VX-REXX objects represent visible windows, a parent object is specified as the
first parameter, followed by the class of object to create. Object properties
(state data) may be set at creation time in pairs of arguments (property name
and initial value); see Example 3(a).
VRCreate returns an object handle, or a null string if the object could not be
created. The object handle uniquely identifies the object, but each object is
also identified by its name, a settable property. Names and object handles may
be used interchangeably. An object may be destroyed at any time by calling the
VRDestroy function, Object properties may be set and retrieved after creation
using VRSet and VRGet, as in Example 3(b). However, some properties are
read-only after an object has been created.
VX-REXX objects have methods (or "member functions" in C++ parlance). Your
program can invoke a method by using VRMethod, as follows: call VRMethod win,
'centerwindow'. Methods are procedures attached to an object, though currently
VX-REXX does not allow methods to be defined in REXX.
VX-REXX programs are event driven--once the window objects have been created,
the program sleeps until something interesting happens and an event is
generated. When an event does occur, it is placed onto a queue for the program
to fetch and process. Different objects respond to different events--a push
button, for example, generates a Click event when the user clicks on it. The
event data returned to the program is set by the user when the object is
created, and is typically the name of a procedure to call. Listing Three (page
100) is a simple example of an event-driven program.
There's more to VX-REXX than the object library. A complete design environment
allows you to create objects and set their properties by direct manipulation.
VX-REXX generates the objects and the program framework for you. You just fill
in the event procedures, using VRGet, VRSet, and VRMethod as required.
It's sometimes useful, however, to create objects on the fly, and this is
where VRCreate comes in handy. Say you want to create a grid of 20 PictureBox
objects on your window to display some bitmaps, and have the window size
itself to the screen. Listing Four (page 100) is a procedure to do just that,
extracted from the SHOWBMPS.EXE VX-REXX example (available electronically).
The example creates the grid dynamically on the main thread while collecting
file information on a secondary thread. When the bitmap information is ready,
the main thread merely loops through the grid and set the PicturePath property
to display the bitmaps, via a call to VRSet, as follows: call VRSet name,
'PicturePath', bitmaps.i. Objects are treated equally whether created in the
design environment or at run time.
While this functional approach to object handling has its limitations, it can
be used to build complex, GUI-based applications. The best example of this is
the VX-REXX design environment itself, a hybrid of REXX and C code that uses
the same object library that user programs use.


Object-oriented REXX


Although oriented specifically to window objects, the VX-REXX object library
could be easily extended to include purely abstract object classes that are
manipulated with the same set of functions. The ability to add new methods
written in REXX to a class would be another extension. However, at this point
the syntax and scoping limitations imposed by the language hardly make it
worth the effort. What's really needed is a truly object-oriented REXX
(OOREXX).
OOREXX started out as a research project at IBM, primarily at its U.K.
laboratories. The idea was to add object-oriented programming concepts to REXX
while maintaining maximal compatibility with existing REXX syntax. This meant
writing a new interpreter that treated all data, including strings, as objects
and transforming all operations into messages that invoke methods. Even simple
expressions like: sum=3+4, are transformed into two messages (one to add a
number to the "3" object, one to assign the resulting object to the variable).
All operators and function calls can be handled this way, ensuring
compatibility with procedural REXX. In general, however, messages are sent to
an object by using the tilde (~) operator, as in:
height=window~get_height('pixels'). Messages can include arguments.
Objects in OOREXX are created from classes definable in REXX. The basic
mechanism is to create a new class from the global class object and attach
code to it; see Example 4(a). You can then instantiate an object of that class
by invoking new method, as in Example 4(b). Of course, inheritance is
supported; see Example 4(c).
The self object is defined within a method to represent the object the method
was invoked on. This allows the object's other methods and any methods from
the superclass to be invoked; see Example 4(d). Objects have their own
variable namespaces, as can each method. Variables from the object space can
be imported into the method space using the METHOD EXPOSE statement; see
Example 4(e).

OOREXX also offers features such as multiple inheritance and support for
concurrency. At recent conferences, IBM has been demonstrating an OS/2 version
of OOREXX that includes support for manipulating WPS objects directly within
the OOREXX framework--quite an improvement over the current REXX support for
WPS objects. Though no firm date for OOREXX has been announced, the OS/2
version is expected to be released by IBM sometime in 1994.

Example 1: A simple REXX program to add the first N integers.

n = arg( 1 )
t = 0
do i = 1 to n
 t = t + i
end
say "Sum from 1 to" n "is" t
exit


Example 2: (a) Loading external functions from the REXXUTIL.DLL library; (b)
creating a Workplace shell object; (c) creatng a VX-REXX object.

(a)
call RxFuncAdd 'SysLoadFuncs', 'REXXUTIL', 'SysLoadFuncs'
call SysLoadFuncs

(b)
result = SysCreateObject( wpsclass, title, location, setupdata, replaceflag )

(c)
object = VRCreate( parent, classname, [property_1, value_1], ... )


Example 3: (a) Setting object properties at creation time; (b) setting object
properties after creation.

(a)

win = VRCreate( '', 'Window', 'height', 1000, 'width', 1000 )

(b)

call VRSet win, 'name', 'MyWindow'
height = VRGet( 'MyWindow', 'height' )




Example 4: (a) Creating a class in OOREXX; (b) instantiating an object of
class hello; (c) using inheritance in defining a new class; (d) a method that
uses messages to self; (e) controlling visibility of methos

(a)

hello = ~class~new( 'Hello' )
hello~define( 'SAY', 'say "hello"' )


(b)

object = hello~new
object~say


(c)

big_hello = ~class~new( 'Big Hello' )~inherit( hello )
big_hello~define( 'SAY', 'say "HELLO"' )



(d)

big_hello~define( 'SAY_TWICE', 'self~say; self~say.super' )

(e)


counter = ~class~new( 'Counter' )
counter~define( 'INIT', 'method expose count; count = 0' )

counter~define( 'INCREMENT', 'method expose count; count = count + 1' )

[LISTING ONE]

/* BOOKS.CMD -- Makes a folder containing all the .INF files on your drives */
call RXFuncAdd 'SysLoadFuncs', 'REXXUTIL', 'SysLoadFuncs'
call SysLoadFuncs

/* Make a folder object (first erase old one) */
say 'Building Books Folder'
call SysDestroyObject '<BOOKS_FOLDER>'

classname = 'WPFolder'
title = 'All Books'
location = '<WP_DESKTOP>'
setup = 'OBJECTID=<BOOKS_FOLDER>;'
call SysCreateObject classname, title, location, setup, 'r'

/* Get the list of local drives starting at C: */
drives = SysDriveMap( 'C', 'local' )

/* For each drive, search for .INF files */
classname = 'WPProgram'
location = '<BOOKS_FOLDER>'

count = 1
do while( drives <> '' )
 parse var drives drive drives
 say 'Searching disk' drive
 drop books.
 call SysFileTree drive '\*.INF', 'books', 'FSO'
 if( books.0 > 0 )then
 do

 say books.0 '.INF files found on drive' drive
 do i = 1 to books.0
 title = filespec( 'name', books.i )
 setup = 'EXENAME=view.exe;' ,
 'PROGTYPE=PM;' ,
 'PARAMETERS='books.i';' ,
 'OBJECTID=BOOK_'count';'
 call SysCreateObject classname, title, location, setup, 'r'
 count = count + 1
 end
 end
 else
 say 'No .INF files found on drive' drive
end

say count '.INF files added to Books folder.'



[LISTING TWO]

/* BITMAP.CMD -- show system bitmaps */

call RXFuncAdd 'SysLoadFuncs', 'REXXUTIL', 'SysLoadFuncs'
call SysLoadFuncs

/* Some magic to put ourselves in the background. We run with PMREXX.EXE
 * in a minimized state, leaving an icon that the user can click on later...
*/
if( arg(1) = '' )then do
 say 'Updating the desktop and running the program in the background'
 parse source . . program
 setup = 'EXENAME=PMREXX.EXE;PARAMETERS='program' anyparm;' ,
 'PROGTYPE=PM;MINWIN=DESKTOP;MINIMIZED=YES;' ,
 'OPEN=DEFAULT;OBJECTID=CYCLE_BITMAPS'
 call SysCreateObject 'WPProgram', 'Cycle Bitmaps', '<WP_DESKTOP>', ,
 setup, 'update'
 exit
end
/* Get the list of local drives starting at C: */
drives = SysDriveMap( 'C', 'local' )
bitmaps.0 = 0
/* For each drive, search for .BMP files */
do while( drives <> '' )
 parse var drives drive drives
 say 'Searching drive' drive'...'
 drop tmp.
 call SysFileTree drive '\*.BMP', 'tmp', 'FSO'

 do i = 1 to tmp.0
 j = bitmaps.0 + 1
 bitmaps.j = tmp.i
 bitmaps.0 = j
 end
end
/* Now cycle through the list, showing each bitmap for 1 minute */
do forever
 do i = 1 to bitmaps.0
 say 'Displaying bitmap' bitmaps.i
 call SysSetObjectData '<WP_DESKTOP>', 'BACKGROUND='bitmaps.i';'
 call SysSleep 60
 end
end


[LISTING THREE]

/* EVENT.CMD -- A simple event-driven program. Requires VX-REXX.
 * Run using the command "vrx event" */
 /* Build the window and add a pushbutton to it */
 win = VRCreate( '', 'Window', 'height', 1000, 'width', 3000 )
 pb = VRCreate( win, 'PushButton', 'left', 100, 'top', 100, ,
 'height', 500, 'width', 2700 )
 /* When the window is closed, return an 'exit' */
 call VRSet win, 'close', 'exit'

 call VRMethod win, 'centerwindow'
 /* When the pushbutton is clicked, return a call statement */
 call VRSet pb, 'click', 'call increment'
 call VRSet pb, 'caption', 'Push me'
 count = 0
 /*---- Event loop: wait for and interpret events ---------*/
 do forever
 interpret VREvent()
 end
 exit
/*----- Update the pushbutton caption -----------*/
increment:
 count = count + 1
 call VRSet pb, 'caption', 'You pushed me' count 'time(s)'
 return


[LISTING FOUR]

/* Example of adding 20 PictureBox objects dynamically to a window */
ih = VRGet( 'Window1', 'InteriorHeight' )
iw = VRGet( 'Window1', 'InteriorWidth' )
x_incr = trunc( iw / 5 )
y_incr = trunc( ih / 4 )
top = 0
do i = 1 to 4
 left = 0
 do j = 1 to 5
 x = (i-1) * 5 + j
 call VRCreate 'Window1', 'PictureBox',,
 'name', 'Picture'x,,
 'width', x_incr,,
 'height', y_incr,,
 'left', left,,
 'top', top,,
 'resizepicture', 'true',,
 'backcolor', 'white',,
 'border', 'true',,
 'bordercolor', 'black'
 left = left + x_incr
 end
 top = top + y_incr
end



















December, 1993
The IDEA Encryption Algorithm


An advanced block-cipher approach to encryption




Bruce Schneier


Bruce is author of Applied Cryptography (John Wiley & Sons, 1993) and can be
contacted at 730 Fair Oaks Ave., Oak Park, IL 60302.


For the past 15 years, the best security most of have heard about is the
Digital Encryption Standard (DES). It's a good algorithm, and secure against
mid-seventies technology. With the computing power of today's machines,
however, DES's small key size has made the algorithm vulnerable, leading
cryptographers to look for, and propose new, stronger algorithms.
One such proposal, the International Data Encryption Algorithm (IDEA) invented
in Switzerland by Xuejia Lai and James Massey, may be one of most secure block
algorithms available to the public today. IDEA is a block cipher, operating on
64-bit data blocks: 64 bits of plaintext go in one end and 64-bits of
ciphertext come out the other. The key is 128 bits long. The same algorithm is
used for both encryption and decryption.
The design philosophy behind the algorithm is one of "mixing operations from
different algebraic groups." There are three algebraic groups whose operations
are being mixed, all of which are easily implemented in both hardware and
software:
XOR.
Addition modulo 216 (addition, ignoring any overflow).
Multiplication modulo 216+1 (multiplication, ignoring any overflow).
All these operations (and these are the only operations in the
algorithm--there are no permutations) operate on 16-bit subblocks, making it
efficient even on 16-bit processors.
Figure 1 shows a diagram of IDEA. The 64-bit data block is divided into four
16-bit subblocks, which are X1, X2, X3, and X4. These four subblocks become
the input to the first round of the algorithm. There are eight rounds total.
In each round, the four subblocks are XORed, added, and multiplied with each
other and six 16-bit subblocks of key material. Between each round, the second
and third subblocks are swapped.
The sequence of events in each round is as follows:
1 Multiply X1 and the first key subblock.
2 Add X2 and the second key subblock.
3 Add X3 and the third key subblock.
4 Multiply X4 and the fourth key subblock.
5 XOR the results of step #1 and #3.
6 XOR the results of step #2 and #4.
7 Multiply the results of step #5 with the fifth key subblock.
8 Add the results of step #6 and #7.
9 Multiply the results of step #8 with the sixth key subblock.
10 Add the results of step #7 and #9.
11 XOR the results of step #1 and #9.
12 XOR the results of step #3 and #9.
13 XOR the results of steps #2 and #10.
14 XOR the results of steps #4 and #10.
That's it. The output of the round is the four subblocks that result from
steps 11, 13, 12, and 14. Swap the two inner blocks (except for the last
round), and that's the input to the next round. After the eighth round,
there's a final output transform:
1 Multiply X1 and the first key subblock.
2 Add X2 and the second key subblock.
3 Add X3 and the third key subblock.
4 Multiply X4 and the fourth key subblock.
Finally, the four subblocks are reattached to produce the ciphertext.
Creating the key subblocks is also easy. The algorithm uses 52 of them (six
for each of the eight rounds, and four more for the output transform.) First,
the 128-bit key is divided into eight 16-bit subkeys. These are the first
eight subkeys for the algorithm (the six for the first round, and the first
two for the second round). Then, the key is rotated 25 bits to the left and
again divided into eight subkeys. The first four are used in round two; the
last four are used in round three. The key is rotated another 25 bits to the
left for the next eight subkeys, and so on until the end of the algorithm.
Decryption is the same, except that the key subblocks are reversed and
slightly different. The decryption key subblocks are either the additive or
multiplicative inverses of the encryption-key subblocks. (For the purposes of
IDEA, the multiplicative inverse of 0 is 0.) Calculating these takes some
doing, but you only have to do it once for each decryption key. Table 1 shows
the encryption key subblocks and the corresponding decryption key subblocks.
Current software implementations of IDEA are faster than DES (about 1.5 to 2
times as fast). On a 33-MHz 386 machine, IDEA encrypts data at 880 kbps. (An
equivalent DES implementation encrypts data at only 584 kbps.) On a VAX 9000,
the speed is almost four times greater. Listing One (page 105) is an
implementation of IDEA in C.
A VLSI implementation of IDEA (developed at ETH Zurich) consisting of 251,000
transistors on a chip 107.8mm on a side, encrypts data using IDEA at a 177
Mbits/second data rate when clocked at 25 MHz.
IDEA can work within any block-cipher mode discussed in relation to the DES
algorithm: Electronic Codebook (ECB), Cipher Block Chaining (CBC), Output
Feedback (OFB), and Cipher Feedback (CFB). ECB (see Figure 2) is the simplest
mode. The plaintext is encrypted in blocks of 64 bits. The first 64 bits of
plaintext are encrypted to become the first 64 bits of ciphertext; the second
64 bits of plaintext are encrypted to become the second 64 bits of ciphertext;
and so on. Each block is encrypted independently of all other blocks.
Decryption is the reverse.
The problem with ECB mode is that, if a cryptanalyst has the plaintext and
ciphertext for several messages, he can start to compile a codebook without
knowing the key. In most real-world situations, fragments of messages tend to
repeat. One message may have bit sequences in common with another.
Computer-generated messages (such as electronic mail) may have a regular
structure. Messages may have parameters that take only a few values, or long
strings of zeros or spaces. Other computer-generated messages may always have
important data in the same place.
If the cryptanalyst learns that the plaintext block "5e081bc5" encrypts to the
ciphertext block "7ea593a4", he can immediately decrypt that ciphertext block
whenever he sees it in another message. If the application encrypts messages
with many redundancies, and these redundancies tend to show up in the same
places in the message, this can be a very powerful attack.
The other three modes defend against this sort of attack. In CBC mode, the
plaintext is XORed with the previous ciphertext block before it is encrypted.
Figure 3 shows CBC in action. After a plaintext block is encrypted, the
resulting ciphertext is also stored in a feedback register. Before the next
plaintext block is encrypted, it is XORed with the feedback register to become
the next input to the encrypting routine. The resultant ciphertext is again
stored in the feedback register, to be XORed with the next plaintext block.
And so on until the end of the message. The encryption of each block depends
on all the previous blocks.
Decryption is just as straightforward. A ciphertext block is decrypted
normally, and also saved in a feedback register. After the next block is
decrypted, it is XORed with the results of the feedback register. Then the
next ciphertext block is stored in the feedback register, and so on, until the
end of the message.
Mathematically, this appears thus:
Ci = EK(Pi XOR Ci-1)
Pi = Ci-1 XOR DK(Ci)
OFB and CFB (described in standard cryptography books) are two other feedback
mechanisms.



IDEA Security


IDEA's key length is 128 bits--over twice as long as the DES key. Assuming
that testing every possible key (a brute-force attack) is the most efficient
way to break the algorithm, it would require 2128 encryptions to recover the
key. DES only requires 256 encryptions to break; a million chips capable of
testing a million keys a second can break DES in 20 hours. IDEA is much
stronger. Design a chip that can test a billion keys per second, and throw a
billion of them at the problem, and it will still take 1013 years to break
IDEA. An array of 1024 such chips can find the key in a day, but it
questionable whether there are enough silicon atoms in the universe to build
such a machine.
Unless, of course, brute force isn't the best way to attack IDEA. The
algorithm is still too new for any definitive statements about its security.
The designers have done their best to make the algorithm immune to all known
cryptanalytic attacks (including a new and powerful attack called
"differential cryptanalysis").
While IDEA appears to be significantly more secure than DES, it isn't always
easy to substitute one for the other in an existing application. If your
database and message templates are hardwired to accept a 64-bit key, it may be
impossible to implement IDEA's 128-bit key.
For those applications, generate a 128-bit key by concatenating the 64-bit key
with itself. Remember, however, that IDEA is weakened considerably by this
modification.


Conclusions


IDEA is a new algorithm that looks secure. While it appears to be resistant to
differential and related-key cryptanalysis, secure-looking algorithms have
fallen to new forms of cryptanalysis time and time again. Several academic and
military groups currently cryptanalyzing IDEA. None of them have succeeded
yet, but who knows what they might come up with tomorrow.
The IDEA block cipher is patented in Europe, and the patent is pending in the
United States. The patent is held by Ascom-Tech AG. No license fee is required
for noncommercial use. Commercial users interested in licensing the algorithm
should contact Dr. Dieter Profos, Ascom Tech AG, Solothurn Lab, Postfach 151,
4502 Solothurn, Switzerland (Tel: +41 65 242-885 or fax: +41 65 235-761).


References


Lai, X., J. Massey, and S. Murphy. "Markov Ciphers and Differential
Cryptanalysis." Advances in Cryptology--EUROCRYPT '91 Proceedings.
Springer-Verlag, 1991.
Massey, J.L. and X. Lai, "Device for Converting a Digital Block and the Use
Thereof." International Patent PCT/CH91/00117, November 28, 1991.
Schneier, Bruce. Applied Cryptography. New York: John Wiley & Sons, 1993.
 Figure 1: Diagram of the IDEA approach to cryptography.
Table 1: IDEA encryption- and decryption-key subblocks.
Round Encryption-key Subblocks Decryption-key Subblocks
1 Z1(1) Z2(1) Z3(1) Z4(1) Z5(1) Z6(1) Z1(9)-1 -Z2(9) -Z3(9) Z4(9)-1 Z5(8)
Z6(8)
2 Z1(2) Z2(2) Z3(2) Z4(2) Z5(2) Z6(2) Z1(8)-1 -Z2(8) -Z3(8) Z4(8)-1 Z5(7)
Z6(7)
3 Z1(3) Z2(3) Z3(3) Z4(3) Z5(3) Z6(3) Z1(7)-1 -Z2(7) -Z3(7) Z4(7)-1 Z5(6)
Z6(6)
4 Z1(4) Z2(4) Z3(4) Z4(4) Z5(4) Z6(4) Z1(6)-1 -Z2(6) -Z3(6) Z4(6)-1 Z5(5)
Z6(5)
5 Z1(5) Z2(5) Z3(5) Z4(5) Z5(5) Z6(5) Z1(5)-1 -Z2(5) -Z3(5) Z4(5)-1 Z5(4)
Z6(4)
6 Z1(6) Z2(6) Z3(6) Z4(6) Z5(6) Z6(6) Z1(4)-1 -Z2(4) -Z3(4) Z4(4)-1 Z5(3)
Z6(3)
7 Z1(7) Z2(7) Z3(7) Z4(7) Z5(7) Z6(7) Z1(3)-1 -Z2(3) -Z3(3) Z4(3)-1 Z5(2)
Z6(2)
8 Z1(8) Z2(8) Z3(8) Z4(8) Z5(8) Z6(8) Z1(2)-1 -Z2(2) -Z3(2) Z4(2)-1 Z5(1)
Z6(1)
output transformation Z1(9) Z2(9) Z3(9) Z4(9) Z1(1)-1 -Z2(1) -Z3(1) Z4(1)-1
 Figure 2: Electronic codebook (ECB) mode.
 Figure 3: Cipher-block chaining (CBC) mode.
[LISTING ONE]

/* idea.h - header file for idea.c */

#include "usuals.h" /* typedefs for byte, word16, boolean, etc. */
#define IDEAKEYSIZE 16
#define IDEABLOCKSIZE 8
void initcfb_idea(word16 iv0[4], byte key[16], boolean decryp);
void ideacfb(byteptr buf, int count);
void close_idea(void);
void init_idearand(byte key[16], byte seed[8], word32 tstamp);
byte idearand(void);
void close_idearand(void);

/* prototypes for passwd.c */
/* GetHashedPassPhrase -get pass phrase from user, hashes it to an IDEA key.
*/
int GetHashedPassPhrase(char *keystring, char *hash, boolean noecho);

/* hashpass - Hash pass phrase down to 128 bits (16 bytes). */
void hashpass (char *keystring, int keylen, byte *hash);

/********************************IDEA.C*******************************/

/* idea.c - C source code for IDEA block cipher. IDEA (International Data
 * Encryption Algorithm), formerly known as IPES (Improved Proposed Encryption
 * Standard). Algorithm developed by Xuejia Lai and James L. Massey, of ETH
 * Zurich. This implementation modified and derived from original C code
 * developed by Xuejia Lai. Zero-based indexing added, names changed from IPES
 * to IDEA. CFB functions added. Random number routines added. Optimized for
 * speed 21 Oct 92 by Colin Plumb <colin@nsq.gts.org>. This code assumes that
 * each pair of 8-bit bytes comprising a 16-bit word in the key and in the
 * cipher block are externally represented with the Most Significant Byte
 * (MSB) first, regardless of internal native byte order of the target CPU. */

#include "idea.h"

#ifdef TEST
#include <stdio.h>
#include <time.h>
#endif

#define ROUNDS 8 /* Don't change this value, should be 8 */
#define KEYLEN (6*ROUNDS+4) /* length of key schedule */

typedef word16 IDEAkey[KEYLEN];

#ifdef IDEA32/* Use >16-bit temporaries */
#define low16(x) ((x) & 0xFFFF)
typedef unsigned int uint16; /* at LEAST 16 bits, maybe more */
#else
#define low16(x) (x) /* this is only ever applied to uint16's */
typedef word16 uint16;
#endif

#ifdef _GNUC_
/* __const__ simply means there are no side effects for this function,
 * which is useful info for the gcc optimizer */
#define CONST __const__
#else
#define CONST
#endif

static void en_key_idea(word16 userkey[8], IDEAkey Z);
static void de_key_idea(IDEAkey Z, IDEAkey DK);
static void cipher_idea(word16 in[4], word16 out[4], CONST IDEAkey Z);
/* Multiplication, modulo (2**16)+1. Note that this code is structured like
 * this on the assumption that untaken branches are cheaper than taken
 * branches, and the compiler doesn't schedule branches. */
#ifdef SMALL_CACHE
CONST static uint16 mul(register uint16 a, register uint16 b)
{
 register word32 p;
 if (a)
 { if (b)
 { p = (word32)a * b;
 b = low16(p);
 a = p>>16;
 return b - a + (b < a);
 }
 else
 { return 1-a;
 }

 }
 else
 { return 1-b;
 }
} /* mul */
#endif /* SMALL_CACHE */
/* Compute multiplicative inverse of x, modulo (2**16)+1, using Euclid's GCD
 * algorithm. It is unrolled twice to avoid swapping the meaning of the
 * registers each iteration; some subtracts of t have been changed to adds. */
CONST static uint16 inv(uint16 x)
{
 uint16 t0, t1;
 uint16 q, y;
 if (x <= 1)
 return x; /* 0 and 1 are self-inverse */
 t1 = 0x10001 / x; /* Since x >= 2, this fits into 16 bits */
 y = 0x10001 % x;
 if (y == 1)
 return low16(1-t1);
 t0 = 1;
 do
 { q = x / y;
 x = x % y;
 t0 += q * t1;
 if (x == 1)
 return t0;
 q = y / x;
 y = y % x;
 t1 += q * t0;
 } while (y != 1);
 return low16(1-t1);
} /* inv */
/* Compute IDEA encryption subkeys Z */
static void en_key_idea(word16 *userkey, word16 *Z)
{
 int i,j;
 /* shifts */
 for (j=0; j<8; j++)
 Z[j] = *userkey++;
 for (i=0; j<KEYLEN; j++)
 { i++;
 Z[i+7] = Z[i & 7] << 9 Z[i+1 & 7] >> 7;
 Z += i & 8;
 i &= 7;
 }
} /* en_key_idea */
/* Compute IDEA decryption subkeys DK from encryption subkeys Z */
/* Note: these buffers *may* overlap! */
static void de_key_idea(IDEAkey Z, IDEAkey DK)
{
 int j;
 uint16 t1, t2, t3;
 IDEAkey T;
 word16 *p = T + KEYLEN;
 t1 = inv(*Z++);
 t2 = -*Z++;
 t3 = -*Z++;
 *--p = inv(*Z++);
 *--p = t3;

 *--p = t2;
 *--p = t1;
 for (j = 1; j < ROUNDS; j++)
 {
 t1 = *Z++;
 *--p = *Z++;
 *--p = t1;
 t1 = inv(*Z++);
 t2 = -*Z++;
 t3 = -*Z++;
 *--p = inv(*Z++);
 *--p = t2;
 *--p = t3;
 *--p = t1;
 }
 t1 = *Z++;
 *--p = *Z++;
 *--p = t1;
 t1 = inv(*Z++);
 t2 = -*Z++;
 t3 = -*Z++;
 *--p = inv(*Z++);
 *--p = t3;
 *--p = t2;
 *--p = t1;
/* Copy and destroy temp copy */
 for (j = 0, p = T; j < KEYLEN; j++)
 {
 *DK++ = *p;
 *p++ = 0;
 }
} /* de_key_idea */
/* MUL(x,y) computes x = x*y, modulo 0x10001. Requires two temps, t16 and t32.
 * x must me a side-effect-free lvalue. y may be anything, but unlike x, must
 * be strictly 16 bits even if low16() is #defined. All of these are
 * equivalent; see which is faster on your machine. */
#ifdef SMALL_CACHE
#define MUL(x,y) (x = mul(low16(x),y))
#else
#ifdef AVOID_JUMPS
#define MUL(x,y) (x = low16(x-1), t16 = low16((y)-1), \
 t32 = (word32)x*t16+x+t16+1, x = low16(t32), \
 t16 = t32>>16, x = x-t16+(x<t16) )
#else
#define MUL(x,y) ((t16 = (y)) ? (x=low16(x)) ? \
 t32 = (word32)x*t16, x = low16(t32), t16 = t32>>16, \
 x = x-t16+(x<t16) : \
 (x = 1-t16) : (x = 1-x))
#endif
#endif
/* IDEA encryption/decryption algorithm . In/out can be the same buffer */
static void cipher_idea(word16 in[4], word16 out[4], register CONST IDEAkey Z)
{
 register uint16 x1, x2, x3, x4, t1, t2;
 register uint16 t16;
 register word32 t32;
 int r = ROUNDS;
 x1 = *in++; x2 = *in++;
 x3 = *in++; x4 = *in;

 do
 {
 MUL(x1,*Z++);
 x2 += *Z++;
 x3 += *Z++;
 MUL(x4, *Z++);
 t2 = x1^x3;
 MUL(t2, *Z++);
 t1 = t2 + (x2^x4);
 MUL(t1, *Z++);
 t2 = t1+t2;
 x1 ^= t1;
 x4 ^= t2;
 t2 ^= x2;
 x2 = x3^t1;
 x3 = t2;
 } while (--r);
 MUL(x1, *Z++);
 *out++ = x1;
 *out++ = x3 + *Z++;
 *out++ = x2 + *Z++;
 MUL(x4, *Z);
 *out = x4;
} /* cipher_idea */
/*-------------------------------------------------------------*/
#ifdef TEST
/* Number of Kbytes of test data to encrypt. Defaults to 1 MByte. */
#ifndef KBYTES
#define KBYTES 1024
#endif
void main(void)
{ /* Test driver for IDEA cipher */
 int i, j, k;
 IDEAkey Z, DK;
 word16 XX[4], TT[4], YY[4];
 word16 userkey[8];
 clock_t start, end;
 long l;
 /* Make a sample user key for testing... */
 for(i=0; i<8; i++)
 user
key[i] = i+1;
 /* Compute encryption subkeys from user key... */
 en_key_idea(userkey,Z);
 printf("\nEncryption key subblocks: ");
 for(j=0; j<ROUNDS+1; j++)
 {
 printf("\nround %d: ", j+1);
 if (j==ROUNDS)
 for(i=0; i<4; i++)
 printf(" %6u", Z[j*6+i]);
 else
 for(i=0; i<6; i++)
 printf(" %6u", Z[j*6+i]);
 }
 /* Compute decryption subkeys from encryption subkeys... */
 de_key_idea(Z,DK);
 printf("\nDecryption key subblocks: ");
 for(j=0; j<ROUNDS+1; j++)

 {
 printf("\nround %d: ", j+1);
 if (j==ROUNDS)
 for(i=0; i<4; i++)
 printf(" %6u", DK[j*6+i]);
 else
 for(i=0; i<6; i++)
 printf(" %6u", DK[j*6+i]);
 }
 /* Make a sample plaintext pattern for testing... */
 for (k=0; k<4; k++)
 XX[k] = k;
 printf("\n Encrypting %d KBytes (%ld blocks)...", KBYTES, KBYTES*64l);
 fflush(stdout);
 start = clock();
 cipher_idea(XX,YY,Z); /* encrypt plaintext XX, making YY */
 for (l = 1; l < 64*KBYTES; l++)
 cipher_idea(YY,YY,Z); /* repeated encryption */
 cipher_idea(YY,TT,DK); /* decrypt ciphertext YY, making TT */
 for (l = 1; l < 64*KBYTES; l++)
 cipher_idea(TT,TT,DK); /* repeated decryption */
 end = clock() - start;
 l = end * 1000. / CLOCKS_PER_SEC + 1;
 i = l/1000;
 j = l%1000;
 l = KBYTES * 1024. * CLOCKS_PER_SEC / end;
 printf("%d.%03d seconds = %ld bytes per second\n", i, j, l);
 printf("\nX %6u %6u %6u %6u \n",
 XX[0], XX[1], XX[2], XX[3]);
 printf("Y %6u %6u %6u %6u \n",
 YY[0], YY[1], YY[2], YY[3]);
 printf("T %6u %6u %6u %6u \n",
 TT[0], TT[1], TT[2], TT[3]);
 /* Now decrypted TT should be same as original XX */
 for (k=0; k<4; k++)
 if (TT[k] != XX[k])
 {
 printf("\n\07Error! Noninvertable encryption.\n");
 exit(-1); /* error exit */
 }
 printf("\nNormal exit.\n");
 exit(0);/* normal exit */
} /* main */
#endif /* TEST */
/*************************************************************************/
/* xorbuf - change buffer via xor with random mask block. Used for Cipher
 * Feedback (CFB) or Cipher Block Chaining (CBC) modes of encryption. Can be
 * applied for any block encryption algorithm, with any block size, such as
 * the DES or the IDEA cipher. */
static void xorbuf(register byteptr buf, register byteptr mask,
 register int count)
/* count must be > 0 */
{
 if (count)
 do
 *buf++ ^= *mask++;
 while (--count);
} /* xorbuf */
/* cfbshift - shift bytes into IV for CFB input. Used only for Cipher Feedback

 * (CFB) mode of encryption. Can be applied for any block encryption algorithm
 * with any block size, such as the DES or the IDEA cipher. */
static void cfbshift(register byteptr iv, register byteptr buf,
 register int count, int blocksize)
/* iv is initialization vector. buf is buffer pointer. count is number of
bytes
 * to shift in...must be > 0. blocksize is 8 bytes for DES or IDEA ciphers. */
{
 int retained;
 if (count)
 {
 retained = blocksize-count; /* number bytes in iv to retain */
 /* left-shift retained bytes of IV over by count bytes to make room */
 while (retained--)
 {
 *iv = *(iv+count);
 iv++;
 }
 /* now copy count bytes from buf to shifted tail of IV */
 do *iv++ = *buf++;
 while (--count);
 }
} /* cfbshift */
/* Key schedules for IDEA encryption and decryption */
static IDEAkey Z, DK;
static word16 *iv_idea; /* pointer to IV for CFB or CBC */
static boolean cfb_dc_idea; /* TRUE iff CFB decrypting */
/* initkey_idea initializes IDEA for ECB mode operations */
void initkey_idea(byte key[16], boolean decryp)
{
 word16 userkey[8]; /* IDEA key is 16 bytes long */
 int i;
 /* Assume each pair of bytes comprising a word is ordered MSB-first. */
 for (i=0; i<8; i++)
 {
 userkey[i] = (key[0]<<8) + key[1];
 key++; key++;
 }
 en_key_idea(userkey,Z);
 if (decryp)
 {
 de_key_idea(Z,Z); /* compute inverse key schedule DK */
 }
 for (i=0; i<8; i++)/* Erase dangerous traces */
 userkey[i] = 0;
} /* initkey_idea */
/* Run a 64-bit block thru IDEA in ECB (Electronic Code Book) mode, using the
 * currently selected key schedule. */
void idea_ecb(word16 *inbuf, word16 *outbuf)
{
 /* Assume each pair of bytes comprising a word is ordered MSB-first. */
#ifndef HIGHFIRST /* If this is a least-significant-byte-first CPU */
 word16 x;
 /* Invert the byte order for each 16-bit word for internal use. */
 x = inbuf[0]; outbuf[0] = x >> 8 x << 8;
 x = inbuf[1]; outbuf[1] = x >> 8 x << 8;
 x = inbuf[2]; outbuf[2] = x >> 8 x << 8;
 x = inbuf[3]; outbuf[3] = x >> 8 x << 8;
 cipher_idea(outbuf, outbuf, Z);
 x = outbuf[0]; outbuf[0] = x >> 8 x << 8;

 x = outbuf[1]; outbuf[1] = x >> 8 x << 8;
 x = outbuf[2]; outbuf[2] = x >> 8 x << 8;
 x = outbuf[3]; outbuf[3] = x >> 8 x << 8;
#else /* HIGHFIRST */
 /* Byte order for internal and external representations is the same. */
 cipher_idea(inbuf, outbuf, Z);
#endif /* HIGHFIRST */
} /* idea_ecb */
/* initcfb - Initializes IDEA key schedule tables via key; initializes Cipher
 * Feedback mode IV. References context variables cfb_dc_idea and iv_idea. */
void initcfb_idea(word16 iv0[4], byte key[16], boolean decryp)
/* iv0 is copied to global iv_idea, buffer will be destroyed by ideacfb. key
is
 * pointer to key buffer. decryp is TRUE if decrypting, FALSE if encrypting.
*/
{
 iv_idea = iv0;
 cfb_dc_idea = decryp;
 initkey_idea(key,FALSE);
} /* initcfb_idea */
/* ideacfb - encipher a buffer with IDEA enciphering algorithm, using Cipher
 * Feedback (CFB) mode. Assumes initcfb_idea has already been called.
 * References context variables cfb_dc_idea and iv_idea. */
void ideacfb(byteptr buf, int count)
/* buf is input, output buffer, may be more than 1 block. count is byte count
 * is byte count of buffer. May be > IDEABLOCKSIZE. */
{
 int chunksize; /* smaller of count, IDEABLOCKSIZE */
 word16 temp[IDEABLOCKSIZE/2];

 while ((chunksize = min(count,IDEABLOCKSIZE)) > 0)
 {
 idea_ecb(iv_idea,temp); /* encrypt iv_idea, making temp. */
 if (cfb_dc_idea)/* buf is ciphertext */
 /* shift in ciphertext to IV... */
 cfbshift((byte *)iv_idea,buf,chunksize,IDEABLOCKSIZE);
 /* convert buf via xor */
 xorbuf(buf,(byte *)temp,chunksize);/* buf has enciphered output */
 if (!cfb_dc_idea)/* buf was plaintext, is now ciphertext */
 /* shift in ciphertext to IV... */
 cfbshift((byte *)iv_idea,buf,chunksize,IDEABLOCKSIZE);
 count -= chunksize;
 buf += chunksize;
 }
} /* ideacfb */
/* close_idea function erases all the key schedule information when we are
 * done with a set of operations for a particular IDEA key context. This is to
 * prevent any sensitive data from being left around in memory. */
void close_idea(void) /* erase current key schedule tables */
{
 short i;
 for (i = 0; i < KEYLEN; i++)
 Z[i] = 0;
} /* close_idea() */
/********************************************************************/
/* These buffers are used by init_idearand, idearand, and close_idearand. */
static word16 dtbuf_idea[4] = {0}; /* buffer for enciphered timestamp */
static word16 randseed_idea[4] = {0}; /* seed for IDEA random # generator */
static word16 randbuf_idea[4] = {0}; /* buffer for IDEA random # generator */
static byte randbuf_idea_counter = 0; /* random bytes left in randbuf_idea */
/* init_idearand - initialize idearand, IDEA random number generator. Used for

 * generating cryptographically strong random numbers. Much of design comes
 * from Appendix C of ANSI X9.17. key is pointer to IDEA key buffer. seed is
 * pointer to random number seed buffer. tstamp is a 32-bit timestamp */
void init_idearand(byte key[16], byte seed[8], word32 tstamp)
{
 int i;
 initkey_idea(key, FALSE); /* initialize IDEA */
 for (i=0; i<4; i++) /* capture timestamp material */
 { dtbuf_idea[i] = tstamp; /* get bottom word */
 tstamp = tstamp >> 16; /* drop bottom word */
 /* tstamp has only 4 bytes-- last 4 bytes will always be 0 */
 }
 /* Start with enciphered timestamp: */
 idea_ecb(dtbuf_idea,dtbuf_idea);
 /* initialize seed material */
 for (i=0; i<8; i++)
 ((byte *)randseed_idea)[i] = seed[i];
 randbuf_idea_counter = 0; /* # of random bytes left in randbuf_idea */
} /* init_idearand */
/* idearand - IDEA pseudo-random number generator. Used for generating
 * cryptographically strong random numbers. Much of design comes from Appendix
 * C of ANSI X9.17. */
byte idearand(void)
{
 int i;
 if (randbuf_idea_counter==0) /* if random buffer is spent...*/
 { /* Combine enciphered timestamp with seed material: */
 for (i=0; i<4; i++)
 randseed_idea[i] ^= dtbuf_idea[i];
 idea_ecb(randseed_idea,randbuf_idea); /* fill new block */
 /* Compute new seed vector: */
 for (i=0; i<4; i++)
 randseed_idea[i] = randbuf_idea[i] ^ dtbuf_idea[i];
 idea_ecb(randseed_idea,randseed_idea); /* fill new seed */
 randbuf_idea_counter = 8; /* reset counter for full buffer */
 }
 /* Take a byte from randbuf_idea: */
 return(((byte *)randbuf_idea)[--randbuf_idea_counter]);
} /* idearand */
void close_idearand(void)
{ /* Erase random IDEA buffers and wipe out IDEA key info */
 int i;
 for (i=0; i<4; i++)
 { randbuf_idea[i] = 0;
 randseed_idea[i] = 0;
 dtbuf_idea[i] = 0;
 }
 close_idea();/* erase current key schedule tables */
} /* close_idearand */
/* end of idea.c */












December, 1993
Visualizing Data in Real Time


A Windows-based front-end tool puts the best face on data acquisition




James F. Farley and Peter D. Varhol


James is a project manager at Armtec/Regan Inc., located in Manchester New
Hampshire. Peter is an assistant professor of computer science and mathematics
at Rivier College in Nashua, New Hampshire.


One of the last things a data-acquisition engineer wants to worry about is the
user interface of the data-acquisition code. This code, frequently written for
a single use, is meant to acquire the data, display answers to very specific
questions, and save the data for further analysis in a spreadsheet or other
analysis package. From the engineer's standpoint, the user interface (UI) is
extra work that doesn't relate to answering the real time questions posed by
the data flow.
In certain cases, however, someone other than the engineer has to examine the
data as it's being produced. For example, the process may be monitored by a
technician, or it may be used by a sales or marketing professional to help
market a product. In other cases, the engineer may not even be interested in
the data itself, but in whether the data stream produces results beyond an
acceptable range. The message may be "Alert me only if something unusual
happens."
For these uses, it makes sense to put some effort into the UI. This doesn't
mean that the engineer has to spend some time with the Windows SDK, however.
One solution to such real-time UI problems is Laboratory Technology's Vision,
a graphical front end for data display. It includes a collection of graphical
plots and instruments for data display, and the ability to create more, if
needed.
Vision is deceptively simple to use. Each of the plots and instruments can be
manipulated as a whole on the screen. Once they're in the desired positions,
double-clicking on them opens up a dialog box in which data parameters can be
specified. Since the primary purpose of Vision is to receive and display data,
there are few parameters to worry about. Depending upon the type of
instrument, you can adjust the range of display values, the reading labels,
and in some cases the alarm conditions. Through another mechanism, you also
set the DDE links with the server application.


A Graphic Front-end Requirement


The application we built with Vision acquires and displays sensor data from an
optical fire detector. The sensors collected data on the ultraviolet and
infrared radiation in the surrounding atmosphere, along with the normal
background radiation, and determine whether or not a fire exists based on the
ratio between the ultraviolet and infrared signals. The determination is pure,
rather than fuzzy, because a ratio within a specified region indicates a fire
condition, while a ratio outside of that region does not.
The ultraviolet sensor works on the Geiger-Muller principle. A photon of light
strikes the high-potential filament of the UV tube, knocking off an electron
that is then attracted to the low-potential filament. This causes a momentary
conduction within the tube. Whenever a conduction event occurs, it's referred
to as a "count," and there is one count per photon release.
The infrared sensor is a thermopile. This solid-state device is made up of two
dissimilar materials. When a photon of light strikes the active surface of the
IR sensor, a potential is generated between the two materials. This signal,
which is proportional to the number of photons striking the thermopile, is
then amplified and fed into an analog-to-digital converter. The strength of
the signal generated as a result of this process is known as "converter
steps."
The sensors are made in such a way (or, put behind filters in the case of the
thermopile) that they respond only to specified wavelengths of ultraviolet and
infrared radiation. These regions of the electromagnetic spectrum correspond
to a hydrocarbon fire signature. The region is centered around 4.3 microns for
the IR sensor and 0.22 microns for the UV sensor. This is all necessary to
make the fire-detector immune to false-positive signals.
The detector is designed with an embedded microcontroller that gathers data
from the sensors and conditions the pulse streams, using the ratio of the
pulse streams to determine when a fire exists. The data then can be sent out
into the serial port of the computer and either gathered into an ASCII file or
displayed in real time on a terminal.
Vision provided the UI for the recorded signals. We selected simple plots for
the sensor data, so that we could observe the level of radiation over time,
something not always possible to do with a text display. These plots are also
alarm enabled; that is, they can be configured to perform an action when the
input signal falls within a certain range of values. That was also a part of
our goal, so that the user could receive a clear notification that a fire was
detected.
Vision cannot access the data stream directly. It is only a graphical front
end for another application--a spreadsheet, database, or a data-acquisition
program. The package includes an interface for XBase-compatible databases, but
the primary method of getting data for display is to communicate with the
actual acquisition application via a DDE link. At first glance, this seems
unnecessarily complex, but this approach can save an investment in code. If
the engineer already has C routines for data acquisition, these routines can
be enhanced with the canned DDE routines provided with Borland or Microsoft C,
and immediately adapted to work with Vision. The effort needn't begin from
scratch, as is often the case with Windows applications.
A sample DDE link is shown if Figure 1. The link is specified by the
application name, function name, and variable name. This means, incidentally,
that a link can be established between the Vision UI and more than one server
application. The Vision front end also uses the OLE Paste Link function, which
can display objects that update automatically when the host file is updated. A
Net Link function allows you to establish the DDE link from any drive on the
network, so that the DDE interface need not be local to a single machine.
Vision is capable of exerting some control over its displays, primarily
through the DDE link. It includes glyphs for switches, knobs, and sliders that
can be manipulated by the user while the application is running. Signals from
these tools can be sent to data-display tools in the Vision application or
through a DDE link back to the server program. This is primarily useful for
simulating processes rather than displaying actual data.
To round out the UI building facilities, Vision provides basic drawing tools
to create lines, text, circles, and polygons, which can be used to improve the
overall appearance of the user interface; see Figure 1. They can also be
converted into display "hot spots," such as our fire-alarm button, which can
then serve as a DDE connection.


The DDE Server


We prototyped the data-acquisition application using Visual Basic. This may
seem unusual but it afforded several advantages. First, it enabled us to
visually examine our data as we read it into the program. This was not
required from the data-display standpoint, but it helped in the debugging
process. We also imagined a situation in which an engineer would prefer to
view the raw data as it was acquired, as well as examine the graphical plots
in Vision. We also did some data manipulation in Visual Basic, mostly in
checking the validity of data and computing the ratio between IR and UV
signals.
Second, rather than having to manually launch two different applications in
order to create the DDE link, Visual Basic lets you start another application
from within a running program. We did just that using the RUNAPP call. This
meant that the application could be started by a novice user, without starting
two separate applications and establishing the appropriate DDE links between
them.
We confess that for the prototype, we read the data from a file, rather than
from an active data stream. However, this was merely so that the application
wouldn't have to be written and tested. After ensuring that this performed as
expected, we set to work writing a separate program to read our data from the
serial port.
The end result is shown in Figure 2. We plotted the IR and UV signals along
with the ratio between them, so that an observer could examine the history of
signals leading up to the fire determination. Each of these plots was
connected via a DDE link to the Visual Basic program reading the data. As
mentioned, the calculations occur in Visual Basic, so that just the resulting
ratio is passed into the instrument. This UI also includes a simple "idiot
light" icon, whose default color is green. When the ratio signal passes its
threshold, the light turns to red. This provides an easy visual verification
when a fire is detected.
As we indicated, Visual Basic is an unlikely back end, especially for reading
data coming into the computer in real time. After demonstrating to our
satisfaction that the DDE interface worked properly, we began work on a C
interface to read data directly from the serial port and pump it into Vision,
without the intermediate step of displaying it first. This is likely the more
common use for this type of front end.


Why Use a Graphical Front-end Package?


There is no facility for real-time process control from a Vision application.
However, much of this type of work can be done from the calling application.
If, for example, the IR and UV signal ratio were to determine that a fire
existed, it would be possible to send a "Fire" signal back to the Visual Basic
server application, which could then be used in conjunction with a hardware
interface to activate the fire-suppression system.
With the potential for so much to be accomplished in our Visual Basic
data-acquisition application, why bother with Vision at all? The answer is
that Vision has most of the display tools required to quickly prototype the
graphical data display, while in our case we would have to design and write
these for Visual Basic separately. Although possible, this reduces the
effectiveness of a rapidly prototyped user interface.
If you're using existing C code and adding Vision as the UI, this approach
makes more sense than porting the data-acquisition program into a graphical
language first. There isn't a good correspondence between a purely procedural
language like C and a forms-based language like Visual Basic or even Visual
C++.
Further, Vision makes it possible to display separate data signals
simultaneously on the same application desktop. Our application establishes
three separate DDE links to Visual Basic, and updates all plots at virtually
the same time. This isn't possible with most other data-display packages,
which either limit you to a single X-axis on a graph, or don't update the
graph in real time.


Limitations and Promises



Just about the only problem we had with Vision is a tendency to lose the DDE
links between application runs. This wouldn't matter if the data-acquisition
process were running continuously, but it could cause a problem when stopping
and restarting the application. The example applications provided with Vision
didn't exhibit this, and we couldn't find a pattern behind the problem.
Another problem--or perhaps a "feature"--is that Vision doesn't update signals
that don't change. The DDE transfer occurs only when the current value differs
from the previous one. This certainly saves processor cycles, but in our case,
prevented us from viewing a data history within Vision. Anyone using a plot to
view data over time should be aware that the plot won't be updated unless the
acquired data values change.
The network link means that one computer can be used for real-time data
acquisition, and the UI can be displayed on another. This allows for a
conceptual and physical separation between the acquisition and display of
data. This feature could be even more useful if the Vision front end could
alert the user to an alarm condition, even if the front end were iconized on
the desktop and the user were actively working in another application. The
user could then monitor a process while doing other work.
With the popularity of tools such as PowerBuilder, the concept of
software-development packages for building graphical front ends to existing
code is becoming more accepted in general. Vision brings such a capability to
the real-time data-acquisition world, using client/server technology. By
itself, it does little to improve the data-acquisition process. However,
working with existing code, or with a DDE-enabled commercial application, it
can make creating and using graphical, instrument-based user interfaces a
snap.


For More Information


Laboratory Technologies Inc.
400 Research Drive
Wilmington, MA 01887
508-657-5400
 Figure 1: A graphical front end with DDE links displayed.
 Figure 2: Our fire-detector graphical front end, recording signals over time
and warning of a fire.
[LISTING ONE]

Sub FileCommand_Click ()
 FireData.Hide
 FileForm.Caption = "Files"
 FileForm.FIL_Files.Refresh
 FileForm.Show 1
 FireData.Show 0
 ReadData (DataFile)
End Sub

Sub ReadData (FileSpec As String)
 Open FileSpec For Input Access Read As #1
 FileForm.Hide


 While Not EOF(1)
 dummy% = DoEvents()
 Input #1, FileData$
 SpcPos1& = InStr(1, FileData$, " ") ' Find space.
 RawUV.Text = Format$(Val("&H"+ Left$(FileData$, SpcPos1& - 1)), " ##0")
 SpcPos1& = SpcPos1& + 1
 SpcPos2& = InStr(SpcPos1&, FileData$, " ") ' Find space.
 RawIR.Text = Format$(Val("&H" + Mid$(FileData$, SpcPos1&,
 SpcPos2& - SpcPos1&)), " ##0")
 SpcPos2& = SpcPos2& + 1
 SpcPos1& = SpcPos2&
 SpcPos2& = InStr(SpcPos2&, FileData$, " ") ' Find space.
 Background.Text = Format$(Val("&H" + Mid$(FileData$, SpcPos1&,
 SpcPos2& - SpcPos1&)), " ##0")
 SpcPos2& = SpcPos2& + 1
 SpcPos1& = SpcPos2&
 SpcPos2& = InStr(SpcPos2&, FileData$, " ") ' Find space.
 UVLimited.Text = Format$(Val("&H" + Mid$(FileData$, SpcPos1&,
 SpcPos2& - SpcPos1&)), " ##0")
 SpcPos2& = SpcPos2& + 1
 SpcPos1& = SpcPos2&
 SpcPos2& = InStr(SpcPos2&, FileData$, " ") ' Find space.
 IRLimited.Text = Format$(Val("&H" + Mid$(FileData$, SpcPos1&,
 SpcPos2& - SpcPos1&)) / 256, " ##0.0000")
 SpcPos2& = SpcPos2& + 1
 SpcPos1& = SpcPos2&
 SpcPos2& = InStr(SpcPos2&, FileData$, " ") ' Find space.
 Ratio.Text = Format$(Val("&H" + Mid$(FileData$, SpcPos1&,

 SpcPos2& - SpcPos1&)) / 256, " ##0.0000")
 SpcPos2& = SpcPos2& + 1
 SpcPos1& = SpcPos2&
 SpcPos2& = InStr(SpcPos2&, FileData$, " ") ' Find space.
 FireRegister.Text = Format$(Val("&H" + Mid$(FileData$, SpcPos1&))," ##0")
 Wend
 Close #1
End Sub
Sub Quit_Click ()
 End
End Sub



















































December, 1993
Understanding OSI Network Management


The object-oriented model is the key




William Stallings


Bill, an independent consultant and president of Comp-Comm Consulting, is the
author of over a dozen books on data communications and computer networking.
This article is based on his recent book, SNMP, SNMPv2, and CMIP: The
Practical Guide to Network Management Standards (Addison-Wesley, 1993). He can
be reached at 72500.3562@compuserve.com.


It's rare indeed for organizations to maintain networks in which all
workstations, PCs, routers, and associated gear are from a single vendor using
a single operating system. Thus it becomes necessary to use some sort of
vendor-independent network-management scheme to tie everything together for
management and control. The most ambitious scheme, based on the open systems
interconnection (OSI) model, is a set of standards for OSI-based network
management, referred to as "OSI systems management."
A network-management system consists of one or more network-management
stations that provide an operator interface for monitoring and controlling
network activity, plus software in the various nodes of the
network--workstations, bridges, routers, and the like--that interact with the
network-management stations and respond to network-management commands. The
foundation of any network-management system is a database (maintained at each
node) containing information about the resources and elements to be managed.
Typically, this database is referred to as a "management information base"
(MIB). The general framework within which a MIB can be defined and constructed
is the structure of management information (SMI). The SMI identifies the data
types used in the MIB, and determines how MIB information is represented and
named.
Key to understanding OSI systems management is an understanding of the SMI and
MIB structure. This article explores the underlying concepts used in defining
OSI MIBs and the types of management operations that can be performed on them.


Basic Concepts of the Information Model


OSI systems management relies heavily on object-oriented design. Each resource
that's monitored and controlled by OSI systems management is represented by a
managed object. The MIB is a structured collection of such objects. An object
can be defined for any resource that an organization needs to monitor and/or
control. Examples of hardware resources include switches, workstations, PBXs,
LAN port cards, and multiplexers. Software resources include queuing programs,
routing algorithms, and buffer-management routines. There are several points
about managed objects you need to keep in mind:
A managed object is an abstraction directly available to the
systems-management function. Other system software, outside the scope of the
OSI management standards, maintain the relationship between the managed object
and the actual resource.
A single managed object may represent a single resource, or many resources.
The same network resource may be represented by a single managed object, or a
number of different managed objects, each of which represents a particular
aspect of the resource.
Not all resources need be represented by a managed object.
Some managed objects are defined solely for the support of management
functions and do not represent resources--event logs and filters, for
instance.
A managed object is defined in terms of attributes it possesses, operations
performed on it, notifications it issues, and relationships with other managed
objects. (For more details, see the accompanying text box entitled "Defining
Managed Objects.") To structure an MIB definition, each managed object is an
instance of a managed-object class--a model or template for managed-object
instances that share the same attributes, notifications, and management
operations. A managed-object class, as specified by the template, consists of:
Attributes visible at the managed-object boundary.
System-management operations that can be applied to the managed object.
Behavior exhibited by the managed object in response to management operations.
Notifications that can be emitted by the managed object.
Conditional packages that can be encapsulated in the managed object.
Position of the managed object in the inheritance hierarchy.


Attributes


The actual data elements contained in a managed object are called
"attributes." Each attribute represents a property of the resource that the
object represents, such as the operational characteristics, current state, or
conditions of operation. Attributes are commonly used for monitoring, with the
attribute value reflecting the status of the underlying resource. An attribute
can also be used for control, with the setting of an attribute value causing a
change in the behavior or status of the underlying resource.
The data type of an attribute may be integer, real, Boolean, character string,
or some composite type constructed from the basic types. In addition to a data
type, each attribute has access rules such as read, write, read/write. There
are also rules by which it can be located as the result of a filtered search
(for instance, matching rules).
An attribute can be a simple scalar variable. Read (get) and write (set,
replace) operations are possible on scalar attributes. Additionally, an
attribute may be set-valued, consisting of an unordered, variable number (zero
or greater) of elements, all of one type. In addition to read and write
operations performed on all attributes, operations to add or remove elements
from a set-valued attribute are possible.


Object Classes and Inheritance


A managed-object class is a template that defines management operations,
attributes, packages, notifications, and behavior included in a particular
type of object. All object instances that share these same elements are
members of the same class. The individual object instances may differ in the
values of their attributes. The class concept is thus a macro-type facility
that allows a general type of object to be defined once, and allows that
definition to be re-used many times for each actual instance of the object
type.
More significantly, the class construct allows for the definition of new
object classes in terms of existing classes, a process referred to as
"specialization." A new object class is referred to as a "subclass" of the
class from which it is specialized. The use of the subclass concept allows the
development of a class hierarchy, with a subclass in turn having its own
subclasses. This structure mirrors the actual structure of resources to be
modeled in almost every case. The subclass retains characteristics of its
superclass, a concept known as "inheritance." This minimizes the need to
specify characteristics of individual objects.
In the OSI systems-management context, specialization is achieved by extending
the characteristics of an object class in one or more of the following ways:
the addition of new attributes; the extension or restriction of the range of
an existing attribute; the addition of new operations and notifications; the
addition of arguments to existing operations and notifications; and the
extension or restriction of the ranges of arguments to operations and
notifications.
Unlike a general-purpose object-oriented scheme, OSI systems management
doesn't allow the definition of a subclass by deleting any of the
characteristics of its superclass.
All object classes ultimately derive from a unique object class referred to as
"top." This is the ultimate superclass, and the other object classes form an
inheritance hierarchy with top as the root. Figure 1 is an example of a
portion of an inheritance hierarchy.


Behavior



A managed object exhibits behavioral characteristics, including how the object
reacts to operations performed on it, and the constraints placed on its
behavior. The behavior of a managed object occurs in response to external
stimuli (system-management operations delivered in the form of messages) and
internal stimuli (events internal to the managed object and its associated
resource, such as timers).
All managed-object instances of the same managed-object class exhibit the same
behavior. The behavior defines the semantics of the attributes, operations,
and notifications, the response to management operations being invoked on the
managed object, the circumstances under which notifications will be emitted,
the dependencies between values of particular attributes, and the effects of
relationships on the participating managed objects


Notifications


Managed objects are said to emit notifications when an internal or external
occurrence affecting the object is detected. Notifications may be transmitted
externally in a protocol, or logged. Managing systems may request that some or
all of the notifications emitted by a managed object are to be sent to it.
Notifications that are sent to a manager or to a log are contained in an event
report.


Containment and Naming


The object-oriented subclass facility allows for the creation of an
inheritance hierarchy (Figure 1) that reflects the relationship among various
types of objects. This hierarchy represents a convenience for defining a
variety of object types with the minimum of text. It's also a useful
structuring tool in designing objects for an MIB. However, the inheritance
hierarchy does not reflect the structure of an actual MIB. This structure is
defined using the object-oriented containment facility. The containment
facility lets one object "contain" one or more other objects. Containment is
achieved by including a reference to the subordinate (contained) object in the
superior (containing) object. The reference is in the form of the object
identifier of the subordinate object, and is stored as the value of an
attribute in the superior object. A subordinate managed object may be
contained in only one superior managed object, enforcing the condition that
the MIB structure be a tree structure. A containing object may itself be
contained in another object, allowing the construction of a tree of arbitrary
depth. Thus, the MIB structure can directly model real-world hierarchical
structures, such as assembly, subassemblies, components, or directory, files,
and fields.
Just as there's a distinction between the inheritance hierarchy which defines
the relationship among object classes and the containment hierarchy which
defines the relationship among object instances in the MIB, there's also a
distinction between the naming scheme for object classes and object instances.
Each object class is registered in the registration tree and is identified by
a unique object identifier. The object identifier is a sequence of integers
that navigate through the registration tree of assigned identifiers to the
managed object class. Figure 2 shows the top levels of the ISO/CCITT object
identifier tree.
The naming scheme for object instances is completely distinct from that for
object classes, and is dictated by the containment relationship. The naming
scheme works as follows: Each managed object class includes an attribute that
is used in naming instances of that object. The relative distinguished name of
an object instance corresponds to a specific value of the naming attribute.
This value must be unique among all objects that are subordinate to the same
superior. The actual form of a relative distinguished name is an assertion
that an attribute has a particular value, for example, MS-Id = "BDC" where
MS-Id is the name of the attribute and "BDC" is the desired value. The
distinguished name of an object instance is formed as the sequence of relative
distinguished names from the root of the containment tree to this object.
Figure 3 shows a containment tree. A managed object instance name (the value
of the naming attribute) is created when the instance is created. These names
don't have to be registered or made public. They do have to be exchanged
between interoperating managed systems to permit access to the object. Also,
although the naming scheme is based on containment, not all forms of
containment are necessarily used for naming. Containment can be used to create
pointers between object instances that reflect the structure of the MIB and
that go beyond a simple tree structure.
The ISO registration tree is a naming tree which registers the definition of
managed objects, attribute definitions, actions, notifications, and packages.
You can think of the registration tree as a dictionary or library of "stubs"
stuck into new managed object class definitions. Since they're registered, the
stubs have well-known names and agreed-upon semantics. This illustrates the
benefit of object-oriented reuse.
The inheritance tree (see Figure 1) shows how the definition of object classes
is derived from other object classes using object-oriented principles.
Inheritance allows for reuse of a object class structure, with refinements to
define a related but distinct object class.
Finally, the containment tree is the MIB structure. It shows the objects an
agent contains and the hierarchy/containment of those objects. This tree is
used not only to define the MIB structure but as a means of unambiguously
referencing object instances.


Systems-management Operations


The definition of the management-information model includes a specification of
the operations that may be performed on objects. These operations are
performed by a management entity by means of a message sent to the object,
using a network-management protocol. Systems-management operations apply to
the attributes of an object or to the managed object as a whole. An operation
performed on a managed object can succeed only if the invoking managing system
has the access rights necessary to perform the operation, and consistency
constraints are not violated.
Management operations apply to attributes of an object and to the object as a
whole.


Attribute-oriented Operations


There are a number of operations that may be sent to an object to be applied
to one or more of its contained attributes, including get-attribute value,
replace-attribute value, set-attribute value to default, add member to a
set-valued attribute, and remove member from a set-valued attribute.
Any operation may request that the same function be performed on a list of
attributes. For example, an operation to replace attribute values could
specify a list of attributes with the new value for each attribute. An
operation may specify that the individual operations are to be performed
atomically; that is either all operations succeed or none are performed.
If atomic operation is not requested, the managed object will attempt the
individual operation on each attribute in the list of attributes for which the
operation is requested. As a result of the operation, the object will report
the attribute identifiers and their associated values for those attributes
whose values could be operated on, and error indications for those attributes
that could not be operated on.


Object-oriented Operations


The create, delete, and action operations apply to managed objects as a whole.
The semantics of these operations are part of the definition of the
managed-object class. In particular, the effect of these operations on other
related managed-objects (for example, superior or subordinate objects) must be
specified.
Central to the concept of automated network management is a requirement that
the resources to be managed and the actions that can be taken are represented
in a systematic way. The object-oriented approach used for OSI systems
management provides a flexible foundation for meeting this requirement.


Defining Managed Objects


The OSI systems-management standard includes a specification for the format of
definitions for managed objects and their attributes. The specification
consists of a set of templates, which are standardized formats to be used in
the definitions. The templates summarize the elements that should be included
in a definition and the notational tools that are recommended to be used in
the definition. Templates are constructed using these conventions:
Terms in uppercase letters in the template appear in the definition in the
same form.
Terms in lowercase letters in angle brackets are variable names; the actual
name is substituted in the definition.
Terms in square brackets are optional.
Terms in square brackets followed by an asterisk may appear zero or more
times.
The template for defining managed-object classes has the structure shown in
Listing One (page 77). The DERIVED FROM construct indicates the superclass or
superclasses from which this class is inherited. This construct is required in
all definitions except for object class top. Characteristics (attributes,
notifications, and so forth) that are inherited are not repeated in the
definition unless an extension or modification is made.
The CHARACTERIZED BY construct allows one or more mandatory packages to be
included in the object-class definition. The CONDITIONAL PACKAGES construct
allows one or more conditional packages to be included.
As an example, consider an object-class definition from the standard which
defines transport-layer management. One of the definitions is for the
managed-object class for the connection-oriented transport-protocol machine;
see Listing Two (page 77).
This managed object contains key parameters for monitoring the operation of a
transport-protocol entity in a managed system. Note that a number of the
defined attributes are grouped together for convenient access to the counters
as a unit.
--W.S.
 Figure 1: Inheritance example.

 Figure 2: Top-level object-identifier assignments.
 Figure 3: Transport-layer managed object-containment structure.
[LISTING ONE]

<class-label> MANAGED OBJECT CLASS
 [DERIVED FROM <class-label> [,<class-label>]* ; ]
 [CHARACTERIZED BY <package-label> [,<package-label>]* ; ]
 [CONDITIONAL PACKAGES <package-label> PRESENT IF
condition-definition
 [,<package-label> PRESENT IF condition-definition]* ; ]
 REGISTERED AS object-identifier ;
 supporting productions


[LISTING TWO]

comodePM MANAGED OBJECT CLASS
 DERIVED FROM "ISO/IEC 10165-2":top;
 CHARACTERIZED BY comodePM PACKAGE
 ATTRIBUTES
 coPMName DEFAULT VALUE TLM.null-MO-Name-Value
 PERMITTED VALUE TLM.null-MO-Name-Syntax
 REQUIRED VALUE TLM.null-MO-Name-Syntax GET,
 "ISO/IEC 10165-2":octetsSentCounter GET,
 "ISO/IEC 10165-2":octetsReceivedCounter GET,
 "ISO/IEC 10165-2":incomingProtocolErrorsCounter GET,
 openConnections GET
 maxConnections REPLACE-WITH-DEFAULT
 DEFAULT VALUE implementation dependent GET-REPLACE
 localSuccessfulConnections GET,
 remoteSuccessfulConnections GET,
 localUnsuccessfulConnections GET,
 remoteUnsuccessfulConnections GET,

 localErrorDisconnects GET,
 remoteErrorDisconnects GET,
 unassociatedTPDUs GET,
 maxOpenConnections REPLACE-WITH-DEFAULT
 DEFAULT VALUE see attribute behavior GET;
 ATTRIBUTE GROUPS
 TLPMCounters
 openConnections
 localSuccessfulConnections
 remoteSuccessfulConnections
 remoteUnsuccessfulConnections
 localErrorDisconnects
 remoteErrorDisconnects
 unassociatedTPDUs
 maxOpenConnections;;;
 REGISTERED AS { TLM.moe comodePM (4) } ;












December, 1993
Examining the StarView Application Framework


Cross-platform GUI development




Ramin Firoozy


Ramin heads rp&A Inc. in San Francisco, California. He can be reached through
CompuServe at 70751,252 or via the Internet at rpa@netcom.com.


It's no secret that GUIs are hard to program. Application frameworks like the
Microsoft Foundation Classes (MFC) and Borland's Object-Windows Library (OWL)
under Windows, Apple's MacApp, and Symantec's Think Class Library (TCL) under
MacOS, and InterViews under X-Window systems, all use an object-oriented
paradigm to simplify traditional event-based programming. However, all these
frameworks are inextricably tied to their specific platforms. There are also
portable GUI libraries like XVT and Neuron Data's Open Interface that replace
the standard API calls and create a common and portable interface to the
native GUI. But these GUI libraries don't necessarily simplify
programming--they merely replace the one standard, event-based API with
another.
Since, by definition, application frameworks hide the low-level mechanisms of
a GUI system, they're ideal for this sort of portable abstraction. A gauge of
an application framework's portability is how far you can go without dropping
into the native GUI to do something. Any native GUI code instantly renders the
program nonportable, and the developer is responsible for porting that code to
all target machines. The theory is that if you do stick to framework code, you
can move the same source code to another platform, recompile and link, and you
have a fully functional application with the look-and-feel of the native
system.
StarView 1.0 from Star Division is a cross-platform C++ application framework
that operates across Windows, Macintosh, NT, OS/2, OpenLook, and Motif. In
addition to offering StarView as a separate product, Star Division uses
StarView internally to develop GUI versions of its application products
(primarily its StarWriter word processor). C++ code written to its application
framework can be copied from one machine to another and rebuilt. What comes
out is a native executable that has the speed of a C++ application and the
look-and-feel of the native GUI. StarView provides an impressively large
assortment of classes covering a wide variety of areas in a GUI application.
There are a few areas where it doesn't offer any classes, but where it goes,
it goes well.


A GUI Sampler


To get a better idea of how it actually works, I created a "GUI Sampler" that
examines how well StarView meets its claims. Source code for the program is
available electronically; see "Availability," page 3. I then took the source
and recompiled it on other platforms to see if it worked as claimed. I wanted
this GUI Sampler to show how things like buttons, lists, colors, dialogs, and
fonts moved from one platform to another, and also whether the application
framework was robust enough for a general-purpose native application.
I started development under Windows 3.1 with Borland C/C++. StarView also
supports Microsoft C++ (as of this writing it has not been tested against
Visual C++) and Zortech C++. After the skeleton was developed, I switched to a
Macintosh running System 7.1 and MPW C++ and continued development. The final
application was moved back to the PC and rebuilt. There's also an NT version
of StarView, which I did not test. The NT system I have was October '92 beta,
as was StarView, which would make it difficult to determine if problems with
it were due to StarView or NT.
The Windows version comes on six diskettes, two each for Borland, Microsoft,
and Zortech. The Macintosh version is on one diskette that contains the
libraries and MPW tools. There are two hefty three-ring bound manuals. The
user's guide has a good description of the application framework and covers
the rather large set of examples provided with the system. The reference
manual is an alphabetical description of the classes.
StarVision has three main components: the framework library, the DesignEd
resource editor, and the resource compiler. The library interface comes in a
single #include file (sv.hxx) that contains all the classes and methods
implemented under StarView. The library comes in the form of linkable
libraries (Mac) or a set of DLLs (under Windows). The DesignEd editor is a
stand-alone application that allows you to "paint the screen" instead of
having to write text resources by hand. It generates text resource files with
StarView's portable resource language. The resource compiler translates this
resource file into the local system's resource format, which is then compiled
with the native GUI's resource compiler (rc under Borland and rez under MPW).
I could not find a description of StarView's resource language in the
documentation, although presumably, DesignEd takes care of generating these
resources.
An important issue when evaluating a cross-platform tool is whether all the
tools are symmetrically available on all its target platforms. Most often, you
are expected to develop on one platform and just rebuild on all the others. As
far as source code is concerned, StarView is symmetric across these platforms.
You can develop under one system and move the source to another. Resources,
however, are not symmetric. They're best designed under Windows using a local
resource editor such as Whitewater Resource Toolkit or Borland's Resource
Editor, and then incorporated into DesignEd. Even though DesignEd supports
Macintosh resources such as icons, you can't easily move them back to Windows.
Also, DesignEd has no support for editing bitmaps or icons. Again, you need a
program like Borland's Resource Workshop or ResEdit on the Mac. DesignEd had
no trouble loading these resouces from external resource files. Although it
would have been nice to be able to do everything inside DesignEd, most
development environments come with their own bitmap resource editor, so this
was not much of a problem. DesignEd performs competently, but after using
resource editors like Whitewater Resource Toolkit and Resourcerer, you wish
some more user-interface testing had been performed on it. The Macintosh
version under the current release also requires more QA testing. It crashed a
few times and pronounced some resource language syntax errors, refreshingly
enough, in German.
When demonstrating GUI development systems, it is obligatory to show the
standard Hello World application. Early Windows, Macintosh, and X examples
were written in perverse ways to see how many pages of C code they would take.
A fully functional framework reverses this trend. (I stripped out the comments
to show how small it can get.) Figure 1 shows the Hello World program written
with StarView. The executable this code generates automatically supports
resizable windows, iconize and maximize operations, a close box, and redrawing
when uncovered by another window.


Browsing the Library


The classes have few surprises. They're well laid out and instantly familiar
to anyone who has used an application framework. You subclass Application and
override those functions you want to customize. You are required to override
the Application::Main virtual function, which is automatically called when you
start up. You subclass WorkWindow to create a "main" window. In the
application's Main function, you create an instance of your main window, show
it, then go into the endless event loop by sending the application an Execute
message. The framework automatically calls any function you have overridden
when an event triggers it. In the case of the Hello World program, the Paint
function of the window is called to redraw the contents of the main window.
The framework is allowed to handle everything else in its default fashion.
For a framework to be effective, it has to provide a fair amount of coverage.
Table 1 lists the available classes. The implementation of the OutputDevice is
very flexible and basically provides "free" printing and print-previewing. You
use the same functions for output to the screen, the print-preview window, and
the printer. In order to implement this, the OutputDevice class encapsulates
some of the functions of the Windows GDI and Macintosh QuickDraw, such as
routines to draw boxes and arcs. It's not Display PostScript, but it works
well and simplifies printing a great deal.
The Help and HelpEvent classes support context-sensitive help. Under Windows,
this translates to calls to the Windows Help engine. Under Macintosh, StarView
includes a help engine and help compiler that recognize the same Windows Rich
Text Format (RTF) files as the Windows Help engine. Although the Mac engine is
somewhat slow, having a common source to online help easily outweighs its
shortcomings. You can associate a help ID with every user-interface element.
If the user presses F1 or Help, the help engine can automatically display help
on the workings of that element. By selecting proper help IDs, you can provide
online help down to the workings of a single button.
The Font class is one of the more useful features of this library. Ordinarily,
if you write out text in a given typeface, chances are good that the same
typeface will not be available on another machine. To help prevent this, the
StarView Font class allows you to define a font through its characteristics
rather than its name. For example, instead of displaying text in Helvetica
bold, you can request a font from the Swiss family of fonts with a heavy
weight setting. At run time, StarView maps the specification onto a locally
available font that comes closest to the requested specs. Figure 2 shows a
screen from the Sampler application, demonstrating some of the families and
attributes available on the Macintosh. Of course, this does not prohibit you
or the user from specifying local fonts. The OutputDevice class allows you to
access all local fonts by name. The Font screen of Sampler uses this to show a
list of all local fonts.
Colors are handled as RGB values with up to 16-bits per component. Common
colors and patterns can be identified by constants like COL_YELLOW. The set of
named constants provides portability and covers basic application needs. For
more advanced color handling or painting, you will probably have to subclass
the Color class and add a few more features (such as HLS or CYMK support). The
color section of the Sampler shows some basic patterns and colors.
Other interesting features available are the MDIWindow class, and the ListBox
and ComboBox type classes. What makes them interesting is that they are
implemented universally to behave similarly under environments that don't have
any direct support for them. MDI under Windows is handled as you would expect.
On the Mac, MDI is simulated as an application with multiple open document
windows. You can't iconize the windows, but you get tiling, cascading, and
other MDI-type activity for free. DropDownLists maps to standard popup menus
on the Mac. Variations such as combo boxes are implemented as controls that
behave the same as their Windows counterparts. The Lists section of the
Sampler demonstrates the variations in list boxes.


Event Handling


Two issues I was most curious about when setting out to look at StarView were
how it maps events to functions, and how it handles dialogs. In a normal C
program, events are handled inside an event loop with a switch statement.
Since StarView takes care of all that in the background, how would it notify
your program when someone presses a button or selects an item from a menu bar?
This is not a unique problem. Any application framework has to tackle it
somehow. A library like MFC takes the route that the programmer is responsible
for associating a specific Windows event code with a function through a
message map. But, this is not a viable solution when you need portability.
StarView tackles this by matching a set of common events with handlers for
each object. For example, a push button generally needs to know when it has
been pressed. You have to define a member handler function of the PushButton
class. You then call the ChangeClickHdl method of the PushButton class to link
the "click" event with the member function. When the user presses the button,
your handler function is automatically called. These handlers abstract system
dependencies and reduce the need to deal with nonportable event codes.
The handlers cover the most frequently used events for each control. You are
free, of course, to go nonportable and patching into the event loop and
dealing with custom events in any way you want. I found the handler method to
be simple and effective for taking care of all the needs of the Sampler
routine. Since buttons are derived from controls (which are derived from
windows), you can associate handlers with a fairly large number of events
available to all the classes in this hierarchy.
Dialog boxes were another area of curiosity. One of the distinct
characteristics of an application framework is how it transfers information to
and from a dialog box. Dialog boxes are used to display information on the
state of the program to the user and allow them to change the settings. The
framework should provide a way to set the values of the individual elements
and an easy way to obtain their values when done. MFC and OWL use special
transfer-buffer mechanisms automatically read and loaded by dialogs.
StarView takes a more traditional approach. A dialog layout is customarily
designed using DesignEd. In the source code, you have to subclass ModalDialog
or ModelessDialog. For each control defined in the dialog, you need to provide
an instance of the control and handler methods to take care of any immediate
action like selection or mouse clicks. Many frameworks implement controls in a
similar manner. Personally, I think MFC 2.0's Dialog Data Exchange (DDX)
mechanism is one of the best implementations for automating dialog setup and
unloading. I hope the StarView design team takes a strong look at DDX for
their future releases. On all other accounts, the dialog mechanism works well
and is capable enough.


Missing Links


In addition to the GUI classes, StarView supports classes that implement a
variety of abstract data types; see Table 2. A few problems should be
mentioned, however. Even with such an extensive collection of classes, some
areas not covered by StarView (though they are arguably not part of the GUI
proper). Two important areas are classes that handle directories and files in
a portable manner, and support for object persistence. Star Division
recommends use of BKS's Poet package for object persistence. I would have
preferred to see simple persistence built in. There are many occassions where
you don't need a heavy-duty and relatively expensive library like Poet.
Without a portable directory and file class, however, the code automatically
becomes nonportable anywhere files are referenced. The most glaring problem
comes up when external resources are referenced in a StarView resource file.
The path to the file is stored in the resource file for subsequent access. The
path to a DOS file does not work under any other system. A portable path class
would take care of that.

Another problem has to do with class naming. Class names under StarView are
easy to read and comprehend: The main application class is called Application.
This, however, increases the chance of conflict when sharing class libraries
with others or when using third-party class libraries. OWL and MacApp both
prepend the letter T to each class (for example, TApplication) . TCL and MFC
use the C character instead (CApplication). A simple scheme, such as
SVApplication, would help avoid future class-naming conflicts.


On the Mac Side


The Macintosh version has problems with the implementation of some of the
interface elements and the way it stores resources. The Mac toolbox provides
built-in support for interface elements like modal dialogs and alerts.
StarView appears to bypass these and implement its own, presumably to provide
full cross-platform support. However, the StarView MessageBoxes do not look
anything like standard Mac Alerts, and ModalDialogs appear not to be
completely modal within the application. When you have a modal dialog box
open, your menu items should not be accessible. I found that I could invoke
another menu item from the application through its accelerator key, even when
a modal dialog was up. Now this is not necessarily a big problem if you have
the sources to the class library handy and can track down the bugs, but
StarView does not include class-library sources. The sources are available,
but you have to make separate arrangements with Star Division. I think it's
very important to provide sources with a class library, even for an additional
(but reasonable) price. There is a burgeoning trade on bug-fixes to various
other class-library sources on CompuServe and the Internet. This would not be
possible if the sources were not easily accessible. I hope the lack of
included sources is just a packaging oversight and Star Division remedies
this.
The other Mac problem has to do with the resources created by DesignEd. When
compiled, these are stored inside custom resources with names starting with
SV. Modal dialog layouts, button settings, menus, and other common resources
are all stored in these custom resources, presumably to simplify access by the
class library. However, you cannot look at the contents of these resources
with a program like ResEdit. This can be a problem when debugging and removes
one of the benefits of resources; namely, the ability to localize a binary
program using resource-editing tools like ResEdit. StarView should either use
standard Mac resources or at least provide ResEdit TMPL resources to allow the
contents of the custom resources to be viewed and edited.
Being cross-platform unfortunately also means not implementing certain useful,
but platform-dependent, functions like AppleEvents. This raises the general
issue that cross-platform development will never be a simple recompiling task.
When developing across platforms, a prudent developer will allocate some time
for fine-tuning the application to match prevailing local standards. Using
#ifdefs to bracket platform-dependent code will help keep nonportable code out
of the core functions.


Conclusion


Overall, I found StarView to be a highly competent and well-designed
application framework. Any problems I encountered with the components were
minor and paled in comparison to the amount of time saved moving from one
platform to another. It was very comforting to know that programmers at Star
Division were using the same class library to implement their own mainstream
applications. This, I think, makes Star Division more adept at catching
problems and fixing them than other library vendors, who wait for their users
to shake out their bugs for them.
StarView's main competition in the area of cross-platform C++ application
frameworks is Inmark's zApp and the forthcoming Bedrock from Symantec/Apple.
However, zApp does not currently support the Macintosh, and Bedrock appears to
be targeted at Macintosh and Windows only. With Microsoft's Visual C++, all
application frameworks have to compete against a highly visual and interactive
development environment. If StarView is to become a viable mainstream
development environment, it will undoubtedly have to withstand comparison with
Visual C++ and may have to work hard to match its features.
Regardless, I found StarView's features sufficient for the majority of
application development tasks a developer will encounter in the brave new GUI
world.


For More Information


Star Division
2180 Sand Hill Road, Suite 320
Menlo Park, CA 94025
800-888-8527
Windows 3.x, Macintosh, OS/2 2.1, or Windows NT: $499.00
Openlook or Motif: $1499.00
(no run-time royalties)
Table 1: Built-in classes
Accelerator Keyboard accelerator
Application General-purpose application framework
AutoCheckBox Automatically checked check box
AutoRadioButton Automaticaly checked radio button
AutoScrollBar Scrollbar with thumb managed automatically
AutoTimer Timer that is automatically restarted
AutoTriStateBox Automatically checked tri-state box
Bitmap Color bitmaps
Brush Logical brush used for general-purpose output
Button General-purpose button (clickable region)
CheckBox Manual check box
Clipboard General-purpose clipboard for Cut/Copy/Paste
Color Device-independent RGB color
ComboBox A combo box (single-line edit and a list box)
Config Configuration-file support (like WIN.INI)
Control General-purpose control
Cursor Text cursor
DefPushButton Default push button
Dialog Standard dialog class
DropDownComboBox A drop-down combo box (single-line edit and popup)
DropDownListBox A drop-down list box (list box as popup)
Edit Editable text field
ErrorBox Message box for errors
FixedBitmap Static bitmaps for dialog boxes and windows
FixedIcon Static icon for dialog boxes and windows
FixedText Static text in dialog boxes and windows
Font Portable font class
FontMetric Characteristics of fonts
GDIMetaFile Stores a sequence of output operations

GroupBox Groups controls together
Help Interface to the help-viewer subsystem
HelpEvent Supports context-sensitive help
Icon Color and black-and-white icons
InfoBox Message box for displaying information
KeyCode Physical keyboard values
KeyEvent Keyboard-press events
ListBox Scrollable list box
MapMode Portable measurement units with scaling
MDIApplication Application that supports MDI windows
MDIWindow MDI widow supported by MDIApplication
Menu Menu bars, pop-up menus, bitmap menus, submenus
MessBox Message box
ModalDialog Modal dialog box
MouseEvent Mouse movement and click events
MultiLineEdit Multiple-line edit field
OutputDevice Portable support for all output
Pen Logical pen for output operations
PhysicalPrinter Physical printer interface
Pointer Mouse pointer
PopupMenu General-purpose pop-up menus
Preview Automatic print-preview support
Printer Portable printer-output device
PushButton Standard push button
QueryBox Message box to present a query
RadioButton Manual radio button
Resid Interface to portable resources
ResMgr Support for multiple resource files
Resource Root of all classes loaded from resources
ScrollBar Manually controlled scrollbars
SingleLineEdit One-line editable text field
Sound System beep
SysMessBox Message box that is system modal (on top)
System Global information about local platform
SystemWindow Base class for Dialogs and WorkWindows
Timer Timed event generation
TriStateBox Manual on/off/don't know check box
VirtualDevice Offscreen bitmap support
WarningBox Message boxes to show warnings
Window General-purpose window class
WorkWindow Top-level application window
Table 2: General-purpose non-GUI classes
Container General-purpose collection of pointers
DynArray Dynamic arrays
Fraction Single and double floating numbers
Link Mechanism to link events to objects
List Linked List
MetaAction Support for actions
MetaFile A file containing MetaActions
Pair A pair of values
Point X and Y coordinates
Queue A FIFO queue of pointers
Range Minimum and maximum range of values
Rectangle General-purpose rectangles
Selection Range of values within a pair
Size Two-dimensional representation of size
Stack A stack of pointers
String General support for strings
Table Table class with key access

UniqueIndex Table of unique indexes
SysDepen Nonportable system-dependent internals
Figure 1: StarView version of the Hello World program.
#include <sv.hxx>
class TheApplication : public Application
{
public:
 virtual void Main(int, char *[]);
};
class TheWindow : public WorkWindow
{
public:
 TheWindow(Window *parent, WinBits windowStyle) :
 WorkWindow(parent, windowStyle) {}
 virtual void Paint(const Rectangle &);
};
void TheWindow::Paint(const Rectangle&)
{
 DrawText(Point(100,100), String("Hello World!"));
}
void TheApplication::Main(int, char *[])
{
 TheWindow aWindow(NULL, WB_APP WB_STDWORK);
 aWindow.Show();
 Execute();
}
TheApplication anApplication;
 Figure 2: Figure 2: Text screen from Sampler.


































December, 1993
Wireless Data and Minimum Airtime Software


Cutting down on data-transfer errors




Darrell Diem


Darrell is software development manager for Motorola EMBARC advanced
messaging. Darrell can be contacted at 407-463-3151


Wireless networks can be implemented as either LANs, MANs (metropolitan area
networks) or WANs (wide area networks). By providing coverage for a building,
city, region, or even a nation, wireless-network technology makes it possible
for you to stay connected while still on the move. However, software
developers who wish to create wireless-enabled applications need to understand
some of the technical problems and user limitations of wireless networks.
The biggest limitation of wireless networks is the range and signal strength
of transmitters. The location of the receiver in relation to structural
surroundings can also affect reception quality. Some radio signals do a better
job of penetrating buildings and earth than others. Wireless networks that use
paging technology, for example, provide for signal overlap, and the receivers
are designed to catch even the faintest signals. But in many cases, an end
user can do little to improve the likelihood of error-free data
reception--other than change his or her location.
Even if the receiving unit is within the transmitter's range and the signal
comes through clearly, problems such as electrical interference (either
atmospheric or man-made) can corrupt a file during wireless transmission.
Often, the entire file needs to be transmitted again. This does more than just
frustrate the person waiting for information in the field--it becomes a costly
way to manage wireless data.
Obviously, the less time the data spends flying across the air, the better the
chance for error-free reception. All else being equal, the time it takes to
move data across a network is directly related to the amount of data being
moved. Of course, other variables apply to this equation, like data
compression and throughput speed. But in general, small files take less time
to move across networks, regardless if they're wired or wireless.
When coded into applications, intelligent data-management software techniques,
give the user control over the amount of time data spends in the air. This not
only reduces the chance for errors, but significantly reduces a users bill for
wireless-network airtime. These approaches incorporate "minimum airtime
software" (MATS) algorithms.
Airbase, the MATS implementation described in this article, was developed for
Motorola's EMBARC (Electronic Mail Broadcast to a Roaming Computer)
nationwide, wireless e-mail and data network. EMBARC, a one-way wireless
approach to roaming data, is based on paging broadcast technology. You can use
EMBARC to send e-mail, transfer files, and update databases simultaneously to
one or more users who may be widely dispersed--and it costs the same no matter
how many people you send the message to.
Mobile computer users need low-error wireless delivery of data that they can
control, and they need the ability to quickly recover corrupted or lost data.
These needs led to the development of an "intelligent packet" MATS solution
that addresses the problems associated with lengthy over-the-air transmission
time.


MATS: A Practical Database Example


Most applications carried by mobile PCs don't require the transfer of large
amounts of data--usually the bulk of the data is already on the PC and only
needs to be "freshened." Attempts to send a full file over-the-air when only a
small part has changed results in higher costs and greater chance for error.
MATS first assumes that common data is dispersed. It then updates to that
database via packets containing new information along with instructions on how
to process it into this roaming copy of the database.
Sending data in small packets minimizes exposure to interference-induced
errors. Since packets are also identified as a part of a larger "set," they
can be recombined into their original size and format on the receiving
platform (usually a PC-type device). Error checking and ID tracking on each
packet provides the user with information on a missed or corrupt packet. The
receiving person or system can request retransmission of a missing or corrupt
packet.
However, multiple errors in a file can still occur if a large number of
packets are sent and rebuilt by the receiving application. An error in an
ASCII e-mail file is usually tolerable since the document is read by a human,
who can usually interpolate the correct answer from the context.
Binary file transfers are a different story. An 1-byte error can crash a
program or change a significant digit in a database. Since databases tend to
be large (frequently running into multi-megabytes), certain methods must be
used to minimize over-the-air time. In addition to reducing the chance for
random externally induced errors that can trash a file transfer, these methods
are also less expensive than retransmitting the complete file.
Figure 1 illustrates the complex interrelationship of a number of variables,
all contributing to message errors or corruption. Length of message, mobility
of receiver, and signal strength are some sources of externally induced
errors.
The MATS approach we implemented to solve these large data-transfer problems
involved the development of a DBF (xBase) file editing and packetization
software tool called "Airbase." The tool, which creates a standardized packet
with the changes made only to each DBF record in a file, provides a means of
launching a batch process upon receipt of the data, and identifying and
recovering any lost or corrupt packets. (Airbase is available at no charge
from the author.)


Creating Packets


In the editing, or update, process, the server-based user selects a DBF file
to update, and Airbase creates a "shadow" or companion database file. The
names of the databases are the same except that the extension on the shadow
copy is DB$. Figure 2 shows how the structure of the shadow file differs from
the original. Notice the two additional fields in the DB$ structure. All other
fields match the DBF structure. The ACTION field in SAMPLE.DB$ is a 1-byte
character field containing the intended edit activity: add, copy, and delete.
The edit process copies all data from an edited record into the DB$ file.
The specific edit process of adding, updating, copying, and deleting causes
the appropriate character to be placed in the ACTION field of the DB$
database. The sort, or key-field, data of the original SAMPLE.DBF record is
also placed in the K_FIELD field of DB$. Figure 3 shows sample record and its
DB$ shadow. This data represents an update to the LAST_NAME, ADDRESS and to
the CITY/ST/ZIP fields. The sort field is LAST_NAME and the sort data is
"Smith." The record in the SAMPLE.DBF database isn't changed at this point.
Changes are accumulated in the shadow file until the session is complete or
until the Transmit option is selected from the edit menu. This means that all
databases are the same, including the master database, until Transmit is
invoked, at which time the following happens:
Each record in DB$ is compared to the original in the DBF file.
Each field with a change is extracted into a buffer.
A packet is created for each changed record.
The packet is submitted for transmission.
The packet is itself placed in a transmission packet and transmitted.
The file SAMPLE.DBF record is updated.
The SAMPLE.DB$ record is marked "sent" (deleted with *).
The wireless modems receive and either store or download the transmitted data
packets. Table 1 shows the basic structure of an Airbase packet.


Receiver Processing


The packet is received by a wireless receiver (the NewsStream receiver, in the
case of EMBARC) and stored in the unit's internal buffer. When connected to a
PC equipped with receiving software, the packets are downloaded to mailboxes
and stored on disk. The system-level transmission packet header contains
information including its own CRC error-checking data, source of message, time
sent, number of pieces to come, application tag, unique ID, and so forth.
The download software uses the application tag to determine if the packet is
an Airbase data packet. If the download software is installed with the
auto-process flag set to "on" and, if an Airbase packet is received, upon
processing the last packet, the software, will stuff the keyboard buffer with
a numerically named batch file. If no foreground process is running, this will
launch the batch file for postprocessing the packet according to the
batch-file contents. This could include a dial-back request for lost or
corrupt packets to be re-broadcast. Listing One (page 102) shows how
postprocessing is implemented. The first byte of the packet is the batch-file
number. For example, a "5" in the first byte will launch a batch file named
5.bat, and so on. If the first byte is a 0, no keyboard stuffing occurs.
Listing Two (page 103) provides details on download processing.



User Control


Recipients of the data can use Airbase to review the mailbox data and process
it into a target database, delete the data, launch the indicated batch file,
or do nothing.
The received data-packet header is used to find the database file using the
pathname in the packet. This lets the sender and user to update or maintain
multiple copies on the receiving platform. The recipient controls access to
the data by installing the wireless-modem download software with the
auto-process flag set on or off. Manual execution of updates can than be
executed via menus.
Airbase is a three-way application: It's an editor and wireless transmit
package, it provides manual review and update options for received data, and
it can be called with command-line processing strings in a batch-file mode for
auto-updating. See Listing One for details on how to process the packet.
(Airbase, a free library, and sample source for this process are available
from the author.)


Conclusion


The benefits from a small packet of "difference only" data such as that
provided by MATS are:
Reduced opportunity for random errors due to interference,
Receiver "overview" of data before application of updates to a data file
Identification of lost/corrupt packets
Lower cost of transmission.
The small size also reduces the high cost of retransmitting data if an error
occurs, since only packets and not the full file, are rebroadcast. Recipient
control over auto-batch or manual processing provides error-management control
and callback capability for corrupt or missing packets.
Other approaches to reducing the airtime for over-the-air file updates are
required when the data structures are too complex to address with a
special-purpose application such as Airbase. Spreadsheets and word-processing
files are two such examples. To minimize file size and airtime exposure, the
files could be compressed but, in most cases, they will still be large. Also
available are utilities that create a "difference" file by comparing an
original with a changed file. One such "differencing" package is .RTPatch by
Pocket Soft (Houston, Texas).
Coverage, error management, and the cost of wireless data distribution will
improve over time. The users of the technology need to know the strengths and
limits of any given implementation in making choices. Software selection and
data-management strategies are opportunities for users to improve delivery
quality and reduce costs. Management of data before, during, and after
wireless delivery are equally important. A little thought can provide a
higher-quality, cost-effective solution for maintaining "roaming data."
 Figure 1: Variables that can contribute to message errors or corruption.
 Figure 2: The structure of the original (SAMPLE.DBF) vs. a shadow file
(SAMPLE.DB$).
 Figure 3: A sample record (SAMPLE.DBF) and its DB$ shadow (SAMPLE.DB$).
Table 1: Airbase data-packet structure
NAME Bytes TYPE CONTENTS
Batch Tag 1 Binary Batch-file tag number 0x01 = no tag used
Trans Type 1 ASCII Transaction Type:
 A ADD
 U Update
 C Copy
 D Delete
Time Stamp 17 ASCII Nonnull terminated time stamp. String format:
"MM/DD/YY<space>HH:MM:SS".
Num of Fields 1 Binary Number of fields in the packet body
BCC 1 Binary Block -check character. Calculated over entire packet except for
the BCC component.
Dest DB Name Length 1 Binary Length of destination DB, 64-character maximum.
Dest DB Name 0 to 64 ASCII Pathname of destination DB, nonnull term.
Field Name Length 1 Binary Length of field name.
Field Name 1 to 25 ASCII Field name of changed data, nonnull term.
Key Index 1 Binary Key indicator level of field:
 0x00 not a key field
 0x01 primary key field
 0x02 secondary key field
 0x03 tertiary key
 0xnn 0xnnth key
Data Length 2 Binary Length of data.
Data 0 to 64K ASCII/ Binary Actual data.
_WIRELESS DATA AND MINIMUM AIRTIME SOFTWARE_
by Darrell Diem

[LISTING ONE]

/*****************************************************************************
** FUNCTION: MasterDBUpdate
** DESCRIPTION: Dummy master database update routine invoked by the
** TSR record handler for live updates
** PARAMETERS: None
** RETURN VALUES:
** ERRCTKNONE - Master D/B update performed successfully
** ERRxxx - Master D/B update not performed. The Message Handler
** will write the record to the Holding File

** ERROR HANDLING:
** NOTES:
** PSEUDOCODE:
** BEGIN MasterDBUpdate
** END MasterDBUpdate
******************************************************************************/
int DBUpdate(int iApiIdTag, int Flag_argc)
{
 int iRecNum,icounter;
 char cTransType;
 int iErCode = 0;
 int iRetn = 0;
 int iNumFields;
 MBOX *pstMbx; // pointer to mailbox structure created by API
 MBOX_MSG_HDR *pstHdr; // buffer for mbox hdr data
 static char szOldKeyVal[FLDVALSIZ + 1];
 static char szNewKeyVal[FLDVALSIZ + 1];
 static char szTimeStamp[CTKTIMESTAMPLEN];
 static char szDataFile[_MAX_PATH];
 static char szIndxFile[_MAX_PATH];

 static char szDrive[_MAX_DRIVE];
 static char szDir[_MAX_DIR];
 static char szFname[_MAX_FNAME];
 static char szExt[_MAX_EXT];
 PDBINDFILEX pstDbFileX;

 /* perform CtkBeginTrans to get the Transaction Type, the Time Stamp,
 ** the Number of Fields in the Client/Server Packet, and the Destination
 ** Database Name */
 CtkQueueSize(iApiIdTag,&iRecNum);

 icounter = iRecNum; //number of records in mbox
 if(Flag_argc == 0) //flag to do only newest data
iRecNum = 1; //newest 1st so updates reflect only latest data

 //command line overrides the num recs selection
if(Flag_argc > 0 && Flag_argc != 'X') {
iRecNum =Flag_argc;
Flag_argc = 0; //do only indicated rec Number
 //flag_argc to break while loop 1st pass
}
 /* loop until all processed or break encountered */
 while (iRecNum > 0 && iRecNum <= icounter ) {

 /* open mailbox to check app_flag processing status 3 == done */
if ((iErCode = API_open_mbox(iApiIdTag, MBX_RD_WRT, &pstMbx))
 ! = EPC_NO_ERR)
{
return iErCode;
}
 /* retrieve mailbox message header for this message and get message length */
if ((pstHdr = malloc((size_t) sizeof(MBOX_MSG_HDR))) == NULL)
{
API_close_mbox(&pstMbx);
return ERRCTKMEMALLOCHDR;
}
if ((iErCode = API_get_msg_hdr_from_mbox(pstMbx, iRecNum, pstHdr))
 != EPC_NO_ERR)

{
free(pstHdr);
API_close_mbox(&pstMbx);
return iErCode;
}
iRetn = pstHdr->app_flag; //if zero set app_flag to
API_set_app_flag(pstMbx,iRecNum,3); //num 3 for 'processed' data
free(pstHdr);
API_close_mbox(&pstMbx);
 //this process is for command-line updates
if(iRetn >= VIEWED && Flag_argc > 0) {
iRecNum--;
if(iRecNum <= icounter) break;
continue;
 }
 CtkReadQueue(iApiIdTag, iRecNum); //set up data packet buffer
 if ((iRetn = CtkBeginTrans(&cTransType, szTimeStamp, &iNumFields,
 szDataFile)) == ERRCTKNONE)
 {
/* construct index file name from the name of the data file by
** changing the extension from (nominally) ".DBF" to ".NDX" */
_splitpath(szDataFile, szDrive, szDir, szFname, szExt);
_makepath(szIndxFile, szDrive, szDir, szFname, ".NDX");

/* attempt to open the data and index files for writing */
if ((pstDbFileX = DbOpenX(szDataFile, szIndxFile, TRUE)) == NULL)
{
 iRetn = ERRCTKDBUPDATEFAILED;
}
else
{
 /* attempt to retrieve the record from the database */
 iRetn = LocateDbRec(pstDbFileX, iNumFields,
szOldKeyVal, szNewKeyVal);
 switch(iRetn)
 {
/* record was found in D/B */
case ERRDB3NONE:
 switch (cTransType)
 {
case CTKTRANSADD:
 iRetn = ERRCTKDUPLICATEREC;
 break;
case CTKTRANSUPDATE:
 iRetn = TransUpd(pstDbFileX, iNumFields,
 szOldKeyVal, szNewKeyVal);
 break;
case CTKTRANSDELETE:
 iRetn = TransDel(pstDbFileX);
 break;
default:
 iRetn = ERRCTKINTERROR;
 break;
 }
 break;
/* record was not found in D/B */
case ERRDB3READFAILED:
case ERRDB3SEARCHFAILED:
 switch (cTransType)

 {
case CTKTRANSADD:
 iRetn = TransAdd(pstDbFileX, iNumFields, szNewKeyVal);
 break;
case CTKTRANSUPDATE:
case CTKTRANSDELETE:
 iRetn = ERRCTKNOSUCHREC;
 break;
default:
 iRetn = ERRCTKINTERROR;
 break;
 }
default:
 break;
 }
 /* close the data and index files */
 DbCloseX(pstDbFileX);
}
/* end transaction, and if it failed then record error */
if ((iErCode = CtkEndTrans()) != ERRCTKNONE && iRetn == ERRCTKNONE)
{
 iRetn = ERRCTKDBUPDATEFAILED;
}
iRecNum--;
if(iRecNum <= 0 Flag_argc == 0) break;
 } //end while loop
API_enable_core(API_NO_FUNCT);
 }
 fclose(pstTraceFile);
 return iRetn;
}


[LISTING TWO]

 // Batch file stuff goes here
 //
#ifdef BATCH_STUFF
 if(iBatch_Tag == 1) {
 if ( MbxHeader.api_tag >= 33 && MbxHeader.api_tag <= 39 )
 {
 if ( do_batch_thing != 0 )
 {
 char batch_key_stuffin[30];
 // the first byte of the message is called do_batch_thing
 _itoa(do_batch_thing, batch_key_stuffin, 10);
 strcat(batch_key_stuffin,".bat\r\n");
 kbflush();
 if(MbxHeader.k_block == MbxHeader.n_blocks) //use only last request in list
 kbstuff(1, batch_key_stuffin);
 }
 }










December, 1993
PROGRAMMING PARADIGMS


Pittman's Progress




Michael Swaine


Tom Pittman is no stranger to these pages.
In the first year of this publication's existence, back when it was called Dr.
Dobb's Journal of Computer Calisthenics and Orthodontia and the personal
computer industry was a handful of crazy hackers trying build computers in
their garages, the founders of this venerable journal published a rough spec
for a programming language that could be implemented on almost any machine,
requiring practically no memory. It was a clever design, and it required some
cleverness to see how to implement it. The language was Tiny Basic, and one of
the few people who took Dennis and Bob seriously enough to actually implement
Tiny Basic was Tom Pittman.
Tom has never lost his interest in programming challenges. This interest has
led him down strange paths. When Apple released HyperCard in 1987, Tom got
intrigued with this user-programming tool, and wound up writing CompileIt!, a
tool that compiles HyperTalk code into HyperCard external commands and
functions.


Generation X


Now, HyperCard externals are a nice idea. By building into HyperCard the means
of extending HyperTalk, hence HyperCard. XCMD developers could replace
commands and functions of the language with more powerful or specialized
versions; access the ROM Toolbox to do things otherwise completely beyond
HyperCard's capabilities; and, with the release of a better XCMD interface in
HyperCard Version 2, control external windows that didn't have to look or act
like HyperCard stacks. The external interface, like the loophole in DOS that
allowed developers to create TSRs, was a hacker's dream.
But it wasn't a stack developer's dream, because stack developers couldn't,
generally speaking, write the things. These XCMDs formerly had to be written
in another language like C or Pascal, by someone who understood a lot more
about programming than the average HyperTalk coder. CompileIt! broke down this
barrier. With CompileIt!, anyone who could write HyperTalk code could write
XCMDs.
That was the goal, and for Tom it was more a programming challenge than a
product idea.


Generation XX


This brings us to this year, when Tom's efforts began to get more generally
interesting.
As I said, Tom approached CompileIt! initially as a programming challenge.
Only when it became successful did he begin to think of it as a real product,
which today it is. CompileIt! is a must-have for any serious HyperCard stack
developer.
The problem with CompileIt!, though, is that that's all it is. It's
interesting only to HyperCard stack developers or to those using Macromedia
Director or some other XCMD-supporting application/tool. All right, you can
create other kinds of code resources, other than XCMDs, using CompileIt!, but
why bother? There are existing tools, like ResEdit, specifically designed for
building resources, and there is no dearth of tools for writing and compiling
code. For an experienced Mac programmer, CompileIt! fills an unfelt need.
That wasn't good enough for Tom, and he set out to make it more generally
useful. One approach would have been to turn it into a general
code-development tool that could be used to develop full applications. But
again, why? Why go into competition with compiler vendors?
The approach Tom took instead was to create a separate, complementary product
that allows XCMDs to become stand-alone, double-clickable applications. This
was made possible by the improved XCMD interface that HyperCard Version 2
brought; in particular by the possibility of creating external windows, owned
and controlled by individual XCMDs.
The thing to do, Tom decided, was to create an application shell that the
XCMDs could live in, and that's what his new product, Double-XX (available
from Heizer Software, Pleasant Hill, California), is.
This alone is sort of technically interesting. It makes my rough analogy with
creating TSRs in DOS a little more germane: You can now use the capabilities
supplied by the operating system vendor to do things it never intended. Using
Double-XX to create little stand-alone apps feels like hacking the system. But
Double-XX has some features that make it more interesting and useful than
this.


Seeing Double


There are a lot of reasons to take a look at Double-XX. Some of these are
Tom's or Heizer's reasons, some mine, and some, perhaps, are yours. Since I
know mine best, I'll state them, clearly labeled as such.
1 Tom's latest project is always interesting: clever, deep, and well-executed.
And this is his latest.
2 I'm doing XCMD development, and I've been using CompileIt!, and now,
Double-XX.
3 It's useful for noodling around on the Mac. I find that I can test an idea
quickly with Double-XX. I'm not one of those who knows Inside Mac by heart,
but by using CompileIt! and Double-XX, I can write code that calls ROM ToolBox
routines.
4 The Apple and Claris programmers who created and enhanced HyperCard put a
lot into it. Third-party developers who wrote thousands of XCMDs over the
years have built a large library of useful tools. Most of what's in HyperCard
and all of the externals are usable in stand-alone applications created with
Double-XX. I am using Double-XX to build a utility app into which I plan to
drop new XCMDs as I discover a need for them.
5 I use it to explore AppleScript and AppleEvent programming and scripting on
the Mac. I can use it to quickly create little scriptable applications to test
the ability of other apps to send and receive AppleScript messages.
6 It's one of the quickest and most convenient ways I know of to create a
small, double-clickable app on the Mac.
If any of that appeals to you, maybe you should look at Double-XX. I'll tell
you a little more about it to help you resolve that maybe into a definitely or
a nah.
The bad news first: The interface to Double-XX is a HyperCard stack. I can't
pretend otherwise. But don't let that fool you; this is a powerful tool for
its intended purposes. Also, the apps you develop using Double-XX don't look
anything like HyperCard stacks, because they aren't. They are external windows
controlled by XCMDs which are in turn controlled by an event loop that
Double-XX supplies. The apps don't typically include any of the usual
HyperCard elements: cards, fields, and the like. (It is, however, possible to
put these HyperCard user-interface elements into your apps, and it can be done
easily. This can be very handy, as discussed below.)


The XX Encounter


When you first begin to use Double-XX, here's what you encounter:
Since Double-XX appears as a stack, you'll see a card. Buttons on this card
help you in assembling the pieces of your app: the XCMD, of course, but also
menus and other resources.
The Resources button brings up a list of resources associated with your
project (it's a project until you compile it into an application). The list
can incude XCMDs, XFCNs, and resources for cursors, menus, icons, text, and so
forth. You can add resources from other stacks, including the resources
supplied with Double-XX, but for the most part you'll have to step outside
Double-XX to create new resources (using CompileIt! or some other tool).
Menus are handled by a separate button. When you click on it, you see a
scrolling list that contains all menus, menu items, and messages associated
with the menu items. A message can be a HyperTalk command, a series of
HyperTalk commands, or, more likely, an XCMD call.

Any Mac application that is to appear on the Finder desktop as an icon must
have ICN#, BNDL, and FREF resources for the Finder to use. It also needs a
unique creator type and a vers (version) resource containing version number,
copyright information, and country code. Buttons and fields on the card walk
you through creating these things, but you have to call Apple yourself to
register your creator code if you plan to distribute your app.
Two other elements on the card are of special interest if you want your app to
communicate with other apps. These are the AppleEvents check box and the aete
Resource button. If you check the former, the system will send your app
AppleEvents, and it had better be ready to handle them. Double-XX will
automatically include code to handle the oapp, quit, dosc, and eval messages,
but all others will fail unless you supply code to handle them.
The aete Resource list is the starting point for supplying this code. It lists
all (or any) AppleEvents your app handles. The syntax is complex, but
Double-XX gives some help in building the aete resource. Reading the
AppleScript manual and various Apple documents on AppleEvents and the Object
Model is also helpful.
So to build an application, you collect or write the XCMDs that do all the
things you want your app to do, then package them into an app using Double-XX.
Double-XX provides the parts that the Finder wants to find and plugs the XCMDs
into its event loop.


Let's Face It


What's missing here is the user interface. Since Double-XX itself doesn't give
you much help in building the user interface, you need to look to other
sources for that. WindowScript (also marketed by Heizer) is a good UI tool and
works well with Double-XX and CompileIt!. But if you're willing to use a
HyperCard-like interface, with HyperCard-like buttons, fields, and cards, you
can do the job just using the (nonsupported) stack compiler that comes with
Double-XX. You just create the HyperCard-like UI elements as though you were
authoring a HyperCard stack, and the stack compiler turns them into resources
that become part of your stand-alone application.
(The stack compiler is not complete enough for compiling existing stacks into
stand-alone apps. That would be nice, but it's not what it's for. Its purpose
is simply to give you a convenient way to slap a HyperCard-like UI onto your
stand-alone app. For some apps, this can radically reduce the time you'd spend
making the things usable.)
Double-XX is a nifty little tool, in my experience. It comes with decent
debugging aids. You can use it in combination with C or Pascal. And you can
replace elements of its interface and debugging support with tools you like
better.


When the User is You


What's nice about Double-XX, apart from the facts that it makes it easy to
develop stand-alone apps and that these apps are small, is the power you can
put into your apps. Since the target user for a small, rapidly written app is
likely to be the developer him or herself, that's particularly interesting.
Tom has built into Double-XX a HyperTalk interpreter (or actually a
HyperTalk-subset interpreter). It supports the bulk of HyperCard's language,
although some language elements not meaningful in Double-XX are not supported.
That's reasonable; Double-XX creates standalone applications, not HyperCard
stacks, so references to stack elements like fields and cards are not
relevant.
This interpreter can be made available to the user of your app. What this
means is that you can, with virtually no extra effort, make your app
scriptable. With slightly more effort, you can decide what elements of the
language you don't want to support and trap commands that involve those
elements.
There are several ways you can make this scripting capability usable. You can
include in your app the MessageBox XCMD, which is supplied with Double-XX. It
implements a message box similar to HyperCard's into which the user can type
and execute single-line commands. Or you can create a window in which the user
can type multiline blocks of code that can be executed using the Do command.
The interpreter also makes it possible to leave parts of the functionality of
your app uncompiled, either for the user or for you to tweak. For example, you
have a special operation, specialOp. Rather than supply a specialOp XCMD, you
put the HyperTalk code for the operation into a TEXT resource. Then in the
menu card you associate a menu selection with the command Do getTEXTresource
(nnnn). getTEXTresource is a supplied XCMD; nnnn would be the number of the
TEXT resource.
You could even, with a little more effort, track user actions and turn them
into code, saving the code as executable TEXT resources as above. In other
words, you can create a watch-me mode in your app.


Talking to Yourself


If you check the AppleEvents check box in the Double-XX stack, your app will
inform System 7 that it is AppleEvents-aware, and the system will send it
AppleEvents. As mentioned, Double-XX will handle four AppleEvents
messages--oapp, quit, dosc, and eval--and you're on your own for the rest. You
make your app handle other messages by supplying aete resource information and
XCMDs for them. But you can also override Double-XX's handling of the four
messages in the same way.
The interpreter is also able to catch AppleEvents and send them to other
AppleEvent-aware programs, but only if they are running on the same machine.
The next version may support cross-network AppleEvents.
You can code your app in such a way that it does everything via sending
AppleEvents to itself. This might seem like overkill for the small, ad hoc
applications likely to be generated with Double-XX, but there are some reasons
to consider doing this. An app that can be driven from the outside, say via
AppleScript, is more than an app. It's a packaged chunk of functionality, and
you or some user may find an entirely unexpected use for it.
It's also intriguing to reflect that any application that responds to
AppleEvents will automatically become voice controllable when speech input
really takes off on the Mac. I'm sure there are people who enjoy typing, but
as someone who has never really befriended the keyboard, I find this
possibility very appealing.
This has been a pretty enthusiastic report, and I feel compelled to find
something to complain about. I guess I'll pick on the name. The name Double-XX
is supposed to mean "Double-click your XCMDs." I dunno about that; I look at
it and immediately parse it into "XXXX." My guess is that it was named by
Prince's press agent.





























December, 1993
C PROGRAMMING


Of Monkeys and Tools




Al Stevens


In The Origin of Species, Darwin presents his theory of evolution, which
postulates that we and the other primates are derived classes with a common
base. Darwin acknowledges the problems with his theory and tries to deal with
each of them. Why, he asks, if we evolved from the lower species through a
gradual process of natural selection, are there not large numbers of
incremental species walking around the earth, each one representing one tiny
advance in the evolutionary chain? His guess is that each new superior species
eliminated its inferior ancestors by virtue of the survival of the fittest.
Why, then, have no paleantological records of those interim species been
found? His answer to that question is that the records themselves are
incomplete, having been mostly erased by natural disasters and phenomena and
that we are lucky to have uncovered the few traces that do remain.
The problem as Darwin saw it is not that there's a "missing link" but that
there is a missing chain, with each link being only slightly different from
and improved over the one before it.
We have tools today that Darwin could not have imagined. We have on our
desktops technology advanced enough to build a computer model of this missing
chain. The processed called "morphing," which DDJ covered in the July, 1993
issue, is the key. If we start with pictures of an ape and a human and morph
from one to the other, we should have a sequence of image frames that would
constitute, by computer simulation, the missing evolutionary chain. In the
dead center of the sequence just might be that elusive missing link. A century
and a half of scientific mystery and speculation would be ended.
I put it to the test, and it worked. I morphed an image of an ape into an
image of a contemporary male human. Then, with hands trembling and to test my
theory, I retrieved the image that occurred midway in the morphing sequence.
The image that sprang to my screen, the missing link in the evolutionary
chain, the answer to a question that has stumped the best minds of science for
decades, is a dead ringer for The Kramer.
I'm working on my Nobel acceptance speech in my spare time.


D-Flat++: ToolBars and TED


I liked the way I implemented dialog boxes in D-Flat++. You derive a class
from the DialogBox class and embed control objects in it. That's all it takes.
The reason that Windows developers needed resource compilers was that the code
to implement a dialog box or a menu does not resemble the output, structurally
or any other way. It involved a bunch of function calls. The tabular format of
the resource-compiler languages was an improvement because it gave mnemonic
representation to the design. D-Flat++ substitutes a class design for a
resource language, and loses none of the notational convenience in the
process. It worked so well, that I used the same approach for the ToolBar and
ToolButton classes. You derive your tool bar class from the base ToolBar class
and embed ToolButton objects in it. Then you embed an object of the derived
tool bar class in your derived application class. That worked so well, that I
went back to the way that the menu bar is implemented and changed it to use
the same approach.
It doesn't matter how well something is working, we can always find a reason
to fix it.
This month I'll show you the new example application that demonstrates the
D-Flat++ class library. It's a simple text editor called TED. It has a menu
bar, a status bar, and a tool bar, and it lets you work on one text file at a
time. After discussing TED, I'll look at how the tool bar and buttons work.


TED Version 1


Listing One, page 134, is ted.h, the source file that defines the Ted
application class. Version 1 is a simple application. It has a menu bar and a
tool bar. The application class embeds an EditBox class control object to
handle the text editing. The class intercepts the Size message to change the
size of the edit box whenever the user changes the size of the application
window. There's a member function to build the application window's title to
reflect the name of the file that the user is editing. Another member function
prompts the user to save the file when changes have been made and the program
is going to exit or use the window for a different file. The application class
intercepts the CloseWindow message to make that test. There are a constructor
and a destructor, and there is a member function for each of the commands on
the menu.
Listing Two, page 134, is ted.cpp, which contains the member functions for the
Ted class. There's little in the way of a text editor here. The EditBox class
takes care of all that.
The menus for TED are built in the tedmenu.cpp source file, Listing Three,
page 134. First are declarations of all the menu commands, which are instances
of the MenuSelection class. I discussed that class and the others used in this
source file in my April 1993 column. I'm including it here to show you what
the TED menus look like. There are File, Edit, and Options menus. The File
menu has New, Open, Save, Save As, and Exit commands. The Edit menu has Cut,
Copy, and Paste. The Options menu has Insert and Word Wrap. Each of these
commands, except for Word Wrap, has an associated member function in the
derived Ted application class. Word Wrap is a toggle.
The TED toolbar is built in two files, tedtools.h, Listing Four, page 134, and
tedtools.cpp, Listing Five, page 135. tedtools.h defines the tool bar, which
is a class derived from the ToolBar class. It has three tool buttons to
correspond to the New, Open, and Save menu commands. CUA programs typically
use tool bars as mouse shortcuts to the commands available on menus. An
application defines a tool bar by deriving a class from the ToolBar base
class, embedding some ToolButton objects in the class and providing a
constructor to initialize the buttons with labels and functions to call when
they are pressed. Listing Five contains the constructor, which initializes the
three ToolButton objects and calls their SetButtonFunction method to assign
functions to them.


Designer Classes


The design of a complex class library from the ground up tends to topple
conventional wisdom. Just when you think that you have a mature, working
class, one whose details you can tuck away and forget about in true black-box
engineering style, it comes back to haunt you. Why? Because you try to derive
a new class from it, that's why. Nothing humbles a class designer more than
the realization that some old and trusted class can't live up to its promise
because it doesn't host a particular newly derived class with grace and
hospitality. The notion that C++ and object-oriented design promote reusable
software any more than traditional procedural design is a mistaken one. C++
encourages such design, to be sure, because the class structure holds the key
to encapsulation. But designs are made to be modified, and the notion that you
can do it once, top down, bottom up, upside down, or any other way, and chisel
the details in stone never to be budged again is a flawed one.
Case in point: I have this handy PushButton class in D-Flat++. It derives from
a base Button class that handles all the details common to all control
buttons--whether they're enabled, current setting, when it gets pushed, and so
on. The PushButton class implements the command button. When the user pushes
it--releases it, actually--the class calls a function in the program's
application class. The PushButton class has all the stuff that it needs to
watch the keyboard and the mouse while the user holds the key or button down
and moves it around--everything that a well-behaved push button has to do.
Buttons on a tool bar work just like push buttons except that they have a
different look and are grouped on the tool bar instead of coexisting with
other unrelated push buttons on a dialog box. Just perfect for inheritance.
Most of the details are the same, and only two areas need to be customized. So
I did it; I derived the ToolButton class from the PushButton class. And it
works now. The trouble was, however, that there were a lot of details in the
PushButton's member functions that should be inherited and some others that
should be customized. What's that, you say? Why is that a problem? Isn't that
exactly what polymorphism is all about? Yep. Except that those reasonable and
replaceable implementation details were usually tucked together in the same
PushButton member functions. You can't polymorphize part of a function and
inherit the rest. Its all one way or the other. The pure object-oriented
designer would have gritted the old choppers and overridden all those virtual
functions. Not me. I know all about those details in that base class. I
designed and built it. If they were coded differently--better--it would be
easier to inherit their good stuff and ignore the parts that I don't need in
the new class. So I just waded in and changed it.
That's a difficult problem to avoid because you can't always know in advance
which classes are going to become base classes and what parts of them will be
inherited and overridden. I haven't yet figured a way to look at class during
design and spot all the functionally independent implementation details. I
think that it takes a while to develop that eye, and I haven't seen any class
designs yet that evince that kind of designer foresight.


Implementing the Tool Bar


Off the soapbox. Listing Six, page 135, is toolbar.h, the source file that
defines the ToolBar and ToolButton classes. A tool bar is a simple thing. It
is no more than a blank spot on the screen that stretches the length of the
application window and that is the parent of the tool buttons.
When you are dealing with a 25x80 text-mode screen, there is not a lot of
space to waste. These tool buttons are three character positions high to allow
for frames and a label. The tool bar will be the same height. It is necessary
to provide visual separation between the menu bar above and the document
window below. There aren't enough character rows to let all these window
pieces have their own frames, so the implementation uses color to separate
them.
Listing Seven, page 135, is toolbar.cpp, the code that implements the ToolBar
class. All that it does is position itself in the right place on the
application window, and see that its size changes appropriately when the
application window size changes.


Ted's Tools


Listing Eight, page 135, is toolbutt.cpp. (Don't laugh, they only give you
eight characters for a filename.) The ToolButton class member functions are in
this file. To visually differentiate these buttons from other controls, I
implemented them as small windows with frames. I use the single-line frame
characters for the top and left sides of the frame and the double-line
characters for the right and bottom sides. That configuration gives the button
a 3-D look. When the user clicks on the button, the program reverses the
single and double lines, which makes it look like the button recesses when it
is pushed.
There are three color configurations for a tool button, a normal one, a color
set for while the button is pressed, and one for when the button is disabled.
The three Color objects at the top of toolbutt.cpp define these colors. The
two BoxLines structures define the frame characters for the two
configurations. There are two constructors, one to make the button
automatically position itself on the tool bar and one to assign a screen
location for the button. The class intercepts the Border method to draw one
frame or another, depending on whether the button is being pressed. The Paint
interception assigns the correct color configuration depending on the button's
current state. The SetFocus and ButtonCommand interceptions see to it that
whichever window had the focus before the button was pressed gets it back
after the button's command function returns. Other than that, the code lets
the Button and PushButton base classes do most of the work.



How to Get the Source Code


D-Flat++ is still preliminary but far closer to a working version than the
first two. I am writing this column in September and will soon upload version
3. The first version was incomplete, but there was enough of an implementation
to give you an idea of how DF++ works and how it differs from D-Flat. The
second version had enough functionality to build an application, and the third
one improves on that although there are more features to come. Later versions
will be released by the time you read this. The C D-Flat function library is
still available, too. You can download DF and DF++ from the CompuServe DDJ
Forum or from M&T Online. You can also get them by sending a stamped,
self-addressed diskette mailer and a formatted diskette to me at Dr. Dobb's
Journal, 411 Borel Avenue, San Mateo, CA 94402. The software is free, but if
you wish, include a dollar for my Careware charity, the Brevard County Food
Bank.


Hooked on Templates


In August, I attended Miller Freeman's East Coast edition of Software
Development '93 in Boston. The show grows every year. I conducted a workshop
on C++ object-database management based on the contents of my C++ Database
Development (MIS Press, 1993).
End of self-serving plug--it's the Pournellian imperative rearing its ugly mug
again. The point of all this is that as I was stepping through the overhead
slides and holding forth on the code in that book, I kept tripping over how
much easier everything I was describing would have been if the C++ compilers
had included templates a year ago when I wrote it. I would explain some
complex and contrived base class that lets programmers wedge their designs
into the parts and pieces of the database manager, and I could see as I spoke
that there is a far better way. Using inheritance to add management to objects
of a class is cumbersome, but it used to be the only way to do it. Inheritance
should be used to add and inherit behavior. Templates should be used to wrap a
management function around objects. I'm going to have to do a second edition.


Multimedia Prez


Another bonus of that seminar was an entertaining demonstration of Windows
multimedia by Charles Petzold, the Windows programming guru who wrote the
definitive book on the subject.
With a few simple demonstrations, Charles unlocked the MIDI puzzle that I
posed a few months back. I came home and downloaded some of Charles's C
programs from ZiffNet, and the shrouds of mystery fell away.
Charles demonstrated some impressive video and sound on a lunch box portable.
He showed how the "not" could be removed from Nixon's "I am not a crook" quote
by a simple swipe (pun intended) of the mouse. Then he ran an animation of
Bill Gates's head on Ronald Reagan's body dancing to Mozart. He didn't pick on
any Democrats. Maybe Mozart was a Democrat.
Fired up by Charles, I came home and got into MIDI again. Don Menza, a fine
bebop tenor saxophone player had given me a transcription of Bill Clinton's
saxophone solo on "Your Mama Don't Dance and Your Daddy Don't Rock 'n' Roll."
Clinton played that at an inaugural party, sitting in with the band. I encoded
the transcribed solo into a MIDI file and added electric piano accompaniment
so a listener could find the context of the improvised solo. You can hear it
yourself if you have a sound board and MIDI sequencer software. It's in the
MIDI forum on CompuServe under the name CLNTON.ZIP.
After listening to the clip a few times, I decided to send it to Philippe
Kahn. There's an open chair in the Turbo Jazz sax section after a Symantec
defection, and I figure Bill might be looking for a gig in about three years.
[LISTING ONE]

// --------- ted.h
#ifndef TED_H
#define TED_H

#include "dflatpp.h"
#include "fileopen.h"
#include "tedtools.h"

#define Df void (DFWindow::*)()
#define Ap void (Application::*)()

extern MenuBarItem TedMenu[];
extern MenuSelection InsertCmd, WordWrapCmd;

// ------- Ted application definition
class Ted : public Application {
 MenuBar menubar;
 TedTools toolbar;
 String fname;
 EditBox editor;
protected:
 virtual void Size(int x, int y);
 void BuildTitle();
 void TestChanged();
public:
 Ted();
 virtual ~Ted() {}
 // ----- menu command functions
 void CmNew();
 void CmOpen();
 void CmSave();
 void CmSaveAs();
 void CmInsert();
 void CmCut() {}

 void CmCopy() {}
 void CmPaste() {}
 void CmExit() { CloseWindow(); }
 virtual void CloseWindow();
};
#endif


[LISTING TWO]

// ------------- ted.cpp
#include <fstream.h>
#include "ted.h"

static char untitled[] = "(untitled)";
main()
{
 Ted ma;
 ma.Execute();
 return 0;
}

// ---- construct application
Ted::Ted() : menubar(TedMenu, this),
 toolbar(this),
 editor(ClientLeft(),
 ClientTop(),
 ClientHeight(),
 ClientWidth(),
 this),
 fname(untitled)
{
 SetAttribute(SIZEABLE MOVEABLE);
 SetClearChar(' ');
 BuildTitle();
 Show();
 editor.SetFocus();
}
// ---- builds the title with the current document name
void Ted::BuildTitle()
{
 SetTitle(String("TED: ") + fname);
 Title();
}
// ---- File/New Menu Command
void Ted::CmNew()
{
 TestChanged();
 editor.ClearText();
 editor.Paint();
}
// ---- File/Open Menu Command
void Ted::CmOpen()
{
 TestChanged();
 FileOpen fo;
 fo.Execute();
 if (fo.OKExit()) {
 editor.ClearText();

 fname = String(fo.FileName());
 BuildTitle();
 editor.ClearChanged();
 ifstream tfile(fname);
 String ip(200);
 while (!tfile.eof()) {
 tfile.getline((char *) ip, 200);
 editor.AddText(ip);
 }
 editor.Paint();
 }
}
// ---- File/Save Menu Command
void Ted::CmSave()
{
 if (fname == String(untitled))
 CmSaveAs();
 if (fname != String(untitled)) {
 editor.ClearChanged();
 ofstream tfile(fname);
 tfile.write((char *)*editor.Text(),editor.TextLength());
 }
}
// ---- File/Save As Menu Command
void Ted::CmSaveAs()
{
 SaveAs sa;
 sa.Execute();
 if (sa.OKExit()) {
 fname = String(sa.FileName());
 BuildTitle();
 CmSave();
 }
}
// ---- Options/Insert Menu Command
void Ted::CmInsert()
{
 if (InsertCmd.isToggled())
 desktop.keyboard().SetInsertMode();
 else
 desktop.keyboard().ClearInsertMode();
}
// ----- resize the editor when the application resizes
void Ted::Size(int x, int y)
{
 editor.Hide();
 editor.Size(editor.Right()+(x-Right()),
 editor.Bottom()+(y-Bottom()));
 Application::Size(x, y);
 editor.Show();
}
// ---- test for changes to the document before discarding
void Ted::TestChanged()
{
 if (editor.Changed()) {
 String msg(fname + "has changed. Save?");
 if (YesNo(msg))
 CmSave();
 }

}
// ---- test for changes before closing
void Ted::CloseWindow()
{
 TestChanged();
 Application::CloseWindow();
}



[LISTING THREE]

// ---------- tedmenu.cpp
#include "dflatpp.h"
#include "ted.h"

// --------- MenuSelection objects
MenuSelection
 NewCmd ("~New", (Ap) &Ted::CmNew ),
 OpenCmd ("~Open...", (Ap) &Ted::CmOpen ),
 SaveCmd ("~Save", (Ap) &Ted::CmSave ),
 SaveAsCmd ("Save ~As...", (Ap) &Ted::CmSaveAs ),
 ExitCmd ("E~xit Alt+F4", (Ap) &Ted::CmExit, ALT_F4 ),
 CutCmd ("~Cut Ctrl+X", (Ap) &Ted::CmCut, CTRL_X ),
 CopyCmd ("C~opy Ctrl+C", (Ap) &Ted::CmCopy, CTRL_C ),
 PasteCmd ("~Paste Ctrl+V", (Ap) &Ted::CmPaste, CTRL_V ),
 InsertCmd ("~Insert Ins", (Ap) &Ted::CmInsert, On, INS ),
 WordWrapCmd ("~Word wrap", On );
// --------- File menu definition
MenuSelection *File[] = {
 &NewCmd,
 &OpenCmd,
 &SaveCmd,
 &SaveAsCmd,
 &SelectionSeparator,
 &ExitCmd,
 0
};
MenuSelection *Edit[] = {
 &CutCmd,
 &CopyCmd,
 &PasteCmd,
 0
};
MenuSelection *Options[] = {
 &InsertCmd,
 &WordWrapCmd,
 0
};
// --------- menu bar definition
MenuBarItem TedMenu[] = {
 MenuBarItem( "~File", File ),
 MenuBarItem( "~Edit", Edit ),
 MenuBarItem( "~Options", Options ),
 MenuBarItem( 0 )
};


[LISTING FOUR]


// ---------- tedtools.h
#ifndef TEDTOOLS_H
#define TEDTOOLS_H

#include "toolbar.h"

// -------- the TED toolbar
class TedTools : public ToolBar {
 ToolButton newtool;
 ToolButton opentool;
 ToolButton savetool;
public:
 TedTools(DFWindow *par);
};
#endif



[LISTING FIVE]

// ---------- tedtools.cpp
#include "ted.h"

// -------- the TED toolbar
TedTools::TedTools(DFWindow *par) : ToolBar(par),
 newtool("New", (DFWindow *) this),
 opentool("Open", (DFWindow *) this),
 savetool("Save", (DFWindow *) this)
{
 newtool.SetButtonFunction(this->Parent(),
 (Df) (Ap) &Ted::CmNew);
 opentool.SetButtonFunction(this->Parent(),
 (Df) (Ap) &Ted::CmOpen);
 savetool.SetButtonFunction(this->Parent(),
 (Df) (Ap) &Ted::CmSave);
}




[LISTING SIX]

// -------- toolbar.h
#ifndef TOOLBAR_H
#define TOOLBAR_H

#include "pbutton.h"

const COLORS ToolBarBG = BLUE;

// ------ Toolbar class
class ToolBar : public DFWindow {
 void ParentSized(int xdif, int ydif);
public:
 ToolBar(DFWindow *par);
};
// ------- Toolbar button class
class ToolButton : public PushButton {

 void SetColors();
 void InitWindow(char *lbl);
 DFWindow *oldFocus;
 virtual void ButtonCommand();
 virtual Bool SetFocus();
public:
 ToolButton(char *lbl, int lf, int tp, DFWindow *par=0);
 ToolButton(char *lbl, DFWindow *par=0);
 virtual void Paint();
 virtual void Border();
};
#endif


[LISTING SEVEN]

// ------------ toolbar.cpp
#include "toolbar.h"

// ----- construct a Toolbar
ToolBar::ToolBar(DFWindow *par) : DFWindow(par)
{
 windowtype = ToolbarWindow;
 if (par != 0) {
 // --- put it into the application window
 Move(par->ClientLeft(), par->ClientTop());
 Size(par->ClientRight(), par->ClientTop()+2);
 par->SetAttribute(TOOLBAR);
 }
 colors.fg = colors.bg = ToolBarBG;
}
// ---- resize the menubar when the application window resizes
void ToolBar::ParentSized(int xdif, int)
{
 Size(Right()+xdif, Bottom());
}


[LISTING EIGHT]


// ---------- toolbutt.cpp
#include "toolbar.h"
#include "desktop.h"

// ------- various color patterns
static Color EnabledColor = {
 LIGHTGRAY, // fg
 ToolBarBG, // bg
 WHITE, // selected fg
 ToolBarBG, // selected bg
 CYAN, // frame fg
 ToolBarBG, // frame bg
 WHITE, // highlighted fg
 ToolBarBG // highlighted bg
};
static Color PressedColor = {
 BLACK, // fg
 ToolBarBG, // bg

 ToolBarBG, // selected fg
 ToolBarBG, // selected bg
 CYAN, // frame fg
 ToolBarBG, // frame bg
 ToolBarBG, // highlighted fg
 ToolBarBG // highlighted bg
};
static Color DisabledColor = {
 BLACK, // fg
 ToolBarBG, // bg
 BLACK, // selected fg
 ToolBarBG, // selected bg
 BLACK, // frame fg
 ToolBarBG, // frame bg
 BLACK, // highlighted fg
 ToolBarBG // highlighted bg
};
// ---- pressed and unpressed frame corner characters
const int PRESSED_NE = 184;
const int PRESSED_SW = 211;
const int UNPRESSED_NE = 183;
const int UNPRESSED_SW = 212;
// ----- button frame when pressed
static BoxLines PressedBorder = {
 FOCUS_NW,
 FOCUS_LINE,
 PRESSED_NE,
 SIDE,
 SE,
 LINE,
 PRESSED_SW,
 FOCUS_SIDE,
};
// ----- button frame when not pressed
static BoxLines UnPressedBorder = {
 NW,
 LINE,
 UNPRESSED_NE,
 FOCUS_SIDE,
 FOCUS_SE,
 FOCUS_LINE,
 UNPRESSED_SW,
 SIDE
};
// --- common constructor code
void ToolButton::InitWindow(char *lbl)
{
 oldFocus = 0;
 windowtype = ToolButtonWindow;
 Size(Left()+5, Top()+2);
 ClearAttribute(SHADOW);
 SetAttribute(BORDER);
 SetText(lbl);
 SetColors();
}
// --------- construct a Toolbar button specifying position
ToolButton::ToolButton(char *lbl, int lf, int tp, DFWindow *par)
 : PushButton(lbl, lf, tp, par)
{

 InitWindow(lbl);
}
// ------ construct a Toolbar button, self-positioning
ToolButton::ToolButton(char *lbl, DFWindow *par)
 : PushButton(lbl, 0, 0, par)
{
 InitWindow(lbl);
 if (par != 0 && par->WindowType() == ToolbarWindow) {
 int btcount = 0;
 DFWindow *Wnd = par->First();
 while (Wnd != 0 && Wnd != this) {
 if (Wnd->WindowType() == ToolButtonWindow)
 btcount++;
 Wnd = Wnd->Next();
 }
 Move(par->Left()+btcount*6, par->Top());
 }
}
// ----- draw the button's frame
void ToolButton::Border()
{
 if (visible) {
 if (pressed) {
 colors = PressedColor;
 DrawBorder(PressedBorder);
 }
 else {
 SetColors();
 DrawBorder(UnPressedBorder);
 }
 }
}
// -------- set the button's colors
void ToolButton::SetColors()
{
 if (isEnabled())
 colors = EnabledColor;
 else
 colors = DisabledColor;
}
// ------- paint the button
void ToolButton::Paint()
{
 SetColors();
 TextBox::Paint();
}
// -------- the button was pressed
void ToolButton::ButtonCommand()
{
 PushButton::ButtonCommand();
 if (oldFocus != 0)
 oldFocus->SetFocus();
 oldFocus = 0;
}
// ---- remember who had the focus before the button got it
Bool ToolButton::SetFocus()
{
 if (oldFocus == 0)
 oldFocus = desktop.InFocus();

 return PushButton::SetFocus();
}




























































December, 1993
ALGORITHM ALLEY


Heads I Win, Tails You Lose




Tom Swan


Among programmers, the search for a better mousetrap is matched only by the
hunt for the ultimate random-number generator. Or, perhaps I ought to call it
a random-sequence generator, because, as you probably know, computers can't
generate random numbers. They can produce only pseudo-random numbers--values
that, when juxtaposed, seem genuinely random.
Stock-market prices and lottery numbers are good examples of true random
numbers. Never mind whether those values satisfy someone's favorite numerical
test for randomness--if the Dow Jones industrial average didn't behave
randomly, everyone would be wealthy. In fact, the Daily Lottorama is probably
a better source of unpredictable numbers than the random-number generator
supplied with many compilers.
I'll probably hear from mathematicians about this, but when it comes to
unpredictability, only real-world future events can possibly be random.
Numerical sequences generated by a computer are repeatable and are therefore
unpredictable by definition. They are not random. They merely possess the
characteristics of randomness. A random sequence, for instance, may have an
even frequency distribution, an unpredictable "gap length," uniformly
distributed pairs of successive numbers, and other qualities devised by
researchers as tests of randomness. Some have even suggested that a definition
for a random sequence is impossible, but I'll stick my neck out and offer my
own recursive rule of thumb: A random sequence is any series of values that
can't be proved to be nonrandom.
It's a clear case of heads I win, tails you lose. You cannot prove that a
random-number generator is working; all you can say for sure is whether the
algorithm behind the function has failed.


In Rand We Trust


Why not trust the random-number generator that came with your compiler? There
are several reasons why that may not be wise:
Few random-number generators are supplied with test results. At the very
least, a random-number generator should be accompanied with a suitable test
suite, though this is rarely done. When was the last time you tested your
compiler's Rand function?
A random-number generator should include an option for producing non-random
sequences, useful for debugging. You might, for example, test your code with a
broken generator--one that consistently returns 0 or 1, or that produces an
obviously nonrandom sequence such as 1, 2, 3, . . ., N.
Compiler manufacturers have been known to alter a random function's algorithm
without notice, adversely affecting simulation, cryptography, game, and other
software that might be sensitive to a particular generator. It's especially
important to use your own random-number generator in programs that must
produce a repeatable sequence from a specific starting value, or seed.
Programs that depend heavily on random numbers might benefit from a generator
that has been optimized for the computer's word size, or one written in
assembly language.
Random-number generators can be combined to produce sequences more random than
those produced by individual methods. Because this technique tends to waste
memory, and the functions are less efficient than those that use simpler
algorithms, few compiler manufacturers use this method for stock generators.
If you need a combination generator, you probably have to write one yourself.


Testing for Nonrandomness


In his quintessential opus, The Art of Computer Programming, Vol. 2,
Seminumerical Algorithms, Donald Knuth proposed nine tests for determining
randomness. Perhaps, however, we should consider them to be tests of
nonrandomness, because the algorithms are particularly adept at weeding out
generators that are not producing usable random sequences. Before writing your
own random-number generator, it's important to know how to implement at least
one of these tests.
The best-known analysis examines a random sequence's frequency of
distribution. The test is easily performed, but the results are often
misinterpreted. Let's take a look at the method's inner workings, then apply
it to your compiler's random-number generator. Next month, I'll list
random-number algorithms that you can implement and test using the algorithm
explained here.
To measure a frequency distribution, we need to select a number of categories
(k) and a number of tests to perform (N). The tests must have independent
outcomes--that is, test N-1 must have no bearing on the results of any other
test. Too many categories are unwieldy, so K is usually set to a small value
such as 100. In other words, we will test numbers selected at random in 100
categories from 0 to 99. N should be relatively large, say 1000, 10,000, or
even higher. (Robert Sedgewick suggests performing at least K*10 tests.) The
algorithm uses an array of integers to record the frequency of numbers
generated at random within the defined range. After the test, the array's
counts are processed using a statistical method known as the chi-square
distribution; see Figure 1.
The chi-square distribution formula has been published in a variety of
derivations, but the one in Knuth's text is among the simplest to understand
and implement. If mathematical formulas make you glassy eyed, bear with me.
It's not as difficult to comprehend as it may appear, and besides, you don't
have to understand the formula to use it. (Knuth provides an excruciatingly
detailed description of the chi-square statistical method.) Briefly stated,
the formula sums the squares of the random-number counts (Y) divided by
probability (P). Because each number should be equally probable if the
generator is working, P should be set to 1/K. (Remember, K is the number of
test categories, 100 in this example.) The reason for using the chi-square
formula rather than directly comparing random-number occurrences and their
probabilities is that the statistical method mirrors true randomness as found
in the real world (wherever that is). We don't want each number to be
generated with a frequency of exactly 1/100. That would hardly be random!
Table 1 shows a portion of a chi-square distribution table that you can use to
analyze the formula's results. I extracted the table's data from a portion of
a Microsoft Excel spreadsheet included with this month's files. If you have
Excel, open file CHISQR.XLS and enter a starting value into cell F2 (not shown
here). For example, to produce the figures in Table 1, enter 95. The other
cells are protected, so you can't accidentally enter a value into the wrong
location. When creating new tables, be patient--a complete recalculation takes
a half minute or longer if your system doesn't have a math coprocessor.
To use Table 1, determine the number of degrees of freedom used for the test,
equal to one less than the number of categories. For example, given 100
categories, examine the expected values in row 99. According to the table, the
chi-square formula should yield 98.33414 about 50 percent of the time (the
middle column). The leftmost column tells you that, 99 percent of the time, V
should be greater than 69.23. Or, to put that another way, V should be less
than 69.23 no more than about 1 percent of the time. The right-most columns in
row 99 indicate that V should be greater than 117.41 no more than about 10
percent of the time. A simple way to use the table is to look at the 70-40
percent columns and ignore the others. If V is between 91.12 and 101.93 most
of the time, the random-number generator is probably working.
Example 1, Algorithm #14, lists the Pascal code for the frequency distribution
test using the chi-square formula. The algorithm assumes that the array Data
holds the counts of random numbers generated for a range 0. . .degree-1.
RANDTEST.PAS, Listing One (page 136) implements the formula and uses it to
test Borland Pascal's random-number generator. When I ran the program several
times, specifying 1000 iterations for each run, I received the chi-square
results 112.0, 100.8, 94.8, 99.6, 87.2, and 105.8. These values are clustered
around the 50 percent column for row 99, so it's reasonable to conclude that
Borland Pascal's random-number generator is working (or, at least, it isn't
broken according to the frequency-distribution test). Enable the test
program's DEBUG symbol by deleting the space before the dollar sign in
$DEFINE, and rerun RANDTEST to examine the results of a degenerate generator,
which I included in the program just for fun.
On one occasion, I received a chi-square result of 77.4 from RANDTEST. Does
this mean that Borland Pascal's random-number generator sometimes produces
nonrandom sequences? Not at all! From Table 1, we should expect to receive a
similarly low value about once in every ten trials. In fact, if the results
were right on the money in the 50 percent column every time, the generator
would be as suspect as it would be if it produced results that were out of
whack on every go-around. After all, it's a random-number generator, not a
Swiss watch.


Your Turn


I almost forgot to mention some notable advice from Robert Sedgewick's
Algorithms in C++. Keep this one in mind the next time your
statistical-analysis application develops a bug. If the program fails, always
blame the random-number generator.
Next time: random number algorithms you can implement in most programming
languages.
 Figure 1: Chi-square distribution formula.
Example 1: Pascal code for Algorithm #14 (chi-square distribution).
const
 degree = 100; { Test range is 0 .. degree - 1 }
var
 Data: array[0 .. degree - 1] of LongInt;
function ChiSquare(N: LongInt; R: Word): Double;
var
 V, P: Double;
 I: Integer;
begin

 V := 0; { Initialize result }
 P := 1 / R; { Probability of Random(R) }
 for I := 0 to R - 1 do
 V := V + (Sqr(Data[I]) / P);
 ChiSquare := ((1.0 / N) * V) - N
end;
Table 1. Chi-square distribution probabilities
Degrees
of
freedom 99% 90% 70% 50% 40% 10% 1%
95 65.89826 77.81844 87.31749 94.33416 97.8549 113.0377 129.9725
96 66.73003 78.72541 88.27938 95.33417 98.8733 114.1307 131.1411
97 67.56234 79.63287 89.24148 96.33415 99.8916 115.2232 132.3089
98 68.39571 80.54082 90.20378 97.33415 100.9098 116.3153 133.4756
99 69.22986 81.44925 91.16627 98.33414 101.9279 117.4069 134.6415
100 70.065 82.35813 92.12895 99.33413 102.9459 118.498 135.8069
101 70.90072 83.26747 93.09182 100.3341 103.9639 119.5887 136.9711
102 71.7373 84.17727 94.05486 101.3341 104.9817 120.6789 138.1343
[LISTING ONE]

(* ----------------------------------------------------------- *(
** randtest.pas -- Random sequence test program **
** Copyright (c) 1993 by Tom Swan. All rights reserved. **
)* ----------------------------------------------------------- *)

{ $DEFINE DEBUG} { Delete space before $ for debugging }

{$N+,E+} { Required for 8087 mode }
{$R+} { Halt on range errors }

program RandTest;
uses Dos;
const
 degree = 100; { Test range is 0 .. degree - 1 }
var
 Data: array[0 .. degree - 1] of LongInt;

{$IFDEF DEBUG}
const
 a: LongInt = 0;

{ Degenerate generator }
function Rand: Word;
begin
 inc(a);
 Rand := a;
end;

{ Degenerate randomizer }
procedure Scramble;
begin
 a := -1
end;

{$ELSE}

{ Call Borland Pascal's Random function }
function Rand: Word;
begin

 Rand := Random(65535)
end;

{ Call Borland Pascal's Randomize function }
procedure Scramble;
begin
 Randomize
end;

{$ENDIF}

{ Return a word at random in the range 0 to M-1 }
function RandMod(M: Word): Word;
begin
 RandMod := Rand MOD M
end;

{ Perform Chi-Square analysis of counts in Data array }
function ChiSquare(N: LongInt; R: Word): Double;
var
 V, P: Double;
 K: Word;
 I: Integer;
begin
 V := 0; { Initialize result }
 P := 1 / R; { Probability of RandMod(R) }
 for I := 0 to R - 1 do
 V := V + (Sqr(Data[I]) / P);
 ChiSquare := ((1.0 / N) * V) - N
end;

var
 N: LongInt; { Number of tests to perform }
 I: LongInt; { For-loop control variable }
 E: Integer; { Error result for val function }
begin
 if ParamCount = 0 then
 begin
 Writeln;
 Writeln('Random Sequence Test');
 Writeln('(C) 1993 by Tom Swan');
 Writeln;
 Writeln('Enter number of tests to perform.');
 Writeln;
 Writeln('ex. randtest 10000')
 end else
 begin
 Scramble;
 val(ParamStr(1), N, E);
 if E <> 0 then
 begin
 Writeln(ParamStr(1));
 Writeln('^':E, '--- Error!')
 end else
 begin
 if N < 10 * degree then
 begin
 Writeln;
 Writeln('WARNING: For accurate results, run program');

 Writeln(' with at least ', 10 * degree, ' tests.')
 end;
 Writeln;
 Writeln('Test range: 0 to ', degree - 1);
 Write('Performing ', N, ' tests...');
 for I := 0 to degree - 1 do
 Data[I] := 0;
 for I := 1 to N do
 Inc(Data[RandMod(degree)]);
 Writeln;
 Writeln('Chi-Square result = ',
 ChiSquare(N, degree):0:4)
 end
 end
end.















































December, 1993
UNDOCUMENTED CORNER


Walking the VxD Chain in Windows (and Chicago)




Andrew Schulman


Many PC programmers have written, or at least seen, a program to list all the
DOS device drivers installed on a machine. Probably every PC system
administrator has, at one time or another, used such a program. MS-DOS keeps
device-driver headers in a linked list, called the device "chain" and, while
Microsoft doesn't document how to find this chain, doing so is extremely
simple. More because of this combination of semi-illicitness with extreme
simplicity, and less because of any genuine utility, programs to walk DOS
device-chain are ubiquitous.
But this is the '90s! Whether you like it or not, real-mode DOS
memory-resident programs (TSRs) and DOS device drivers are becoming less
important. Three years ago, Art Rothstein wrote a program to walk the OS/2
device chain (see "Opening OS/2's Backdoor," DDJ, October 1990). What matters
more and more these days are Windows virtual device drivers (VxDs). Think of
VxDs as 32-bit protected-mode TSRs. Just as many programmers were able to use
TSRs to accomplish the otherwise impossible under MS-DOS, likewise VxDs will
be where programmers "push the envelope" of Windows. What TSRs were in the
mid-eighties, VxDs will be in the mid-nineties.
Since VxDs are 32-bit protected-mode code, Windows need not make an expensive
mode switch when a VxD rather than a real-mode TSR or driver provides some
service. For example, programs that hook DOS INT 21h should (if only for
performance reasons) be written as VxDs rather than as TSRs. This will be
particularly true in Microsoft's forthcoming "Chicago" operating system
(Windows 4/DOS 7), in which the presence of even one real-mode TSR or driver
may seriously impact the whole system. Furthermore, VxDs are allocated out of
extended memory, so they don't occupy any precious conventional memory below
one megabyte. Finally, Windows VxDs have available hundreds of services that
plain-vanilla DOS just doesn't provide.
In this month's "Undocumented Corner," I'll throw out the traditional program
to walk the DOS device chain, and write a new program, VXDLIST, that walks the
Windows Enhanced mode VxD chain and displays the names of all VxDs loaded on
the system. This program is similar to the VXD command in Nu-Mega's
Soft-ICE/Windows debugger. In particular, VXDLIST will be useful as a tool for
exploring Chicago, which is based on VxDs.
While Microsoft has failed to document the Linear Executable (LE) file format
used by VxDs, and the W3 file format used by the Enhanced mode WIN386.EXE file
(DOS386.EXE in Chicago), you can bypass this problem by examining VxDs in
memory. Microsoft's Windows Device Driver Kit (DDK) documents almost
everything needed to implement a VXDLIST program, and the remaining pieces are
readily inferred from the documentation.
While VXDLIST depends almost entirely on documented interfaces, it is useful
for uncovering undocumented interfaces. For example, while Microsoft clearly
documents how a VxD can provide a programming interface to applications
running under Windows, the documentation says next to nothing about which
built-in Windows VxDs actually do provide APIs. VXDLIST shows which ones do.


Chicago Looks Cool!


Figure 1 shows sample output from VXDLIST running under Windows 3.1 Enhanced
Mode. Figure 2 shows sample output when running under the August 1993
prerelease of Chicago.
What do we see here? Just a list of VxDs. In one sense, the VXDLIST output
looks like a slightly complicated version of the [386Enh] section of the
Windows SYSTEM.INI file, which uses device= statements to specify the VxDs to
be loaded. So, if you want to find out which VxDs are loaded on a system, why
not just print out the [386Enh] section of SYSTEM.INI? Because the user might
have modified the .INI file without restarting Windows. In addition, TSRs and
drivers running before Windows can bypass SYSTEM.INI by hooking an INT 2Fh
AX=1605h broadcast to tell Windows to load a VxD (which might be embedded
right inside the TSR or driver). For example, this is how MS-DOS 5 and 6
forced Windows 3.0 to load WINA20.386.
You can learn a lot by staring at the VXDLIST output. First, there's the sheer
number of VxDs that happened to be loaded in this standard configuration:
about 30 in Windows 3.1 and almost 50 in Chicago. Many of these (including
VMM, the Windows Virtual Machine Manager) are built into WIN386.EXE or
DOS386.EXE; others are supplied as *.386 files, or built into DOS programs
such as EMM386.EXE or INTERLNK.EXE.
In addition to VMM, there are VPICD (the Virtual Programmable Interrupt
Controller [PIC] Device), VTD (Virtual Timer Device), VDD (Virtual Display
Device), and so on. DOSMGR is the Enhanced Mode DOS extender and Windows'
(largely undocumented) interface to MS-DOS. PageFile and PageSwap manage the
virtual-memory swap file; the Reboot VxD virtualizes Ctrl-Alt-Del.
In Figure 2, you can see that Chicago tends to rely more heavily than Windows
3.1 on VxDs. Chicago includes many new VxDs, including VFAT (a 32-bit
protected mode version of the DOS FAT file system), IFSMGR (an installable
file-system manager), VXDLDR (a dynamic VxD loader), VSHARE (a 32-bit
protected mode version of SHARE), and so on. Microsoft is providing an early
look at many of these new Chicago VxDs in version 3.11 of Windows for
Workgroups (WfW).
The order in which VxDs appear in this list reflects the order in which the
VxDs initialized. For example, it is significant that SHELL appears last, as
this VxD is responsible for starting the Windows kernel (KRNL386.EXE), which
in turn boots the rest of what we normally think of as Windows (USER, GDI, and
so on). SHELL is an interface between the upper and lower parts of Windows
Enhanced mode.
The third column in Figure 1 and 2 shows VxD identification numbers. For
example, VMM is considered VxD #1, VPICD is #3, VTD is #5, and so on.
Microsoft issues these numbers to third-party VxD developers
(vxdid@microsoft.com). Only VxDs that provide APIs require an ID number.
One VxD calls functions in another VxD by issuing an INT 20h, specifying the
VxD ID and a function number. Windows and DOS programs can call down to a VxD
through a callback address; programs obtain this address by calling INT 2Fh
AX=1684h with BX set to the VxD ID.
The VxD can provide separate APIs for protected mode (PM) and Virtual-86 (V86)
mode. A real-mode DOS program running under Windows could use the V86 API; a
Windows application would use the PM API, of course. Figure 1 and 2 show which
VxDs provide APIs in each mode. An asterisk next to the address of the API
indicates that it has a corresponding callback address, which in turn
indicates that some program has probably used the API. To determine what the
API actually does, you would need to disassemble at the address displayed by
VXDLIST, or consult the documentation. For an examination of a sample VxD API,
see "Fast Interrupt Handling Without VxDs" by Karen Hazzah (Windows/DOS
Developer's Journal, June 1993).
The last VXDLIST column indicates the size of the VxD's "service table." This
is the table of function pointers used to service API calls from other VxDs.
For example, in Windows 3.1, VMM provides 242 services, VPICD provides 21, VTD
provides 8, and so on. For the most part, these are documented in the Windows
DDK Virtual Device Adaptation Guide.
Notice in Figure 2 that Chicago provides many new functions: VMM in Chicago
provides 350 services, over 100 more than in Windows 3.1. These new VMM
services include calls for threads, mutexes, memory commit/uncommit, and all
the other low-level functionality you would expect VMM to provide, given that
Microsoft intends for Chicago to support all the same Win32 calls as Windows
NT, except for security and Unicode. And in four megabytes, instead of twelve,
or whatever ridiculous amount of memory NT requires.
Other important new areas of functionality in Chicago include IFSMGR, which
appears to provide 61 services. According to a developer at Microsoft, IFSMGR
will be a full-blown, documented manager for installable file systems--greatly
preferable to using the undocumented DOS network redirector.
As seen in the VXDLIST output, the SHELL VxD provides only six services in
Windows 3.1, but 25 in Chicago. The new SHELL seems to include services for
accessing the Windows API (PostMessage, WinExec, the clipboard, and DLL
calling) from a DOS box; this presumably underlies Chicago's ability to start
a Windows program from the C:\> prompt in a "unified" command shell.
VXDLIST has a -verbose option which will output the number, address, and name
for each function in the service table; Figure 3 shows an example.
Unfortunately, the function names are not built into the VxDs themselves.
VXDLIST includes a list of function names that were culled from DDK header
files. Since I didn't have a Chicago DDK, VXDLIST doesn't know the names of
the new Chicago services.


A Slimy Hack


VXDLIST.C is unfortunately too long to reprint here; full source code is
available electronically (see "Availability," page 3). Because VxDs reside in
extended memory, walking the VxD list is best accomplished by a protected-mode
program. This can either be a DOS program that uses the DOS Protected Mode
Interface (DPMI) services provided by Windows, or it can be a Windows
application (which is really just a certain type of protected-mode DOS
program). VXDLIST is a DOS program that uses DPMI to switch itself into
protected mode. DPMISHEL.C, also available electronically, provides a C
interface to DPMI INT 2Fh and INT 31h services. Incidentally, in Windows
Enhanced Mode the DPMI server is located in VMM, and in many ways is just a
thin layer on top of VMM services.
The key portion of VXDLIST is the loop that runs over the VxD chain; this is
shown in Listing One (page 137) along with an extract from VXDLIST's display
function to printf selected fields from each VxD's header. The Windows DDK
documents the VxD header structure, not in the manual, but in the VMM.INC
file. Listing Two (page 137) shows a C version of this structure, called the
Device Descriptor Block (DDB).
As you can see in Listing One, VXDLIST locates the root of the VxD chain by
calling a function, Get_First_VxD. It moves from one VxD to the next by
calling Get_Next_VxD. Unfortunately, there are no VMM functions to walk the
VxD chain. Instead, Get_First_VxD and Get_Next_VxD are provided by another
module, VXDCHAIN.C, which is also unfortunately too long to reprint here.
However, Listing Three (page 137) shows the key Get_First_VxD and Get_Next_VxD
functions from VXDCHAIN.C.
Get_Next_VxD is simple. The Windows VMM keeps all VxD headers in a linked
list, in the order in which they initialized. The DDB has a "reserved" field,
DDB_Next. It is not explained in the documentation, but this field is
obviously a pointer to the next VxD in the chain.
So, all you need is the root of the VxD chain (Get_First_VxD). This will
always be the VMM itself, which initializes before all VxDs. While VMM really
isn't a VxD (among other responsibilities, it is the VxD manager and supplier
of most functions that VxDs call), it has a standard VxD DDB. Its DDB_Name
field will contain the string "VMM," followed by five spaces. Its VxD ID
number (see DDB_Req_Device_Number in Listing Two) will be 1.
Implementing Get_First_VxD is the only tricky or undocumented part of this
code. Frankly, the function uses brute force: it searches in memory for the
string "VMM", and sees if any such strings are indeed VMM's DDB_Name field.
Windows doesn't contain an undocumented Get_First_VxD function; instead, we
have to implement one, by any means necessary. Thus, this month's
"Undocumented Corner" has revealed, not so much an undocumented interface, as
a slimy hack.
Where does Get_First_VxD look for the VMM DDB? We know it's in extended
memory, but where exactly? At the VxD level, Windows deals in what are called
"linear" addresses: the machine is treated as a linear (that is, flat)
4-gigabyte array of bytes. This is a sparse array, in that, except on a dream
machine with four gigabytes of physical memory or swap-file disk space, most
of the addresses are empty. Windows leaves empty slots in the page directory
and page tables. As presently constructed, Windows Enhanced Mode reserves one
page table (four megabytes of linear address space) for VxDs, starting at
address 80000000h (two gigabytes).
To find the start of the VxD chain, Get_First_VxD must search in the range
8000000h-80400000h for the string "VMM". It can't start the search right at
80000000h, however, because Windows seems to use the first 4K as a "guard"
page to catch null pointers. Thus, it starts searching at 80001000h (see
Listing Two).
Linear addresses such as 80001000h are not immediately useful to a 16-bit
program such as VXDLIST. To turn this linear address into a useful
selector:offset pointer that it can dereference, VXDLIST must "map" the
address into its address space. This is done using Get_VxD, which calls the
map_linear function, which in turn uses DPMI equivalents of the AllocSelector,
SetSelectorBase, and SetSelectorLimit Windows API functions (see Listings
Three and Four, page 137). To turn a linear address into a selector:offset
pointer, map_linear allocates a selector/descriptor and sets its base address
to the desired linear address. This is explained in detail in the new
"Undocumented DOS Meets Windows" chapter of Undocumented DOS, second edition.


Where Now?


Besides helping reveal how Windows Enhanced Mode and Chicago work, does
walking the VxD chain have any practical use? Surprisingly, it does. I was
first alerted to the need for this utility by a system administrator who
needed to incorporate a program into a batch file to tell whether a named VxD
was running. Placing a VxD name or ID number on VXDLIST's command line makes
it search for the specified VxD. VXDLIST will return a DOS error code based on
whether the VxD is loaded. A batch file ERRORLEVEL statement can then test
this:
vxdlist %1 > nul

if errorlevel 2 goto not_found
if errorlevel 1 goto end
echo VxD %1 is loaded
The code to search for a specified VxD can be found in the Find_VxD and
Find_VxD_ID functions in VXDCHAIN.C, available electronically.
The VxD manipulation functions in VXDCHAIN.C might also be useful as the basis
for a dynamic VxD loader for Windows 3.1, similar to VXDLDR in Chicago. It's
also interesting to consider how a non-VxD program might call functions such
as those displayed in Figure 3. One possibility would be a 32-bit Ring 0 call
gate (see Matt Pietrek's "Run Privileged Code from Your Windows-based Program
Using Call Gates," Microsoft Systems Journal, May 1993).
That's it for this month's "Undocumented Corner." Future columns will take up
the undocumented LE and W3 file formats I mentioned earlier. I've already
received a useful write-up of LE from Doug McIntyre. This material will be
useful in Chicago as well as in Windows 3.1. I've also just received a
fascinating two-part article by Kelly Zytaruk on the Windows Virtual Machine
Control Block (VM CB) structure, most of whose fields are undocumented. It
looks, then, like we're in for several months of coverage for Windows Enhanced
Mode and Chicago. Please send your suggestions to me at 76320,302 on
CompuServe.
Figure 1: VXDLIST output under Windows 3.1 Enhanced Mode
Name Vers ID DDB Control V86 API PM API #Srvc
-------- ---- ----- -------- -------- -------- -------- -----
VMM 3.10 0001h 80011A74 8000ADFF 242
VPICD 3.10 0003h 80020EFC 8001FF10 80020624 80020624* 21
VTD 3.10 0005h 80022404 8002199E 80021933 80021933 8
PageFile 2.00 0021h 80036514 80035EB8 800363DF 7
PageSwap 2.10 0007h 8002D028 8002C6D0 7
PARITY 1.00 0008h 8003692C 80036884 0
Reboot 2.00 0009h 800230B8 800224FC 80022825* 0
VDD 2.00 000Ah 8001C580 80016060 80019EC7* 14
VSD 2.00 000Bh 80025078 80024F04 2
VCD 3.10 000Eh 80037960 80036E7F 80036EC7 4
VMD 3.00 000Ch 80016000 80015DBC 80015F49 80015F49* 3
VKD 2.00 000Dh 8001F44C 8001DB28 8001E05E* 15
BLOCKDEV 3.10 0010h 80035E70 80035BB8 7
INT13 3.10 0020h 800151EC 80014F3C 5
VFD 2.00 001Bh 80036818 800366C8 0
PharLap 1.00 80014550 80013E14 0
VMCPD 1.02 0011h 80037D20 80037B84 3
BIOSXLAT 1.00 0013h 80036D80 800369C0 0
DOSMGR 1.00 0015h 80030E6C 8002EB33 8002E977* 12
VMPOLL 3.10 0018h 80031680 80031433 3
Vpfd 1.04 1022h 80014BF4 800147D4 80014B2A 80014B2A 1
VXD 1.01 28C0h 80014ED8 80014C88 80014C8A 80014C8A 0
LANMAN10 3.00 800154FC 800154AC 0
VTDAPI 3.00 0442h 8001FB60 8001FB52 8001F728* 0
COMBUFF 1.00 800380CC 80037D70 0
TDDebug 1.00 001Dh 8001478C 8001466A 8001475A 0
VDMAD 2.00 0004h 80024B88 8002323C 24
V86MMGR 1.00 0006h 8002C67C 8002AD13 21
SHELL 3.00 0017h 80035988 8003429E 80032C88* 6
Figure 2: VXDLIST output under Chicago (August 1993 Prerelease)
Name Vers ID DDB Control V86 API PM API #Srvc
-------- ---- ----- -------- -------- -------- -------- -----
VMM 4.00 0001h 80014020 80004097 80004BE1 80004BE1* 350
VCACHE 3.01 048Bh 8003DEF4 8003DC0C 8003DEAF 8003DEAF 14
CONFIGMG 4.00 0033h 8001FAE0 8001BEBF 8001BE20 8001BE20 57
FAKEIDE 3.10 00FDh 8001B1CC 8001B1B8 0
VPICD 3.10 0003h 800466D4 80044F38 800459C0 800459C0 22
VTD 4.00 0005h 80048DD8 80048080 80230B5F 80230B5F 12
VXDLDR 3.00 0027h 80043CA0 800431C0 800432EB 800432EB 6
ISAPNP 4.00 0051h 80047D00 800475A4 0
IOS 3.10 0010h 80036080 8003495C 802244B8 13
PAGEFILE 4.00 0021h 800442E8 80044130 8004422C 7
PAGESWAP 2.10 0007h 80050EBC 80050DB0 10
PARITY 1.00 0008h 800527C4 80052704 0
REBOOT 2.00 0009h 800493C8 8004902C 802310C2* 0
VDD 2.00 000Ah 800599E8 80053140 80054735 80054735* 15
VSD 2.00 000Bh 8004B6C0 8004B4F0 4
VCD 3.10 000Eh 80027AD4 80026F4F 80026F97 9
VMD 4.00 000Ch 8005B104 8005AA38 802370F0 802370F0* 7
VKD 2.00 000Dh 8005CCA4 8005B380 8005B953* 20

VFBACKUP 4.00 0036h 80022788 80022212 80022405 80022406 5
INT13 3.10 0020h 80043094 80042A8E 5
VMCPD 1.02 0011h 80052F34 80052C34 7
BIOSXLAT 1.00 0013h 80052BC4 80052808 0
VNETBIOS 3.00 0014h 8005E89C 8005D9B8 4
DOSMGR 4.00 0015h 800511A4 80050F04 80232BA4 13
VSHARE 1.00 0483h 80022BC0 80022AA3 80022A78 80022A78 7
VMPOLL 3.10 0018h 800515B8 80051363 3
VXD 2.00 28C0h 8001B144 8001AE64 8001AE8E 8001AE8E 0
VWIN32 1.02 002Ah 80021E68 80020844 80020936* 7
VCOMM 1.00 002Bh 80024EBC 80024554 80223530 80223530* 29
SERIAL 1.00 80025F64 80024FAC 0
LPT 1.00 8002652C 800260AC 0
COMBUFF 1.00 80026C44 800268A0 0
VCOND 1.00 0038h 8003EDD8 8003DFA8 8003DFBB 8003E042* 2
VTDAPI 4.00 0442h 800530EC 800530D6 802363A4* 0
UNIMODEM 1.00 0460h 8005D1E4 8005CF74 8005D025 0
DiskTSD 3.10 81ED4720 81ED44A8 0
voltrack 3.10 0090h 81EDD3A4 81EDCDD4 0
RMM 3.10 81EDE3B8 81EDDD2C 0
NEC 3.10 81EEB95C 81EEABA0 0
ESDI_506 3.10 008Dh 81EECA50 81EEC0A8 0
VDMAD 2.00 0004h 8004B230 800495D8 26
V86MMGR 1.00 0006h 80050C38 8004F42B 23
VFAT 3.00 0486h 8003D688 8003A364 8003A398 8003A398 0
CDFS 3.00 800428D8 800423CB 0
VDEF 3.00 800440F4 80043FD8 0
IFSMgr 3.00 0484h 8002A630 80027BDC 8002A134 61
SHELL 4.00 0017h 8005251C 80051F18 80235F0F 80235F0F* 25
WSHELL 4.00 80044820 800443C8 8022F1E0* 0
Figure 3: The VXDLIST verbose option
C:\DDJ\VXDCHAIN>vxdlist -verbose PageFile
Name Vers ID DDB Control V86 API PM API #Srvc
-------- ---- ----- -------- -------- -------- -------- -----
PageFile 2.00 0021h 8007495C 80074300 80074827 7
Init order: 18000000
Demand Paging Swap Device Services
 210000h @ 80074318h PageFile_Get_Version
 210001h @ 80291002h PageFile_Init_File
 210002h @ 80074316h PageFile_Clean_Up
 210003h @ 8007435Ah PageFile_Grow_File
 210004h @ 80074496h PageFile_Read_Or_Write
 210005h @ 80074709h PageFile_Cancel
 210006h @ 80074337h PageFile_Test_IO_Valid
[LISTING ONE]

#include "vxdchain.h"

DWORD vxd_lin, next;
int err;
if ((err = Get_First_VxD(&vxd_lin)) != 0)
 fail("Can't locate VxD chain");
for (;;)
{
 display(vxd_lin);
 if ((err = Get_Next_VxD(vxd_lin, &next)) == 0)
 vxd_lin = next;
 else if (err == ERROR_END_OF_CHAIN)
 break; // successfully got to end of VxD chain

 else
 fail("Can't walk VxD chain");
}

// ...

void display(DWORD vxd_lin)
{
 unsigned char name[9];
 DDB far *pddb;
 if (Get_VxD(vxd_lin, &pddb) != 0)
 fail("Can't get VxD");
 _fstrncpy(name, pddb->DDB_Name, 8); /* printf %8.8Fs not reliable */
 name[8] = '\0';
 printf("%s ", name);
 printf("%u.%02u ",
 pddb->DDB_Dev_Major_Version, pddb->DDB_Dev_Minor_Version);
 if (pddb->DDB_Req_Device_Number)
 printf("%04Xh", pddb->DDB_Req_Device_Number);
 // ... etc.
}


[LISTING TWO]

#pragma pack(1)
typedef struct {
 DWORD DDB_Next; // addr of next VxD in chain, or 0
 WORD DDB_SDK_Version;
 WORD DDB_Req_Device_Number; // the VxD ID number
 BYTE DDB_Dev_Major_Version;
 BYTE DDB_Dev_Minor_Version;
 WORD DDB_Flags;
 BYTE DDB_Name[8]; // padded with spaces
 DWORD DDB_Init_Order; // also order within list (SHELL last)
 DWORD DDB_Control_Proc;
 DWORD DDB_V86_API_Proc;
 DWORD DDB_PM_API_Proc;
 void (far *DDB_V86_API_CSIP)(); // V86 mode seg:ofs callback addr
 void (far *DDB_PM_API_CSIP)(); // prot mode sel:ofs callback addr
 DWORD DDB_Reference_Data;
 DWORD DDB_Service_Table_Ptr;
 DWORD DDB_Service_Table_Size;
 } DDB; // from ddk include/vmm.inc


[LISTING THREE]

#include "map_lin.h"

int Get_First_VxD(DWORD *proot)
{
 DWORD vmm_ddb_lin = 0;
 DWORD lin;
 DDB far *pddb;
 char far *vmm_str;
 unsigned char far *fp;

 // try to find string "VMM " in first megabyte

 // can't start at 0x800000000L: page fault!
 for (lin=0x80001000L; lin<0x80400000L; lin += 0x10000L)
 {
 if (! (fp = (char far *) map_linear(lin, 0x10000L)))
 return ERROR_CANT_MAP_LINEAR;
 if (vmm_str = fmemstr(fp, VMM_STR, 0xFFFF)) // search for string
 {
 // back up to get linear address of possible VMM_DDB
 vmm_ddb_lin = lin + (vmm_str - fp) - offsetof(DDB, DDB_Name);
 free_mapped_linear(fp);

 // map VMM_DDB (hopefully) into our address space
 if (! (pddb = (DDB far *) map_linear(vmm_ddb_lin, sizeof(DDB))))
 return ERROR_CANT_MAP_LINEAR;

 // make sure it really is VMM_DDB
 if ((_fstrncmp(pddb->DDB_Name, VMM_STR, 8) == 0) &&
 (pddb->DDB_Req_Device_Number == 1) &&
 (pddb->DDB_Init_Order == 0) && // VMM_Init_Order
 (pddb->DDB_Next > 0x80000000L) &&
 (pddb->DDB_Control_Proc > 0x80000000L))
 {
 Free_VxD(pddb);
 *proot = vmm_ddb_lin;
 return SUCCESS;
 }
 }
 // still here: keep looking
 free_mapped_linear(fp);
 }
 // still here: didn't find it
 return ERROR_CANT_FIND_VMM;
}

int Get_Next_VxD(DWORD vxd, DWORD *pnext)
{
 DDB far *pddb;
 DWORD next;
 int err;
 if ((err = Get_VxD(vxd, &pddb)) != 0)
 return err;
 next = pddb->DDB_Next; // should verify with an Is_VxD() function?
 Free_VxD(pddb);
 *pnext = next;
 return next ? SUCCESS : ERROR_END_OF_CHAIN;
}

int Get_VxD(DWORD vxd, DDB far* *ppddb)
{
 if (*ppddb = (DDB far *) map_linear(vxd, sizeof(DDB)))
 return SUCCESS;
 else
 return ERROR_CANT_MAP_LINEAR;
}

int Free_VxD(DDB far *pddb)
{
 free_mapped_linear(pddb);
 return SUCCESS;

}



[LISTING 4]: MAP_LIN.C excerpts

#ifdef DPMI_APP
// DPMISHEL.H includes DPMI equivalents for AllocSelector, etc.
#include "dpmishel.h"
#else
#include "windows.h"
#endif

void far *map_linear(DWORD lin_addr, DWORD num_bytes)
{
 WORD sel;
 // allocate a data selector similar to our current DS
 _asm mov sel, ds
 if ((sel = AllocSelector(sel)) == 0) // INT 31h AX=0 CX=1
 fail("Cannot allocate a selector!");

 // set the base and limit of the new selector
 SetSelectorBase(sel, lin_addr); // INT 31h AX=7 BX=sel CX:DX=lin_addr
 SetSelectorLimit(sel, num_bytes - 1); // INT 31h AX=8 BX=sel CX:DX=limit

 return MK_FP(sel, 0); // turn sel into a far pointer
}



































December, 1993
PROGRAMMER'S BOOKSHELF


Robots Make Their Move




Jonathan Erickson


Mobile Robots: Inspiration to Implementation
Joseph L. Jones and Anita M. Flynn
A K Peters, Ltd., 1993, 349 pp., $39.95
ISBN 1-56881-011-3
Mobile robots, long the high-tech toy of researchers, tinkerers, science
fiction buffs, and like dreamers, may finally be coming into their own. Of
course, stationary robots proved up years ago, at least in automated factories
where they play a key role in manufacturing everything from cars to computers.
(Interestingly, Steve Jobs' robot-centric manufacturing facility recently went
on the auction block, a victim of Next's refocus on software.) But except for
critters like R2D2 in Star Wars, mobile robots haven't moved forward the way
their stationary cousins have.
The paucity of real-world mobile-robot applications can be traced in part to
the difficulties of integrating multiple complex technologies: computational
hardware, sensors, machine vision, mechanics, real-time control, motors,
power, programming, recognition, multitasking, learning, and navigation, to
mention a few. Just getting two or three of these working together
harmoniously is difficult; uniting all of them in an affordable, efficient,
and reliable package can be daunting indeed.
But recent reports like the following out of Maryland suggest mobile robots
are moving from the theoretical to the practical. In this case, the police
sent a three-foot tall, 480-pound remote-controlled mobile robot into an
apartment to disarm and capture a murder suspect. Upon opening a bedroom
closet door and finding a pile of clothes, the robot began plucking at the
pile until the hidden suspect was uncovered. After a brief tussle, the robot
used a high-pressure water gun to knock a shotgun out of the suspect's hands,
enabling police officers to burst in and arrest the suspect. Practical? You
bet. By avoiding a bloody confrontation, someone's life was surely spared. (Of
course, not every mobile robot performs so heroically. When a mobile robot was
recently sent in to disarm a bomb in San Francisco, it moved forward a few
feet, then began spinning in circles instead of grabbing the bomb. Then again,
maybe the robot knew what it was doing all along. That's California for you.)
Over the past few years, much of the work at the MIT Artificial Intelligence
Lab's Mobile Robot Group has focused on how to more smoothly integrate
disparate hardware and software more smoothly to better cope with
"computational bottlenecks, noisy sensors, and the complexity of reality." In
particular, the Lab has been investigating "new models of intelligence that
would be robust and work in real time." Mobile Robots: Inspiration to
Implementation by Joseph L. Jones and Anita M. Flynn is an outgrowth of that
research. Flynn (who's associated with the Lab) and Jones (of IS Robotics,
makers of research robots and sensor systems) have written a book that's a
trove of information--even if you're not particularly interested in robots. On
one level, Mobile Robots provides you with virtually all the information you
need to build your own mobile robot, including everything from parts lists and
suppliers to control software and schematics. But Mobile Robots is more than a
project cookbook. On another level, it's the application of a new approach to
artificial intelligence referred to as "nouvelle AI." And on still yet another
level, the book is a comprehensive treatise on embedded-systems design.
At the heart of the book is a robot called "Rug Warrior" that's designed
around the Motorola 68HC11 microcontroller. (Flynn and Jones also briefly
present TuteBot, a simple non-microprocessor-based mobile robot built from
switches, relays, motors, and discrete electronic components.) Rug Warrior is
significant in that its design is based on a subsumption architecture, a
concept proposed by Rodney Brooks (also of MIT's Mobile Robot Lab) which
organizes intelligence systems by layering arbitration mechanisms (that is,
the priority process) between task-achieving behaviors. In other words, "in a
subsumption architecture, the designer of the intelligence system lays out the
behaviors in such a way that higher-level behaviors subsume lower-level
behaviors when the higher-level behaviors are triggered."
For example, Rug Warrior can exhibit a "follow-light" behavior that would have
higher priority than a "random-wandering" behavior. When Rug Warrior detects a
high-intensity light source, it moves towards it. If the light source were
turned off, the follow-light behavior would deactivate, cease subsuming the
wandering behavior, and random wandering would resume. (Light sensors are one
of several sensors the Rug Warrior can have; others include sensors for tilt,
sound, force, motion, and the like. The authors provide drivers for most of
these.)
The beauty of subsumption, say Flynn and Jones, is that it lets you tie
together in a coherent and efficient whole all elements of robot control via
behavior fusion--and do so using modest computational resources.
Alternatively, the traditional approach to programming robots--the
modeling/planning paradigm--employs sensor fusion which is much more
computationally intensive because it uses a series of sequence steps to
transform sensory data into a series of actions. Certainly, the modest
processing power of most low-cost microcontrollers lends itself to
subsumption, rather than modeling/planning. Figure 1 illustrates an example of
one way subsumption might be implemented for Rug Warrior. Subsumption has its
roots in the nouvelle AI movement which investigates distributed approaches to
organizing intelligence systems. (For more information on nouvelle AI, Jones
and Flynn point to Designing Autonomous Agents: Theory and Practice from
Biology to Engineering and Back by Pattie Maes, MIT Press, 1991.)
The authors implement subsumption in both pseudocode and a version of C called
"IC," (short for "Interactive C"). In doing so, they introduce and implement
behaviors as finite-state machines. IC, written by Randy Sargent and Fred
Martin of the MIT Media Lab, is an implementation of C for the 68HC11A0
available free-of-charge via anonymous FTP on the Internet
(cherupakha.media.mit.edu, or 18.85.0.47) IC, which runs on the PC, Macintosh,
and UNIX, lets you initiate and terminate processes and execute C statements
immediately. The source code presented in Mobile Robots is written C and
68HC11 assembler. (Jones and Flynn note that most of their actual lab work is
done in Lisp; however they switched to C in the book because of familiarity
amongst programmers.) As you can see in Figure 2, IC syntax is C-like. Note
that the code snippet in Figure 2 implements, in part, the subsumption network
in Figure 1.
When it comes down to it, Rug Warrior is really nothing more than an embedded
system that can move about and, from this perspective, Mobile Robots is one of
the most complete presentations of embedded-system development you'll come
across. From microcontroller innards to logic components and hardware/software
interfaces, the book provides a design approach that's practical,
comprehensive, and, because of the subject matter, entertaining. If you're new
to embedded-systems development, or a veteran wanting hands-on information
about designing MC68HC11-based systems, Mobile Robots provides one-stop
shopping.
The only downside to the book is the $39.95 price tag, which may keep Mobile
Robots out of the hands of many potential readers. Instead of expensive color
photographs and glossy hardcover, the publisher might have been better advised
to produce a more affordable book. After all, it's the content that makes
Mobile Robots unique and valuable, not the presentation.
It's been over ten years since I last put together a robot, a Rube
Goldberg-like stationary arm controlled by a now-extinct Radio Shack TRS-80
Model I. After reading Mobile Robots, I've decided its about time to get
moving on another robotics project.
 Figure 1: Implementing a behavior model using a subsumption architecture.
 Figure 2: IC source code that implements a behavior model called cruise.
































December, 1993
Of Interest
COS, an object-oriented extension to C, has been released by Algorithms
Corporation. COS, short for "C-language Object System," does not have a
preprocessor and does not make changes to the C syntax. It supports multiple
inheritance, has true dynamic binding, provides a full metaclass system, has
enforced encapsulation, automatic garbage collection, multiple threads, named
pipes, and the like. COS provides a class library that is used to represent
all standard C types as objects, container classes to provide link lists,
sets, and dictionaries; a date class, and strong numeric formatting
capabilities.
The system comes in two flavors for MS-DOS development: Developer and Source
editions. The Developer edition includes documentation; ready-to-run libraries
that are compatible with Microsoft C/C++ 8.0, Watcom C/C++32 9.5 (Visual
C/C++), and Borland C++ 3.1; example programs; and C source for most of the
class libraries. Additionally, the Source edition includes entire C source to
the COS system (required to compile under UNIX) and COS kernel, Also included
are an entire class library and garbage collector, as well as Pipe, Thread,
and Semaphor classes.
The Developer edition sells for $99.00, and Source edition for $499.00. Reader
service no. 20.
Algorithm Corporation
3020 Liberty Hills Dr.
Franklin, TN 37064
615-791-1636
In support of "enterprise-wide instrumentation" for embedded-systems
developers, Tektronix has introduced an X Windows-compatible (X11/R4)
implementation of its digital-analysis system called "Enterprise DAS." With
Enterprise DAS, software and hardware engineers can work simultaneously at
their respective workstations and, for instance, run a high-level debugger in
one window and see the same code in a real-time trace in another.
Designed for heterogeneous environments where multiple workstations need to
access a single digital-analysis system, Enterprise DAS supports real-time
trace, state and timing analysis, complex triggering, and channel counts.
The Enterprise DAS works with OpenLook or Motif window managers, requires no X
Windows toolkit, and uses X resources to select a variety of options,
including color, window size, and server. Reader service no. 21.
Tektronix Inc.
Test and Measurement Group
P.O. Box 1520
Pittsfield, MA 01202
800-426-2200
LogiCraft's CyberNet Object Database DLL is a database-management system that
implements an object-oriented architecture to minimize data duplication by
automatically indexing objects for fast retrieval, compressing/decompressing
data on-the-fly, and making network access transparent to the user. Object
records can be of variable size and complexity, while the number and size of
each field can change.
The system can be used from any Microsoft Windows application that supports
the registration of DLL functions or from any Windows-compatible programming
language. The DLL can be shared between several applications running
simultaneously on the same machine, and database file can be accessed by
multiple users when needed.
LogiCraft sells the CyberNet Object Database DLL for $250.00. Reader service
no. 22.
LogiCraft Corp.
3303 116 Street
Edmonton, AB
Canada T6J 31
403-435-4049
DBS GmbH has introduced Ad Oculos, a software development kit for digital
image processing under Microsoft Windows. The software gives you access to
image analysis with minimal hardware requirements. Typical tasks include robot
vision, quality control, and analysis of satellite photographs. You can also
use Ad Oculos to process photos, TIFF files, and gray-level images using local
and global operators, region and contour segmentation, Fourier and Hough
transforms, morphing, pattern recognition, and image-sequence analysis.
Ad Oculos includes frame grabbers, more than 50 basic algorithms with C source
code, and documentation.
User-defined algorithms can also be integrated into the program's frame
structure. Algorithms are delivered as DLLs and can be used independently of
Ad Oculos in programs such as C/C++, Visual Basic, or Turbo Pascal.
The price for Ad Oculos is $470.00. Reader service no. 23.
DBS GmbH
Fahrenheitstr. 1
28359 Bremen
Germany
+49-421-2208 161
CompuServe: 100013,115
Data Structures 1.0, released by Natural Systems, is a Turbo Pascal
container-object hierarchy which lets you use optimized and verified data
structures about the same way you'd declare a variable. Data Structures, which
is divided into three major groups (deques, lists, and trees), is an object
hierarchy of the data structures commonly used for applications such as
spreadsheets and word processors. Also included are structures for sparse
matrices and Huffman encoding. A C++ version is forthcoming.
Data Structures sells for $49.00. Reader service no. 24.
Natural Systems
P.O. Box 968
Brookline, MA 02146
617-232-6951
The Association of Shareware Professionals has published The Shareware
Compendium, "a try before you buy" guide to over 700 cross-referenced and
indexed pages about 1100 programs marketed through the shareware concept. Each
listing in this title has descriptions, hardware and software requirements,
registration fees, and benefits of registration. Separate appendices list the
programs by author or company and how to contact them, how to order, and how
to get support. Also listed are BBSs and disk vendors from whom evaluation
copies can be obtained. All evaluation copies are either free or quite
inexpensive. Edited by Rob Rosenberger, The Shareware Compendium is available
in bookstores or directly from the Association for $24.95; ISBN 1-55623-914-9.
Reader service no. 25.
Association of Shareware Professionals
545 Grover Road
Muskegon, MI 49442-9427
616-788-5131
Cygnus Support and Advanced Micro Devices are providing evaluation/development
kits that include Am29205-based hardware and GNU software. The kit includes a
29K board and GNU C/C++ compilers, debugger, assembler, linker, binary
utilities, and documentation. DOS-based tools are available on 3.5-inch disks;
Sun SPARC tools are on CD-ROM.
The evaluation kit is available from AMD for $595.00. Reader service no. 26.
Cygnus Support
1937 Landings Drive
Mountain View, CA 94043
415-903-1400
IntegrAda for Windows from Aetech offers a complete Ada development system for
creating Windows applications. This integrated Ada environment encapsulates a
validated Ada compiler with a Microsoft C-compatible interface, Windows help
and header files, Windows resource and help compilers, linker, and a full set
of Ada windows libraries.
IntegrAda for Microsoft Windows is available for $495.00. Reader service no.
27.
Aetech
5841 Edison Place, Suite 110
Carlsbad, CA 92008
619-431-7714

TCP/IP Illustrated, Volume 1: The Protocols by W. Richard Stevens has been
released by Addison-Wesley. The book provides an inside look at TCP/IP
protocols and explains how the protocols work under a variety of
implementations--SunOS 4.1.3, Solaris 2.2, System V Release 4, BSD/386, AIX
3.2.2, and 4.4BSD--and relates these implementations to the RFC standards.
Stevens also explains the newest features of TCP/IP. Future volumes are
planned to cover other facets of TCP/IP.
The hardcover copy is available for $47.50; ISBN 0-2-1-63346-9. Reader service
no. 28.
Addison-Wesley
One Jacob Way
Reading, MA 01867
800-238-9682
The Intelligent Tools Library (ITL), a C library for device-control software,
has been released by Intelligent Tools. ITL functions provide high-level
control over hardware-interrupt management, control of DMA hardware, access to
DPMI services, and access to VDS. The DMA routines, for instance, simplify the
implementation of interrupt and/or DMA-driven operations by consolidating
common sequences of operations into a single function call. All ITL functions
are written in assembler; an alternate function-call interface is provided for
applications written in assembly.
The Intelligent Tools Library sells for $195.00. Reader service no. 29.
Intelligent Tools
P.O. Box 6334
Abilene, TX 79608
817-725-7455
21 CenNet has announced a software system called "MobileWare" that lets mobile
users connect with nonmobile systems over wireless and landline telephone
networks. E-mail, faxes, files, and printed documents can be sent and received
in compressed and encrypted form between portable computing devices and
company LANs.
Documents are created using Windows applications, and are sent via the carrier
of choice to a MobilWare LAN server. Disconnected transmissions reconnect
automatically from the point of drop without loss of information. Cost can be
kept down if messages are sent and received simultaneously, and data
compression is used to reduce file size. Priority can also be given to
messages, allowing you to take advantage of low rates (night runs) when
messages are of low priority. Incorporation of the Data Encryption Standard
ensures security of passwords and information.
The MobilWare software system starts at $500.00. Reader service no. 30.
21 CenNet Inc.
2425 N. Central Expressway, Suite 1001
Richardson, TX 75080-2746
214-690-6181
With Postman's Sort, a general-purpose file-sorting utility developed by
Robert Ramey Software, you can sort a 100,000-record file (consisting of
alphabetical records 15 bytes long) in 25 seconds on a 386/33. Similarly, a
file of 10,000 records takes about 3 seconds.
The program runs standalone or can be called from another program and there's
no limit on file size. Field types accommodated include alphabetic,
signed/unsigned binary, packed decimal, ASCII numeric, and IEEE floating
point, and are fixed, variable, and data delimited. Windows and DOS
protected-mode versions exploit all extended memory to minimize sorting time.
The kit is available for DOS (with standard versions for the 8086, a 16-bit
protected mode version for the 80286, or a 32-bit protected mode version for
the 80386), OS/2 1.x or 2.x, or a Windows, stand-alone version.
The SDK is available for $149.00. Reader service no. 31.
Robert Ramey Software Development
3949 1/2 Foothill Road
Santa Barbara, CA 93110
805-569-3793
WinLite, a compression utility for Windows executable programs, has been
released by Rosenthal Engineering. WinLite compresses all Windows files
(resources, bitmaps, and the like) and enables automatic extraction that's
transparent to users. No source-code modification or additional linking is
required for the compressed, self-loading programs. For example, the program
shrinks the 180K Windows Solitaire game to less than 77K.
WinLite is available for $149.00. Reader service no. 32.
Rosenthal Engineering
P.O. Box 1650
San Luis Obispo, CA 93401
805-541-0910
RGB Spectrum's SynchroMaster 300 is designed for real-time image fusion
applications. In simulation, for instance, SynchroMaster 300 allows the
production of images of greater complexity than a single computer or scene
generator can produce in real time; in medical imaging, you can use it for
comparative analysis of real and synthetic images. One image is used as the
foreground and the other as the background. The background signal is digitized
and written to a 1280x1024 frame buffer, synchronized to and combined with the
foreground signal.
Images are combined by using a chroma key, a luminance key, or a weighted sum
algorithm. The mixer incorporates a frame-store synchronizer to allow the
mixing of asynchronous images from disparate sources. Reader service no. 33.
RGB Spectrum
950 Marina Village Parkway
Alameda, CA 94501
510-814-7000
Articulate Systems Inc. and Voice Processing Corp. announced a strategic
alliance to develop advanced desktop communications applications for Apple's
new DSP-based Centris 660AV and Quadra 840AV Macs. The two companies said one
of their first products will be a over-the-telephone speech recognition system
for any Macintosh. Among the target application will be a personal messaging
and telephony station, in which voice mail, e-mail, facsimile, and
speech-recognition will be integrated to offer users automated functionality.
Shipment for the first products, with the brand name "PowerSecretary," is
planned soon. Reader service no. 34.
Voice Processing Corp.
One Main Street
Cambridge, MA 02142
617-494-0100

















December, 1993
SWAINE'S FLAMES


May It Please the Court


Dobbs v. United States, 520 US 126 (1955).
Counsel for petitioner: Lawrence Tribe, Boston Massachusetts.
Counsel for respondent: Hamilton Burger, Washington, D.C.
Narrator: It's November 9, 1995. Chief Just William Rehnquist has called for
argument a case that could only have arisen in the digital age. In response to
the Eurogate scandal in early 1995 and the attendant hue and cry for better
security on the Internet, Congress passed the Security Hosting Act, requiring
that all electronic services with access to the Internet be equipped with a
Guardian. Guardians are, in layman's terms, antivirus viruses, capable of
seeking out and destroying invading computer viruses. Howard Curtin Dobbs,
operator of a small computer bulletin board system in Taos, New Mexico, which
he maintained for himself and a few friends, refused to all a Guardian to be
installed on his BBS, and was arrested and convicted of violating the Security
Hosting Act. He appealed to the Supreme Court, which agreed to hear his case
in an unusually short time. Mr. Tribe appears for the petitioner, and Mr.
Burger for the government.
Tribe: Mr. Chief Justice, and may it please the Court. The issue before the
Court today is whether--
Scalia: A moment, Mr. Tribe. We have your brief, but what isn't clear to me
is, on just what Constitutional point does your argument rest?
Thomas: Yes, that's unclear.
Tribe: It is our contention that enforcement of the Security Hosting Act
violates the Third Amendment.
Scalia: The Third Amendment? Are you serious?
Thomas: Yes, are you serious?
Ginsberg: Goodness, you can't mean the Third Amendment. That has to do with
the quartering of soldiers in private houses.
Tribe: Yes--so a literal reading would suggest.
Scalia: Well, we would hardly expect so pedestrian a reading from you, Mr.
Tribe. I can hardly wait to hear what creative interpretation you have in
mind.
Tribe: It is our contention that the Court must concern itself with mapping
the text and structure of the Constitution onto the texture and topology of
cyperspace.
Scalia: That's lovely. Let me write it down.
Ginsberg: I want to be sure I understand. Mr. Dobbs's bulletin board is his
house, and these computer programs are soldiers?
Tribe: I couldn't have put it better myself.
Ginsberg: You're stretching the language awfully far, it seems to me.
Tribe: The precedent for such an extension is clear. In another Third
Amendment case, Engblom v. Carey, both the terms "soldier" and "house" were
given broad interpretations.
Ginsberg: Oh, but not that broad. "House" still meant a structure and
"soldier" still meant a human being, for goodness sake.
Tribe: But in Katz v. United States the definition of "house" in the Fourth
Amendment was extended beyond the physical. The decision stated that the
Constitution protects people, not places, an inspired phrase, in my opinion.
Furthermore, "soldier" clearly designates a job, not a person, and as we all
know, jobs once held by people are sometimes now performed by hardware or
software.
Burger: May it please the Court! Mr. Tribe is making a mockery of this Court
with his grandstanding tricks! Everything he has said is irrelevant,
incompetent, and immaterial!
Rehnquist: Thank you, Mr. Burger. As it is now nearly lunchtime here in
Barbados, we shall adjourn until 11 A.M., Greenwich Mean Time.
Narrator: One by one, the justices log off the net and the Court goes into
recess.
The use of the names of real justices and a real lawyer in this column is for
satiric purposes only. This column is entirely fictional. Somebody else wrote
it and put my name on it.
Michael Swaine
editor-at-large



























Special Issue, 1993
Special Issue, 1993
EDITORIAL


Rumors of War




Michael Floyd


Whenever you see Microsoft and Borland on the same playing field, the first
thought that comes to mind is that "it's war." This time, however, the
so-called "battle" over whose API will bridge the gap between PC databases and
the rest of the relational world may be nothing more than a short-lived
skirmish. At the front lines of this alleged battle are Microsoft's Open
Database Connectivity (ODBC) API and Borland's Integrated Database Application
Programming Interface (IDAPI). Both claim to be extensions of the Call Level
Interface (CLI), a standard proposed by a group of some 40 vendors calling
themselves the SQL Access Group. Both companies are members of this consortium
and both are proposing extensions to CLI.
For the record, the mission of the SQL Access Group is to allow developers
(and users) to get at data stored in different for mats that resides on
disparate platforms. CLI, which uses SQL to access various database
environments, is that common denominator. Microsoft's ODBC extends CLI's 22
core functions with an additional 29 functions to support things like large
data objects, asynchronous requests, and scrollable cursors. By writing a
single driver that supports ODBC, the database vendor can effectively create a
software IC that plugs into an ODBC socket. According to Microsoft, the ODBC
specification and feature set are documented and openly available to
developers and DBMS providers. Worthy of note is that the specification, which
comes in the form of a Windows SDK, is available free of charge from
Microsoft. (For details, call 206-936-2655.)
Borland counters that although the ODBC specification is publicly available,
its current implementation is in Windows. Therefore, it's up to
operating-system suppliers to support ODBC on their specific platform. IDAPI,
on the other hand, will support a heterogeneous environment and rely on a
common API to communicate with various servers, database engines, and drivers.
Borland claims that this will result in less code redesign for database
implementors going from one platform to another. IDAPI also features a
request/responder mechanism to cooperate in a (Novell) network environment,
and it will integrate ODBC, which will allow developers to automatically hook
into ODBC drivers.
When it's available, IDAPI will initially support OS/2 2.x, Windows, DOS, and
Netware. But unlike ODBC, an IDAPI SDK has yet to hit the streets. Borland
plans to deliver an early SDK to partners in the second quarter of '93, with
final shipment expected in July.


It's Who You Know


When Microsoft speaks out, it's worth listening. Ultimately, though, what
counts is who falls into line--and on that score, Microsoft is doing quite
well. For example, Digital Equipment Corporation announced that it is jointly
developing an ODBC client driver for DEC's Rdb/VMS database. This follows an
earlier announcement between the two companies to align DEC's Network
Application Support (NAS) with Microsoft's Windows Open Services Architecture
(WOSA). NAS includes APIs, toolkits, and the like so developers can port their
software to other platforms including UNIX, Sun, OpenVMS, and OS/2.
In addition to DEC, Apple announced in the second half of '92 that it would
provide ODBC services within their Data Access Manager (DAM). Apple also plans
to build a Macintosh ODBC client that translates ODBC calls into its SQL-based
language called Data Access Language (DAL). DAL, part of Apple's VITAL
integration framework, currently accesses 12 different relational databases.
And as if that weren't enough, Apple will additionally develop an ODBC/DAL
client for Windows that Microsoft will distribute.
In Borland's corner are some well-known partners such as IBM (which hopes to
promote its OS/2 product as a viable database platform) and Novell (which is
throwing Netware and Btrieve into the fray). Wordperfect has also announced
support of IDAPI, but then, both Wordperfect and Novell have said they'll
support ODBC, too. In answering the question of why, however, it's unlikely
that these two are just covering their bases. That's because Pioneer Software,
another company that has announced support of both CLI extensions, will be
providing the glue between these two interfaces. Pioneer, which already
provides connectivity products that support nearly 20 database formats, has
developed its Idapter technology that will permit ODBC database drivers to be
used with IDAPI-enabled applications.


New World Order


Interoperability is now the name of the game. It's the thing that deals are
made of, and it's the stuff that forges alliances between the likes of Apple
and Microsoft. And no matter how you see it, we stand to benefit anytime
Microsoft and Borland match up. In this case, developers will be able to
support more platforms, operating systems, and networks with fewer changes to
their source code, while users get what they want: heterogeneous data access
and interoperability amongst database plat forms. But, is it war? Only as long
as competing vendors think opposing each other is more important than helping
users. Are ODBC and IDAPI mutually exclusive? Not as long as CLI continues to
gain momentum. Will you miss out because you've invested too much in the wrong
technology? Not necessarily. This looks to be one instance where the stakes
are so high that rumors of war remain just that, and competing technologies
and vendors agree to a peaceful middle ground.






























Special Issue, 1993
 OBJECT-ORIENTED DATABASE MANAGEMENT SYSTEMS


Examining a trio of OODBMS tools




Al Stevens


Al is a contributing editor for DDJ and the author of C++ Database Development
(Henry Holt, 1993). He can be contacted through the DDJ offices at 411 Borel
Ave., San Mateo, CA 94402.


Object-oriented database management systems (OODBMS) are coming of age. The
trouble is, no one knows exactly what that means. More precisely, many people
claim to know, but few agree. There is no standard definition because the
technology is too new. If you want to call a DBMS "relational," there are
rules against which you can measure it: Codd's 12 rules. A CODASYL database
manager--network by definition--has a published standard to shoot for,
although not many DBMSs still try to comply with CODASYL. But the
practitioners of object-oriented technology have not yet hammered out a
standard definition of just what constitutes an object-oriented database. Add
to that the total absence of support for persistent objects in the C++
language, and the overwhelming acceptance of C++ as the object-oriented
language of choice, and you have a fertile field for new C++ OODBMS products,
no two of which are alike.
But with no standard, how are you to know if a DBMS is object-oriented, and,
if not, what does it matter anyway? You might wait several years for the
industry to define it and a standard definition to emerge, but you might also
need to write a program now. Why wait? Latch onto something that works and use
it. But which of today's so-called OODBMSs should you use? Without trying to
answer that question with complete authority, this article looks at three
contemporary OODBMS packages and compares them. Each one takes a different
approach to implementing persistent objects, and each has its strengths and
weaknesses. By using all of them to solve a simple problem, I hope to shed
some light on how they work, how easy they are to learn and work with, and
what kinds of problems each of them is particularly suited to solve.
Two of the OODBMS packages support development of Windows as well as DOS
programs. Not wanting to mess with the SDK or a Windows applications
framework, I decided to build a simple DOS command-line program. I wanted at
least two classes, each one supporting object retrieval by a key data value,
with one class related some how to the other in typical database fashion. At
least one class should have a variable-length data field, perhaps a text
string. If you can do these things with a database manager, you can usually do
anything you want. I designed a simple school-administration system, one that
records teachers, subjects, and teacher/subject assignments. To make it
simple, I decided that a teacher could teach only one subject and only one
teacher could teach any given subject. This is not the kind of problem that
cries out for an object-oriented design (unless you are an OOPS zealot who
believes that every solution should be object oriented), but it is one that
all programmers can relate to, one that any DBMS--object oriented or
not--worth its salt should be able to handle, and one where three different
solutions will fit into an article of this size.
By no means did I wring out any of the three. I did no stress or performance
tests, and I did not exercise all the features. Each product has many features
that the others do not have. My simple application leans toward the features
they share and reflects the database problems I work on. I do not mean for you
to use this article as the only measure for selecting an OODBMS. Instead, I
hope to show three different approaches to what is commonly lumped under the
single vague category of "persistent objects," and encourage you to try each
of them in your own environment against your own requirements.
I found all three products lacking in documentation, although one is clearly
better than the others. All three limit the types of data members that you can
put into a persistent class. Two of the products are difficult to learn and
use, and those are the ones with the poorest documentation. Those were also
the ones for which I had to get technical support from the vendor, not only to
use but to install. You'll read about some of these experiences later on. But
before you get the idea that these criticisms imply fatal flaws, let me add
that I had a compiled, working program in less than one day with each of the
products, and what is more, I felt like I understood how they worked and how
to use them. One product, Object Manager, takes a traditional view of the
database. Another, Code Farms Libraries, takes a revolutionary view, unlike
anything I've seen. The third, POET, is somewhere in the middle. I would
willingly use any of the three to develop an object-oriented database
application.


BKS POET Version 1.2


POET is an object-oriented database-management system that works with C++ and
runs on several platforms, among them DOS and Windows. It implements a
persistent-object database manager by adding the persistent attribute to the
C++ class. POET implements the attribute by extending the C++ language syntax.
POET processes the extensions by passing the class-definition code through a
preprocessing translator, which BKS calls a "pre-compiler."
POET supports DOS, Windows, UNIX, and Next. The edition I used is Version 1.2
for DOS and Windows. It includes support for the Borland, Microsoft, and
Zortech C++ compilers. You can specify during SETUP which compiler you want to
use and whether the target programs are for Windows, DOS, or both. Oddly, the
POET SETUP program runs only under Windows, so you'll need Windows to install
POET even though you might want to use only the DOS version.
SETUP has an annoying practice. It modified my AUTOEXEC.BAT without asking my
permission. If I had not been watching closely, I might not have seen it do
so, because a message flashes by only while the modification is going on. I am
using DOS 6.0, which allows you to select from a menu of configurations in
CONFIG.SYS. AUTOEXEC.BAT reacts to the selection by testing the CONFIG
environment variable. I have several possible configurations, each with its
own PATH statement. I was sure that POET's SETUP couldn't possibly get it
right. It didn't. I had to manually correct the error.


The Windows Version


I had some difficulty after the installation getting the package running
properly. I installed both the DOS and Windows versions to compile with
Borland C++ 3.1 and started with the Windows version. POET includes a
Borland-like IDE in Windows. There is an example program already compiled and
installed in a Windows group. I decided to recompile it.
A POET application, like any other C++ application, consists of class
definitions and executable C++ code that uses objects of the classes. POET
persistent-class definitions contain keyword extensions to the language that
describe the persistent properties. The PTXX Precompiler compiles these
definitions into C++ code for input to the C++ compiler. By convention, the
POET class definitions are in files with the .HCD extension. PTXX compiles
them into source modules with the .HXX file extension. The .CXX extension
denotes the C++ source files that contain the persistent class methods.
When I tried to precompile the example application's class definitions, the
POET IDE reported errors on the first file it processed. The errors made no
sense. I ran it again and watched. Even though I had installed the Borland
version, it was precompiling the Microsoft header files. I had failed to
change the default directories in the IDE's options process, and POET was
using the INCLUDE environment variable, which is normally set for the
Microsoft compiler and which does not usually get in the way of the Borland
compiler. The POET documentation makes no mention of the INCLUDE environment
variable.
The default directories out of the box assume that you have a C:\TMP
subdirectory for temporary files and that the compiler is in C:\BC31, neither
of which is conventional. You'll have to change these. I did and tried to
compile again, and it failed. The changes were posted but had not taken
effect. As it turns out, you have to save the project before the option
changes will associate themselves with it. Poking around in the examples, I
found a Borland-compatible, PRJ file that built the program with Turbo C++ for
Windows. It would save a lot of time if the documentation included a cookbook
tutorial to get a programmer past this first hurdle.


The DOS Version


Next I set out to build the School application program. I began with a simple
.HCD file that described the Teacher and Subject classes, both with the
persistent attribute and with one class containing a reference to the other.
Using the command-line example in the manual, I tried to precompile the .HCD
file with the DOS version of PTXX. The program locked up my computer. The
sample DOS programs that accompany the package use a subdirectory structure
for the include, source, database, and binary files. The documentation is
vague on these matters, but I emulated that structure, and still PTXX locked
up. Then, to be sure that it wasn't something in the code I wrote, I tried it
with the sample application. PTXX locked up there, too. I called tech support,
and we agreed that the culprit was probably DOS 6.0, which is still in beta. I
switched to a machine with DOS 5.0, and the problem went away. I changed my
HIMEM.SYS and EMM386.SYS in the DOS 6.0 computer to those from the Windows 3.1
distribution, and that cured the problem. BKS will need to address this
problem because most users will not want to relinquish the superior memory
management that the DOS 6.0 memory managers provide.
Now underway, I precompiled my first .HCD file. Both classes had character
pointers to represent the names of the objects. POET does not allow persistent
classes to have pointers to nonpersistent objects. I tried references and got
the same restriction. I substituted instances of Borland's String class, and
these, too, were unacceptable to POET because the String class contains a
character pointer. Next, I subclassed a persistent class from the Borland
String class, only to find that a persistent class may not be derived from a
nonpersistent class. These restrictions are documented in the reference guide,
but of course I hadn't read it yet. POET includes the persistent PTString
class, which solved that particular problem, but the underlying concept points
to a larger problem. Your persistent classes may contain instances of
nonpersistent classes if they obey certain rules, but they may not contain
pointers or references to nonpersistent objects, and they may not be derived
from a nonpersistent class. This approach impairs your ability to use existing
class libraries at the foundation of your object-oriented design. POET has a
few compatible classes such as the PTString class and date and time classes,
but a programmer needs a full set of container classes that can be persistent.
I could not put an instance of the Borland String class into my persistent
class because it has what POET calls "hidden semantics," which, in this case,
means that the class has a character pointer, and POET cannot tell whether the
pointer points to a single character or an array of unknown dimension. (For a
detailed discussion of this type of problem, see my article, "Persistent
Objects in C++" in the December 1992 issue of DDJ.) POET includes a process
called the Type Manager, with which you specify how an object's representation
is to be processed in the persistent-object database. Assuming you know the
semantics of a class that you want to include, you would use the Type Manager
to specify them, and then, somehow, POET would accept the object as a member
of a persistent class. The Programmer's and Reference Guide does not tell you
how to use the Type Manager, presuming that most users will not need it, a
presumption with which I flatly disagree. If you are into serious
object-oriented design with persistent objects, and if you are using class
libraries from earlier work--reusability being one of the tenets of
object-oriented design--then you will certainly need the Type Manager. BKS
provides the Type Manager documentation on request.
If you do not know the semantics of a nonpersistent object and POET rejects
it, then you are out of luck because you cannot tell the Type Manager what you
do not know. Another tenet of object oriented design is that encapsulation
hides implementation details from the user. You might find yourself rummaging
around in the class library's source code just to discover its semantics, the
very kind of detail that you never wanted to know.
Listing One (page 13) shows the class design. Except for the persistent
specifier and the two template-like typedefs, which support object queries,
the header file in Listing One is typical C++ code.
The subdirectory conventions allow you to specify a name for the database with
a command-line option. Their convention uses the word "base," which creates a
subdirectory named BASE and puts the database files there. Then it creates a
source file named BASE.HXX and puts it in the subdirectory specified by
another command-line option. It also creates a file named, in this case,
SCHOOL.HXX, and puts it in a subdirectory named in yet another command-line
option. The "base" convention is arbitrary, and I renamed it to SCHOOL to
coincide with the name of my database. That procedure caused PTXX to put a
second file named SCHOOL.HXX (instead of BASE.HXX) in the SCHOOL subdirectory,
and the SCHOOL.HXX file includes this statement: #include <school.hxx>.
Of course, I did not notice this at the time. I tried to compile the SCHOOL
.CXX source file (in another subdirectory named by command-line option), and
the source file's include of itself put the Borland C++ compiler into a memory
eating loop that continued until the program expired. I changed the name back
to BASE. The documentation should warn you about that one. Better still, the
PTXX program should issue a warning. The subdirectory structures and the
source files generated are confusing enough as it is, without that kind of
trap. Once again, a better user's guide could clear up the confusion.
With the database name changed, the C++ compilation ran to completion. My
first attempt had the teacher object in the Subject class as a reference to
type Teacher. The C++ compiler reported that the reference to the Teacher
object in the Subject class was not initialized by two constructors that POET
had generated. I hadn't even written constructors yet, and already they had
compile errors. A call to tech support revealed that they had not tried using
references in persistent objects and did not know if it would work. They
thought that if I provided a default constructor and what they call a "class
factory constructor" to override the ones that PTXX creates, it might work. I
decided not to experiment and switched instead to a pointer.
I wrote the member functions and built a simple program that opens the
database and instantiates one of the objects, nothing more. After several
false starts, I got the program compiled and linked. The documentation says
nothing about which libraries you must link with. You have to poke around in
POET's installation directories and guess. Remember that a POET application
consists of a lot of source code that PTXX generates for you. I was curious to
see how well Turbo Debugger would work, what with all that computer-generated
source code to step through. I imagined a horrendous maze of constructors and
destructors. Well, that's not something you should worry about. My little
program, when compiled with debugging information added, is over 500K, which
is way too big for Turbo Debugger to load into memory. I'm not sure how you
are supposed to debug a POET application. The documentation does not address
the problem. Listing Two (page 13) is the program that builds persistent
objects, retrieves them, and relates them to one another.
POET has a full set of object-oriented database-management features, and the
program in Listings One and Two uses only a small subset of them. POET keeps
track of object copies so that only one copy of an object is in memory regard
less of how many times the object is declared. Retrievals are an interesting
part of POET because the database is not bound to the primary-key paradigm of
the relational database. You retrieve records by building a set and specifying
Boolean query criteria in an object of a class especially built by POET for
each data member in a persistent class. If you do not need to use a particular
member in any queries, no objects of its query class are instantiated, and
there is no overhead. If a class member is used for frequent queries against a
large database of objects, you can tell the precompiler to build an index for
that data member, and the query process will automatically use the index for
searches.


Strengths and Weaknesses


POET's strengths are its implementation of persistent objects with Boolean
queries, its support for multiple platforms, and its plans for future
versions, which include a client/server version. POET's weaknesses are its
incomplete documentation and the constrained subset of class definitions that
you can make persistent--no pointers to nonpersistent objects, no references,
no nonpersistent base classes, no instances of classes that have any of the
above.

The Programmer's and Reference Guide does a reasonable job of explaining
POET's underlying concepts and how the classes work, but there is no real
user's guide to explain how to set up and run the precompiler and how to
design your first class. There is a tutorial and an example application, but
neither of them has enough information to get you going. The Windows version
is even weaker when it comes to documentation. That could be overcome with
comprehensive, context-sensitive Help screens, but none of the IDE's dialog
boxes have Help command buttons or respond to F1. The IDE itself has a Help
menu, and it provides some information, but the overall documentation for both
the DOS and Windows versions is too weak to be considered even marginally
acceptable. The company is aware of these shortcomings, and they are committed
to doing a better job in future versions. In the mean time, they spend a lot
of time hand-holding their customers through the first stages of getting up
and running, after which, they tell me, most programmers are comfortable with
the procedures and can proceed without needing much further help. You should
consider this when you decide to use POET. Most programmers will not get POET
running without at least one call to tech support. As the package gets wider
distribution, the company's ability to hold the hands of every user will wane.
In the future you'll spend more and more time waiting for them to call you
back. Too bad, because the problems and the support calls could be
significantly reduced by better documentation.
POET is a good product that needs good documentation. It is difficult to learn
because the documentation is less than what is required for a package like
this. Once past the learning barrier, you will be able to develop POET
applications effectively. The first hurdle is the highest, and BKS needs to do
something to lower it. POET also compiles huge executable files, which, as I
learned, was an impediment to source-level debugging.


Code Farms' C/CC++ Libraries Version 3.2


The Code Farms Libraries (CFL) are C and C++ libraries that implement
persistent objects under DOS, UNIX, and the Macintosh. This article is about
OODBMS solutions, so I will discuss only the C++ side of CFL.
CFL implements persistent objects by preprocessing class definitions, which
include macro statements that identify the persistence of the classes. CFL's
view of persistence is unlike that of other OODBMS products. Classes are
defined as belonging to meta-classes, not through inheritance, but in the form
of organizations. Objects exist within hyper-organizations--rings,
collections, aggregations, trees, graphs, links, names, stacks, and
entity-relationship models. Once stored in the database, an object may belong
to many organizations. This architecture underpins the retrieval processes and
interobject relationships supported by CFL.
The organizations themselves resemble basic data structures that most
programmers will recognize. A RING is a singly or doubly linked list, except
that it is circular instead of having a listhead and two ends. A COLLECTION is
a RING with a listhead-like parent object of another type to manage its entry
point, which can vary. An AGGREGATION is a COLLECTION where the RING objects
point to their listhead/parent object. A TREE is a hierarchy of objects of the
same class where the objects at each level form a RING and point to a parent
object of the same class at the next higher level in the hierarchy. A GRAPH is
a network of nodes and paths between nodes, called "edges." A LINK relates two
objects. A NAME is a form of LINK that relates an object to a character
string. STACKs are LIFO or FIFO lists. The ENTITY-RELATIONSHIP MODEL relates
objects in one-to-many relationships between classes, where the nature of the
relationships are different. Some of the organizations have variants, such as
the SINGLE_LINK and the DOUBLE_TREE.
When you design an object-oriented database, you will apply your understanding
of these organizations to organize the objects. The CFL application must be
designed so the data model fits into one or a combination of the
organizations. The architecture of a database might very well reflect the
background of the designers who find that the data models they prefer must fit
into some combination of the organizations.
The traditional view of a database is not apparent in these organizations. In
many ways, CFL forces you to think differently about data structures. Whether
or not this new perspective is the new, true object-oriented view or simply
the view of the CFL developers remains to be seen. Nonetheless, CFL has the
potential to support many different data architectures in ways that can
surpass traditional database models.
A fundamental difference between CFL and more traditional approaches is that
the application loads the complete CFL database into memory when the program
begins, and, if the program changes the data in any way, the application must
save the complete database to memory when the program is done. Therefore, a
CFL database must by definition fit into memory. There are virtual paging
operations that use disk or extended memory, and these are mostly transparent
to the programmer, but the performance penalties can be significant. CFL does
not retrieve individual objects on the basis of data-dependent key values
after the fashion of a relational model. Instead, you arrange objects into
organizations and navigate those organizations by using entry classes and
iterator functions.
Many diverse applications' database requirements will fit the CFL model, and
CFL is a powerful tool for addressing those requirements, but you will not use
it for very large databases. The need to fit all the objects into memory and
the absence of individual object retrieval would constrain a large,
transaction-based database application.
CFL persistent objects may not contain references. They may contain pointers
to other types, but with exceptions. Character pointers are treated like
null-terminated character strings. Pointers to other types are treated as
pointers to single instances of the class rather than as pointers to arrays of
the object.
Installing CFL on a DOS system is straightforward, although not without a few
glitches. There is an INSTALL program that builds subdirectories into which it
decompresses files. Then it tells you how to compile the libraries with your
compiler. The package supports several compilers and operating platforms, and
when you select a compiler from its menu, it simply tells you which batch
files to run. When I ran the MAKE batch file to build the Borland C++ version,
the system built several programs and then attempted to run one of them, the
ZZCOMB program, which aborted with a SHARE violation on drive C:. Since no
other programs were running, I assumed that ZZCOMB locks a file and then
somehow attempts to open it elsewhere. I removed SHARE.EXE from my
AUTOEXEC.BAT, rebooted, and the ZZCOMB program ran OK.
Another batch file specified by the INSTALL program builds the CFL library.
You select the batch file depending on which compiler and memory model you are
using.
The User's Guide spells out a series of two tests to make sure that the
software is correctly installed. The first one tests the class generator,
which is CFL's preprocessor that converts class definitions into acceptable
C++. The second test compiles a test program. Both tests worked correctly,
although the User's Guide incorrectly names one of the batch files, and you
must modify it if you are using other than the medium memory model.
You can run tests to exercise every feature in the CFL package, and they
promise to run for several hours. I decided not to try them unless I had other
problems later.
The last step of the installation builds the Reference Manual, which does not
come in printed form. The ZZDOCUM program builds the manual by extracting the
information directly from the source code, which assures that the manual is
current. You have to compile the ZZDOCUM program, and my copy of ZZDOCUM.C had
some jibberish in it that looked like the insert command lines from some
text-management utility. I took those out, and the program compiled and ran.
It puts garbage characters in the document under the Date Printed heading, but
the document is otherwise usable. I sent a report of these problems to Code
Farms, and they assured me that they would correct them in the next release.
Code Farms' policy of having you compile its library and print its Reference
Manual is an excellent way to stay on top of their version control and that of
the compiler manufacturers. You always get the most recent Reference Manual,
and if your favorite compiler's new version has object formats incompatible
with earlier versions, CFL automatically adjusts when you recompile.
Documentation consists of the printed and bound User's Guide and the Reference
Manual, which you build as a printable disk file. The User's Guide is at times
a pleasure to read and at other times a deep and confusing document. The
confusion is at its worse when the manual describes some of the more arcane
parts of the CFL procedures. Many of the code examples contain errors or
confusing syntax. The document often breaks lines of code by hyphenating
identifiers, which makes the expression appear to contain two identifiers
separated by the binary minus operator. In other places, the code will declare
uninitialized pointers and then proceed to dereference the pointers,
apparently assuming that the reader will mentally fill in the missing details.
Other parts of the manual's code refer to class members that do not exist.
Apparently the code in the manual has never been compiled. Once you get the
hang of how to organize and design a CFL program, you are better off looking
at the example programs in the installation's TEST subdirectory rather than
those in the manual. I would often use grep to find a sample program that
contained a feature that I wanted to understand, read the example with my
editor, and compile it to see what it did.
The User's Guide is poorly organized and has an incomplete index. I found many
references in the text to items not listed in the index, usually when I needed
more information. There are concepts and rules for which there are no
examples, and the descriptions are less then comprehensive.


Building an Application


You build an application by including in your source program two files: a
header file at the front and a source file at the end, neither of which exists
when you begin. In between the two includes, you declare your classes and the
program's functions, which contain macros that define the persistent nature of
the classes and the organizations into which you will arrange them. Next, you
run a preprocessor named ZZPREP, which reads your code and generates the two
include files. Then you compile your program with the standard C++ compiler.
The include files fill in the necessary information to allow the compiler to
compile the program. Listing Three (page 14) is the School application
rewritten to work with CFL.
CFL deals with runtime errors by writing error messages on the screen. For
example, if you try to open a database that has not been created yet, you get
an error message on the screen. If you were to try to associate more than one
teacher with the same subject in the program in Listing Three, you would get
an error message on the screen. Such situations could be the result of a
user-input error, and the program should be able to deal with them without
cryptic, intrusive error messages that display wherever the cursor happens to
be. The error conditions themselves are posted in a flag that the program can
query. I would like to be able to suppress the error displays to stdout.


Strengths and Weaknesses


CFL's strengths are found in its unique approach to an object-oriented
database, one that you should at least consider before you opt for one of the
more traditional methods. Its weaknesses are in the quality of the
documentation, especially in view of its nonconventional approach. Some
programmers will avoid CFL because it requires them to learn a new way of
thinking about their object-design tasks. That is a shame, because many kinds
of applications could benefit from the data organizations of CFL.
Like POET, Code Farms Libraries is a good product that suffers from the lack
of good documentation. The restrictions on what you can put into a persistent
object could mean that you need a different solution as well. Unlike POET, the
CFL executable files are of a manageable size. The example program compiles to
147K, with debugging information included. CFL is worth a look when you
consider your object-database requirements. If you can fit all the objects
into memory at once, if you can describe your persistent objects without the
use of pointers and references, and if you can describe the relationships
between classes and the retrieval requirements by using CFL organizations,
then CFL is a good choice.


Raima Object Manager 1. 10


Raima Object Manager is a C++ wrapper around the mature and respected Raima
Data Manager, a C library formerly known as db_Vista. Data Manager implements
a data model that supports relational, network, and direct access of objects.
Object Manager encapsulates that model into an object-oriented database
manager. Object Manager and Data Manager are available in single- and
multiuser versions for DOS, Windows, and OS/2, with support for Borland,
Microsoft, and Zortech C++ compilers. There are also UNIX versions. I worked
exclusively with the single-user DOS version.
A C++ programmer comfortable with relational technology or the network set
organizations of CODASYL databases will have no trouble using and
understanding Object Manager. You design a database by building a traditional
schema definition in a Data Definition Language (DDL) text file. A DDL
compiler reads the DDL and generates a C header file with structure
definitions for the records and #define statements to identify the files,
records, and fields. You include the header file in your source code and
process the database by inheriting classes and using macros and member
functions to manage object persistence, retrieval, and relationships.
You can retrieve objects by key values, from one-to-many sets, or by object
identity. You can relate classes by using sets, by duplicating key values in
related objects, or directly with database pointers. There is a data model
among these to satisfy almost any system architecture.
An Object Manager database consists of collections of fixed-length objects
with only primitive data types--int, char, float, arrays, and so on--as
members. There are no provisions for variable-length records other than those
for the BLOB (binary large object) data type. You may not use pointers,
references, or instances of other classes as data members. You must be mindful
of the physical file organization when you design your database. A file that
holds multiple class types uses fixed-length object slots, with each slot
being large enough to hold an object of the biggest class. This behavior would
incline a designer to organize objects in files according to object size
rather than based on functional relationships.
You install Object Manager as an optional feature of or an addition to the
Data Manager installation. The installation adds some libraries and example
programs to what already comes with Data Manager. If you already have Data
Manager, you can upgrade. If not, you can purchase the entire package in one
bundle.
Installation goes without a hitch. Specify a source drive and destination path
and insert diskettes when the program asks for them. All installations should
be this easy.
Compared to the other products addressed in this article, Raima's
documentation rates a literary award. The chapters on database concepts and
object and database design are readable and will be particularly helpful to
programmers not well versed in those design concepts. The documentation is far
from perfect, however. It's often difficult to find the information even when
you're sure it's in there somewhere. The index could be better. Several times
I came across a macro, function, or data type in an example program that was
not in the index, but which I eventually found described somewhere in the
documentation. A few subjects are ignored in the printed documentation. I
searched high and low for information on how to compile and link an Object
Manager application only to find it tucked away in a README file. Such
knowledge belongs in a User's Guide.


Building an Application


The DDL is straightforward, using a C-like syntax to describe the database,
its files, the records in the files, the fields in the records, the keys, and
the sets. Listing Four (page 15) is the DDL for the School application and
Listing Five (page 15) is the school application program rewritten for Object
Manager.
Persistent classes are defined with classes that derive from the structure
defined in the Database Definition Language Processor (DDLP) generated header
file and from Object Manager's StoreObj class. The Teacher and Subject classes
in Listing Five show how this is done.
The SCHOOL.H file that Listing Five includes is generated by the Object
Manager's DDLP from the DDL statements in Listing Four. The header file
defines the teacher and subject structures as well as a number of #defines to
identify records and fields. The program references some of these, such as
TEACHER and SUBJECT_NAME. Other parts of the program use Object Manager macros
to provide some additional required definition for the classes. For example,
both classes for database objects include the DIRECTREF macro, which tells the
compiler to add member functions to support specific direct references between
classes. In this implementation, I decided to represent the one-to-one
relationships between teachers and the subjects they teach with
direct-reference database-object pointers--not the best database design, but
sufficient to illustrate the technique.
An Object Manager application works from within a task object that you build
by deriving a class from the StoreTask class. This approach allows the
multiuser version to discern between tasks and users. Since I needed a task
object, I encapsulated the menu processes into it. You must also define a
class for the database itself, and I put an instance of it into the task
class. The constructor launches the application, so all the main function must
do is declare an instance of the task class. If the task object was global, we
wouldn't need a main function at all except that, for some reason, the C++
language specification--as it exists today--insists on one.
Object Manager supports object navigation with overloaded operators. The ++
and -- operators navigate forward and backward in the object's default
accessing sequence, which is defined by the constructor. You can retrieve
objects in the sequence of a key value. The SchoolTask: List function uses the
keys, which, in this simple example, are the teachers' and subjects' names.
The overloaded [] operator fetches objects, too. You can put FIRST, LAST, and
so on inside the brackets, or you can use a key value. (An aside: FIRST and
LAST are Object Manager keywords. They are not in the index of either manual.
This is an example of the quality of the index.) The overloaded >> and <<
operators implement retrieval of objects with direct references to other
objects. Many pundits warn against the use of overloaded operators, and I must
agree that, although I used them in this example, I do not find them to be
intuitive. For those who dislike overloaded operators, Object Manager offers
member functions to perform the same functions.

Because Object Manager uses Data Manager as its database engine, all the Data
Manager utility programs work with the object database. The db_Query package,
based on SQL, supports ad hoc and program--generated queries. The db_Revise
package assists with database conversion. There are utility programs to unlock
the database after a crash; view and change the contents of database fields
and check their consistency; export and import the database to and from ASCII
files; and pack, inspect, and rebuild indexes.


Strengths and Weaknesses


Object Manager's strengths are in its use of the rugged Data Manager, its
intuitive use of the C++ language to encapsulate database operations, and its
support for the three data models within which a designer will find a solution
to most problems. Its weaknesses are in the restrictions as to what you can
put into classes, a weakness shared by the other two packages discussed in
this article.


Conclusion


If you are concerned with the quality of documentation, Raima is by far the
best product of the three discussed here. If you depend heavily on
documentation that is easy to read, has good tutorials and introductions to
concepts, and covers the product comprehensively, then Raima outshines the
others. But don't let documentation be a litmus test. The suitability of the
data models supported should be just as important. It's difficult for an old
relational-database analyst like me to imagine a data model that you could not
fit into one of Object Manager's relational, network, and direct-reference
accesses.
Nonetheless, the data model will determine what tool works best for you. POET
is the only one of the three that offers a way to describe the hidden
semantics of other classes to the database manager, thus allowing you to
incorporate embedded objects into your persistent classes. The techniques for
doing this are not easy, and they won't work in every case, but it's the
closest any of the products comes to this kind of support. Code Farms
Libraries offers a rich repertoire of object organizations, and you could
probably implement any traditional data structure by using one or more of
them, but you must be able to fit the entire database into memory at once, not
to mention bearing the overhead of reading it and writing it every time you
run a program. Object Manager builds traditional database files and
fixed-length records, wrapping a C library in a C++ wrapper.
There are models of simulation, imagery, and multimedia data out there waiting
for solutions, and more object-oriented approaches could be more appropriate.
It's just a matter of determining what those really are.

__OBJECT-ORIENTED DATABASE MANAGEMENT SYSTEMS_
by Al Stevens


[LISTING ONE]

// ----------- school.hcd
// School Database Design for POET

#include <poet.hxx>
#include <iostream.h>

persistent class Teacher {
 PtString name;
public:
 Teacher(char *nm) { name = nm; }
 ~Teacher() {}
 void Display() { cout << (char *) name; }
};
persistent class Subject {
 PtString name;
 Teacher *teacher;
public:
 Subject(char *nm)
 { name = nm; teacher = NULL; }
 ~Subject() {}
 void AddTeacher( Teacher &tch ) { teacher = &tch; }
 void Display() { cout << (char *) name; }
 Teacher *Tchr() { return teacher; }
};
typedef cset<Teacher*> TeacherSet;
typedef cset<Subject*> SubjectSet;






[LISTING TWO]

// -------- schoolp.cpp
// School Application for POET


#include <iostream.h>
#include <poet.hxx>
#include "school.hxx"

PtBase objbase;

// ------- build a new Subject object
void NewSubject()
{
 char nm[50];
 cout << "\n--- New Subject --- ";
 cout << "\nEnter Subject name: ";
 cin >> nm;
 Subject subj(nm);
 subj.Assign(&objbase); // assign object to database
 subj.Store(); // store object in database
}
// ------- build a new Teacher object
void NewTeacher()
{
 char nm[50];
 cout << "\n--- New Teacher --- ";
 cout << "\nEnter Teacher name: ";
 cin >> nm;
 Teacher tch(nm);
 tch.Assign(&objbase); // assign object to database
 tch.Store(); // store object in database
}
// ----------- assign a Teacher object to a Subject object
void Assignment()
{
 char nm[50];
 cout << "\n--- Assignment --- ";
 cout << "\nEnter Subject name: ";
 cin >> nm;
 // ----- build sets to query for Subjects
 SubjectAllSet *allSubjects = new SubjectAllSet(&objbase);
 SubjectSet *SubjectResult = new SubjectSet;
 SubjectQuery SubjQuery;
 // ----- build the query

 SubjQuery.Setname(nm, PtEQ);
 // ----- run the query
 allSubjects->Query( &SubjQuery, SubjectResult );
 // ---- get the first SubjectResult
 int n = SubjectResult->GetNum();
 if (n == 0)
 cout << "\nNo such subject";
 else if (n > 1) {
 // ----- more than one result, query used wild cards?
 cout << "\nEnter specific subject";
 cout << '\n';
 cout << "n = " << n;
 }
 else {
 // ------- found a Subject object
 Subject *subj;
 SubjectResult->Seek(0, PtSTART);
 SubjectResult->Get(subj);

 cout << "\nFound ";
 subj->Display();
 // ------- get a Teacher object to assign to Subject
 char tnm[50];
 cout << "\nEnter Teacher name: ";
 cin >> tnm;
 // ----- build sets to query for Teachers
 TeacherAllSet *allTeachers = new TeacherAllSet(&objbase);
 TeacherSet *TeacherResult = new TeacherSet;
 TeacherQuery TchQuery;
 // ----- build the query
 TchQuery.Setname(tnm, PtEQ);
 // ----- run the query
 allTeachers->Query( &TchQuery, TeacherResult );
 // ----- get the first TeacherResult
 int n = TeacherResult->GetNum();
 if (n == 0)
 cout << "\nNo such teacher";
 else if (n > 1)
 cout << "\nEnter specific teacher";
 else {
 // ------- found a Teacher object
 Teacher *tch;
 TeacherResult->Seek(0, PtSTART);
 TeacherResult->Get(tch);
 cout << "\nFound ";
 tch->Display();
 // ---------- add the Teacher to the Subject
 subj->AddTeacher(*tch);

 // ------- store the Subject object in the database
 subj->Store();
 TeacherResult->Unget(tch);
 }
 delete allTeachers;
 delete TeacherResult;
 SubjectResult->Unget(subj);
 }
 delete allSubjects;
 delete SubjectResult;
}
// ---------- list the Subjects (with assigned Teachers)
// and the Teachers (regardless of assignment)
void List()
{
 cout << "\nSubjects";
 cout << "\n--------";
 // --------- build set to get all Subjects
 SubjectAllSet *allSubjects = new SubjectAllSet(&objbase);
 Subject *thisSubject;
 allSubjects->Seek(0, PtSTART);
 while (allSubjects->Seek(1, PtCURRENT) == 0) {
 // ----- get each Subject in turn
 allSubjects->Get(thisSubject);
 // ------ display the Subject
 cout << '\n';
 thisSubject->Display();

 // ----- if there is a Teacher assigned, display it

 Teacher *t = thisSubject->Tchr();
 if (t != NULL) {
 cout << " taught by ";
 t->Display();
 }
 allSubjects->Unget(thisSubject);
 }
 cout << "\n\nTeachers";
 cout << "\n--------";
 // --------- build set to get all Teachers
 TeacherAllSet *allTeachers = new TeacherAllSet(&objbase);
 Teacher *thisTeacher;

 allTeachers->Seek(0, PtSTART);
 while (allTeachers->Seek(1, PtCURRENT) == 0) {
 // ----- get each Teacher in turn
 allTeachers->Get(thisTeacher);
 cout << '\n';
 thisTeacher->Display();
 allTeachers->Unget(thisTeacher);
 }
}

// -------- menu to select processes
void SchoolMenu()
{
 int sel = 0;
 while (sel != 5) {
 cout << '\n';
 cout << '\t' << "1. New Subject" << '\n';
 cout << '\t' << "2. New Teacher" << '\n';
 cout << '\t' << "3. Assignment" << '\n';
 cout << '\t' << "4. List" << '\n';
 cout << '\t' << "5. Quit" << '\n';
 cout << '\t' << " Select: ";
 cin >> sel;
 switch (sel) {
 case 1:
 NewSubject();
 break;
 case 2:
 NewTeacher();
 break;
 case 3:
 Assignment();
 break;
 case 4:
 List();
 break;
 default:
 break;
 }
 }
}
void main()
{
 // ------- connect to server
 if (objbase.Connect("LOCAL") != 0)
 cout << "Cannot connect";

 else {
 // ----- open database
 if (objbase.Open("..\\base") != 0)
 cout << "Cannot open database";
 else {
 // --------- run the application
 SchoolMenu();
 // ------- close the database
 objbase.Close();
 }
 // ------ disconnect from the server
 objbase.DisConnect();
 }
}







[LISTING THREE]

// -------- schoolc.cpp
// School application for Code Farms Libraries

#include <io.h>
#include <iostream.h>
#include <string.h>
#define ZZmain
#include "zzincl.h" // generated by ZZPREP

// ------ Root class to control entry to others
class Root {
 ZZ_EXT_Root
};

// ---------- persistent Teacher class
class Teacher {
 ZZ_EXT_Teacher
public:
 Teacher(char *name);
 Teacher() {}
 void Display();
};
// ---------- persistent Subject class
class Subject {
 ZZ_EXT_Subject
public:
 Subject(char *name);
 Subject() {}
 void Display();
};
// -------- the Organizations
ZZ_HYPER_SINGLE_COLLECT(subject,Root,Subject);
ZZ_HYPER_NAME(subjname, Subject);
ZZ_HYPER_SINGLE_COLLECT(teacher,Root,Teacher);
ZZ_HYPER_NAME(tchname, Teacher);
ZZ_HYPER_DOUBLE_LINK(assignment,Subject,Teacher);

ZZ_HYPER_UTILITIES(util);

// ---------- Teacher constructor
Teacher::Teacher(char *name)
{
 char *nm = util.strAlloc(name);
 tchname.add(this, nm);
}
// -------- display a Teacher object
void Teacher::Display()

{
 // --- get the NAME linked with this Teacher
 char *nm = tchname.fwd(this);
 cout << nm;
}
// ---------- Subject constructor
Subject::Subject(char *name)
{
 char *nm = util.strAlloc(name);
 subjname.add(this, nm);
}
// -------- display a Subject object
void Subject::Display()
{
 // --- get the NAME linked with this Subject
 char *nm = subjname.fwd(this);
 cout << nm;
}
static Root *rt;

// ------- build a new Subject object
void NewSubject()
{
 char name[50];
 cout << "\n--- New Subject --- ";
 cout << "\nEnter Subject name: ";
 cin >> name;
 Subject *sb = new Subject(name);
 subject.add(rt, sb);
}
// ------- build a new Teacher object
void NewTeacher()
{
 char name[50];
 cout << "\n--- New Teacher --- ";
 cout << "\nEnter Teacher name: ";
 cin >> name;
 Teacher *tc = new Teacher(name);
 teacher.add(rt, tc);
}
// ----------- assign a Teacher object to a Subject object
void Assignment()
{
 char nm[50];
 cout << "\n--- Assignment --- ";
 cout << "\nEnter Subject name: ";
 cin >> nm;
 // ---- iterate through the Subject objects

 subject_iterator sIter(rt);

 Subject *sb;
 while ((sb = sIter++) != NULL) {
 char *snm = subjname.fwd(sb);
 if (strcmp(nm, snm) == 0) {
 // ---- get a Teacher object to assign to Subject
 cout << "\nEnter Teacher name: ";
 cin >> nm;
 // ---- iterate through the Teacher objects
 teacher_iterator tIter(rt);
 Teacher *tc;
 while ((tc = tIter++) != NULL) {
 char *tnm = tchname.fwd(tc);
 if (strcmp(nm, tnm) == 0) {
 // --- associate the two
 assignment.add(sb, tc);
 return;
 }
 }
 cout << "\nNo such teacher";
 return;
 }
 }
 cout << "\nNo such subject";
}
// ---------- list the Subjects and the Teachers
void List()
{
 static char *ln = "\n----------------------";
 cout << ln;
 cout << "\nSubjects";
 cout << ln;
 // ---- iterate through the Subject objects
 subject_iterator sIter(rt);
 Subject *sb;
 while ((sb = sIter++) != NULL) {
 cout << '\n';
 sb->Display();
 // -- get Teacher object associated with this Subject
 Teacher *tch = assignment.fwd(sb);
 if (tch != NULL) {
 cout << " taught by ";
 tch->Display();
 }
 }
 cout << ln;
 cout << "\nTeachers";
 cout << ln;
 // ---- iterate through the Teacher objects
 teacher_iterator tIter(rt);
 Teacher *tc;
 while ((tc = tIter++) != NULL) {
 cout << '\n';
 tc->Display();
 // -- get Subject object associated with this Teacher

 Subject *sbj = assignment.bwd(tc);
 if (sbj != NULL) {

 cout << " teaches ";
 sbj->Display();
 }
 }
 cout << ln;
}
// -------- menu to select processes
void SchoolMenu(void)
{
 int sel = 0;
 while (sel != 5) {
 cout << '\n';
 cout << '\t' << "1. New Subject" << '\n';
 cout << '\t' << "2. New Teacher" << '\n';
 cout << '\t' << "3. Assignment" << '\n';
 cout << '\t' << "4. List" << '\n';
 cout << '\t' << "5. Quit" << '\n';
 cout << '\t' << " Select: ";
 cin >> sel;
 switch (sel) {
 case 1:
 NewSubject();
 break;
 case 2:
 NewTeacher();
 break;
 case 3:
 Assignment();
 break;
 case 4:
 List();
 break;
 default:
 break;
 }
 }
}
static char dbname[] = "school";

void main()
{
 char *v, *t;
 // --------- open the database and load the organizations
 if (access(dbname, 0) == 0) {
 util.open(dbname, 1, &v, &t);
 rt = (Root *) v;
 }
 else {
 // ----- the database has never been built
 rt = new Root;
 v = (char *) rt;

 t = "Root";
 }
 // --------- run the application
 SchoolMenu();
 // --------- save the objects to the database
 util.save(dbname, 1, &v, &t);
}

#include "zzfunc.c" // generated by ZZPREP







[LISTING FOUR]

/* -----------------------------------------------------------------------
 school.ddl -- the Raima Object Manager schema for the School database
 ---------------------------------------------------------------------- */
database school[512] {
 data file "school.dat" contains teacher, subject;
 key file "school.k01" contains teacher.name;
 key file "school.k02" contains subject.name;
 record teacher {
 key char name[30];
 db_addr subj;
 }
 record subject {
 key char name[30];
 db_addr tch;
 }
}





[LISTING FIVE]

// -------- schoolr.cpp
// School Application for Raima Object Manager

#include <iostream.h>
#include <string.h>
#include <storedb.hpp>
#include <storeobj.hpp>
#include <keyobj.hpp>
#include "school.h"

// ------ define the database
class School : public StoreDb {
public:
 School();
 DEFINE_DB_LOCATOR;
};

class Subject;
// ------ Teacher class
class Teacher : public StoreObj, public teacher {
 int RecType() { return TEACHER; }
public:
 Teacher() : StoreObj(KeyObj(TEACHER_NAME))
 { subj = 0; }
 Teacher(char *nm) : StoreObj(KeyObj(TEACHER_NAME))
 { subj = 0; strncpy(name, nm, 30); }

 STOREDIN(School);
 DIRECTREF(Subject, subj);
 void Display() { cout << name; }
};
// ------ Subject class
class Subject : public StoreObj, public subject {
 int RecType() { return SUBJECT; }
public:
 Subject() : StoreObj(KeyObj(SUBJECT_NAME))
 { tch = 0; }
 Subject(char *nm) : StoreObj(KeyObj(SUBJECT_NAME))
 { tch = 0; strncpy(name, nm, 30); }
 STOREDIN(School);
 DIRECTREF(Teacher, tch);
 void Display() { cout << name; }
};
// ------- define the task
class SchoolTask : public StoreTask {
 School SchoolDB; // this is the database
 int sel; // menu selection
 void NewSubject();

 void NewTeacher();
 void Assignment();
 void List();
public:
 SchoolTask();
};
DB_INIT(School); // Initialize DB_LOCATOR

// ------ constructor for the database
School::School() : StoreDb("School", PDB_LOCATOR)
{
 if (Open() != True)
 // --- database probably has not been initialized
 cout << "\nCannot open database";
}
// ------- build a new Subject object
void SchoolTask::NewSubject()
{
 char nm[50];
 cout << "\n--- New Subject --- ";
 cout << "\nEnter Subject name: ";
 cin >> nm;

 Subject sbj(nm); // construct the Subject
 sbj.NewObj(); // add it to the database
}
// ------- build a new Teacher object
void SchoolTask::NewTeacher()
{
 char nm[50];
 cout << "\n--- New Teacher --- ";
 cout << "\nEnter Teacher name: ";
 cin >> nm;

 Teacher tchr(nm); // construct the Teacher
 tchr.NewObj(); // add it to the database
}

// ----------- assign a Teacher object to a Subject object
void SchoolTask::Assignment()
{
 char nm[50];
 cout << "\n--- Assignment --- ";
 cout << "\nEnter Subject name: ";
 cin >> nm;

 Subject sbj;
 KeyObj sky(SUBJECT_NAME, nm); // build a Subject key
 sbj[sky]; // retrieve Subject
 if (sbj.Okay()) {
 // ------- get a Teacher object to assign to Subject
 char tnm[50];

 cout << "\nEnter Teacher name: ";
 cin >> tnm;
 Teacher tchr;
 KeyObj tky(TEACHER_NAME, tnm); // build a Teacher key
 tchr[tky]; // retrieve Teacher
 if (tchr.Okay()) {
 tchr.Ref(sbj); // direct reference to Subject
 sbj.Ref(tchr); // direct reference to Teacher
 }
 else
 cout << "\nNo such teacher";
 }
 else
 cout << "\nNo such subject";
}
// ---------- list the Subjects (with assigned Teachers)
// and the Teachers (with assigned Subjects)
void SchoolTask::List()
{
 Subject sbj;
 Teacher tchr;

 cout << "\nSubjects";
 cout << "\n--------";

 // ------- step through Subjects
 for (sbj[FIRST]; sbj.Okay(); sbj++) {
 cout << '\n';
 sbj.Display();
 sbj >> tchr; // direct reference link
 if (tchr.Okay()) {
 cout << " taught by ";
 tchr.Display();
 }
 }
 cout << "\n\nTeachers";
 cout << "\n--------";

 // ------- step through Teachers
 for (tchr[FIRST]; tchr.Okay(); tchr++) {
 cout << '\n';
 tchr.Display();
 tchr >> sbj; // direct reference link
 if (sbj.Okay()) {

 cout << " teaches ";
 sbj.Display();
 }
 }
}
// -------- task constructor has menu to select processes
SchoolTask::SchoolTask()
{

 sel = 0;
 while (sel != 5) {
 cout << '\n';
 cout << '\t' << "1. New Subject" << '\n';
 cout << '\t' << "2. New Teacher" << '\n';
 cout << '\t' << "3. Assignment" << '\n';
 cout << '\t' << "4. List" << '\n';
 cout << '\t' << "5. Quit" << '\n';
 cout << '\t' << " Select: ";
 cin >> sel;
 switch (sel) {
 case 1:
 NewSubject();
 break;
 case 2:
 NewTeacher();
 break;
 case 3:
 Assignment();
 break;
 case 4:
 List();
 break;
 default:
 break;
 }
 }
}
void main()
{
 SchoolTask st;
}





















Special Issue, 1993
 PROTOTYPING AND PROGRAMMING DATABASE SYSTEMS


Program development when time is important




Miles Dempsey


Miles is vice president of ProtoView Professional Services, the consulting arm
of ProtoView Development Corp. He can be contacted at 5920 Roswell Road, Suite
B107-221, Atlanta, GA 30328; 800-572-0988.


Imagine that it's 20 degrees below zero, snow is whipping down, the wind's
blowing 30 miles an hour, and you're trying to write a program in a wall-less
factory on the shores of Lake Michigan. Further imagine that you have only one
week to implement a production-control system for this factory. And just to
make things worse, the only time you have access to the system is during the
graveyard shift, because every minute the production line is down, the company
loses thousands of dollars.
This is exactly how I spent a week in January 1992 when implementing a
client/server application for a major steel company. While my initial design
was indoors, most of the programming occurred during 40 hours of
graveyard-shift hell out in the cold where the software interfaced with vision
equipment and robotic components on a manufacturing line.
This article describes the manufacturing process that was automated, the
architecture of that system, the software tools used, the application
prototyping and coding process, and the database implementation. The entire
development had many constraints, the least of which was the one-week time
limit. Ultimately, tools for developing the graphical user interface (GUI) and
the structured query language (SQL) database made the difference in meeting
the deadline. The application was completed in the last few hours of the
allotted time, and it's been running 24 hours a day, seven days a week ever
since.


The Process


The steel industry manufactures and sells many types of products in a variety
of shapes, sizes, and elemental compositions. As different products move down
the manufacturing line, it's necessary to stamp each one with an identifying
signature (ID) designating the length, width, weight, and alloy composition.
This enables the hundreds of steel products to be sorted, inventoried, and
tracked.
Before I arrived at the steel mill in question, the process was performed
manually with a special tool that looked like a sideways jackhammer. A
production worker would receive from his supervisor a sheet of paper with the
various IDs for all the products coming down the line. A worker would scan the
paper, find the appropriate ID, and hammer it into the steel product at his
station. My job was to automate the process, storing a copy of each ID along
with secondary information in a database.
Automation would make the process safer and more efficient: The steel coming
down the line was usually several hundred degrees Fahrenheit, so an automatic
process would prevent serious burns, and the automatic stamper could perform
ten times faster than a human. Management would be pleased because efficiency
would increase by an order of magnitude, and labor would be happy because they
were paid bonuses based on total output. The ingredients for a successful
system were present--if I could deliver the system.


The System Architecture


The steel company already had an IBM mainframe computer which performed a
variety of tasks and services for various departments within the company. The
job was to integrate a specialized PC work-station to serve as the master
behind the stamping operation. The IDs were transferred from the mainframe to
the PC over a serial cable. Besides handling the ID, the PC performed several
other functions, such as monitoring the position of the steel products and
clamping the steel products once they got into position. A series of robotic
arms, digital-to-analog converters, and vision devices worked together to
secure the steel products for stamping.
Once a steel product was in place and its ID was displayed on the PC monitor,
the production worker would visually verify that the ID was correct, then
touch the monitor's touchscreen to send the ID to the automatic stamper. When
the stamping was complete, the ID was written to a dBase file on the PC, which
would then tell the robot to release the steel product so the next one in line
could get stamped. Since the process was interactive, the worker could
override the inventory ID at any point if it was incorrect.
The hardware was standard except for the PC designed for the project by
Ziatech Corp. The PC had one 386 CPU and four 286 chips, four standard COM
ports, and a capacitative touchscreen mounted on the monitor. DOS 5.0 and.
Microsoft Windows 3.0 were installed on a partition for the 386 chip, and DOS
5.0 was installed on separate partitions for each of the 286 chips.
The 386 chip ran a program that had several responsibilities: receiving the
data from the mainframe via a RS-422 cable, sending the ID to the automatic
stamper via a RS-232 cable, and writing a record to a dBase file. Each of the
286 processors, on the other hand, ran programs which controlled robotic arms
via RS-232 serial cables wired into Opto 22 boards. The robotic arms were
responsible for clamping the steel products. Vision devices wired to the Opto
22 boards provided positioning feedback to the system.
The entire system was enclosed in an industrial-strength housing bolted to the
concrete floor. The stamper, also bolted to the floor, sat between the system
and the manufacturing line.


The Software Tools


The project specification dictated that the interactive stamping program on
the 386 had to run under Microsoft Windows and be written in C. The robotic
programs (running on the 286s) also had to be in C. The database format was
dBase.
I used ProtoView 3.3 Dialog Editor to build the dialog boxes. I then used the
ProtoGen 3.0 Code Generator to generate ANSI C code for the Windows
application and Pioneer's Q&E Library to make calls to dBase via the code
generated ProtoGen and ProtoView's SQLView 1.0. I compiled the interactive
stamping program with Microsoft C 6.0 and the Microsoft 3.0 Software
Development Kit (SDK) using Paintbrush to create bitmaps used in the Windows
program. Microsoft C 6.0 was used to compile the robotic C programs, with
Brief 3.0 as my primary editor.
The ProtoView tools utilize and manipulate bitmaps in a Windows application,
generate and regenerate code, and link to a back-end database. In addition,
the ProtoView Dialog Editor supports the use of bitmaps through a custom
bitmap control. This was important, since bitmaps were used to display warning
and danger conditions to the user. As a developer, I was able to easily hide
and show the bitmap pictures via a timer message, thus creating the illusion
of a flashing signal. Bitmaps are visually manipulated with the ProtoView
Dialog Editor; see Figure 1.
The ProtoGen Code Generator generated code for a dialog box in less than two
seconds on a 20-MHZ 386. (This is important when you're standing ankle deep in
snow in the middle of the night with a 20-mile-an-hour wind hitting you in the
face.) More importantly, I could add custom source code to the code originally
generated by ProtoGen, make changes to the GUI, and rely on ProtoGen to
regenerate code (with the new changes), while preserving my custom source code
(see Listing One, page 23). My application had about ten dialog boxes; the
average regeneration time was about 30 seconds.
I used the Q&E Library from Pioneer Software because of a group of
ProtoView-supplied custom macros that integrated the Q&E Library into the
ProtoView Dialog Editor. These macros make up SQLView and allow you to
visually assign specific fields on the screen to their database counterparts.
Once the dBase tables had been defined, all I had to do was visually assign
(point-and-click) macros to the fields that were a part of the database
scenario. SQLView handled all the necessary calls for inserting, updating,
selecting, and deleting various records. The SQLView and Pioneer Software DLLs
make a powerful combination for quickly building back-end support.


The Prototyping and Coding Phase


When I entered the factory the first night, all I had was a design document
and my notebook computer. The PC workstation and the automatic stamper had
been delivered that afternoon. My first task was to write the Windows program
which accepted data from the host mainframe. I turned on my notebook computer
and loaded Windows, then loaded the ProtoView Dialog Editor and ProtoGen Code
Generator. With the dialog editor, I populated a dialog with a couple of
string controls and saved the dialog to a resource file. Then, using ProtoGen,
I attached the dialog box as the main window for my application and generated
C code. This process took about three minutes.
Once the code was generated. I used Brief to add a SetTimer function call
under the WM_INITDIALOG case statement to poll the serial port every one-tenth
of a second. Since I had set a timer, I then had to add the WM_TIMER case
statement to the code. This wasn't a problem because ProtoGen generates pairs
of regeneration comments throughout the source code. If you place custom code
between ProtoGen's regeneration comments, then your code will be preserved if
you have to regenerate the application, as in Listing Two (page 23).
The next step was to add code to read data from the serial port. ProtoGen
output lets you add functions to your code as long as you link in the
supporting library. That means you can call any Windows API function or
database functions from third-party database vendors who supply libraries.
Consequently, serial port polling was accomplished using the Windows API
ReadComm function.
ProtoGen also automatically generates variables for every control you place in
a dialog box, regardless of control type. So, if you place a date control in a
dialog box, ProtoGen generates an instance of a tm struct, thus making it easy
to manipulate dates. Since I'd populated my dialog box with string controls,
ProtoGen generated a character array for each string I defined.
ProtoView has its own API, with functions to facilitate data transfers. The
data-transfer mechanism automatically moves the values of each application
variable into their respective screen counterparts or vice versa. I supplied
the ReadComm function (one of the character pointers ProtoGen generated) and
invoked the transfer mechanism by calling ProtoView's vwUpdateScreen function.
This was all that was needed to display incoming characters on the dialog box.
ProtoGen invokes either the Microsoft or Borland compiler from a menu option.
The first program I wrote took about ten seconds to compile. ProtoGen also
launches your program once you get a successful compile. I launched my program
and watched as it began to display characters in my newly created string
control.
I spent the first three nights designing the interface, which displayed
information about the positioning of the steel products. Using timers and
bitmaps, I simulated a flashing signal to the production worker on one of the
dialog boxes by simply hiding and unhiding the bitmaps every half second. Once
the vision devices detected the steel was in place, I showed the worker an
"in-place" bitmap, telling him it was all right to send the steel ID to the
automatic stamper. While the vision devices were busy detecting positioning
parameters, data was downloaded from the host. If the ID from the host was
acceptable, the worker touched the touchscreen to send the ID, character by
character, to the automatic stamper.

Since the ProtoView is a "work-in-process" toolset, the code regeneration lets
you make small changes to the interface, then regenerate it to support those
changes. Thus, the development process becomes a cycle of incremental
prototyping and production-code solidification, as shown in Figure 2.


The Database Implementation


As stated earlier, SQLView was used to gain back-end support. Every window
designed with the ProtoView Dialog Editor can gain access to a variety of
databases. SQLView is a DLL that contains a group of SQL macros that interface
with Pioneer's Q&E Library and Microsoft's Open Database Connectivity Library
(ODBC) to the window being designed. All you do is pick the fields belonging
to a database table and the buttons executing SQL actions--SQLView does the
rest. Also, SQLView is designed to connect to multiple tables in multiple
databases, regardless of the database.
SQLView maps all of ProtoView's custom controls to the underlying SQL APIs.
Microsoft and Pioneer support over 20 different relational databases, so
integrating a ProtoView dialog to Oracle, Paradox, DB2, SQL Server, dBase, and
the like is straightforward. ProtoView also provides automatic data
conversion--for example, a dialog with one date control, one currency control,
and one integer control can be mapped to a database table consisting of three
10-byte character fields. SQLView converts each data type to a character
string if necessary or desired.
The ID being stamped and some data that reflected current operating conditions
were stored in two dBase tables. SQLView allowed the same touchscreen button
that sent data to the automatic stamper to send data to the database as well.
SQLView also made it possible to pick exactly which fields on the screen to
associate with a database table. A SQL Execute macro was attached to the
button and a SQL Field macro was attached to each field. When touched, the
macro scanned the window and inserted all fields with the Field macro into the
database. Database connectivity was achieved in less than five minutes; see
Figure 3.


Conclusion


In addition to staying warm, my goal that week was to develop a working
application. My client knew what he wanted, but didn't know how to implement
it. I knew computer programming, but knew very little of the particular
industry. Once the client had defined his needs, it was a simple process of
designing the architecture of the system, choosing the right tools,
prototyping the application, and solidifying the code into a final working
executable.

_PROTOTYPING AND PROGRAMMING DATABASE SYSTEMS_
by Miles Dempsey


[LISTING ONE]

//REGEN_FILEHEADING
//REGEN_FILEHEADING
/*****************************************************************************
* Source File: Warning.c
******************************************************************************/

#include <windows.h>
#include <pv.h>
#include "WARNING.h"

//REGEN_VARIABLES
//REGEN_VARIABLES

VIEW hViewWarning;
HWND hWndWarning;
LONG FAR PASCAL fnWarningWndProc(HWND, WORD, WORD, LONG);

int fnWarning(HWND hParentWnd)
{
 int ReturnCode;

 VIEWPROC lpfnfnWarningWndProc;
 //REGEN_BEGINFUNCTION
 //REGEN_BEGINFUNCTION

 lpfnfnWarningWndProc=(VIEWPROC)MakeProcInstance((FARPROC)
 fnWarningWndProc,hInst);
 //REGEN_INITDLG
 //REGEN_INITDLG
 if(!(hViewWarning = vwCreateView(hInst,
 "Warning",
 hParentWnd,
 lpfnfnWarningWndProc,
 NULL)))
 return FALSE;
 vwSetFieldVar(hViewWarning, ID_WARNING, &nWarning);


 //REGEN_INITVIEW
 //REGEN_INITVIEW
 if((ReturnCode = vwShowModalView(hViewWarning)) == -1)
 {
 MessageBox(NULL, "Unable to display view", "System Error",
 MB_SYSTEMMODAL MB_ICONHAND MB_OK);
 return FALSE;
 }
 //REGEN_TERMVIEW
 //REGEN_TERMVIEW
 FreeProcInstance((FARPROC)lpfnfnWarningWndProc);
 return(ReturnCode);
}

long FAR PASCAL fnWarningWndProc(HWND hWnd, WORD wMessage, WORD wParam,
 LONG lParam)

{
 GETVIEW;
 //REGEN_WINDOWPROCVARIABLES
 static int Opto22 = 0;
 //REGEN_WINDOWPROCVARIABLES
 switch(wMessage)
 {
 //REGEN_WNDPROC
 case WM_TIMER :
 switch (wParam)
 {
 case TIMER_LOOK_FOR_CLAMP:
 code which monitored one of the Opto 22 boards went here
 // Flash the bitmap until the steel is clamped
 if (!bFlash)
 {
 vwHideField (View, ID_WARNING);
 bFlash = TRUE;
 }
 else
 {
 vwUnHideField (View, ID_WARNING);
 bFlash = FALSE;
 }
 // When steel is clamped, kill the timer
 if ( Opto22 == FOUND )
 KillTimer (hWnd, TIMER_LOOK_FOR_CLAMP);
 break;
 }
 break;
 //REGEN_WNDPROC
 case WM_INITDIALOG :
 //REGEN_WM_INITDIALOG

 /* Custom code which starts a timer looking for feedback
 * from the vision instuments went here between the regeneration
 * comments which ProtoGen will preserve. */

 SetTimer (hWnd, TIMER_LOOK_FOR_CLAMP, 500, NULL);

 //REGEN_WM_INITDIALOG
 return TRUE;

case WM_COMMAND :
 switch(wParam)
 {
 case ID_WARNING :
 //REGEN_ID_WARNING
 //REGEN_ID_WARNING
 break;

 case ID_CONTINUEOPERATION :
 //REGEN_ID_CONTINUEOPERATION
 //REGEN_ID_CONTINUEOPERATION
 break;

 //REGEN_CUSTOMCOMMAND
 //REGEN_CUSTOMCOMMAND
 }
 break;
 }
 return DefViewProc(hWnd, wMessage, wParam, lParam);
}
//REGEN_CUSTOMCODE
//REGEN_CUSTOMCODE





[LISTING TWO]

long FAR PASCAL fnReadCommWndProc(HWND hWnd, WORD wMessage, WORD wParam,
 LONG lParam)

{
 switch(wMessage)
 {
 //REGEN_WNDPROC
 case WM_TIMER :
 switch (wParam)
 {
 case TIMER_LOOK_FOR_CHARACTER:
 // code to monitor the RS422 port
 // szString was generated by ProtoGen
 ReadComm (...., szString);
 if ( lstrlen(szString) > 0 )
 vwUpdateScreen (View);
 break;
 }
 break;
 //REGEN_WNDPROC

 case WM_INITDIALOG :
 //REGEN_WM_INITDIALOG
 // Set a timer for every 1/10 of a second
 // The third parameter is milliseconds
 // 1000 = 1 second
 SetTimer (hWnd, TIMER_LOOK_FOR_CHARACTER, 100, NULL);
 //REGEN_WM_INITDIALOG
 return TRUE;
 }

 return DefViewProc(hWnd, wMessage, wParam, lParam);
}
//REGEN_CUSTOMCODE
//REGEN_CUSTOMCODE


























































Special Issue, 1993
BUILDING A DATABASE FILE VIEWER


Programming with the Paradox Engine




Michael Floyd


Michael is executive editor for Dr. Dobb's Journal. He can be reached at the
DDJ offices, on CompuServe at 76703,4057, or via MCI mail at mfloyd.


A database engine is a library that provides--in the form of an API--the core
features of a database management system. One advantage of an engine is that
you can use routines that have been fully optimized and tested to add database
functionality to an application. A database engine also allows your
application to support a common data format, so that there's no need to
develop a proprietary one. This means other applications or tools can access
data created by your program, and vice versa. And, while many applications are
not databases per se, they often have sophisticated data-storage and
-retrieval requirements--CD-ROM applications, for example, usually require an
optimized search engine just to access files.
This article presents a data-file viewer written with Borland's Paradox Engine
3.0 database engine and Borland Pascal 7.0 with Objects. The DOS-based viewer
provides both a means of exploring the Paradox Engine and a useful tool for
development efforts. Although the application uses Borland Pascal, much of the
discussion is relevant to both C and C++ programmers working under DOS or
Windows. In fact, any Windows developer whose development platform can access
DLLs can use the techniques presented here.


Touring the Engine


The Paradox Engine supports Borland's C/C++ and Pascal compilers for both DOS
and Windows.
DOS programmers can also access engine functions using Microsoft C 6.01 (or
later). Because the Paradox Engine is supplied as a dynamic link library
(DLL), Windows programmers can access the functions from any tool that
supports DLLs--Visual Basic, Toolbook, ObjectVision, and the like. In addition
to the procedural libraries, the Paradox Engine comes with a Database
Framework--an object-oriented library that encapsulates the functionality of
the engine and gives C++ and Pascal programmers high-level access to Paradox
Engine functions.
Compared to previous releases of the Paradox Engine, version 3.0 supports
binary large objects (BLOBs), improves on concurrency support, and no longer
requires you to specify a network type when initializing the engine. Although
the Paradox Engine does not currently support dBase file formats (or at least
importing dBase data), 3.0 does support Paradox 3.5 and 4.0 file formats,
allowing you to open and manipulate both 3.5- and 4.0-formatted files, as well
as import 3.5 files to the new format.
The memory requirements of the Paradox Engine are negligible. Under DOS, for
example, you can compile using Borland's overlay manager to swap chunks of up
to 90 Kbytes in and out of RAM. I was able to code and test the database
viewer presented here from within the integrated development environment
(IDE). It was only when I tried to add a Turbo Vision interface that memory
requirements forced me to use the command-line compiler. (This also becomes an
issue when working with the Database Framework.) However, if you're using
Borland Pascal 7.0 and a 386, you can take advantage of the new DPMI support.
Windows is less of an issue since most Windows environments already take
advantage of DPMI. Consequently, the Engine-as-DLL can be shared between
applications, thus making efficient use of memory.


The Engine at Work


There's a classic chicken-and-egg problem in the first stages of database
development. On one hand, the mechanisms to manipulate and view data are
either not yet in place or not fully debugged. On the other hand, maintaining
the integrity of data is critical in testing new functions. Therefore, you may
be tempted to first build the viewer functions. However, opening a database
on, for example, a composite index does not provide an accurate view of the
physical data. In fact, most database applications present views of the
database that do not represent its current state. The solution is to build a
generic data viewer that can quickly display the database.
Listing One (page 28) presents a viewer that can read and display any Paradox
database table that's been described by a "structure file"--a plain ASCII file
that describes the fields for a given table. There are a number of benefits to
using structure files: They document the fields in your database, allow
generic routines to access any Paradox database table, and make it possible to
view Paradox data without your having to own Paradox. Table 1 describes the
syntax used in the structure file to describe Paradox field types. For
example, an ASCII field containing a maximum of 20 characters is described as
A20. One field type not supported by the file viewer is the BLOB type. While
it is a simple matter to read and display, say, an integer, BLOB fields can
contain a memo field, a bitmap image, or even sound data. Therefore, access
routines must be written to read and write each specific BLOB type.
Table 1: Syntax used to describe field types in a table-structure file.

 Syntax Field Description
 _________________________________________________________________________

 Annn Alphanumeric nnn defines the maximum number of characters. A50
 defines an alphanumeric field that contains a
 maximum of 50 characters.

 N Numeric Double-precision floating-point value.

 S Short Number Signed-integer value from -32,767 to 32,767.

 D Date Valid dates using the Gregorian calendar from
 January 1, 100 to December 31, 9999. Date fields
 can be viewed in any of five different formats
 and printed using one of eight additional formats.

 $ Currency Similar to Numeric field, but by default displays
 values rounded to two decimal places and places
 negative numbers in parentheses.

If you're familiar with versions 2 or 3 of the Paradox Engine, you'll notice
some familiar code in Listing One. Some of the support routines such as Strip,
Error, LoadTableStructure, and InputRecord were "borrowed" from the Fondex
example program supplied with the Paradox Engine. (This example provided
inspiration for the generic database viewer.)
To use the viewer, create a new table by first creating a structure file using
your favorite editor, then firing up the viewer and selecting "New Table." The
viewer program then calls NewTable to load the table-structure file and
creates a new table by calling the PXTblCreate engine function. A call to
KeyAdd is also included, but has been commented out. KeyAdd creates a primary
index based on the field specified. To create a primary index, simply
uncomment the call to KeyAdd.
Once a table has been created, you can add a new record, delete or update
existing data, and search the data table. There are also options to open and
close a table. These two options are important. Because although you've
created a new table, it has yet to be opened, and a table must be opened
before you can perform any operations on it. OpenStruct, the procedure that
actually opens the table, first ensures that it is not already open. The
Paradox Engine doesn't require this step. In fact, the engine allows for up to
64 file handles and related objects (such as indexes) to be open at any one
time. If you plan on supporting multiple tables, you should store
FirstTableOpen (which is a Boolean value) in some dynamic structure such as an
array. Opening as many tables as possible is probably not a good practice
however. Using PXTblClose to close a table flushes the buffers, thus freeing
them for use elsewhere in the program and guaranteeing the update of disk
files in a timely fashion. With that in mind, the total number of open files
is by default set to five. If you need additional tables, this default value
can be changed using PXSetDefaults prior to calling one of the PXInit
functions.

With the issue of open tables out of the way, OpenStruct calls the PXTblOpen
engine function. As one of its arguments, PXTblOpen takes an index ID value
that specifies the order for records in the table. In this case the ID value
is set to 0, which opens the table in the order of the primary index if one
exists, or in natural order if there's no primary index. If the table is
successfully opened, a call to PXRecBufOpen allocates a record buffer for the
table. GetTableStructure is then called to retrieve the table's field names
and types. The "Close Table" option is provided for completeness. As
previously mentioned, PXTblClose automatically flushes the buffers and should
be used whenever possible to free up table handles and update to disk. You
can, however, safely exit the viewer without closing a currently open
table--the engine automatically closes all open tables before exiting.
Once a table is open, you can begin to add data to it using the AddEntry
function in Listing One. As with OpenStruct, AddEntry first checks for an open
table and exits with an error message if that's the case. AddEntry also
flushes the record buffer with a call to the PXRecBufEmpty engine function.
InputRecord then gets the field values from the user and places them in the
record buffer. Finally, PXRecAppend performs the update to the database.


Searching


Searching with the Paradox Engine is straightforward, requiring only a single
call to perform most searches. PXSrchFld takes a table, a record, and a field
handle as arguments along with a forth Mode parameter. Mode specifies the
PXSrchFld search mode. Valid modes are SEARCHFIRST, SEARCHNEXT, and
CLOSESTRECORD. For example, consider the Search procedure in Listing One.
PXSrchFld is placed in a loop with Mode initialized to SEARCHFIRST. If no
matches are found, a message is displayed. Otherwise, the results are read
into the record buffer using PXRecGet and displayed. If the user chooses to
search for additional matches, Mode is set to SEARCHNEXT and the loop repeats.
By default, searches are case sensitive and the viewer relies on this. You can
create a search that is insensitive to case by creating an index for the field
or fields (for a composite index) to search on (PXKeyAdd), opening the table
on this index (PXTblOpen), and calling PXSrchFld as usual. Note, too, that the
rules change when you open a table without specifying an index. As previously
mentioned, PXTblOpen opens the table in the order of the primary index if one
exists, or in natural order if there is no primary index. PXSrchFld uses the
natural order when the table is not indexed and the CLOESESTRECORD mode is not
valid.


Database Framework


The Database Framework is a class library that encapsulates the Engine's API.
The idea is that providing a set of high-level objects will reduce the
complexity of application development. However, Paradox is not an
object-oriented DBMS. The Database Framework simply provides an abstraction of
Paradox tables into a virtual table using the functionality of Paradox.
All of the source code used to build the Database Framework is included with
the Engine in both Pascal and C++. There are, however, significant differences
between the Pascal and C++ versions. In particular, the C++ Framework defines
a base object, BDbObject, from which all other database engine objects are
derived; see Figure 1(a). The C++ Framework is designed in such a way that it
is independent of Turbo Vision or the Object Windows Library (OWL), allowing
you to use your favorite GUI class library.
The base object in the Pascal version, on the other hand, is TObject--the base
object used in both Turbo Vision and OWL. Four Engine objects are derived from
TObject: TEngine, TDatabase, TCursor, and TRecord; see Figure 1(b). According
to Borland, using TObject as the base object in the database hierarchy allows
them to make use of TCollection to store dynamic collections of records.
Agreed, there's not a plethora of third-party application frameworks available
for Turbo Pascal. However, this strategy of inheriting from TObject ties you
into an application framework that you may never use.
Still, the Database Framework simplifies several aspects of Pascal
programming. For example, the Windows API requires null-terminated strings
(defined as PChar in Turbo Pascal), while the DOS API allows the use of the
standard Turbo String type. On the other hand, both the DOS and Windows
versions of the Database Framework use String variables. The Database
Framework thus resolves the difference in string handling and allows even
Windows programmers to work with the simpler Turbo Pascal String type.
Returning to the hierarchy in Figure 1(b), you'll notice the TEngine object
type. TEngine is to the Database Framework what TApplication is to Turbo
Vision or OWL. There is one and only one instance of TEngine, and it is the
first object instantiated. TEngine methods handle basic initialization of the
engine (in Windows, single-user, or network mode), allowing you to set and get
defaults and handle things like password protection.
Once you've initialized the engine, you can create and open a database table,
which is handled within the framework by the TDatabase object type. TDatabase
methods simplify operations on tables by treating the table as a whole and by
allowing you to reference table names rather than handles. There are methods
to create, copy, rename, and report on tables. In creating a table, you can
establish primary and secondary indexes, and you can easily open tables on
simple, composite, or case-insensitive indexes. Methods to navigate the table
come from TCursor. The TCursor object type encapsulates routines for accessing
indexed and unordered tables. More than one cursor can be open on the same
table, thus allowing multiple views of a given table.
Finally, the TRecord object type handles reading from and writing to record
fields, including BLOB fields. A TRecord is always created in the context of a
TCursor. Most notable about TRecord is its encapsulation of what Borland calls
a "generic record," a record whose structure is not known until run time. In
addition to generic records, you can also create custom records. These custom
records, which are derived from TRecord, allow you to create alternative views
of the record data by combining some or all existing fields with derived
fields. Also, with custom records, you can directly map record fields to
variables in your program. This allows you to update fields without having to
call GetField and PutField.
There is one slight hitch in using custom records: A custom record class must
override some of TRecord's methods. Fortunately, Borland supplies a Generate
utility that, when given a table name and list of source and target fields,
will generate a custom-record object type (or class in the case of C++). This
custom record will be derived from TRecord.


Concurrency


You don't have to be on a network to consider file sharing in your
application. Windows, although non-preemptive, is still a multitasking
operating system. As such, you can open multiple instances of an application
that can, in turn, simultaneously lead to multiple accesses to a given table.
Even DOS now provides a file SHARE mode which can cause problems. Therefore,
the database developer must always be cognizant of those issues generally
associated with networks. The 12 network API function calls include: record,
table, and file locking; table refreshing, error handling; and a Goto function
and allows you to go to a previously locked record.
For those of you on a network, Paradox Engine 3.0 supports Novell Netware,
3Com, 3Com3+Open, IBM PC LAN, AT&T StarGroup, Banyan Vines, local-share net
types, and other DOS 3.1 compatible networks. Prior to the 3.0 version of the
engine, the developer had to specify which network the application was to run.
This was not a big deal, but it did force the developer to add some #ifdef
statements. Under 3.0, however, you no longer have to specify the network--the
Paradox Engine handles it for you.


Conclusion


There is one small problem I've noticed in deleting records from the database
apps built with the Paradox Engine. Whenever you delete an entry in the
database, PXRecDelete removes the pointers to that record, effectively marking
the data for overwriting. But this operation does not actually delete the
data. As a result, your database can grow much larger than the apparent number
of records in the database. Therefore, you'll need to write a packing function
that creates a new table with the same structure as your current table and
copies only active records to this new table. According to Borland, packing a
database with a primary index and then adding records requires a significant
amount of table reorganization as the new records are inserted. Borland
provides a technical-information note (TI1004) that describes the problem and
its solution. A copy of the technical note is available on CompuServe in the
Borland Development Tools forum (GO DBEVTOOLS) as TI1004.ASC.

_BUILDING A DATABASE FILE VIEWER_
by Michael Floyd


[LISTING ONE]

{$N+,E+}
Program DbViewer;
{ DbViewer is a simple Paradox table viewer. DbViewer can display most, but
 not all, data field types. DbViewer includes options to create a new
 table, open, close and search existing tables, and to add, edit and
 display records in a given table. Note: this program was adapted from
 the FONDEX example program included with the Paradox Engine. }
Uses DOS, Crt, PXEngine, PXMSG;

Const
 Success = True;
 Failure = False;
 MaxFields = 40;
 MaxFieldSize = 50;
Var
 RecHandle : RecordHandle;
 NdxHandle, TblHandle : TableHandle;

 FirstTableOpen, SecondTableOpen, ThirdTableOpen : Boolean;
 TextFile : Text;
 NFields : Integer;
 Names, Types : NamesArrayPtr;
 StructureFile, DataFile : String;

Type
 IntArray = array[1..18] of Integer;

{--- Strip a string of leading and trailing white space ---}
procedure Strip(var S: String );
 var
 L1, L2: Byte;
 begin
 L1 := 1;
 while (L1 < Length(S)) and (S[L1] in [#9..#13, ' ']) do
 Inc(L1);
 L2 := Length(S);
 while (L2 > 0) and (S[L2] in [#9..#13, ' ']) do
 Dec(L2);
 S := Copy(S, L1, L2 - L1 + 1);
 end; { Strip }

{--- Write error message if an error has occurred ---}
function Error(RC: Integer): Boolean;
 begin
 if RC <> PXSUCCESS then
 WriteLn('DbViewer: ', PXErrMsg(RC));
 Error := RC <> PXSUCCESS;
 end; { Error }

{--- trap error, but ignore ---}
procedure ErrIgnore(RC: Integer);
 begin
 if Error(RC) then; { ignore error return code }
 end; { ErrIgnore }

{--- Load a table structure from an ASCII disk file ---}
function LoadTableStructure: Boolean;
 var
 F: Text;
 FldName, FldType, Help: String;
 begin
 { Open the structure file }
 Assign(F, StructureFile);
{$I-}
 Reset(F);
{$I+}
 if IoResult <> 0 then
 begin
 WriteLn('can''t open Structure file');
 LoadTableStructure := FAILURE;
 Exit;
 end;
 { Read in the structure }
 NFields := 0;
 New(Names);
 New(Types);
 while not Eof(F) and (NFields < MAXFIELDS) do

 begin
 { read data, bewaring of unexpected EOF }
{$I-}
 ReadLn(F, Help);
{$I+}
 if (IoResult = 0) and (Help <> '') then
 begin
 Inc(NFields);
 Strip(Help);
 FldType := Copy(Help, 1, Pos(' ', Help) - 1);
 FldName := Copy(Help, Pos(' ', Help), Length(Help));
 { remove trailing and leading white space from name }
 Strip(FldName);
 Names^[NFields] := FldName;
 Types^[NFields] := FldType;
 end; { THEN }
 end; { WHILE }
 Close(F);
 { Return error if no fields were found }
 if NFields = 0 then
 LoadTableStructure := FAILURE
 else
 LoadTableStructure := SUCCESS;
 end; { LoadTableStructure }

{--- Frees memory associated with table structure ---}
procedure FreeTableStructure;
 begin
 Dispose(Names);
 Dispose(Types);
 end; { FreeTableStructure }

{--- Retrieve table field names and types ---}
function GetTableStructure(THandle: TableHandle) : Boolean;
 var
 FldName, FldType: NameString;
 I: Word;
 begin
 NFields := 0;
 New(Names);
 New(Types);
 if Error(PXRecNFlds(THandle, NFields)) then; { ignore Result }
 for I := 1 to NFields do
 if not Error(PXFldName(THandle, I, FldName)) and
 not Error(PXFldType(THandle, I, FldType)) then
 begin
 Names^[I] := FldName;
 Types^[I] := FldType;
 end
 else
 begin
 GetTableStructure := FAILURE;
 Exit;
 end;
 GetTableStructure := SUCCESS;
 end; { GetTableStructure }

procedure ShowFiles(Filename : String);
var

 I : Integer;
 Found : SearchRec;
begin
 I := 1;
 Assign(TextFile, Filename);
 {$I-}
 Reset(TextFile);
 {$I+}
 While IOResult <> 0 do
 begin
 ClrScr;
 Writeln('-------- Available Files ----------');
 FindFirst('*.dat', AnyFile, Found);
 While DosError = 0 do
 begin
 Writeln(Found.Name);
 FindNext(Found);
 end;
 Writeln;
{ Write('Enter Filename (less extension): ');
 Readln(Filename);}
 end;
end;

{--- Open a structure file and associated table ---}
procedure OpenStruct(StructType : String; var THandle : TableHandle);
var
 FileN, S : String;
begin
 { Attempt to open the table and allocate a record buffer. If table
 is open, inidicate an error }
 If StructType = 'author' then
 if FirstTableOpen then
 begin
 WriteLn('table already opened');
 Exit;
 end;
{ uncomment this section to support additional tables }
{ If StructType = 'foo' then
 if SecondTableOpen then
 begin
 WriteLn('table already opened');
 Exit;
 end;
 If StructType = 'bar' then
 if ThirdTableOpen then
 begin
 WriteLn('table already opened');
 Exit;
 end;
}
 { Now try and open the table }
 if Error(PXTblOpen(StructType, TblHandle, 0, False)) then
 Exit;
 { Allocate a record buffer }
 if Error(PXRecBufOpen(TblHandle, RecHandle)) then
 Exit;
 if GetTableStructure(TblHandle) = FAILURE then
 Exit;

 THandle := TblHandle;
 If StructType = 'author' then FirstTableOpen := True;

{ uncomment this section to support additional tables }
{ If StructType = 'foo' then SecondTableOpen := True;
 If StructType = 'bar' then ThirdTableOpen := True;
}
 end; { OpenStruct }

{--- Close the table if opened ---}
procedure CloseStruct(TblName : String; THandle : TableHandle);
 begin
 If TblName = 'author' then
 if not FirstTableOpen then
 begin
 WriteLn('table not open');
 Exit;
 end;
{ uncomment this section to support additional tables }
{ If TblName = 'foo' then
 if not SecondTableOpen then
 begin
 WriteLn('table not open');
 Exit;
 end;
 If TblName = 'bar' then
 if not ThirdTableOpen then
 begin
 WriteLn('table not open');
 Exit;
 end;
}
 { Free the record buffer }
 if Error(PXRecBufClose(RecHandle)) then
 Exit;
 { Close the table }
 if Error(PXTblClose(THandle)) then
 Exit;
 FreeTableStructure;
 If TblName = 'author' then FirstTableOpen := False;

{ Uncomment this to support additional tables}
{ If TblName = 'foo' then SecondTableOpen := False;
 If TblName = 'bar' then ThirdTableOpen := False;
}
 end; { CloseStruct }

{--- Retrieve in a string format any valid Paradox type ---}
function GetData(FH: FieldHandle;
 var S: String ): Boolean;
 var
 TheDate: TDate;
 Month, Day, Year: Integer;
 TheValue: Double;
 TheShort: Integer;
 IsBlank: Boolean;
 Help: String ;
 begin
 { if this field is blank, we want to return a blank string }

 GetData := SUCCESS;
 if not Error(PXFldBlank(RecHandle, FH, IsBlank)) then
 if IsBlank then
 S := ''
 else
 case UpCase(Types^[FH][1]) of
 'A':
 if Error(PXGetAlpha(RecHandle, FH, S)) then
 GetData := FAILURE;
 'D':
 if not Error(PXGetDate(RecHandle, FH, TheDate)) then
 begin
 ErrIgnore(PXDateDecode(TheDate, Month, Day, Year));
 Str(Month, S);
 Str(Day, Help);
 S := S + '/' + Help;
 Str(Year, Help);
 S := S + '/' + Help;
 end
 else
 GetData := FAILURE;
 'N':
 if not Error(PXGetDoub(RecHandle, FH, TheValue)) then
 Str(TheValue: 5: 0, S)
 else
 GetData := FAILURE;
 '$':
 if not Error(PXGetDoub(RecHandle, FH, TheValue)) then
 Str(TheValue: 6: 2, S)
 else
 GetData := FAILURE;
 'S':
 if not Error(PXGetShort(RecHandle, FH, TheShort)) then
 Str(TheShort, S)
 else
 GetData := FAILURE;
 end { case }
 else
 GetData := FAILURE; { an error occured in PXFldBlank }
 end; { GetData }

{--- Store a string in any valid Paradox type ---}
function PutData(FH: FieldHandle;
 S: String ): Boolean;
var
 TheDate: TDate;
 Month, Day, Year: Integer;
 TheValue: Double;
 TheShort: Integer;
 Code: Integer; { needed for VAL }

 function GetNextWVal(var S: String ): Word;
 const
 Delim = '/';
 var
 L: Byte;
 Help: Word;
 Code: Integer;
 begin

 L := Pos(Delim, S);
 if L = 0 then
 L := Length(S) + 1;
 Val(Copy(S, 1, L - 1), Help, Code);
 S := Copy(S, L + 1, Length(S));
 if Code = 0 then
 GetNextWVal := Help
 else
 GetNextWVal := 0;
 end; { GetNextWVal }
 begin
 PutData := SUCCESS;
 case UpCase(Types^[FH][1]) of
 'A':
 if Error(PXPutAlpha(RecHandle, FH, S)) then
 PutData := FAILURE;
 'D':
 begin
 Month := GetNextWVal(S);
 Day := GetNextWVal(S);
 Year := GetNextWVal(S);
 if Error(PXDateEncode(Month, Day, Year, TheDate)) or
 Error(PXPutDate(RecHandle, FH, TheDate)) then
 PutData := FAILURE;
 end;
 '$', 'N':
 begin
 Val(S, TheValue, Code);
 if Error(PXPutDoub(RecHandle, FH, TheValue)) then
 PutData := FAILURE;
 end;
 'S':
 begin
 Val(S, TheShort, Code);
 if Error(PXPutShort(RecHandle, FH, TheShort)) then
 PutData := FAILURE;
 end;
 end; { case }
 end; { PutData }

{--- Edit existing record buffer and let user accept, cancel, or re-edit ---}
function InputRecord: Boolean;
 var
 C: Char;
 I: Word;
 Buf: String ;
 begin
 InputRecord := FAILURE;
 { Keep attempting to input until user selects DONE or CANCEL }
 while True do
 begin
 { Go through all fields }
 for I := 1 to NFields do
 begin
 { translate the current value into the input buffer }
 if GetData(I, Buf) <> SUCCESS then
 Exit;
 WriteLn(Buf);
 { ask for the new value }

 Write(Names^[I], ': ');
 ReadLn(Buf);

 { Now translate it back into the record buffer unless old value
 is kept by just hitting return. }
 if Length(Buf) > 0 then
 if PutData(I, Buf) <> SUCCESS then
 Exit;
 end; { for }
 { Ask what to do with this input }
 WriteLn('S)ave, C)ancel, R)edo:');
 repeat
 C := UpCase(ReadKey);
 until C in ['S', 'C', 'R'];
 case C of
 'S':
 begin
 InputRecord := SUCCESS;
 Exit;
 end;
 'C': Exit;
 end; { case }
 end; { while }
 end; { InputRecord }

{--- Add a new record to the table ---}
procedure AddEntry;
 begin
 if not FirstTableOpen then
 begin
 WriteLn('Table not opened');
 Exit;
 end;
 { Empty the current record buffer }
 if Error(PXRecBufEmpty(RecHandle)) then
 Exit;
 { get the fields unless input is cancelled by user }
 if InputRecord = FAILURE then
 Exit;
 { Attempt to append the record }
 ErrIgnore(PXRecAppend(TblHandle, RecHandle));
 end; { AddEntry }

{--- Displays and accepts a legal field number ---}
function InputField(var FieldNumber: FieldHandle): Boolean;
 var
 Buf: String ;
 begin
 { Get the field number as an integer }
 FieldNumber := Ord(ReadKey) - Ord('0');
 if (FieldNumber < 1) or (FieldNumber > NFields) then
 begin
 WriteLn('illegal field number');
 InputField := FAILURE;
 Exit;
 end;

 { Input the field }
 Write(Names^[FieldNumber], ': ');

 ReadLn(Buf);
 { And translate it }
 if PutData(FieldNumber, Buf) <> SUCCESS then
 begin
 InputField := FAILURE;
 Exit;
 end;
 InputField := SUCCESS;
 end; { InputField }

procedure FindAndUpdate;
 begin
 if InputRecord <> FAILURE then { Update it }
 if PXRecUpdate(TblHandle, RecHandle) <> PXSUCCESS then
 Exit;
 end; { FindAndUpdate }

{--- Processes search allowsing user to Delete and Update records ---}
procedure DoSearch(FieldNumber: FieldHandle);
 var
 Mode: Integer;
 I: Integer;
 Done: Boolean;
 Buf: String ;
 begin
 Mode := SEARCHFIRST;
 Done := True;
 while True do
 begin
 { If no match found, get out }
 if PXSrchFld(TblHandle, RecHandle, FieldNumber,
 Mode) <> PXSUCCESS then
 begin
 WriteLn('No Matches');
 Exit;
 end;
 { Get the record found }
 if Error(PXRecGet(TblHandle, RecHandle)) then
 Exit;
 { Print the record }
 for I := 1 to NFields do
 begin
 if GetData(I, Buf) <> SUCCESS then
 Exit;
 WriteLn(Names^[I], ': ', Buf);
 end;
 WriteLn('N)ext, D)elete, U)pdate, E)xit Search:');
 repeat
 case UpCase(ReadKey) of
 'N':
 begin { Search for the next occurrence }
 Mode := SEARCHNEXT;
 Done := True;
 end;
 'D':
 begin
 ErrIgnore(PXRecDelete(TblHandle));
 Exit;
 end;

 'U':
 begin
 FindAndUpdate;
 Exit;
 end;
 'E': Exit;
 else Done := False;
 end; { case }
 until Done;
 end; { while }
 end; { DoSearch }

{--- Search functions ---}
procedure Search;
 var
 FieldNumber, I: Word;
 begin
 if not FirstTableOpen then
 begin
 WriteLn('table not open');
 Exit;
 end;
 { List the fields to search on }
 WriteLn('Select Field');
 for I := 1 to NFields do
 WriteLn(I, ' ', Names^[I]);
 { Get the input field to search on }
 if InputField(FieldNumber) = FAILURE then
 Exit;
 { Perform Search Options }
 DoSearch(FieldNumber);
 end; { Search }

procedure NewTable(Filename : String);
begin
 if LoadTableStructure = Failure then
 Writeln('NewTable: Cannot open table')
 else
 if not Error(PXTblCreate(DataFile, NFields, Names, Types)) then
 FreeTableStructure;
end;

procedure KeyAdd;
var
 FldHandles : FieldHandleArray;

begin
 FldHandles[1] := 1; FldHandles[2] := 2;
 If not Error(PXKeyAdd(DataFile, 2, FldHandles, Primary)) then
 Writeln('Key field inserted');
end;

procedure Menu;
var
 C : Char;
begin
 StructureFile := 'author.dat';
 DataFile := 'author';
 repeat

 Writeln;
 Writeln('1 - New Table');
 Writeln('2 - Open Table');
 Writeln('3 - Add Entry');
 Writeln('4 - Search Entry');
 Writeln('5 - Close Table');
 Writeln('6 - Quit');
 C := ReadKey;
 Case C of
 '1' : Begin
 Writeln('Enter File Name to Save As: ');
 Readln(StructureFile);
 NewTable(StructureFile);
{ KeyAdd;}
 end;
 '2' : Begin
 ShowFiles(DataFile);
 OpenStruct(DataFile, NdxHandle);
 end;
 '3' : AddEntry;
 '4' : Search;
 '5' : CloseStruct('indextbl', NdxHandle);
 '6' : ;
 else
 Writeln('Invalid Option');
 end;
 until
 C = '6';
end;

{--- Main ---}
Var
 Val : Integer;
 Ndx : Real;
 Done : Boolean;
 Ans : Char;
 PxErr : Integer;
Begin

 ClrScr;
 FirstTableOpen := False;
 PXErr := PXSetDefaults(32, 5, 10, 1, 10, SortOrderAscii);
 PxErr := PxInit;
 If PxErr = PxSuccess then
 Menu
 Else
 begin
 Writeln('problem');
 readln;
 Halt(1);
 end;
 PX(PXExit);
End.





































































Special Issue, 1993
DATABASE TUNING: PRINCIPLES AND SURPRISES


Strategies for database optimization




Dennis Shasha


Dennis is an associate professor at NYU's Courant Institute, where he does
research on transaction processing, real-time algorithms, and pattern
matching. He also consults at UNIX System Laboratories. He can be reached at
shasha@cs.nyu.edu.


Database tuning is the activity of making a database system run faster. The
tuner may have to change the way applications are constructed, select new
indexes, tamper with the operating system, or buy hardware. Understanding how
to do this requires a broad knowledge of the interaction among different
components of a database management system (DBMS).
This article attempts to lay a principled foundation for tuning and then
illustrates some surprising interactions. My approach to tuning comes from a
number of sources: my own experience as a member of a team that designed and
implemented an embedded DBMS for AT&T Bell Labs, my DBMS consulting experience
for Wall Street firms, and finally from tapping the expertise of tuning
consultants affiliated with DBMS vendors such as Oracle, IBM, Sybase, Ingres,
Servio, and O2 Technology.


Toward a Rational Approach to Tuning


If you consult the commercial DBMS product manuals for tips on tuning, you
will probably get some useful advice, but that advice is presented as
disjointed rules of thumb--like so many fortune cookies. This jumble of
system-specific facts is hard to manage and remember, in the same way that
it's hard to recall the last ten fortune-cookie messages you've read.
My tuning strategy rests on a few common principles that, while they may lack
the generative force of mathematical axioms, can put order among the rules of
thumb. These principles have the added benefit of explaining interactions
among various levels of the system--the connection between index choices,
concurrent contention, and buffer management. When reduced to sound bites, the
following principles may sound obvious. Even so, they are often ignored. It's
therefore worth your while to make a mental checklist and explicitly consider
each tuning maxim:
Think globally, fix locally.
Partitioning cures bottlenecks.
Starting is expensive, continuing is cheap.
Render unto the server what is due unto the server.


Think Globally, Fix Locally


Say you're presented with a slow system, and you check the running time of
each compiled query and discover that one of them is slow. Should you create
indexes or take other action to make the query run faster? Well, maybe. First,
you should check the accounting statistics to make sure the query runs often
enough to be important in the global scheme of things. This seems elementary,
but many people waste a lot of time tuning infrequently executed queries and
then wonder why their efforts don't bear fruit.
As a second example, suppose you discover that all your disks are saturated.
Should you buy a new one to reduce the load? Again, the answer is maybe. Many
cheaper alternatives may work just as well. For example, a frequently executed
query that performs an equality selection may be scanning instead of searching
through an index. Or perhaps a query that performs a SELECT * can be written
to select just the attributes it needs and can thereby be answered completely
within some dense index. Thus, a local fix to a single query may reduce global
expenses.


Partitioning Cures Bottlenecks


A bottleneck occurs when too much work is thrown at too few resources.
Partitioning means spreading the work among more resources. Surprisingly, this
may or may not entail the replication of physical resources.
For example, suppose a bank has many branches and overwhelms the resources of
a mainframe cluster. Physical partitioning may help in this case, since most
account activity occurs at the home branch of each depositor. So, creating a
computing resource for each branch and partitioning the accounts so each
depositor's account information is placed on his or her home branch's
computing site may eliminate the bottleneck.
Now, consider a situation in which a long batch update transaction occurs
concurrently with many short online transactions, causing lock and resource
contention. In this case, you should ask whether the batch transaction can run
at a time when there is little online transaction activity. Such temporal
partitioning requires no additional physical resources--only a more even use
of existing resources.
Finally, suppose you discover excessive lock contention on the free lists of
your database buffer. Free lists are managed as follows: A transaction thread
dequeues a page from a free list after acquiring the semaphore on that free
list. Each of the several free lists has its own semaphore. Lock contention on
the free lists results from lock contention on those semaphores. A good
approach is to increase the number of free lists in order to increase the
number of semaphores and thereby decrease the amount of lock contention. This
form of partitioning is known as logical partitioning, because the only
resources replicated are locks, a purely soft resource.


Starting is Expensive, Continuing is Cheap


It takes about as long to read or write a track of disk as to read or write a
small part of a track. Similarly, it takes little more time to send one
kilobyte across a network than it does to send one byte. The general lesson is
that setup costs often dominate running costs.
Consider an application that inherently requires scans of files--for example,
it performs range queries on nonclustered attributes. The scan will take a lot
less time if each read retrieves a large portion of a disk track rather than a
single page. This requires that the file be laid out sequentially on disk and
that the prefetching factor be set to fetch eight or more pages. The idea is
to amortize the setup costs over several accessed pages.


Render Unto the Server What is Due Unto the Server


Suppose that an important application must take some action every time data is
inserted into a table. One approach is to check this table periodically from
the client to see whether new tuples have been inserted since the last time
the client looked.
This polling approach has two complementary problems. If the client polls too
frequently, then it may incur useless overhead by issuing polling queries even
when no insertions have taken place. Conversely, if the client polls too
seldom, it may miss tuples that have been inserted and then subsequently
deleted before the next polling query takes place.

A better approach is to embed a program in the server that will execute
exactly when tuples have been inserted. This trigger-based approach (triggers
are analogous to hardware interrupts) will neither incur unnecessary overhead
nor miss insertions. Not all DBMS packages offer triggers, but more and more
do all the time.
As a second example, consider a circuit-design application which must traverse
the circuit-graph structure many times, to check for power consumption,
current noise, testability, and so on. Accessing the necessary pages from the
client site across the network to the server site may be hopelessly
inefficient. Some object-oriented database management systems give users the
option of placing database buffers on the client site, thereby reducing the
number of network messages.
The rule of thumb that follows from this principle is that user interaction
and compute-intensive tasks should occur at the client, and data-dependent
tasks should occur at the server.


Pitfalls for the Unwary


A common approach to teaching database internals is to teach query processing
as if the system were single-user and then to teach concurrency control and
recovery as a nearly independent topic. This approach works well, since query
processing and index selection do not commonly take concurrent activity into
account. In fact, it is common for one part of the development team to
implement query processing and for a different part to implement concurrency
control and recovery. This separation of concerns may be good pedagogy,
perhaps even good software engineering, but it is bad tuning policy. The
following sections present two extended examples that illustrate the kinds of
interactions that can take place at different levels of a system.


The Case of Contending Locks


You have discovered that concurrent inserts into a table encounter severe lock
contention. In fact, the insert transactions appear to occur serially, one
after another. Your system uses page-level locking, and you have no clustering
indexes, but several nonclustering ones.
The diagnosis: The absence of any clustering index implies that all new
insertions will go to the last page of the file of tuples. This means that any
insert transaction will have to hold a lock on the last page of the file until
the transaction ends.
A necessary condition for avoiding a bottleneck is to distribute the inserts
among the data tuples. Recall that a clustering index based on a B-tree on an
attribute (or attributes) X will impose an organization on the data tuples, so
all tuples having values near X will be near one another.
How can this help? Provided that most concurrent inserts won't have near-X
values, a clustering index will disperse the inserts across the data file,
thereby eliminating the lock-contention bottleneck.
But there can still be a problem if a sequential key is being used. A
sequential key is one whose value is proportional to the time of insertion, a
timestamp. If X is a sequential key, then newly inserted tuples will have the
largest X values of any tuples in the data table. So, there will still be a
lock-contention bottleneck on the last data page.
One solution is to choose a non-sequential key on which to cluster. That will
eliminate the lock contention on the data pages. If you have a nonclustering
index based on a B-tree on a sequential key, however, then there will be lock
contention on the last page of the B-tree whenever there are many concurrent
inserts. If the data structure were a hash structure, however, then nearby but
unequal X values would be widely dispersed in both the data structure and the
data table. So, hashing is a good strategy if you must index sequential keys.
Thus, key type, insert frequency, and data-structure type interact in ways
that may seem obvious only in retrospect. If many inserts occur on your table,
then either form a clustering index on a non-sequential key using a B-tree
data structure, or form a clustering index based on a hash structure. If you
want a clustering or nonclustering index on a sequential key, then consider a
hashing data structure, if one is available.


The Case of Slow Accesses


Our second example has to do with a bank that keeps track of its depositors
using the relation:
 account(id, balance, name, street, city, zip)
There are essentially two types of accesses: accesses to the balance
attribute, which result from deposits, withdrawals, and queries; and accesses
to the entire relation in order to send out monthly statements.
Accesses of the first type may occur up to several times a day and are, on the
average, much more frequent than accesses of the second kind. This suggests a
vertically partitioned design of the form:
 accountbal(id, balance) accountrest(id, name, street, city, zip)
This makes the second style of query more expensive because it requires a
join. The first style, however, may become more efficient for two reasons:
The original account tuples are much larger than the accountbal tuples, by a
factor of 4 to 10. Thus, the likelihood of finding a random accountbal tuple
in the database buffer is likely to be at least four times as large as the
likelihood of finding an account tuple in the buffer. If both likelihoods are
very small, then this will not help much, but if 20 percent of the account
tuples would have been found in the buffer, then approximately 80 percent of
the accountbal tuples will now be in the buffer (and perhaps more, because of
the vagaries of the least-recently used algorithm used to manage the buffer).
A sparse clustering index is a data structure having one pointer per page, as
opposed to one pointer per record. Thus, a sparse clustering index on a table
with small records will have far fewer pointers than one with larger records.
This may mean that the sparse index will be one level shallower in accountbal,
thus saving a disk read on each record access.
Thus, the benefits of vertical partitioning depend critically on the relative
access frequencies to the different partitions and on the relative tuple
sizes. The benefits increase significantly if your system offers sparse
indexes and if your buffer is large.


Troubleshooting Lessons


Database tuning is based on a few principles and a body of knowledge. Some of
that knowledge depends on the specifics of particular DBMS packages, such as
which index types each system offers. But in actual practice, much tuning is
independent of vendor, version number, and even data model (for example,
hierarchical, relational, or object oriented).
When troubleshooting a system, the tuner must decide whether to attack global
problems first, such as concurrent contention or slow logging, or to dispense
with local ones, such as tuning a certain important query. Either way, you
will eventually look at lock statistics, disk load, operating-system
priorities, relative query frequencies, and query plans. Before deciding on a
course of treatment, you must formulate a correct diagnosis. Even without a
good bedside manner, you can use the principles here to get a sick database
system back on its feet, earning you the ephemeral gratitude of its many
users.


References


Shasha, Dennis. Database Tuning: A Principled Approach. Englewood Cliffs, NJ:
Prentice-Hall, 1992.














Special Issue, 1993
EXTENDING FOXPRO


Understanding the API is the key




Michael L. Brachman


Michael is senior systems analyst for Micro Endeavors, where he develops
applications for Fortune 500 companies and conducts FoxPro training seminars.
He's published articles in Data Based Advisor, FoxTalk, Science, Brain
Research, Journal of the Acoustical Society of America, and other
publications. Michael can be contacted at 215-449-4680 or on CompuServe at
71045,1217.


After developing a database application in FoxPro, you do a quick benchmark
and discover to your horror that performance is going to be too slow by a
factor of five. Luckily, FoxPro's application programming interface (API) lets
you add functions and features that can, among other things, improve
performance anywhere from three- to fifty-fold, depending upon what you are
trying to do.
Let's say that your application required counting the number of occurrences of
a string in a number of files. This could be the forerunner to a global
search-and-replace utility for a document index generator, for example.
Several string-counting strategies come to mind. One might be to place every
word or every line of a document into a database, but we'll assume this isn't
practical. Another strategy might be to use low-level file I/O to read in the
document and check the incoming stream against the comparison string and count
the number of matches. Listing One (page 37) shows FoxPro code for this
method. This may not be the best method of performing this function; it's
simply presented as a slightly artificial demonstration of how to use the API.
You would call it from FoxPro with ?search1("testfile.prg", "cor") and it
would count the occurrences of cor in the file TESTFILE.PRG. This may not
yield the desired count because the string cor is also part of the words corn,
rancor, and incorrect. To account for this, the program accepts the % symbol
as a wildcard. Without the % sign, the routine only counts the occurrences of
the desired string delimited with nonalphabetic characters. Our example would
reject corn, rancor, and incorrect as invalid occurrences. If the % sign is
the first character, it relaxes the rule at the beginning and will include
rancor as an occurrence. If the % sign is the last character, the routine
relaxes the rule at the end and will include corn as a valid occurrence. If
the comparison string includes a % sign at the beginning and end, any
occurrence of cor will be included. This is the same as the $ operator in
FoxPro.
Now for the critical test. Assume that you implemented this routine and tested
it on a particular file on a particular machine--sort of a benchmark. Say that
the response took about 2.5 seconds, which is not unreasonable. The only
problem is, your application hinges on this taking under one second for a file
this size. Your choices are simple: Get a faster machine or do it a different
way. We'll assume that you don't have the option of getting a faster machine
since your client has already made a substantial investment in hardware and is
unwilling to upgrade 400 machines just to make one routine run faster. The
next option would be to not use FoxPro and rewrite the whole application in
another language. Because FoxPro offers a rich application-development
environment and a powerful database manager, abandoning it is the last thing
you want to do. There must be another way.


FoxPro Backgrounder


There are two kinds of computer languages in the world: compiled and
interpreted. An interpreted language reads in each statement and interprets it
into machine language every time the statement is encountered. This is
somewhat inefficient, but makes application development particularly easy.
Compiled languages introduce a preprocessor, which compiles the program down
to machine language one time. This boosts performance tremendously, but it
takes much longer to develop programs because every time a change is made, the
program must be recompiled.
FoxPro's procedural language is a superset or dialect of Xbase. Traditionally,
Xbase has been an interpreted language, making application development easy
but limiting performance. FoxPro is a hybrid. At its heart, it is still an
interpreter but it does not interpret your original source code. It
precompiles each program module the first time it encounters it into a compact
code called "p-code" and thereafter interprets the p-code. Performance is much
better than an ordinary interpreted language, but there is still some overhead
involved with interpreting each line of p-code. Do not confuse this with
performance of an individual command. Each FoxPro high-level command launches
a procedure written in C or assembler. Performance here is the best it can be.
What we're talking about are looping constructs and the overhead associated
with the loop itself. In the string-search example, we use a FOR/ENDFOR loop
to work our way through the test code.


Accessing the API


So the solution becomes clear. Why not rewrite the specific section in C and
call it from FoxPro? Listing Two (page 37) and Listing Three (page 38) provide
a means to do just that. This time we take advantage of the knowledge that
none of our source files will be all that large, and we read the file into
memory first. We use malloc() to allocate as much memory as we need to hold
the entire file. The search now proceeds memory to memory. This code was
compiled using the Watcom C compiler. The reason for that will be explained
later. Once compiled and linked, the program called SEARCH2.EXE was tested
from DOS using the command line search2 testfile.prg cor.
A quick test with a stopwatch reveals that this program executes so fast, it
can't even be timed properly. That means an execution time of under 0.1
seconds. Things are looking good. The program stores the answer in a file
called SEARCH2.ANS, which reveals the correct answer when typed to the screen.
All we need to do now is go back to FoxPro and issue the command RUN search2
testfile.prg cor. Disaster! This one line of code takes over four seconds to
execute, not counting the time it takes to open and read in the answer. What
happened? The mechanics of the RUN command dictate that FoxPro open a new copy
of COMMAND.COM. This takes time. FoxPro must tuck away certain registers and
memory locations before launching COMMAND.COM, and this takes time also. We
assume that the program executes as quickly as it did before. Returning from
DOS, FoxPro must purge the second copy of COMMAND.COM and restore all the
memory locations and registers. The overhead for all of this is about four
seconds on the particular machine tested.
You can reduce the time it takes to swap out to DOS by placing a copy of
COMMAND.COM on a RAMDISK and changing COMSPEC=. You can also reduce the time a
small amount by creating SEARCH2.ANS on the RAMDISK. There are various other
tricks you can perform within FoxPro, like turning off the resource file,
forcing FoxPro's swap file to the RAMDISK, and so on. But even after all these
tricks, the total execution time is still around three seconds. So what have
we gained?
The solution, of course, is to take the string-search routine and attach it to
FoxPro using the API, bypassing the need to swap out to DOS. This isn't just a
matter of recompiling and relinking, but it is very nearly so. There are
certain things you cannot do from within the API, but in every case a
substitute is available. Listing Four (page 38) is the port of SEARCH2.C,
called SEARCH3.C, which is compatible with FoxPro's API. Note that a new
header file has been added: PRO_EXT.H. This file contains all the structures
and function predefinitions you will need to attach routines to FoxPro. It is
available in the Library Construction Kit (LCK) described further on.
Several general rules must be observed. First, FoxPro controls all of memory.
You cannot use malloc(). Instead, you ask FoxPro for a block of memory, which
is then assigned a memory handle using the function_AllocHand(). The memory
handle can be converted to a pointer using the function_HandToPtr(). Next,
because you cannot use a file pointer, you instead use a file channel, which
again is assigned by FoxPro. You cannot use low-level file routines such as
fopen() and fread(). Instead, you use the equivalent routines _FOpen() and
_FRead(). Calls to string routines such as strlen() and memmove() must be
replaced by calls to _StrLen() and _MemMove(). You cannot use printf(), but
there are equivalent routines for that also.
Parameters are passed via a special structure called a ParamBlk, which is a
union of two "faces." The first face, called the Value structure, is used when
parameters are passed using call by value. The other face is called a Locator
structure and is used for call by reference. The ParamBlk structure is a
subscripted structure that permits multiple arguments to be passed. At the
bottom of Listing Four, you will see a structure labeled FoxInfo. This is the
table used by FoxPro when you attach a library file. The first argument
(always in upper case!) is the name of the function FoxPro will use to call
your routine. The second argument is the function to be called. The third item
is the number of arguments that will be passed. The last table entry tells
FoxPro which types of arguments should be passed. In this particular case,
both arguments must be of type Character.
Converting a FoxPro-style string to a null-terminated string is demonstrated
in Section 2 of Listing Four. You must retrieve the string length from an
element of the Value face using the notation shown in Example 1.
Example 1: Retrieving the string length from an element of the Value face.

 parm->p[0].val.ev_length

 parm-> pointer to the ParamBlk structure
 p[0]. first argument
 val. Value face
 ev_length integer length element

The actual string is not contained in the Value structure, but rather in its
memory handle, which is found in parm->p[O].val.ev_handle. You convert this to
a real pointer using _HandToPtr(), then copy the bytes to a local array. You
cannot count on FoxPro to null terminate strings; you must supply the null
termination explicitly.
The rest of the code parallels the stand-alone DOS function. Once things have
been set up properly, the function contained in SENGINE.C is called, which is
identical to the function called from DOS. This is not an accident. I
purposely set up an interface which would be compatible under both
circumstances to reduce my debugging time. Not all situations lend themselves
to this organization, but you should try to do this if you can. Create a
routine and test it from DOS, then replace the front end with one written for
FoxPro, if possible. If you go back and check SEARCH2.C, you'll see that I
created definitions for _StrLen and _MemMove so that I wouldn't have to change
SENGINE.C when I switched to FoxPro.
Once the function has calculated the number of matches, you will want to
return the value back to FoxPro. Unfortunately, you cannot use a standard
return because all functions invoked from FoxPro must be of type Void.
Instead, you use special _RetXXXX() functions that format the data properly
for FoxPro. In this particular example, we use _RetInt() to return an integer.
The first argument of _RetInt() is the value itself; the second is the width
the argument would take on screen.
After all this, you compile the function using the Watcom compiler with the
command line wcc /s /zu /zW/ml /fpc search3. This invokes the WATCOM compiler
WCC.EXE. The command line options are: /s, remove stack over-flow checks; /zu,
decouple the stack segment from the data segment; /zW, use Microsoft Windows
naming conventions; /ml, use the large model; and /fpc, use only the
floating-point library, not the coprocessor. You then link the output of the
compiler (SEARCH3.OBJ) with the command line wlink file api_l, search3 lib
proapi_l, clibl.
API_L.OBJ is a Microsoft-supplied file that contains the interface to FoxPro.
PROAPI_L.LIB contains all the functions used by the API. Both files are
available for purchase in the LCK sold by Microsoft. Also contained in the
FoxPro 2.0 version of the LCK is a copy of the Watcom compiler, the only
compiler that works at present, due to its method of passing parameters. Other
compilers such as Microsoft C 7.0 will likely also be supported when FoxPro
2.5 is released, but this was not certain at the time this article was
written.
After linking, you will need to copy or rename the file API_L.EXE to
SEARCH3.PLB. PLB is the default extension for API add-on routines. You attach
the function within FoxPro using the command: SET LIBRARY TO search3. If you
do a LIST STATUS, you will see that the function SSEARCH has been added. You
would invoke with the command: ?ssearch("testfile.prg", "cor"). A quick
benchmark reveals an average execution time of 0.066 seconds, a performance
improvement of about 40-fold! Other examples may not be quite as dramatic, but
the fact remains: The API can be used to boost the performance of I/O-bound
routines or computationally intensive loops by at least a factor of three up
to as much as 50-fold.
The API can be used to add graphics, serial communications, encryption,
compression, network, and other functions to FoxPro. Many of these functions
have already been written by third parties and are available for purchase, or
you can write your own routines.


Conclusion



In summary, FoxPro provides an exceptional application-development environment
and superior database-management tools. Under all circumstances, you should
write your application in FoxPro and only consider using the API if you need a
function it cannot provide, such as access to new hardware or when a
particular section of code (most likely a looping construct) needs to run
faster. Working in C is like walking a tightrope without a net. No one is
going to be able to help you if you crash. However, with a little bit of care,
you can achieve great results. Good luck!

_EXTENDING FOXPRO_
by Michael L. Brachman


[LISTING ONE]

* PROCEDURE search1
* single file string search
* notes: case-sensitive. word must match exactly unless '%' relaxes rule on
* that end. e.g. "%int" means char preceding can be alpha
* searches default directory only
* Copyright 1993 by Micro Endeavors, Inc. 3150 Township Line Road,
* Drexel Hill, PA 19026, (215) 449 - 4680

PARAMETER l_infile,l_srchstrg
* l_infile = file specification
* l_srchstrg = string to search for
* returns number of occurrences
*
PRIVATE ALL LIKE l_* && protect other variables
l_count = 0
l_ifh = FOPEN(l_infile) && open the file
IF l_ifh > 0 && meaning valid file
 DO CASE
 CASE l_srchstrg = '%' AND RIGHT(l_srchstrg,1) = '%'
 STORE .T. TO l_relaxr
 STORE .T. TO l_relaxl
 l_srchstrg = SUBSTR(l_srchstrg,2,LEN(l_srchstrg)-2)
 CASE l_srchstrg = '%' && can begin with any character
 STORE .F. TO l_relaxr
 STORE .T. TO l_relaxl
 l_srchstrg = SUBSTR(l_srchstrg,2)
 CASE RIGHT(l_srchstrg,1) = '%' && can end with any character
 STORE .T. TO l_relaxr
 STORE .F. TO l_relaxl
 l_srchstrg = LEFT(l_srchstrg,LEN(l_srchstrg)-1)
 OTHERWISE && strict checking
 STORE .F. TO l_relaxr
 STORE .F. TO l_relaxl
 ENDCASE
 l_srchleng = LEN(l_srchstrg)
 l_fileng = FSEEK(l_ifh,0,2) && determine the file length
 = FSEEK(l_ifh,0) && put file pointer back to the beginning
 l_prevchar = " " && will represent character just to the left
 FOR l_char = 0 TO l_fileng
 l_curstrg = FREAD(l_ifh,l_srchleng+1) && read in length + 1
 IF l_curstrg = l_srchstrg AND ;
 (l_relaxr OR NOT ISALPHA(RIGHT(l_curstrg,1))) AND ;

 (l_relaxl OR NOT ISALPHA(l_prevchar))
 ** A match!
 l_count = l_count + 1
 l_char = l_char + l_srchleng && bump loop index by this amount
 l_prevchar = RIGHT(l_curstrg,1) && take snapshot of last char
 ELSE
 l_prevchar = LEFT(l_curstrg,1) && take snapshot of next char

 = FSEEK(l_ifh,l_char+1) && put fp back to where it belongs
 ENDIF
 ENDFOR
 = FCLOSE(l_ifh)
ELSE
 l_count = -1 && meaning couldn't open file
ENDIF
RETURN l_count






[LISTING TWO]

/* S E A R C H 2 . C
* single file search
* notes: case-sensitive. word must match exactly unless '%' relaxes rule on
* that end. e.g. "%int" means char preceding can be alpha
* searches default directory only
* Copyright 1993 by Micro Endeavors, Inc., 3150 Township Line Road,
* Drexel Hill, PA 19026, (215) 449 - 4680 */

#include <stdio.h>
#include <stdlib.h>
#include <dos.h>

#define _StrLen strlen
#define _MemMove memmove
#define _MemCmp memcmp

extern int
main(int argc,char *argv[])
{
 /* SECTION 1 - DEFINE VARIABLES */
 FILE *fp; /* File Pointer (structure) */
 char *buff,fname[40];
 char ans[20];
 long filen;
 int retval=0,icount;
 struct find_t finfo;

 /* SECTION 2 - SETUP NAME OF FILE */

 /* not needed because argv[1] points to name already */

 /* SECTION 3 - GET SIZE OF FILE */
 _dos_findfirst(argv[1],_A_NORMAL,&finfo);

 /* SECTION 4 - SEE IF FILE EXISTS */
 fp = fopen(argv[1], "r" );

 if (fp == NULL) { /* illegal file name */
 if (argv[1][0] < 32)
 icount = -1;
 else
 icount = -2;
 }

 else { /* file is legal */
 /* SECTION 5 - CREATE BUFFER FROM MEMORY POOL */
 buff = malloc(finfo.size);
 if (buff == NULL)
 icount = -3;
 else {
 /* SECTION 6 - LOAD BUFFER FROM DISK FILE */
 fread(buff,1,finfo.size,fp);

 /* SECTION 7 - CALL THE "ENGINE" */
 icount = nomatches((char far *)buff,finfo.size-1,argv[2]);

 /* SECTION 8 - CLEAN UP */
 free(buff);
 fclose(fp);
 }
 }
 /* SECTION 9 - RETURN ANSWER */
 fp = fopen("search2.ans","w");
 if (fp != NULL) {
 sprintf(ans,"%d",icount);
 fputs(ans,fp);
 fclose(fp);
 }
 return 0;
}
#include "sengine.c"







[LISTING THREE]

/* THE STRING SEARCH "ENGINE" */
extern void strncopy(char *,char *,int);

extern int
nomatches(char far *bf, long filelength, char far *sstring)
{
 char l_srchstrg[500],l_prevchar;
 char l_curstrg[500];

 int l_count = 0;
 int l_relaxr,l_relaxl;
 int l_srchleng;
 int l_fileng = filelength;
 int l_char;
 int sstrlen = _StrLen(sstring);
 switch ( 2 * (sstring[0] == '%') + (sstring[sstrlen-1] == '%') ) {
 case 3: /* both left and right are % */
 l_relaxr = 1;
 l_relaxl = 1;
 strncopy(l_srchstrg,sstring+1,sstrlen-2);
 break;
 case 2: /* only left is % */
 l_relaxr = 0;

 l_relaxl = 1;
 strncopy(l_srchstrg,sstring+1,sstrlen-1);
 break;
 case 1: /* only right is % */
 l_relaxr = 1;
 l_relaxl = 0;
 strncopy(l_srchstrg,sstring,sstrlen-1);
 break;
 default: /* neither are % */
 l_relaxr = 0;
 l_relaxl = 0;
 strncopy(l_srchstrg,sstring,sstrlen);
 break;
 } /* end of switch */
 l_srchleng = _StrLen(l_srchstrg);

 for (l_char=0; l_char<(l_fileng-l_srchleng); l_char++) {
 strncopy(l_curstrg, (char *)(bf + l_char), l_srchleng+1);
 if ( ( _MemCmp(l_curstrg,l_srchstrg,l_srchleng) == 0 ) &&
 ( l_relaxr !isalpha(l_curstrg[l_srchleng-1]) ) &&
 ( l_relaxl !isalpha(l_prevchar) ) ) {
 l_count++;
 l_char += l_srchleng;
 l_prevchar = l_curstrg[l_srchleng-1];
 }
 else
 l_prevchar = l_curstrg[0];
 }
 return l_count;
}
extern void
strncopy(char *dst, char *src, int count)
/* same as strncpy but guarantees null-termination */
{
 _MemMove(dst,src,count);
 dst[count] = 0;
}





[LISTING FOUR]

/* S E A R C H 3 . C
* single file search
* notes: case-sensitive. word must match exactly unless. '%' relaxes rule on
* that end. e.g. "%int" means char preceding can be alpha
* searches default directory only
* Copyright 1993 by Micro Endeavors, Inc., 3150 Township Line Road,
* Drexel Hill, PA 19026, (215) 449 - 4680 */

/* FoxPro 2.0 PLB-version--adds function ssearch(filevar,sstring)--
 returns # of words */
#include <stdlib.h>
#include <dos.h>
#include <pro_ext.h>

/* FUNCTION SSEARCH

 called from FoxPro as:
 ssearch(filevar,sstring) filevar can be a constant or char. variable
 returns: integer number of matches; -1 means "File Not Found";
 -2 means "Insufficient Memory" */
void far
ssearch(ParamBlk FAR *parm)
{
 /* SECTION 1 - DEFINE VARIABLES */
 FCHAN fh; /* File Handle (int) */
 MHANDLE bmh;
 char far *buff,fname[40];
 char far *srchstrg;
 int retval,filen;
 struct find_t finfo;

 /* SECTION 2 - SETUP NAME OF FILE */
 filen = parm->p[0].val.ev_length;
 _MemMove(fname,_HandToPtr(parm->p[0].val.ev_handle),filen);
 fname[filen] = 0; /* add null-terminator */

 /* SECTION 3 - GET SIZE OF FILE */
 _dos_findfirst(fname,_A_NORMAL,&finfo);


 /* SECTION 4 - SEE IF FILE EXISTS */
 fh = _FOpen(fname,FO_READONLY);
 if (fh < 1) { /* illegal file name */
 retval = -1; /* return of -1 means can't find it */
 }
 else { /* file is legal */
 /* SECTION 5 - CREATE BUFFER FROM MEMORY POOL */
 bmh = _AllocHand(finfo.size);
 buff = (char far *)_HandToPtr(bmh);

 if (bmh < 1)
 retval = -2; /* return of -2 means not enough memory */
 else {
 /* SECTION 6 - LOAD BUFFER FROM DISK FILE */
 _FRead(fh,buff,finfo.size);

 /* SECTION 7 - CALL THE "ENGINE" */
 srchstrg = _HandToPtr(parm->p[1].val.ev_handle);
 srchstrg[parm->p[1].val.ev_length] = 0; /* a little dangerous */
 retval = nomatches(buff,finfo.size-1,srchstrg);

 /* SECTION 8 - CLEAN UP */
 _FreeHand(bmh);
 _FClose(fh);
 }
 }
 /* SECTION 9 - RETURN THE ANSWER */
 _RetInt(retval,10);
}
#include "sengine.c"

FoxInfo myFoxInfo[] = {
 {"SSEARCH", ssearch, 2, "C,C"}
};
FoxTable _FoxTable = {

 (FoxTable FAR *)0, sizeof(myFoxInfo) / sizeof(FoxInfo), myFoxInfo
};




























































Special Issue, 1993
HYPERCARD DATABASE TUNING


Linked lists for greater performance




Jeff Elliott


Jeff, an engineer and freelance writer whose work has appeared in EIR magazine
and the San Francisco Chronicle, can be contacted at 6945 Hutchins Ave.,
Sebastopol, CA 95472.


Everybody knows that large HyperCard stacks are supposed to be slow. It's
common knowledge, part of the same musty storehouse of wisdom that reminds you
never to swim after a full meal, run while holding scissors, or believe a
politician during an election year. If everybody knows it, it must be true,
right?
Common knowledge isn't always true, of course, except perhaps the bit about
politicians. It certainly isn't true that large HyperCard stacks have to be
sluggish. HyperCard may not match the speed of a true database engine like
FoxBase, but HyperCard's find is no slouch either, taking only 20 seconds to
search my database of over 3600 cards and find 190 matches--that's an
impressive speed of about 180 cards per second. Not bad, but faster is always
better, and searches can be up to 20 times faster if you use linked lists.


What are Linked Lists?


Linked lists (as in links of a chain) tie together information scattered
through a database. This information can be text in a particular field, the
highlight of a button, or anything el e important enough to need frequently. A
recipe database might have a "main ingredient" field that specifies whether a
recipe uses chicken, broccoli, or other meats or vegetables. Each unique entry
in this field would be linked together: fried chicken linked to General Tso's
chicken, and so on. Another set of links may connect a popup that indicates
the cuisine type: Chinese, Italian, or Mexican. As the different links thread
their way through the stack they will sometimes overlap, as Chinese and
chicken do in the recipe for General Tso's chicken.
In addition to reducing search time because fewer cards have to be examined,
linked lists can also be a powerful navigational tool. Each item linked
provides an opportunity for the user to branch off in a new direction
following a link trail. And, to give the browser as much control as possible
(after all, this is a Macintosh), there can be a second link for each item,
providing the freedom to move backward through the chain as well as forward.
The benefits of linked lists--faster searches and easier browsing--can be lost
if it is hard to find one of the cards with the link you need. A good solution
is to maintain an index of all the linked strings. This index can be kept in a
scrolling field from which the user may select a linked topic. The recipe
stack, for example, would benefit from both a cuisine index and
main-ingredient index, allowing simple and quick access to linked lists
covering different topics.


HyperLinks


Unfortunately, the only linking capability built into HyperCard is the linkTo
option found in the button dialog, and it's unusable for linked lists. For
starters, linkTo is one of the few things that cannot be used in a HyperTalk
script. To link, therefore, the user must change to the button tool,
double-click on a card-level button to invoke the dialog, click on linkTo,
find the card to link with, click OK, and finally choose the browse tool
again.
Together, the link function and insert-Link subroutine (Listing One , page 41)
provide a better solution. These routines make the linking process invisible
to the user and create links for each topic on-the-fly. The link function also
builds and maintains an index of all linked topics. The linking is fast,
requiring only about 15 ticks to insert a new card into the chain, quick
enough that it's probably not necessary to change the cursor or otherwise
notify the user.
The links are stored in one or more hidden fields on each card. The card ID is
used for the links because it is guaranteed not to change if cards are added
or sorted. All of the links can be stored on different lines of a single
field, but there are some performance trade-offs when HyperCard has to work
harder to find the link. Figure 1 compares the speed of fetching a link when
it is alone in a field, when it is one of several items on a line, and when it
is on a separate line. When speed is important, it's best just to have a
single link in the field.
Figure 1 also compares the performance of HyperCard's find when there are only
two characters in a search string vs. three or more characters. It's well
known that triplet searches are quicker, but I was surprised to discover that
they're over 20 times faster. This is another advantage to using linked lists:
Because we're using card IDs instead of find, there's no performance
difference if there are fewer than three characters in the search string.


Linking Your Stack


To build your own linked lists, drop the link and insertLink functions into
the stack script of your HyperCard stack. As mentioned previously, there are
advantages to using the linked lists with an index. They provide fast search
access to a card with the desired link, and also a simple way for the user to
select a linked topic.
These routines expect you to have a card named index with both a back ground
field hiddenIndex and a card field index. Field index only has the selectable
text strings, and hiddenIndex (which obviously should be a hidden field) has
three items on each line: the string that the user sees, the ID of the first
card in the link, and the count of how many cards share the link. The index
field is not technically necessary, but it keeps things neat. Both fields must
have the "don't wrap" and "lock text" options turned on.
When the user clicks on a line in the index field, the script looks up the
string in hiddenIndex and takes the user to the first card in the linked list.
Listing Two (page 41) is the script for field index.
The link handler has four parameters: name, linkf, n, and i. The name
parameter is a string to search for in field hiddenIndex on the index card. If
this name is not found, it is added to the index. Otherwise, this card is
linked into the chain of other cards that share the name string in this field.
The next three parameters, linkf, n, and i, all refer to the same background
field on the card being linked. linkf is the name of the field that holds the
links, n is the line number, and i the item number for the forward link. The
backward link follows as item i + 1.
If you're only linking a single field, you may want to simply have an Add Link
button that prompts for a string (new or existing) and creates a new linked
card for the user. Listing Three (page 41) shows what the script for this
button might look like.
If you have more than one object to link, you should call link whenever data
has been entered; you don't want to clutter the cards with a multitude of
special buttons. When you are linking text fields, the closeField handler is
an ideal time to call link because this message is sent only when something
has been entered or changed in the field.
Once the links are in place, you can use them for navigation. In my HyperCard
stacks, each field and group of buttons that is linked has an adjoining pair
of arrows that allow the user to branch off and follow a particular link. The
scripts for these navigational arrows should read something like:
 go card ID item 1 of line 3 of field links
where the item 1 of line 3 is the next link in that particular chain.

_HYPERCARD DATABASE TUNING_
by Jeff Elliott


[LISTING ONE]

-- insertLink: Subroutine for Link. Links this card into the chain, both
-- forwards and backwards. Does not modify the index.
-- Arguments:

-- root card id of the first card in the chain
-- linkf name of the bg field on this card that holds the links
-- n line number for links in linkf
-- i item number for FORWARD link in line n
-- returns: 0: noErr, -1: no forward link, -2: no back link
function insertLink root,linkf,n,i
 -- BACKWARD link always follows FORWARD link as the next item in line n of
field linkf
 put i + 1 into j
 put the short ID of this card into myself
 put item i of line n of field linkf of card id root into forward
 -- set vars; who was the
 put item j of line n of field linkf of card id root into backward
 -- root pointing to?
 if there is not a card id forward then return -1
 if there is not a card id backward then return -2
 put myself into item j of line n of field linkf of card id root
 -- root's back is always myself
 if forward is myself then -- it's just the 2nd card
 put myself into item i of line n of field linkf of card id root
 -- so my forward is also root
 else -- no more root changes
 put myself into item i of line n of field linkf of card id backward
 -- his forward is me
 end if
 -- finally, make links for myself
 put backward into item j of line n of field linkf
 -- my back is previously oldest
 put root into item i of line n of field linkf -- my next card is the root
 return 0
end insertLink
-- Link: called from the card that's being linked. Before a link is added,
-- we must first lookup the name in the index. If this name isn't found,
-- then we create an index entry. Otherwise, we link this card to the others.
-- Expects: card "index" with background field "hiddenIndex" and card field
-- "index" -- Arguments: name, string to lookup in (or add to) the index;
-- linkf, n, and i are the same as for subroutine insertLink.
-- returns: a success or failure message
function link name, linkf, n, i
 put the short ID of this card into myself
 go card "index"
 find whole name in field "hiddenIndex" -- is this name in the index?
 if the result is "not found" then -- no, so add it to index
 -- first in the link, so create index entries
 -- format for hiddenIndex: string, card ID, count
 get the number of lines in field "hiddenIndex"
 put name & "," & myself & "," & 1 B
 into line it + 1 of field "hiddenIndex" -- tack it onto the end
 -- card field index only has the strings
 get the number of lines in cd field "index"
 put name into line it + 1 of cd field "index"
 sort cd field "index" -- make it pretty
 go back
 -- as the only card, it is just linked to itself
 put myself & "," & myself into line n of field linkf
 else -- the name's in the index, so we link to other cards
 add 1 to item 3 of line word 2 of the foundline B
 of field "hiddenIndex" -- increase link count in index
 put item 2 of the value of the foundline into root
 go back -- return to card being linked

 -- we're ready to link up with the other cards
 if insertLink (root,linkf,n,i) is not 0 then
 return "The links are damaged or missing." -- error exit
 end if
 end if
 return "Link successful." -- everything OK, normal exit
end link






[LISTING TWO]

on mouseUp
 -- works only in HyperCard 2.0 or later
 find whole the value of B
 the clickline in field "hiddenIndex"
 go card ID item 2 of B
 the value of the foundLine
end mouseUp





[LISTING THREE]

on mouseUp
 ask "Name to add:"
 if it is not empty then
 doMenu "New Card"
 put it into fld "myName"
 put link (it,"links",1,1)
 end if
end mouseUp

























Special Issue, 1993
EVENT-DRIVEN DATABASE PROGRAMMING IN C++


Can OODBMSs replace relational DBMSs?


 This article contains the following executables: EVENT.EXE


Dirk Bartels


Dirk is founder and president of BKS Software. He has a degree in computer
science from the Technical University in Berlin. Dirk is the chief designer of
the object-oriented database system POET, and he can be contacted at One
Kendall Square, Cambridge, MA 02139 or Fobetaredder 12, 2000 Hamburg 67,
Germany.


Concurrent processes and event-driven environments like those used with
Windows, Macintosh, or X/Motif require a database system that's an active part
of the entire environment. What happens, though, with information stored in
databases on a network? Do the old database management systems (DBMSs) based
on records and relations fit into these systems? I believe they don't.
Conventional database management systems do not contain event-driven concepts.
Moreover, their data model--based on records, tables, and a fixed set of data
types--is very simple. The result is that database designs are often abstract
and don't fit in with complex information models. This comes as no surprise
since the design of conventional database systems dates back to the '70s when
PCs, workstations, and local area networks didn't even exist. The objective of
these earlier DBMS designs was to support mid-range and host computers with
relatively simple database applications such as online transaction processing
(OLTP). The paradigm shift to object-oriented GUIs in the '80s will usher in a
shift to object-oriented database systems in the '90s.
Object-oriented database systems (OODBMSs) fulfill the new demands of modern
software systems. They have a rich data (class) model based on the same design
approach as 00 languages such as C++ and Smalltalk. Most of these systems are
smoothly integrated into an OOP language and provide relatively easy database
programming. Some of the tools available today also offer special features for
concurrency control based on event-driven mechanisms. Let us take a look at a
sample application that uses the features of the rich data model and
event-driven concurrency control.


The Requirements of the Sample Application


A typical workgroup computing application is a personal-information management
(PIM) system like Lotus Notes, where a group of people who share information
among themselves are connected to a network. A simplified data-base model that
meets the needs of a workgroup might include a calendar, to-do list, and
database for people, companies, and notes.
For the purposes of our example, I'll implement an object-oriented design for
the application and the database model because it helps implement a system
that's extendible, suitable for GUIs, and embodies an elegant and realistic
database schema that supports a 1:1 mapping from the programming model into
the database. While this database model is simplified, it nonetheless typifies
an OODBMS and illustrates its benefits. Due to space constraints, the complete
system is only available electronically; see "Availability," page 3.


The Class Hierarchy


Figure 1 shows a simple class hierarchy with classes for people and companies,
and several tasks, notes, and address types. If you adopted the relational
model for a hierarchy like this, you'd have to use a table for every class and
subclass and join them with keys and relations because the relational model
has no mechanism for expressing class hierarchies. This isn't efficient and
destroys the idea of objects in the database. An OODBMS, on the other hand,
will support these class hierarchies.


Object Identity, Containers, and Polymorphism


In addition to the relationships between base and derived classes (Task and
Mail in Figure 1, for instance), it's also important to consider how classes
use one another --that is, how they are embedded inside each other.
I do this in C++ by simply using a pointer or a list of pointers to express
the relationships between objects; see Figure 2. A C++ pointer provides a kind
of object identity (that is unique during the program's execution). It also
provides polymorphism, which in this case means that a pointer to a particular
object type may also point to other types of objects. For example, a pointer
to a Task object (a base class) could also reference (point to) either a
Meeting object or a Mail object. Lists of objects or lists of references
(pointers) to objects are quite standard in C++, and are more or less
generally available in foundation-class libraries. How does this influence our
database model?
The object-oriented model provides a mechanism called "object identity" that
is the memory address of an object in C++ called this. If you use object
identifiers in a database, you can avoid both creating and managing synthetic
numbers and joining objects (which you must do in relational database
systems). You simply reference objects and build frameworks and hierarchies of
objects. Additionally, you can avoid the use of foreign keys because object
identifiers provide polymorphism. This makes it possible to add a new class
(for instance, an "Important Deadline" class) without changing the class
Person and the rest of the application.
In the base class BasePerson, a list of Note objects can be managed in the
object (remember the object identifier?). Notes can be simple (for example,
after a visit) or specific to a certain phone call. In addition to the object
identity, I use a container --a popular OOD concept --to model complex
information. With the container, I can assign several objects (or object
references) to an object and store them within the object. This is a basic
feature of an OODBMS that can't be used in an RDBMS.
Many people entered into the database may belong to the same company. With a
flat-file database, you would have to retype the company's address again and
again. With this model, I just reference the existing company object. Also, I
may build a direct (and very straightforward) relationship between the company
and its employees by placing a container of Persons in the company object.
The Person and Company classes inherit a set of Notes from their base class
BasePerson.
The Person object contains a list of Task pointers. The Task container
provides a list of all tasks (meetings, phone calls, or mails) related to a
particular person.
Finally, a Person object includes a set of addresses. We have chosen this
model for people who have more than one residence in addition to their
business address, which is already managed in the referenced company object.
The Task schema (see Figure 3) completes our database class model and
highlights many of the benefits of OODBMS design discussed above:
The base class Task contains a reference to a Person object so that tasks may
be delegated.
Mail objects contain a distribution list ("CC=Carbon Copy") of persons.
Meeting objects contain a list of all participants and the meetings location.
The Phonecall object contains a pointer to a person to whom the phone call is
to be directed.
The simple class model presented up to this point leads to a complex object
framework; see Figure 4. If you're wondering why calendar and to-do list
classes haven't been designed in the database, keep in mind that there's no
real reason for these classes because a calendar (or to-do list) is only a
specific view (based on navigation and queries) on top of the database model.


The Event-driven Database Approach


The object-oriented class model provides many advantages for the application
(and application developer). For example, you can navigate from one object to
another without time-consuming queries; the OODBMS reconstructs the pointer
via the object identifier. Besides the navigation, there's still the
opportunity to form a query such as searching for a person named "Miller." The
simplified class definitions are in the header files (.hcd) in Listing One
(page 47).
What are the requirements for such a system within an interconnected
workgroup? Imagine a small work group of four people: Peter, the boss; Mary,
his secretary; Kim, a developer; John, a developer; and Cheryl, a salesperson.

Peter and Mary are working with Macintosh computers, while the programmers and
salesperson work in Windows. The team has different tasks that overlap:
Mary is scheduling tasks for Peter and Cheryl, so she needs access to their
schedules (to-do lists). She is also writing mail for Peter.
Peter is arranging meetings with his development team (Kim and John), his
sales organization (Cheryl and others), customers, and consultants. He is
delegating mail to Mary.
Kim and John are doing customer support and writing reports (notes) into the
customer database.
Cheryl is calling customers. She needs information (notes) from the support
group.
The workgroup database requires permanent access and updates, and the database
system must fulfill specific requirements to guarantee the integrity of
information distributed between the clients. This includes: object
distribution; concurrency control on objects; keeping track on updates of
objects; informing clients if a state of an object has changed; informing
clients if new objects are available; and delivering new object states.


Object Distribution


OODBMSs understand object structure and behavior. The object structure
consists of the class hierarchy, attributes, embedded and referenced objects,
and containers. The object behavior is expressed by the methods of the class.
All true OODBMSs support structural object orientation, and--depending on the
language binding and the implementation language--they more or less support
object-oriented behavior. The approach with languages that use static binding
(like C++) is to copy the object to the client (without its methods) and
construct it with its normal constructor or a similar factory mechanism that
combines the data part and the methods. Languages with dynamic binding (like
Smalltalk or CLOS) can also process object methods on the server side, as
shown in Figure 5 and Figure 6.
Object copying (Figure 6) has an advantage over object sharing (Figure 5)
because methods are processed in the local environment--the overhead for a
method call is far smaller. On the other hand, the OODBMS server has more
tasks because it has to synchronize the different copies of an object within
the network.


Concurrency Control on Objects


As with conventional database systems, it's necessary that an object be
updated in a controlled way. This is referred to as "object locking." OODBMSs
can perform lock requests on an object because OODBMSs know the structure.
Lock requests can be performed on a deep (that is, with all subobjects) or a
shallow (the class hierarchy only) level. Deadlock detection must be
performed, and other clients should be informed that the state of the object
has changed (it's locked).


OODBMS Events


As soon as the server confirms a lock request, it should inform other clients
that have a copy of this object about this new state. The server is sending a
message to the client and the application program can react, by changing the
color or the object, for instance, or simply avoiding an update. Other events
that should be propagated by the OODBMS server are object updates, object
deletions, and the creation of objects that belong to a set (container) of a
related object. Finally, there should be two-way communication between the
database server and the client. For instance, the server informs the client
about the state of a query and the client displays the state in a dialog box.
Another example would be the client interrupting a query with the Cancel
button in the dialog box.


Optimizing OODBMS Events


OODBMS event handling can be optimized in many ways. For instance, it might
not be always necessary or desirable to get numerous lock and update messages
about all the objects a client has loaded. The solution is a configurable
messaging concept based on several levels of information:
The client can suppress any messages to an object.
The client can select specific messages to an object.
The client can specify an update mode where new object states are
automatically delivered from the server.


The Programming Environment


To implement the sample program presented here, I've chosen a standard C++
environment under Microsoft Windows. I'll use Borland C++ 3.1 and Object
Windows Library (OWL), the POET/Windows OODBMS 2.0 on the client side, and the
POET/Novell OODBMS server on the server side.
I've simplified the sample program by developing only the persistent class
Task, a dialog box to insert or update objects of this class, and a browser to
show all instances of this class. Event handling takes place between the Task
dialog box and the Task browser. The application class is called DrDobbs. It
constructs a MDI desktop class Desktop. This class drives the entire
application. Three different kinds of events will be handled by the
application: insert, lock, and update events.
POET's event-handling mechanism is based on callbacks that occur when a client
method is called after the POET system encounters a specific insert, lock, or
update event. When working with callbacks, keep in mind that the class that
contains the callback method must be derived from the base class PtCallback,
and that the callback method must bind to the event. Every possible POET event
has a kind of predefined slot with a default callback method that does
nothing. This slot can be reset with the user's callback method by calling a
set method or overloading the PtCallback:: Notify() method.


Handling Insert Events


In this example, we'll assume that two processes (Windows clients) are working
with the same database. Process 1 inserts task objects, and process 2 displays
the task browser. Every time a new object is stored in the database, process 2
should redraw or update the Task browser. Process 1 creates the Task object,
and the user fills it in the Task dialog box. The Task class is derived from
PtObject, a base class that is a tool class from the POET environment and
inherits the methods PtObject::Assign() and PtObject::Store() to make the
object persistent.
Process 2 constructs an object of the class TaskAllSet. Every persistent class
in the POET programming environment is associated with a template class
<class> AllSet that contains all stored objects of this <class>. The base
class PtAllSet inherits methods like PtAllSet::Seek(), PtAllSet::Get(), and
PtAllSet::Query() to the derived class to access the persistent objects. The
task objects should be displayed in a browser window called AllsetTaskBrowser.
The task-browser object should be notified if the state of the TaskAllSet
changed; that is, a new task object is inserted into the database.
We implement a method CallbackClass::NewObject() that should be called in case
of this event. To simplify the implementation, call the method
BWindow::Redraw(), which paints the whole browser again. (Note that BWindow is
the base class for AllsetTaskBrowser and provides basic browsing
capabilities.)
The next step is to create the watchpoint object. A database object, a
callback method, a watch mode, and a depth mode must be specified. In this
case, the database object is the TaskAllset and the callback method is
TaskBrowser::NewObject(). The watch mode tells the database system which
events should be cached: store, update, delete, lock, and unlock. These events
are represented as bits that can be shared. Finally, the depth mode specifies
the way the object is treated. The possible modes are shallow, deep, and flat.
If the watchpoint object is created, then the Watch() method of the database
object is called and everything is set. The handling of update and delete
events is identical and we haven't included a specific treatment in our
example.


Handling Queries


Queries normally take more time, and the application has to provide some kind
of response--an hourglass or a message box, for instance. POET offers a
three-step mechanism to give the application the control over the state of the
request. After the request is sent, the server sends back a confirmation event
to the client. During the process, the server periodically sends a pending
event to the client, say every two seconds. Finally, the server sends a
completion event to the client when the request terminates. You can define
callbacks for these events. Our sample application has specific callbacks for
each event. We define a simple query asking for a specific deadline of a task.
The query is implemented in the method QueryAbortDialog::Query(), displaying a
dialog box with an input field. The callback methods are part of the desktop.
When the request is started, the first callback CallbackHdl::StartQuery()
displays a message box with a Cancel button. The Cancel button can be pressed
by the user and the pending event callback CallbackHdl::PendingQuery() handles
it and aborts the request. Finally, the dialog box is closed and destroyed by
the terminate callback CallbackHdl::TerminateQuery().
Unlike the watch-and-notify mechanism (see Figure 7), these events are easier
to implement because they do not have to deal with specific objects, watch
modes, and depth modes. Three methods are available in the class PtBase, which
represents an open database: PtBase::SetActionStarting(),
PtBase::SetActionPending(), and PtBase::SetActionTerminated() are responsible
for setting the callback functions in the manner just discussed. The callbacks
are set in the QueryAbortDialog::Query() method. The result of the query is a
set of task objects that we display in a browser.



Handling Lock Events


Receiving events that other processes in the network are using or modifying
objects that are actually used could have a major impact on the user
interface. For instance, a sophisticated application could gray all lines of a
browser that are in edit mode on other workstations.
For example, every object we read in and display in the browser should be
assigned a callback function to handle this event. Our simplified solution to
these kinds of events is to display a message box that mentions the name of
the task object and the event state (lock or unlock). The mechanism to define
this callback is identical to that of the insert event. You can see the
implementation in the method CallbackHdl::LockEvent().


Conclusion


What is database event-handling all about? First, and most important, I think
it is a necessary extension to databases because it helps the user get the
most correct and most up-to-date information. Second, it helps the programmer
implement systems for delivering this information. It would be extremely time
consuming to develop customized event handling on the application side. It
might be possible on a local machine, but it seems to be impossible in
multiplatform networks. Until now, we always had to take a step back and live
with restrictions like simplified data models or snapshots of databases. With
OODBMSs, we can overcome these restrictions.
The workstation model is based on event-driven applications that should work
transparently between concurrent processes. Conventional database systems are
not designed to meet these demands. They don't support event handling and
their data model is too simple to manage the complex and dynamic information
of new power applications. OODBMSs can deliver the required functionality.
Eventually, OODBMSs will likely become the next integral extension of modern
operating systems and be common as today's GUIs and network operating systems.

_EVENT-DRIVEN DATABASE PROGRAMMING IN C++_
by Dirk Bartels


[LISTING ONE]


/////// INFO.HCD

#include <bks.hxx>
#include <ptcomp.hxx>
#include <ptstring.hxx>

#include <_defs.h>

_CLASSDEF(TWindowsObject);

// ZWInfo
/////////////////////////////////////////////////////////////////////////////

enum InfoType { ZWTASK, ZWMAIL, ZWPHONECALL, ZWMEETING, ZWNOTE, ZWREPORT,
 ZWPHONEREPORT };
// forward declaration
persistent class ZWPerson;
persistent class ZWInfo
{
protected:
 PtString subject; // optional
// cset<ZWPerson*> relations; // optional
 PtDate date; // last change
 PtTime time; // last change
 PtDate expiration; // optional, if Info recycled
public:
 ZWInfo(void) {;}
 void SetSubject (PtString t) { subject = t; }
 void SetDate (PtDate &Date, PtTime &Time) {date=Date;time=Time;}
 PtDate &GetDate (void) { return date; }
 PtTime &GetTime (void) { return time; }
 PtString &GetSubject (void) { return subject; }
 char *GetSubject (PtString &t) { t = subject; return (char*)t; }

 virtual InfoType GetInfoType(void) { return ZWTASK; }
// virtual int DisplayYourself(PTWindowsObject) {;}
};



/////// NOTE.HCD

#include <bks.hxx>
#include <ptcomp.hxx>
#include <ptstring.hxx>
#include <_defs.h>

_CLASSDEF(TWindowsObject);

#ifdef PTXX
#include "info.hcd"
#endif

// ZWMemo
/////////////////////////////////////////////////////////////////////////////

class ZWBasePerson;
class ZWNote : public ZWInfo
{
protected:
 PtString text;
transient int dirty;
public:
 ZWNote(void);
 void SetText (char *Text) { text = Text; }
 void SetDirty (int Dirty) { dirty = Dirty; }
 PtString &GetText (void) { return text; }
 int IsDirty (void) { return dirty; }
 virtual InfoType GetInfoType (void) { return ZWNOTE; }
// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWNote*> ZWNoteResult;

// ZWReport
/////////////////////////////////////////////////////////////////////////////
persistent class ZWBasePerson;
class ZWReport : public ZWNote
{
protected:
 ZWBasePerson* person; // person dedicated with this report
public:
 ZWReport(void);
 void SetPerson (ZWBasePerson *Person) { person = Person; }
 ZWBasePerson *GetPerson (void) { return person; }
 virtual InfoType GetInfoType (void) { return ZWREPORT; }
// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWReport*> ZWReportResult;

// ZWPhoneReport
/////////////////////////////////////////////////////////////////////////////
class ZWPhoneReport : public ZWNote
{
protected:
 ZWPerson* person; // Called person
 PtDate date_called;
 PtTime time_called;
public:

 ZWPhoneReport(void);
 void SetPerson (ZWPerson *Person) { person = Person; }
 ZWPerson *GetPerson (void) { return person; }
 virtual InfoType GetInfoType(void) { return ZWPHONEREPORT; }
// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWPhoneReport*> ZWPhoneReportResult;


/////// PERSON.HCD

#include <bks.hxx>
#include <ptcomp.hxx>
#include <ptstring.hxx>

#include <_defs.h>

_CLASSDEF(TWindowsObject);

// Address
/////////////////////////////////////////////////////////////////////////////

persistent class ZWAddress
{
protected:
 PtString street;
 PtString countrycode;
 PtString plz;
 PtString city;
 PtString postbox;
 cset<PtString> phone;
 PtString fax;
 cset<PtString> communication;
public:
 ZWAddress(void){;}

 void SetStreet (char *szStreet) { street = szStreet; }
 void SetCountry (char *szCountry) { countrycode = szCountry; }
 void SetPlz (char *szPlz) { plz = szPlz; }
 void SetCity (char *szCity) { city = szCity; }
 void SetPostBox (char *szPostBox) { postbox = szPostBox; }

 PtString &GetStreet (void) { return street; }
 PtString &GetCountry(void) { return countrycode;}
 PtString &GetPlz (void) { return plz; }
 PtString &GetCity (void) { return city; }
 PtString &GetPostBox(void) { return postbox; }
 PtString &GetFax (void) { return fax; }
 cset<PtString> &GetPhone (void) { return phone; }
 cset<PtString> &GetCommunication(void) { return communication; }

// int DisplayYourself(PTWindowsObject);
};

// Forward declaration
persistent class ZWNote;

// ZWBasePerson
/////////////////////////////////////////////////////////////////////////////


persistent class ZWBasePerson
{
/*?
friend class _CLASSTYPE BasePersonDialog;
?*/
protected:
 cset<ZWNote*> information; // list of all available information
 PtString name; // Must be completed
 cset<PtString> keywords; // list of search items

public:
 ZWBasePerson(void);

 void SetName(char *szName) { name = szName; }

 PtString &GetName(void) { return name; }
 cset<PtString> &GetKeywords(void) { return keywords; }
 cset<ZWNote*> &GetInfos(void) { return information; }

// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWBasePerson*> ZWBasePersonResult;

// ZWCompany
/////////////////////////////////////////////////////////////////////////////
persistent class ZWPerson;
class ZWCompany : public ZWBasePerson
{
/*?
friend class _CLASSTYPE CompanyDialog;
?*/
protected:
 ZWAddress adr;
 cset<ZWPerson*> employees;
public:
 ZWCompany(void);

 cset<ZWPerson*> &GetEmployees(void) { return employees; }
 ZWAddress &GetAddress (void) { return adr; }

// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWCompany*> ZWCompanyResult;

// ZWPerson
/////////////////////////////////////////////////////////////////////////////
class ZWPerson: public ZWBasePerson
{
/*?
friend class _CLASSTYPE PersonDialog;
?*/
protected:
 PtString title;
 cset<PtString> firstnames; // name is in BasePerson
 short gender; // Geschlecht
 cset<ZWAddress*> homeadrs;
 PtDate birthday;
 short check_birthday;

 ZWCompany *employer;
 PtString department;
 PtString function;
 PtString phone;
 PtString fax;
public:
 ZWPerson(void);
 ZWPerson(char* nm) { name = nm; employer = 0; }
 PtString &GetTitle (void) { return title; }
 cset<PtString> &GetFirstNames (void) { return firstnames; }
 short &GetGender (void) { return gender; }
 cset<ZWAddress*>&GetAddresses (void) { return homeadrs; }
 PtDate &GetBirthday (void) { return birthday; }
 short GetCheckDay (void) { return check_birthday; }
 ZWCompany *GetEmployer (void) { return employer; }
 PtString &GetDepartment (void) { return department; }
 PtString &GetFunction (void) { return function; }
 PtString &GetPhone (void) { return phone; }
 PtString &GetFax (void) { return fax; }

 void SetTitle (char *szTitle) { title = szTitle; }
 void SetGender (short Gender) { gender = Gender; }
 void SetDepartment (char *szDepartment) {department=szDepartment;}
 void SetFunction (char *szFunction) { function=szFunction; }
 void SetEmployer (ZWCompany *pCompany){ employer = pCompany; }
 void SetBirthDay (PtDate &Date) { birthday = Date; }
 void SetCheckDay (short check) {check_birthday = check; }
 void SetPhone (char *szPhone) { phone = szPhone; }
 void SetFax (char *szFax) { fax = szFax; }

// virtual int DisplayYourself(PTWindowsObject);
};

///////// TASK.HCD

#include <bks.hxx>
#include <ptcomp.hxx>
#include <ptstring.hxx>

#include <_defs.h>

_CLASSDEF(TWindowsObject);

#ifdef PTXX
#include "info.hcd"
#endif

// ZWTask
/////////////////////////////////////////////////////////////////////////////
class ZWTask: public ZWInfo
{
/*?
friend class _CLASSTYPE TaskDialog;
?*/
protected:
 PtString description;
 short done; // TRUE = done
 ZWPerson* delegated_to; // Person responsible for task
 PtDate deadline; // optional

 PtTime minTime; // Time of task start
 PtTime maxTime; // Time of task end
 PtDate minDate; // Date of task start
 PtDate maxDate; // Date of task end

 useindex ZWTaskIndex; // query optimization, see below
public:
 ZWTask();
 void SetDescription (char *Desc) { description = Desc; }
 void SetDone (short Done) { done = Done; }
 void SetDelegatedTo (ZWPerson *Person) { delegated_to = Person; }

 PtString &GetDescription (void) { return description; }
 short GetDone (void) { return done; }
 ZWPerson *GetDelegatedTo (void) { return delegated_to; }
 PtDate &GetMinDate (void) { return minDate; }
 PtDate &GetMaxDate (void) { return maxDate; }
 PtTime &GetMinTime (void) { return minTime; }
 PtTime &GetMaxTime (void) { return maxTime; }
 PtDate &GetDeadline (void) { return deadline; }

 virtual InfoType GetInfoType (void) { return ZWTASK; }
 virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWTask*> ZWTaskResult;

// ZWTaskIndex
/////////////////////////////////////////////////////////////////////////////
indexdef ZWTaskIndex : ZWTask
{
 minDate;
 maxDate;
};

// ZWReport
/////////////////////////////////////////////////////////////////////////////
enum ZWMailType { LETTER, MEMO, CONCEPT };
class ZWMail: public ZWTask
{
/*?
friend class _CLASSTYPE ReportDialog;
?*/
protected:
 cset<ZWPerson*> cc_list; // carbon copy list
 ZWMailType type;
public:
 ZWMail();
 void SetMailType(ZWMailType Type) { type = Type; }
 ZWMailType GetMailType(void) { return type; }
 cset<ZWPerson*> &GetCCList(void) { return cc_list; }

 virtual InfoType GetInfoType(void) { return ZWMAIL; }
// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWMail*> ZWMailResult;

// ZWPhonecall
/////////////////////////////////////////////////////////////////////////////
enum ZWCallState { HAVENT_CALLED, TRIED, LEFT_MESSAGE, DONE };

class ZWPhonecall: public ZWTask
{
/*?
friend class _CLASSTYPE PhonecallDialog;
?*/
protected:
 ZWPerson* person; // Pointer to person to call
 ZWCallState status; // e.g. LEFT_MESSAGE
 short attempts; // unsuccessful attemps to call the person
 PtString notes;
public:
 ZWPhonecall();

 void SetContact(ZWPerson *Contact) { person = Contact; }
 void SetStatus (ZWCallState Status) { status = Status; }

 ZWPerson * GetContact (void) { return person; }
 ZWCallState GetStatus (void) { return status; }

 virtual InfoType GetInfoType(void) { return ZWPHONECALL; }
// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWPhonecall*> ZWPhonecallResult;

// ZWMeeting
/////////////////////////////////////////////////////////////////////////////
persistent class ZWAddress; // Forward reference
persistent class ZWBasePerson; // Forward reference
class ZWMeeting : public ZWTask // Termine
{
/*?
friend class _CLASSTYPE MeetingDialog;
?*/
protected:
 cset<ZWPerson*> participants; // e.g. Miller, Smith, Bartels, Witt
 ZWAddress *location; // optional
 ZWBasePerson *host; // optional
public:
 ZWMeeting();
 void SetLocation(ZWAddress *szLocation) { location = szLocation; }
 void SetHost (ZWBasePerson *pPerson) { host = pPerson; }

 cset<ZWPerson*> &GetParticipants(void) { return participants; }
 ZWAddress *GetLocation(void) { return location; }
 ZWBasePerson *GetHost(void) { return host; }

 virtual InfoType GetInfoType(void) { return ZWMEETING; }
// virtual int DisplayYourself(PTWindowsObject);
};
typedef cset<ZWMeeting*> ZWMeetingResult;












Special Issue, 1993
Special Issue, 1993
EDITORIAL


You Can Never Be Too Thin, Too Rich, or Have Too Much Information on Windows
Programming


Few software systems have generated the mountains of books, forests of
articles, and swamps of manuals as Windows has--and, if there's ever such a
thing as a sure bet, you can wager there's more on the way. As much a
testament to the success of Windows in capturing the hearts, minds, and mice
of end users, the feast of Windows-related information speaks to the
complexity of Windows software development. How complex is it? For starters,
no one--not even Microsoft--can tell you the total number of Windows function
calls available to programmers. Granted, the Windows 3.1 SDK sports upwards of
700 documented calls, and Win32 even more.
But when you stir in other Windows-related APIs, such as the multimedia or
Asian SDK (which supports Kanji/Korean/Chinese), the Win32s subset, the
unreleased Win32c, Modular Windows, and the future Cairo environment, the old
head starts spinning, as the eyes roll back and glaze over. Toss in a bushel
or two of undocumented calls (yes, they really do exist), DLLs from
third-parties (like Borland's custom controls), emerging standards like ODBC,
class libraries and application frameworks, and suddenly we're talking about a
serious souffl, API-wise.
Considering the sheer volume of absolutely critical information, even a
company with as many cooks in its coding kitchen as Microsoft can choke when
it comes to serving up that information in a timely manner. Updating the
Visual C++ pre-releases was more hampered by the need to photocopy manuals
than fixing bugs and duping disks. By the time NT betas rolled around,
Microsoft proved it had learned its lesson, employing the Heimlich-like
maneuver of distributing the Win32 SDK, sample programs, documentation, books,
magazines, and more on CD-ROM.
What all this boils down to is that, when it comes to Win16 and Win32
development, you can never have too much in-depth, technical information. And
just as importantly, that information needs to come from programmers who've
earned their silver spatulas on the stovetops of both platforms--and that's
where Dr. Dobb's Sourcebook of Windows Programming comes in through the
swinging doors.
This special edition of Dr. Dobb's Journal focuses exclusively on Windows 3
and Windows NT programming. As with the regular issue of Dr. Dobb's Journal,
all of the articles in this issue are written by battle-scarred Windows
programmers who are graciously sharing with you some of what they've
learned--techniques they would have liked to have read about before they
launched into serious development projects.
There's no question that you'll find this special edition an invaluable
addition to your programming bookshelf. We look forward to hearing about the
Windows problems the Sourcebook helps you solve--and maybe even turning those
experiences into articles.
Jonathan Ericksoneditor-in-chief














































Special Issue, 1993
A Multitool Approach to Windows Development


The process is just as important as the product




Al Stevens


Al is a contributing editor to DDJ and can be contracted at 411 Borel Ave.,
San Mateo, CA 94402.


It's no secret that programming for GUIs like Windows is difficult. It takes a
long time to become expert in each GUI. And because no two are alike, even if
you are proficient in one GUI, you can't transfer much of that experience to
another.
Consider the learning curve for a new Windows programmer. The first hump is
the event-driven programming model. It's not quite object oriented, and it's
not quite traditional procedural programming. Once you understand the model,
you must learn all of the messages, functions, and nuances of each. The API
has almost 1000 functions. Few programmers can remember them all, and you must
develop an associative memory about them so that a given need reminds you of
the message or function that will solve the problem at hand. Then there's the
folklore. Windows programmers learn tricks, workarounds, and undocumented
features to get over the hurdles and toward a working application. Debugging a
Windows application is another issue. Windows debugging can be art in itself,
sometimes involving multiple monitors, perhaps even two computers connected on
a network or by a serial cable. Only recently have Windows-hosted debuggers
become available.
The approach I'll discuss here is based on Borland C++, tested with Turbo
Debugger under DOS. The version of the program I present is for testing only,
designed to verify the application-specific algorithms without regard to the
user interface, which is a stubbed iostream module. Once I've debugged the
application, I convert it to a Windows DLL and integrate a Visual Basic UI
front end. The result is a complete Windows application, developed without a
lot of Windows programming knowledge and retaining the strength of a
platform-independent C++ design.


Visual Development Environments


Visual development environments (VDEs) and class libraries are emerging to
address the difficulties in programming Windows. They encapsulate much of the
API and provide visual UI composition tools that take much of the burden off
the programmer. Though not the first of its kind, Visual Basic changed the
nature of software development for GUI platforms when it combined the
traditional Interactive Development Environment (IDE) with visual
resource-design tools and defined what visual programming should be.
In a visual development environment, a programmer uses interactive tools to
build windows and dialog boxes, constructing the visual parts of the
application on the screen as they will appear in the running program. Then the
programmer attaches fragments of code to user events: menu selections, button
presses, mouse movements, and so on. Visual Basic programmers use a structured
superset of Basic for this code. Visual Basic caught on because it is easy to
learn, easy to use, and lets non-Windows programmers write Windows
applications.
The success of Visual Basic was followed by cries from programmers for visual
versions of other languages. Loudest of these calls came from C and C++
programmers. Now there are tools that support visual programming environments
with C++ as the host language. Visual C++ is Microsoft's answer to that cry.
Application::ctor (pronounced "constructor"), from Compass Point Software, is
a similar product that aims a smaller dart at the same big target. (See the
accompanying textboxes entitled, "Visual C++" and "Application::ctor,"
respectively.)
If you are looking for Visual Basic with a C or C++ dialect, forget it.
Neither of these products is to C and Windows what Visual Basic is to Basic
and Windows. Visual Basic is an interpreted language that seamlessly
integrates user- interface design and application development. Visual C++ and
Application::ctor are applications generators. They generate C++ source code
from your visual design and compile it for you. The advantages of an
applications generator over an interpreter are that the compiled applications
run faster and do not need the interpreter's dynamic link library, which means
that you don't have to distribute and install a third-party DLL with your
application. Furthermore, the application's source programs are complete,
stand-alone modules that you can modify and recompile outside of the visual
programming environment. As such, they have access to all of the features
available to a traditional Windows program. There is no new wall to hit.
However, certain disadvantages become apparent during development. An
interpreted program runs slower than a compiled one, but the interpreter
builds a testing application faster. A C/C++ interpreter is possible, but
neither Visual C++ nor Application::ctor uses one. The development model is
different, too. With Visual Basic you click a control on your design, and
instantly you are editing the code that processes the control's command. Make
a small change, tell the interpreter to run the program, and the program runs
without delay. Stop in the middle, change something else, and resume running.
Everything is integrated and fast. A programmer hammers out an application in
the shortest possible time. A C/C++ applications generator could work that
way, but these two do not.
Both Visual C++ and Application::ctor loosely associate, rather than tightly
integrate, design and development. You design the windows; the generators then
build a skeletal source program. Next, you modify and add source code.
However, instead of clicking a button on your design to go directly to the
button's event code in an editor, you navigate to the code through lists.
First you view the class that represents the window that gets the event. Then
you select the event member function from a list. It's a bit of a trek from
the window with the button on it to the C++ code that executes when the user
clicks that button. Once you get there and add or change some code, you have
to compile and link the program. Both systems automate that process, but they
take longer than Visual Basic to rebuild an application after you've made even
the smallest change. That's the price you pay for compiled code.
If you are looking for a visual programming environment that frees you from
the drudgery of the Windows API, forget that, too. An application developed
with either Visual C++ or Application::ctor needs code fragments associated
with user events. You must write the appropriate C++ code within the member
functions of the window classes. With either package, this process requires a
thorough understanding of Windows programming. The class libraries do not
encapsulate everything. There are things you will want to do for which no
member functions exist.
And if you are a C programmer who does not want to learn C++, neither of these
products is for you. Both are bound to class libraries that encapsulate some
of the Windows API. You must know how to read and use C++ classes, and you
must become an expert on the particular class library that each product uses.
To add functionality to an application, you modify the derived classes that
the generators build, adding data members and member functions. To communicate
between windows you must know how to send C++ messages to instantiated objects
of classes. To know what messages are available to a derived class, you must
understand inheritance, function overriding, and virtual member functions. All
of that requires C++ skills.
Rather than reducing the knowledge required from a would-be Windows
programmer, these packages add to it. You need to know the API; you need to
know C++; you need to know the class libraries. That's not necessarily bad,
however. If you want to be a power Windows programmer, you need those skills
anyway. You have to immerse yourself in the hundreds of API functions and
messages and all the underground folklore about undocumented features and
such. You must know C++ intimately and have a decent class library that
encapsulates as much of the Windows API as possible. However, if all you want
is to knock out an occasional custom Windows application without joining the
priesthood, there is no C or C++ solution for you, so you'd better stick with
Visual Basic.
The main difference between Visual Basic and these new programs is, therefore,
the length and pitch of the learning ramp: short and flat vs. long and
vertical. Other differences are aesthetic and relative to what you were doing
before. A C++ Windows programmer coming to Visual C++ or Application::ctor
from the SDK/API environment will be impressed. A C programmer who has been
using Visual Basic while waiting for a visual C environment will be
disappointed.
But VDEs tend to bind the application to the platform. A Windows program
developed with a VDE--or any other tool, for that matter--is anything but
portable.
The result is that applications written for today's GUI platforms--Windows in
particular--tend to be nonportable. The algorithms that support most
applications should be virtually platform independent, yet porting
applications between GUIs usually involves a complete rewrite.
How important is portability to an application? In an interview in the C++
Special Supplement to Dr. Dobb's Journal (December 1992), Bjarne Stroustrup
said, "Portability is an economic issue. You're portable if it's cheaper to
modify the program [than to] build a new program from scratch."
The issue is cost. The secret is to foresee, project, and manage that cost
during the design and development of a program's first version. If you're
hammering out a custom application for a user, portability is not an immediate
concern. But if the user changes systems and wants to take the application
along, portability becomes an instant and important issue. If you planned for
it, you're one step ahead; if not, you must start over.


Designing Windows Apps


The approach to Windows development presented here involves five stages of
design and development:
1. Develop the application for DOS in generic C or C++. Don't worry about the
UI.
2. Build a functional API into the application. Use a DOS/Windows porting
layer with compile-time conditionals that translate the API into either C++
functions or Windows-exportable DLL functions.
3. Test as a DOS application with a command-line UI stub and a DOS
source-level debugger. Get the application-specific algorithms working in the
least intrusive environment. The UI stub calls the API as C++ functions.
4. Convert the application to a Windows DLL. Properly planned, this is a
simple matter of recompiling the application and allowing the porting layer to
establish the API as exportable DLL functions.
5. Develop the UI in Visual Basic. The Visual Basic program interfaces with
the now-debugged application DLL by calling the exportable API functions.
The advantages of this approach are as follows:
You don't need to be a Windows-programming expert to develop a Windows
application. A strength of Visual Basic is that it encapsulates the Windows
API in ways that hide it from the programmer.
The substance of the application is not bound to the performance limitations
of interpreted Visual Basic. Interpreted code does not execute as efficiently
as compiled native code. By offloading the application-specific,
processor-intensive processes to the C++ DLL, you put the power of the
compiled language where it is needed. Interpreted Visual Basic is used only
for the interactive parts.
Having been written in C/C++, the application, is essentially platform
independent. A project can leverage its investment in the C/C++ application
code from an earlier version on a different platform.
You can take your time rewriting the UI with a more-efficient C/C++ class
library such as MFC or OWL with few or no changes to the application code.
This approach encourages a team effort, in which the UI team develops the
Visual Basic stuff, perhaps working with a dummy applications DLL. The
applications team develops the application, using DOS to simulate the user
interface. The interface between the two efforts is clearly defined:
exportable C/C++ functions that exchange strings and numbers between the two
parts of the application.
If the stubbed DOS user interface is usable, a by-product is a DOS
application. If not, a third team can integrate the DLL with a DOS
user-interface tool.
The rapid prototype nature of this approach allows you to field a working
system in the shortest possible time.



Memoir: A Case Study


To illustrate this approach, I'll use a small Windows application called
"Memoir" that resembles the Windows Cardfile application, but is more suitable
for storing short notes. Each note has a subject and text, and the user can
search the database for a key word or phrase match.
The Memoir application code is written in Borland C++, and its UI in Visual
Basic. The source code is available electronically; see "Availability," page
3.
The first part of the program design considers which functions will be written
in C++ and which are deferred for the UI. All file storage and retrieval is
written in C++. The search algorithm is in C++, and everything related to
screen displays and user input is in Visual Basic. This dividing line clearly
separates the GUI-dependent code from code unique to the application--and
that's the objective.
To develop the application, I use the Borland C++ 3.1 compiler, the Brief 3.1
programmer's editor, and Turbo Debugger. You can use any
compiler/editor/debugger package, including an integrated development
environment (IDE), as long as it supports Windows DLL development.
With the responsibilities of both parts of the system defined, it's now
necessary to establish the interface between the two. In its final version,
the Memoir application code will be in a DLL. Visual Basic programs can call
DLL functions with some small restrictions. Therefore, the interface consists
of API function calls that comply with the Visual Basic/DLL specification.
Initially, however, the program will be tested from DOS, so the API must be
compatible with DOS C++ programs, too. Listing One (page 13) is the API as
taken from the memoir.h header file. Note the #define statement for APPLAPI.
While programming in DOS, the global is an empty token. Later, that token
helps to make the program a DLL.
The API consists of functions to open and close the database and add, find,
change, and delete documents. The functions are representative of
UI-independent, application-specific algorithms. The functions pass string
parameters by reference, which is a Visual Basic convention. It's also a C/C++
convention, so the API should adapt well to porting to other GUIs or other
Windows user-interface tools.
To test the application, use a simple DOS command-line UI. The source-code
module that simulates the UI won't be a part of the DLL. It should, however,
have enough functionality to exercise all of the features of the application.
Listing Two (page 13) is an abstract of the Memoir UI stub. Along with a few
supporting functions that read user input and display output, this stub is all
that is needed to test the essence of the application. Listing Three (page 13)
is the Memoir makefile that builds the DOS test version.
The application itself consists of three source files--applapi.cpp,
database.cpp, and search.cpp--plus the memoir.h header file. The details of
how they work aren't important to this discussion. It's enough to know that
they implement the storage, search, and management of documents in the
database. The uistub.cpp program is the DOS UI stub. Note the DOS_VERSION
conditional that is defined by a command-line switch.
Because this is a DOS C++ program, you can use Turbo Debugger to test it. The
program uses the fstream class to read and write the database. It uses the C++
new and delete operators to allocate memory from the heap. Other than for the
unconventional APPLAPI usage, there's nothing in the code to indicate that it
will be included in a DLL after it is debugged.
The DOS_VERSION compile-time conditional specifies that the code is being
compiled as a DOS program. It's used in memoir.h as in Listing Four (page 13).
When DOS_VERSION is defined, the APPLAPI token is an empty comment. When it is
not defined, the program includes windows.h and defines the token to add the
FAR _export PASCAL qualifiers to the API functions. These qualifiers are
required for DLL functions that will be called from Visual Basic.
All of the API functions are compiled with the extern "C" linkage
specification so that their names are not mangled by the C++ compiler and are
accessible to the caller.


Windows Development


After the application program is fully tested as a DOS program with the UI
stub, it can be built as a DLL. I use Turbo C++ for Windows (TCW) to do that.
I could use the Borland C++ command-line compiler, but the Windows-hosted TCW
IDE is more convenient for making small changes to the DLL during Windows
testing.
The DLL substitutes a small source module for the user-interface stub. Listing
Five (page 13) is the source module, dll.cpp. The LibMain and WEP functions
are the standard entry and exit points for a DLL. RegisterAppl allows the
Visual Basic program to determine if a prior instance of the application is
running. The Memoir program does not want multiple instances of itself running
because it uses a constant database filename. Subsequent instances of the
program would run into sharing problems.
The DLL project file includes the module-definition file, the same
applapi.cpp, database.cpp, and search.cpp source files that the DOS makefile
includes, and the dll.cpp file that contains the DLL-specific code.
The UI code is developed in Visual Basic. Memoir is a single-document
interface application with a menu and some text boxes and buttons. Memoir
communicates with the application DLL by declaring the API functions in
Listing Six (page 13) in the MODULE1.BAS source file.
Listing Seven (page 13) is taken from the Visual Basic part of the
application. This code executes when the application window is first loaded.
It calls RegisterAppl to see if an earlier instance of the program is running.
If so, this program calls a Windows API function to give the focus to the
original instance. Otherwise the program calls the API to open the database,
read the first document, and put its text into the application window.
I keep all the code under the main subdirectory named MEMOIR, which has two
subdirectories named VB and DOS. The DOS subdirectory code contains the C++
source files, the DOS makefile, the DLL definition file, and the TCW project
file. The VB subdirectory has the Visual Basic source files.
When TCW builds the DLL, it writes the .OBJ and .DLL files into the VB
subdirectory, where they can be found by the Visual Basic runtime. When the
bcc command-line compiler builds the DOS executable, those .OBJ files and the
.EXE are written into the DLL subdirectory. With this setup, the two make
processes do not confuse dependent object modules.
I set up a Program Manager group window for Memoir development. It contains
icons to execute Visual Basic, BCW++, MS-DOS, and the Memoir executable file
itself. By building a specialized Program Manager group, I can cause the
different development environments to start up in the correct subdirectory and
to execute with the relevant file on the command line. This procedure saves
time. Each execution of Visual Basic, for example, does not need to first load
the default empty application and all the VBX controls.


Conclusion


Using the tools and techniques presented here, you can develop a Windows
application without possessing extensive knowledge about Windows programming.
You'll be able to use existing C/C++ applications code, and the applications
code itself will be relatively portable to other platforms.


Visual C++


Visual C++ 1.0 is Microsoft C/C++ 7.0 with the Foundation Class library (MFC
2.0). It's also surrounded by some visual development tools and the compiler
is called "Version 8.0."
To build an application, you begin in the Visual Workbench, an IDE-like
multiple document interface (MDI) program that manages an application as a
project. You start with nothing, and must first define a project to manage
your application.
The AppWizard tool builds skeleton source code for the application. First, you
name the project and specify the subdirectory for all the source files. You
then decide whether the application will use the single or multiple document
interface, if it will have a tool bar, if there will be print commands, and so
on. You look at the classes that AppWizard creates and change the source-code
filenames if you want to. Make all of these decisions carefully. It's your
only chance. When you are done, AppWizard builds a makefile for the project
and the source files for the skeleton application. If you must make changes,
you will need to learn the format of that makefile, change it yourself with an
editor, maybe delete or rename some source files and change their contents,
and rebuild everything, all outside of the visual environment, all definitely
not visual.
Having described an application to AppWizard, you build it from the Visual
Workbench. Without adding any code, you can compile an impressive, running
Windows program with a menu, a tool bar, a status bar, and document windows.
Certain dialog boxes will retrieve and store documents; others will print
documents, preview printed output, and select printer options. You can move
the application window around, change its size, and minimize, maximize,
restore, and exit from it, all in the standard CUA way. Most commands have
buttons on the tool bar, too. The application doesn't do anything yet, but a
lot of the code that you wrote for yourself in the old days has been written
by the AppWizard tool.
You modify the source code in the Workbench's editor windows. When you change
something, the next Build command senses the change and, in typical,
well-behaved makefile fashion, rebuilds the dependent targets. Some of the
changes might require a Rebuild All command, which recompiles everything, but
the Workbench doesn't do that automatically. The symptom might be a breakpoint
at a source line other than the one you thought you selected, or it might be a
cryptic error message saying that the project database is out of sync. When
things don't look quite right, run the Rebuild All command yourself. That
should straighten things out.
The Class Wizard tool provides a list of the classes in your application and
the events that they process. At first, there are four classes: the
application, the main frame window, the document, and the view of the
document. As you add document types, views, dialog boxes, and so on, the
number of classes increases. Each class has a list of object identifiers. For
example, a dialog box will have an object for itself and one for each of its
controls. Each object identifier has a list of Windows messages that it can
receive. Each message may have an associated class-member function. You can
add and modify member functions from the Class Wizard to define the behavior
of the application.
The App Studio tool builds resources: menus, dialog boxes, strings, bitmaps,
icons, and accelerators. The AppWizard has already built a menu and a bitmap
for the tool bar, and there are already some default dialog boxes. You use App
Studio to add to and modify these resources. When you add a menu selection,
for example, you assign it properties, such as whether it is enabled, checked,
and so on, and you give it an object-identifier code. The menu is associated
with the identifier of an already-defined frame-window class. To add code to
process the new menu selection, you exit from App Studio, go into Class
Wizard, select the frame window class, find the new object- identifier code,
select from the two messages that it can receive, select the Add Function
command, and approve or change the function identifier that Class Wizard
assigns. You are then in the editor, ready to add code. Compare that to Visual
Basic, where you click the menu selection and edit the code.
Visual C++ supports the advanced VBX controls that Visual Basic Professional
programmers have. These controls have the three-dimensional look with beveled
boxes, buttons that stay down, gauges, grids, and other specialized controls.
--A.S.


Application::ctor


Application::ctor (pronounced "constructor") from Compass Point Software is an
applications generator that generates C++ source code from your visual design,
then compiles it for you. The advantages of an applications generator over
interpreters (like Visual Basic) are that the compiled applications run faster
and don't need the interpreter's DLL, which means that you don't have to
distribute and install a third-party DLL with your application. Furthermore,
the application's source programs are complete, stand-alone modules that you
can modify and recompile outside of the visual programming environment. As
such, they have access to all of the features available to a traditional
Windows program.
Application::ctor is not self-contained. You need a C++ compiler. It supports
the Borland, Microsoft, and Zortech C++ compilers. There is no integrated
debugger. To debug Application::ctor programs, you modify the generated
makefile and exit from the environment to debug with the debugger that comes
with your compiler.
When you run Application::ctor, it opens an empty MDI frame window. When you
start to build a new application, the main window gets four document windows,
representing the four kinds of objects you can build. An application consists
of window objects, menu objects, graphic objects, and string objects. You
design your application windows and dialog boxes in the Window Objects
document window. The menu bar, pull-down menus, and floating menus are in the
Menu Objects document. Graphics are in the Graphics Objects document, and
strings are in the Strings Objects document.
You can build a prototype of your application's user interface without writing
any code. For example, the Window Objects document contains an icon for each
of the windows that you define. Double click on an icon, and you're running
the View Editor for that window. The View Editor is where you define the size
and position of the window and where you add controls: buttons, text, edit
boxes, list boxes, and so on. You drag control icons from a tool bar to the
window. Then you modify the appearance and behavior of each window and control
by changing the values of window and control properties and by assigning
functions to events.
A window or control exhibits behavior in response to events. Each window and
control is an object of a class. The Application::ctor View Editor associates
events with class member functions, and the system includes enough prototype
member functions to permit you to build most of the application as a visual
mockup. You add custom behavior by deriving classes from Application::ctor
classes and adding member event functions. These derived classes are called
"window managers." The Application::ctor program adds your custom C++ manager
functions to the list of functions that you can associate with events. You
write these class definitions and implementations in a prescribed format, and
then you run the C++ Source Browser to incorporate those source files into
Application::ctor's list of programs to build.
When you design a window, you're looking at it on the screen. As you change
the design, the representation changes. Certain aspects of that design are not
so easy to use, however. For example, to change the window's size and
position, you enter the coordinates and dimensions into a dialog box. You
don't really see the menu bar on the mockup window. You see the name of a
menu-bar object that you build in the Menu Objects window.
You can design dialog boxes and open them automatically using the prototype
function that Application::ctor generates. But these default dialog boxes are
modeless, which means that the user can select other windows and menus in the
application while the dialog box is active. The user can even select the
command that opened the dialog box and open another copy of the same one. To
build a modal dialog box requires some tricks in a window manager.

There are procedures common to every application that you have to use to build
and run the application. You have to tell Application::ctor to build the .DEF
file and the makefile. If you have changed your custom source code, you have
to run the Source Browser and rebrowse your source. Your only indication that
source code has changed is that the system did not build the .EXE file and
won't run your application.
--A.S.
[LISTING ONE] (Text begins on page 7.)

#define APPLAPI /* */

void APPLAPI OpenDocuments(char *dbname);
void APPLAPI CloseDocuments(void);
void APPLAPI AddDocument(char *Subj, char *Text);

void APPLAPI DeleteDocument(void);
void APPLAPI ChangeDocument(char *Subj, char *Text);
int APPLAPI FindDocument(char *key, char *Subj, char *Text);
void APPLAPI GetFirstDocument(char *Subj, char *Text);
void APPLAPI GetNextDocument(char *Subj, char *Text);

[LISTING TWO]

void main()
{
 OpenDocuments("test.dat");
 int done = 0;
 while (!done) {
 cout << "\n ----------------------------------";
 cout << "\n Add Srch Fst Nxt Chg Del List Quit ";
 cout << "\n ----------------------------------\n";
 switch (getch()) {
 case a':
 AddDocument(Subject, Text);
 break;
 case s':
 if (FindDocument(Keyword, Subject, Text))
 DisplayDocument();
 break;
 case f':
 GetFirstDocument(Subject, Text);
 DisplayDocument();
 break;
 case n':
 GetNextDocument(Subject, Text);
 DisplayDocument();
 break;
 case c':
 ChangeDocument(Subject, Text);
 break;
 case d':
 DeleteDocument();
 break;
 case l':
 GetFirstDocument(Subject, Text);
 while (*Subject) {
 ShowSubject();
 GetNextDocument(Subject, Text);
 }
 break;
 case q':
 done = 1;
 break;
 default:

 break;
 }
 }
 CloseDocuments();
}

[LISTING THREE]

 bcc -ml -c -v -DDOS_VERSION {$*.cpp }

OBJS=uistub.obj applapi.obj database.obj search.obj

memoir.exe : $(OBJS)
 bcc -v -ml -ememoir.exe $(OBJS)

[LISTING FOUR]

#ifdef DOS_VERSION
#define APPLAPI /* */
#else
#include <windows.h>
#define APPLAPI FAR _export PASCAL
#endif

[LISTING FIVE]

#include "memoir.h"

static HINSTANCE hInst = NULL;


extern "C" {
int FAR PASCAL LibMain(HINSTANCE hInstance, WORD wDatSeg,
 WORD cbHeapSize, LPSTR lpCmdLine)
{
 if (hInst == NULL) {
 hInst = hInstance;
 if (cbHeapSize > 0)
 UnlockData(0);
 }
 return 1;
}
int FAR PASCAL WEP(int nParameter)
{
 return 1;
}
int FAR _export PASCAL RegisterAppl(HANDLE hWnd)
{

 static HANDLE semaphore = 0;
 HANDLE sem = semaphore;
 if (semaphore == 0)
 semaphore = hWnd;
 return sem;
}
}

[LISTING SIX]


Declare Sub OpenDocuments Lib "memoir.dll" (ByVal db As String)
Declare Sub CloseDocuments Lib "memoir.dll" ()
Declare Sub GetFirstDocument Lib "memoir.dll" (ByVal Subj As String, ByVal
Text As String)
Declare Sub GetNextDocument Lib "memoir.dll" (ByVal Subj As String, ByVal Text
As String)
Declare Sub AddDocument Lib "memoir.dll" (ByVal Subj As String, ByVal Text As
String)
Declare Sub DeleteDocument Lib "memoir.dll" ()
Declare Sub ChangeDocument Lib "memoir.dll" (ByVal Subj As String, ByVal Text
As String)
Declare Function FindDocument Lib "memoir.dll" (ByVal Key As String, ByVal
Subj As String, ByVal Text As String) As Integer

[LISTING SEVEN]

Sub Form_Load ()
 Dim handle As Integer
 handle = RegisterAppl(form1.hWnd)
 If handle <> 0 Then
 n = SetFocusAPI(handle)
 shutdown
 Else
  ------ initialization code
 OpenDocuments ("memoir.dat")
 GetFirstDocument SubjBuffer, TextBuffer
 DocumentSubject.Text = SubjBuffer
 DocumentText.Text = TextBuffer
 End If
End Sub
End Listings




































Special Issue, 1993
Memory-mapped File I/O


Fast executables and reduced development time for Windows NT




Doug Huffman


Doug, who is co-owner of FlashTek, was both the author of the 32-bit Zortech
DOSX extender and the project leader for the FlashTek X-32VM DOS extender. He
can be contacted at 208-476-4781 or doug@proto.com.


The full power of the 80386/486 processor is just beginning to be tapped by
operating systems available today. Multitasking, virtual memory, huge address
space, multiple levels of protection, and more are available either through
completely new operating systems such as Windows NT, OS/2 2.0, or through DOS
extenders.
So much emphasis has been placed on 386 features for operating-system
designers to work with, we often overlook the features present in an 386
processor that can be used to make life easier for application designers. One
such feature is the ability to remap linear address space so that reading and
writing from memory becomes equivalent to reading and writing from a device
such as a disk drive. This feature of the 386 was primarily intended for
implementing virtual memory, whereby the operating system emulates additional
memory by swapping portions of memory to disk and remapping memory to
locations where the application is attempting to read or write to memory.


Virtual Memory


Writing an application that runs under a well-optimized virtual-memory (VM)
manager can be a real high for the programmer first experiencing it. Suddenly,
the free memory available is roughly equal to free disk space, and physical
RAM is remapped, shuffled about and reinitialized from disk, transparent to
the application. As the huge array of numbers in "memory" is crunched,
altered, and processed by your application, the machine takes on a life of its
own: The hard drive is busy retrieving and storing data from the less-commonly
accessed portions of the linear address space, while the more-commonly
accessed portions of code, data, and stack usually stay in memory and are
seldom, if ever, swapped out.
It seems that we may never get enough memory to make us content, but we must
sometimes stop and take a good look at just what we are doing with all of this
memory. Where does all the information come from that fills up memory? In most
applications, nearly all information originates on disk. Memory space is used
to hold the executable code and the data the application is processing, both
of which must be loaded from disk prior to the application utilizing it. With
this in mind, it is apparent that providing an efficient link between the
application and the disk drive is essential.
Virtual memory is one such link between the disk drive and the computer's
memory. The application's main link to disk is typically in the form of reads
from disk to memory or writes from memory to disk. In many situations, the
application's efforts at reading and writing files are not well coordinated
with the operating system's attempt to provide efficient virtual memory. For
example, suppose the application is reading a large file into memory in
preparation for processing data. If the size of the read exceeds the available
physical memory, the VM manager will start swapping data to a swap file on
disk. The net result is often silly because while the application is reading
data from a file into memory, the VM manager is busy swapping the same data
back out to a swap file. The net result can be that the data is read from the
original file, written to a swap file, and then, when the application accesses
the data, the operating system must read it from the swap file. This obviously
is not optimum for either execution time or efficient use of disk space. If an
application attempts to load files totaling several Mbytes in size into VM on
a system with a nearly full hard drive, there simply won't be enough free disk
space available for swap space. When used as just described, a conventional VM
manager essentially stores duplicates of files read from disk in the swap
file, soaking up disk space and execution time. Many databases today operate
on machines on which the disk drive(s) are nearly filled with huge files, and
it is clearly impractical to read all required files into VM since there isn't
space for a swap file of sufficient size. Even if there were, the execution
time would be severely impacted with this simple approach.
Because of the aforementioned limitations, most programmers working with large
data files are forced to read in a small portion of each file as needed. To
optimize the speed of this type of application, the programmer must keep
records of which portions of which files have been read into memory, and must
keep commonly accessed portions of files in memory permanently while
overwriting less-commonly needed portions with other data. The program must
keep records indicating which portions need to be written back to disk, and so
on. This can be a challenging, time-consuming problem, and many an hour has
been spent trying to optimize and debug programs of this nature.


MMFIO


A much simpler approach is memory-mapped file I/O (MMFIO). This is supported
by modern 386 operating systems such as Windows NT and some versions of UNIX,
and is also available to DOS programmers through 32-bit DOS extenders such as
Phar Lap's 386VMM and FlashTek's X-32VM DOS extender. MMFIO is a feature
whereby a file can be mapped into a linear address space with a simple call to
the operating system. It represents a cooperative effort between the VM
manager and the application to optimize the speed of the application, reduce
total disk I/O, and conserve valuable disk space. Perhaps the most important
advantage of MMFIO is that it usually results in the simplest implementation
of a disk I/O intensive application.
Once a file is mapped into memory, the application can read and write from
that file simply by reading and writing to memory. The operating system's VM
manager deals with the task of reading only those portions of the file that
are accessed and writing back only those portions that are altered. Most 386
VM managers work with 4-Kbyte chunks of memory called pages. Commonly accessed
pages of a file usually stay in memory continuously, while seldom-accessed
pages are swapped to disk if required. A well-optimized virtual- memory
manager provides runtime optimization, controlling which pages of a file are
kept in memory and which are swapped to disk. This takes into account the
amount of physical RAM available on the machine and which portions of a file
are frequently accessed. It provides a bias towards not swapping modified
pages, since those require the extra step of writing the page to disk while
unmodified pages already exist on disk. (The 386 processor automatically marks
which pages are modified, or "dirty," in the operating system's page tables.)
MMFIO not only saves the application designer many hours of hard work when
attempting to process files larger than physical RAM, it can often result in
faster execution time than is possible with conventional solutions to the
problem.


An Example


To illustrate, I'll examine a hypothetical application called prog.exe--a
5-Mbyte executable that manages a customer list, reads and writes to two files
on disk, and is operating on a system with 2 Mbytes or less of RAM. I'll map
in a file called customer.dat, a large list of customers with names,
addresses, phone numbers, and other data contained in thousands of structures
(see Listing One, page 18). I'll also map in state.dat, which contains
configuration information indicating how the application was last used: which
menu was in use, which screen colors were selected, what macros were set up,
and so on. state.dat is used essentially as nonvolatile RAM to store the state
of the application.
I tested the code discussed here with both the Zortech and Watcom 32-bit
compilers in combination with the FlashTek X-32VM DOS extender. The functions
_x32_mmfio_open and _x32_mmfio_flush are provided with the X-32VM DOS extender
but similar functions exist for other operating systems supporting MMFIO. For
example, refer to the Windows NT functions MapViewOfFile and FlushViewOfFile
in the accompanying textbox entitled, "Memory-mapped Files for Windows NT."
When the operating system is initializing, it sets up a swap file to support
the memory allocated by malloc and the stack space; I'll refer to it as
swapfile.dat. X-32VM also keeps the .exe file open for VM purposes and uses it
as a read-only file to access pages of code and static data that the
application doesn't modify. It will automatically switch any pages modified by
the application over to the swapfile, which is a read/write file. prog.exe
calls the function map_files (see Listing Two, page 18) during initialization.
This function opens the files and maps them into the linear address space. The
function _x32_mmfio_open is passed an open file handle and the size of the
address space to map in. Note that the requested address space can be larger
than the actual file size. This allows the file to be expanded if more
customers are to be added to the list. In this case, both files are opened in
read/write mode, but read-only file handles can be used only if the
application treats that region in memory as read only. _x32_mmfio_open returns
a near pointer to the address space in which the file is mapped, this pointer
is stored for future reference to the file.
Once map_files returns, we have a situation similar to the one shown in Figure
1. The operating system has four files open, supporting over 165 Mbytes of
memory space. If this is running on a system with only 2 Mbytes of total RAM,
it can't even hold the entire executable in physical memory. Therefore, we
obviously have only a small portion of these files loaded into memory at any
given time. In fact, when map_files returns, the operating system has only
loaded a small portion of the exe, the portion containing initialization code,
and has not yet loaded any portion of the files state.dat and customer.dat.
The function prnt_name in Listing Three (page 18) outputs a customer name to
stdout using printf. The customer name is a zero-terminated ASCII string
contained in the data structures within the file customer.dat. The function is
passed an index number that indicates which customer we are currently working
with, and it merely prints the customer's name and returns.
When the application first attempts to read or write to a page of memory in
the MMFIO address space, the 386 issues a page fault to inform the operating
system that the application has attempted to access a location that is "not
present." The operating system handles the page fault by mapping a page of RAM
into the desired address space and initializing the RAM with information from
disk if it exists on disk. The same thing happens as the application branches
to pages of executable code that have not yet been accessed: The operating
system reads code into memory only as required. Once the operating system has
all available physical RAM in use, it has to start swapping pages of RAM each
time a page fault occurs.
The RAM containing initialization code that is normally only accessed once is
a prime candidate to be swapped out. The operating system monitors how often
each page of code and data is accessed, and attempts to optimize performance
based on that information. By reading from disk only when required, only
writing information back to disk when it has been modified, and only flushing
data from memory when a page of RAM is needed elsewhere, applications based on
MMFIO can sometimes run with startling speed. The first time I saw an
application requiring over 1 Mbyte of address space run on a system with no
available extended memory, I was shocked. It ran to completion in the wink of
an eye--it normally takes longer than that just to load an exe file of that
size. It took a few minutes for me to realize that only a handful of 4-Kbyte
pages had actually been accessed in that particular run, and so only a small
portion of the executable file had been loaded from disk, thus saving
considerable time.
Before prog.exe terminates, all new information needs to be written to the
MMFIO space. Since many of the changes made to the data in the MMFIO space may
exist only in RAM, operating systems supporting MMFIO must provide a means of
flushing any modified pages to disk. Windows NT provides this feature via the
function FlushViewOfFile while X-32VM provides the function _x32_mmfio_flush.
These functions update specific MMFIO files but do not close or unmap the
files. Only the modified pages are written to disk and then only those not
already swapped to disk during normal VM swapping activity. Critical changes
can be flushed to disk as they are made, thus ensuring file integrity in case
of unexpected termination such as power failure.
MMFIO is not just useful in large applications when handling large files. It
can also be used in small and medium-sized applications where enough memory is
often available to read the entire file into RAM if required. If there is
sufficient RAM to hold all required data, nothing is ever swapped out, and yet
only those pages accessed will be read so it still often results in an optimum
application.


Conclusion


Using MMFIO, an application running on a system with a few Mbytes of physical
memory and a nearly full hard drive can map hundreds of Mbytes of files into
linear memory and then read and write to each file as simply as reading and
writing from an array in memory. This results in a fast and efficient
application with a minimum of development effort.
 Figure 1: MMFIO can be viewed as a link between memory and the system's hard
drives.


Sorting Files with NT's Memory-mapped File I/O





Eric Bergman-Terrell




Eric is a freelance programmer and can be contacted at 8547 E. Arapahoe Road,
Suite J-147, Greenwood Village, CO 80112.


Memory-mapped file I/O lets Windows NT programmers associate a region of
virtual memory with a disk file, such that any access of the memory will
access the associated file data. The program presented here, mmf_sort, uses
memory-mapped file I/O to sort files containing fixed-length records. Mmf_sort
(see Listing Four, page 19), which was compiled and tested with Version
0.00.3043d of the Microsoft NT C/C++ compiler and the March pre-release of
Windows NT, requires a sort definition file that describes the fields of the
file being sorted. For example, to sort a file containing DATA records as in
Example 1(a), you could use the sort definition file in Example 1(b). Comments
have the # character in the first column. Each non-comment line specifies a
field name, sort order, size, alignment, count, a/d flag, and the name of an
optional comparison function.
The field name is an arbitrary identifier for a record field. The sort order
specifies the order in which fields are compared. A field with a sort order of
1 is compared first, followed by a field with a sort order of 2, and so on.
Consider a file containing the records in Figure 2(a). If chr had a sort order
of 1 and lng had a sort order of 2, the file would be sorted as in Figure
2(b). If lng had a sort order of 1 and chr had a sort order of 2, the file
would be sorted as in Figure 2(c).
Many computer systems require that 2-byte integers are stored in even
addresses and that 4-byte integers are stored in addresses that are multiples
of 4. For example, Table 1 shows how the DATA structure in Example 1 is stored
in memory on a 386/486 using the Microsoft NT C/C++ compiler. Notice that two
padding bytes were inserted in the structure to force lng, a 4-byte integer,
to be stored on an address that is a multiple of 4.
Use an alignment value of 2 for fields that must be stored in even addresses,
an alignment value of 4 for fields that must be stored in addresses that are
multiples of 4, and so on. Use an alignment value of 1 for fields that can be
stored in any address. Most computer systems can store 1-byte quantities in
any address. After padding the individual fields, mmf_sort automatically pads
the record size to be a multiple of the largest alignment value.
The record fields sorted by mmf_sort can be arrays. The count entry specifies
the number of elements in the field. For example, in Example 1 the cnt entry
for chr is 2 since chr is a two-element array.
Fields can be sorted in ascending or descending order. Use an a/d entry of A
to sort fields in ascending order, and an entry of D to sort fields in
descending order. The a/d entry can be upper or lowercase.
Mmf_sort is run from the NT command prompt. The first argument specifies the
name of the file to be sorted, and the second specifies the name of the sort
definition file. The file to be sorted is opened and its size is calculated.
After the sort definition file is read the record size is known and the
program verifies that the file size is a multiple of the record size.
Calls to CreateFileMapping() and MapViewOfFile() associate the file_addr
pointer variable with the contents of the file to be sorted. Any accesses of
file_addr accesses the corresponding data in the file being sorted. The
standard C library, qsort(), sorts the file using compare(), a comparison
function based on the sort definition file.
The call to UnMapViewOfFile() breaks the association between file_addr and the
file being sorted and also flushes all changes to the file being sorted. After
the records-sorted-per-second is calculated and output, the program exits.
In sort_def.h (Listing Five, page 19), the FIELD_INFO structure contains all
the data specified in one line of the sort definition file. The SORT_DEF
structure contains all the data specified in the entire sort definition file
as well as the number of fields and the record size. The SORT_DEF structure is
used by the compare() function to sort records.
In sort_def.c (Listing Six, page 19) read_sort_def() stores the data from each
sort definition file entry in the SORT_DEF structure. Lines beginning with #
are comments. The comparison function entry in the sort definition file is
optional. If specified, a pointer to the function is found by function
get_cmp_fcn_ptr() and stored in the SORT_DEF structure. Otherwise a NULL
function pointer is stored.
The max_alignment variable keeps track of the largest alignment value. C
structures are padded to be a multiple of the largest alignment value of any
structure field. Consequently, mmf_sort uses the max_alignment variable to pad
the record size.
After the contents of the SORT_DEF structure are printed, it is sorted using
qsort() so that fields will be compared in the correct order.
Listing Seven, page 19 (compare.h) and Listing Eight, page 20 (compare.c)
contain functions that compare signed and unsigned characters, short integers,
and long integers. Each comparison function returns:
 --1 if the first argument is less than the second argument.
 0 if the arguments are equal.
 1 if the first argument is greater than the second argument.
Since the comparison functions may be used to compare arrays, each function
uses its count argument to compare every element of the field.
Compare() compares record fields based on the sort order specified in the sort
definition file. Fields lacking a comparison function are ignored. If a field
is sorted in descending order, the result of the comparison is reversed.
To add new comparison functions, add the function to compare.c and the
cmp_fcn_info[] table.
error() is used to output an error message and then exit the program (see
Listings Nine and Ten, page 20 and xx, respectively). In globvars.h (Listing
Eleven, page 20) and globvars.c (Listing Twelve, page 20), the SORT_DEF data
structure is a global variable since it is an implicit parameter to the
compare() function called by qsort().
The makefile (Listing Thirteen, page 20) is used to compile mmf_sort. The
makefile supports the Microsoft NT C/C++ compiler. It will have to be modified
to support a different compiler.
Example 1: (a) Sample DATA records to be sorted; (b) sort definition file.

(a)

typedef struct
{
char chr[2];
long lng;
} DATA;

(b)

# field-name sort-order size alignment count a/d optional-comparison-function
chr 1 1 1 2 a uns_chr_cmp
lng 2 4 4 1 a uns_lng_cmp

Figure 2: Sample file containing records; (b) the file after it's sorted with
chr having a sort order of 1 and lng a sort order of 2; (c) the file sorted
with lng having a sort order of 1 and chr a sort order of 2.
(a)

chr lng

ZX 100
AB 100
AT 200
AA 200



(b)

chr lng

AA 200
AB 100
AT 200
ZX 100


(c)

chr lng


AB 100
ZX 100
AA 200
AT 200



Table 1: How the DATA structure in Example 1 is stored in memory on 386/486
systems using the Microsoft NT C/C++ compiler.
Address Contents
0 chr[0]
1 chr[1]
2 padding
3 padding
4 lng's least significant byte
5 _
6 _
7 lng's most significant byte
[LISTING ONE] (Text begins on page 14.)
struct cust
{
 char name[40]; /* Both first and last names. */
 char address1[30];
 char address2[30];
 char address3[30];
 char address4[30];
 char phone1[20];
 char phone2[20];
}

[LISTING TWO]

#include <io.h>
#include <fcntl.h>
#include <structs.h> /* defines structure "cust" */

struct cust *customer; /* pointer to customer.dat file */
void *state; /* pointer to state.dat file */
int c_handle,s_handle;

void *_cdecl _x32_mmfio_open(int fd,int size);

int map_files()
{

 if ((c_handle = open("customer.dat",O_RDWR)) == -1)
 {
 return -1; /* return failure if unable to open file */
 }
 if ((s_handle = open("state.dat",O_RDWR)) == -1)
 {
 return -1; /* return failure if unable to open file */
 }
/* We will allow 150 megabytes of space for customer.dat even if the file
 is smaller than this. This sets a maximum limit for the address space
 available for accessing this file. If the file is currently smaller
 than 150 megabytes, this will allow the operating system to
 automatically expand the file up to 150 megabytes if more customers
 are added to the list.
*/
 if ((customer = _x32_mmfio_open(c_handle,150*1024*1024)) == 0)
 {
 return -1; /* return failure if unable to map file */
 }

/* State.dat is a fixed length file of 500 kilobytes */
 if ((state = _x32_mmfio_open(s_handle,500*1024)) == 0)
 {
 return -1; /* return failure if unable to map file */
 }

/* The pointers customer and state have been successfully initialized,
 return success.
*/
 return 0;
}

[LISTING THREE]

#include <io.h>
#include <structs.h> /* defines structure "cust" */

extern struct cust *customer; /* pointer to customer.dat file */

/* This function uses printf to output a customers name specified by
the customer index number which is passed to the function. This character
string resides in a mmfio region and will automatically be read from the file
customer.dat if it does not already exist in memory.
*/

void prnt_name(int cust_num)
{
 printf("\nCustomer name: %s",customer[cust_num].name);
 return;
}

[LISTING FOUR]

#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "compare.h"
#include "sort_def.h"

#include "error.h"
#include "globvars.h"


int main (int argc, char *argv[])
{
clock_t start = time(NULL);
HANDLE hinput_file, hinput_map;
void *file_addr;
DWORD file_size;
size_t recs_in_file;
int secs;
if (argc != 3)
 fatal_err("usage: mmf_sort <input filename> <def filename>");
printf("\nSorting file %s...\n\n", argv[1]);
/* Open file to be sorted. */
hinput_file = CreateFile(argv[1], GENERIC_READ GENERIC_WRITE, 0, NULL,
 OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hinput_file == INVALID_HANDLE_VALUE)
 fatal_err("cannot open file %s", argv[1]);
file_size = GetFileSize(hinput_file, NULL);
read_sort_def(argv[2]);

/* Make sure file is the correct size. */
if (file_size % Sort_def.rec_size != 0)
 fatal_err("file %s does not contain a whole number of %d byte records",
 argv[1], Sort_def.rec_size);
recs_in_file = file_size / Sort_def.rec_size;
hinput_map = CreateFileMapping(hinput_file, NULL, PAGE_READWRITE, 0, 0, NULL);
if (hinput_map == NULL)
 fatal_err("cannot create mapping for file %s", argv[1]);
file_addr = MapViewOfFile(hinput_map, FILE_MAP_WRITE, 0, 0, 0);
if (file_addr == (void *) NULL)
 fatal_err("cannot map view of %s", argv[1]);
/* Sort the file. */
qsort(file_addr, recs_in_file, Sort_def.rec_size, compare);
UnmapViewOfFile(file_addr);
CloseHandle(hinput_map);
CloseHandle(hinput_file);
secs = (int) (time(NULL) - start);

printf("Sorted %ld records per second\n", secs > 0 ? recs_in_file / secs :
 recs_in_file);
exit(EXIT_SUCCESS);
}

[LISTING FIVE]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\SORT_DEF.H \c AnsiText#define MAX_FIELDS 100
#define FIELD_NAME_LEN 8

#define CMP_FCN_NAME_LEN 16
typedef struct
{
char field_name[FIELD_NAME_LEN + 1],cmp_fcn_name[CMP_FCN_NAME_LEN + 1];
size_t size, align_size, offset;
int sort_order, alignment, count, ascending;
CMP_FCN_PTR cmp_fcn;
} FIELD_INFO;

typedef struct
{
int num_fields;
size_t rec_size;
/* first num_keys entries are keys, in order */
FIELD_INFO field_info[MAX_FIELDS];
} SORT_DEF;
void read_sort_def(const char *def_filename);

[LISTING SIX]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\SORT_DEF.C \c AnsiText#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "compare.h"
#include "sort_def.h"
#include "error.h"
#include "globvars.h"

int sort_def_cmp(const void *a, const void *b)

/* Compare the sort order fields of the SORT_DEF records. */
{

return ((FIELD_INFO *) a)->sort_order -
 ((FIELD_INFO *) b)->sort_order;
}
void read_sort_def(const char *def_filename)
/* Read the contents of the sort definition file and store them in Sort_def.
*/
{
FILE *input;
char line[1024], *ptr;
int i, line_nbr = 0, max_alignment = 0;
FIELD_INFO *cur;
const char *delims = " \n\t";
size_t offset = 0;
if ((input = fopen(def_filename, "r")) == (FILE *) NULL)
 fatal_err("read_sort_def: cannot open file %s", def_filename);
Sort_def.num_fields = Sort_def.rec_size = 0;
cur = Sort_def.field_info;
/* Get fields from sort definition file. */
while (fgets(line, sizeof(line), input) != (char *) NULL)
 {
 /* Discard comment lines. */
 if (line[0] == #')
 continue;
 line_nbr++;
 if (Sort_def.num_fields >= MAX_FIELDS)
 fatal_err("read_sort_def: too many entries in sort definition file");
 /* Get field name. */
 if ((ptr = strtok(line, delims)) == (char *) NULL)
 fatal_err("error in field name on line %d of sort def. file", line_nbr);
 memset(cur->field_name, \0', sizeof(cur->field_name));
 strncpy(cur->field_name, ptr, sizeof(cur->field_name) - 1);
 /* Get sort order. */
 if ((ptr = strtok(NULL, delims)) == (char *) NULL)
 fatal_err("error in sort order on line %d of sort def. file", line_nbr);
 cur->sort_order = atoi(ptr);
 /* Get size. */

 if ((ptr = strtok(NULL, delims)) == (char *) NULL)
 fatal_err("error in size on line %d of sort def. file", line_nbr);
 if ((cur->size = atoi(ptr)) <= 0)
 fatal_err("invalid size on line %d of sort def. file", line_nbr);
 /* Get alignment. */
 if ((ptr = strtok(NULL, delims)) == (char *) NULL)
 fatal_err("error in alignment on line %d of sort def. file", line_nbr);
 if ((cur->alignment = atoi(ptr)) < 0)
 fatal_err("invalid alignment on line %d of sort def. file", line_nbr);
 /* Keep track of maximum alignment. */
 if (cur->alignment > max_alignment)
 max_alignment = cur->alignment;
 /* Get count. */
 if ((ptr = strtok(NULL, delims)) == (char *) NULL)
 fatal_err("error in count on line %d of sort def. file", line_nbr);
 if ((cur->count = atoi(ptr)) <= 0)
 fatal_err("invalid count on line %d of sort def. file", line_nbr);
 /* Get ascending/descending flag. */
 if ((ptr = strtok(NULL, delims)) == (char *) NULL)
 fatal_err("error in a/d flag on line %d of sort def. file", line_nbr);
 *ptr = toupper(*ptr);
 if (*ptr != A' && *ptr != D')
 fatal_err("invalid a/d flag on line %d of sort def. file", line_nbr);
 cur->ascending = (*ptr == A');
 /* Get comparison function. */
 if ((ptr = strtok(NULL, delims)) != (char *) NULL)
 {
 memset(cur->cmp_fcn_name, \0', sizeof(cur->cmp_fcn_name));
 strncpy(cur->cmp_fcn_name, ptr, sizeof(cur->cmp_fcn_name) - 1);
 cur->cmp_fcn = get_cmp_fcn_ptr(ptr);
 }
 else
 {
 /* This field will not be compared. */
 strcpy(cur->cmp_fcn_name, "none");
 cur->cmp_fcn = (CMP_FCN_PTR) NULL;
 }
 while (offset % cur->alignment != 0)
 offset++;
 cur->offset = offset;
 cur->align_size = cur->size;
 while (cur->align_size % cur->alignment != 0)
 cur->align_size++;

 Sort_def.num_fields++;
 offset += cur->count * cur->align_size;
 cur++;
 }
Sort_def.rec_size = offset;
/* Entire record must be padded to maximum alignment. */
while (Sort_def.rec_size % max_alignment != 0)
 Sort_def.rec_size++;
/* Print out contents of sort definition file. */
printf("%-*.*s Order Size Offset Alignment Count A/D "
 "Cmp. Fcn.\n\n",
 FIELD_NAME_LEN, FIELD_NAME_LEN, "Field");
for (i = 0, cur = Sort_def.field_info; i < Sort_def.num_fields; i++, cur++)
 printf("%-*.*s %6d %6d %6d %6d %6d %s %s\n",
 FIELD_NAME_LEN, FIELD_NAME_LEN, cur->field_name,

 (int) cur->sort_order, (int) cur->size, (int) cur->offset,
 (int) cur->alignment,
 (int) cur->count, cur->ascending ? "asc " : "desc",
 cur->cmp_fcn_name);
printf("\nRecord Size: %d bytes\n\n", (int) Sort_def.rec_size);
/* Sort the fields based on the user-specified sort order. */
qsort(Sort_def.field_info,Sort_def.num_fields,sizeof(FIELD_INFO),sort_def_cmp);
fclose(input);
}

[LISTING SEVEN]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\COMPARE.H \c AnsiTextint compare(const void
*a, const void *b);

typedef int (*CMP_FCN_PTR)
 (const char *a, const char *b, size_t align_size, int count);
CMP_FCN_PTR get_cmp_fcn_ptr(const char *cmp_fcn_name);


[LISTING EIGHT]
{INCLUDE C:\\ARTICLES\\MMF_SORT\\COMPARE.C \c AnsiText#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "compare.h"
#include "sort_def.h"
#include "error.h"
#include "globvars.h"

#define COMPARE(a, b) ((a > b) ? 1 : (a < b) ? -1 : 0)
int uns_chr_cmp(const char *a, const char *b, size_t align_size, int count)

/* unsigned comparison of characters */

{
int i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
 cmp = COMPARE(*(unsigned char *) a, *(unsigned char *) b);
return cmp;
}
int sig_chr_cmp(const char *a, const char *b, size_t align_size, int count)
/* signed comparison of characters */
{
int i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
 cmp = COMPARE(*(signed char *) a, *(signed char *) b);
return cmp;
}
int uns_shr_cmp(const char *a, const char *b, size_t align_size, int count)
/* unsigned comparison of short integers */
{
int i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
 cmp = COMPARE(*(unsigned short int *) a, *(unsigned short int *) b);
return cmp;
}
int sig_shr_cmp(const char *a, const char *b, size_t align_size, int count)
/* signed comparison of short integers */
{
int i, cmp = 0;

for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
 cmp = COMPARE(*(signed short int *) a, *(signed short int *) b);
return cmp;
}
int uns_lng_cmp(const char *a, const char *b, size_t align_size, int count)
/* unsigned comparison of long integers */
{
int i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
 cmp = COMPARE(*(unsigned long int *) a, *(unsigned long int *) b);
return cmp;
}
int sig_lng_cmp(const char *a, const char *b, size_t align_size, int count)
/* signed comparison of long integers */
{
int i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
 cmp = COMPARE(*(signed long int *) a, *(signed long int *) b);
return cmp;
}
int compare(const void *a, const void *b)
/* Compare all fields in order. Return: -1 if a < b, 1 if a > b, 0 if a = b.*/
{
int i, result = 0;
FIELD_INFO *cur;
const unsigned char *aa = a, *bb = b;
for (i = 0, cur = Sort_def.field_info; i < Sort_def.num_fields; i++, cur++)
 {
 /* Make sure current field should be compared. */
 if (*cur->cmp_fcn != (CMP_FCN_PTR) NULL)
 {
 result = (*cur->cmp_fcn) ((char *) &aa[cur->offset],
 (char *) &bb[cur->offset],
 cur->size, cur->count);
 /* Invert result if field is sorted in descending order. */
 if (!cur->ascending)
 result = -result;
 if (result != 0)
 return result;
 }
 }
return 0;
}
typedef struct
{
char *cmp_fcn_name;
CMP_FCN_PTR cmp_fcn_ptr;
} CMP_FCN_INFO;
static const CMP_FCN_INFO cmp_fcn_info[] =
{
 {"uns_chr_cmp", uns_chr_cmp},
 {"sig_chr_cmp", sig_chr_cmp},
 {"uns_shr_cmp", uns_shr_cmp},
 {"sig_shr_cmp", sig_shr_cmp},
 {"uns_lng_cmp", uns_lng_cmp},
 {"sig_lng_cmp", sig_lng_cmp}
};
CMP_FCN_PTR get_cmp_fcn_ptr(const char *cmp_fcn_name)
/* Return a pointer to the named function. */

{
int i;
for (i = 0; i < sizeof(cmp_fcn_info) / sizeof(cmp_fcn_info[0]); i++)
 if (stricmp(cmp_fcn_name, cmp_fcn_info[i].cmp_fcn_name) == 0)
 return cmp_fcn_info[i].cmp_fcn_ptr;
fatal_err("cannot find function %s", cmp_fcn_name);
}

[LISTING NINE]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\ERROR.H \c AnsiTextvoid
 fatal_err(const char *fmt,...);

[LISTING TEN]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\ERROR.C \c AnsiText#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include "error.h"

void fatal_err(const char *fmt,...)
/* Give a fatal error message and exit. */
{
char buffer[1024];
va_list argptr;
va_start(argptr, fmt);
vsprintf(buffer, fmt, argptr);

va_end(argptr);
fprintf(stderr, "** FATAL ERROR ** %s\n", buffer);
exit(EXIT_FAILURE);
}

[LISTING ELEVEN]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\GLOBVARS.H \c AnsiTextextern
 SORT_DEF Sort_def;

[LISTING TWELVE]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\GLOBVARS.C \c AnsiText#include <stdlib.h>
#include "compare.h"
#include "sort_def.h"
#include "globvars.h"

SORT_DEF Sort_def;

[LISTING THIRTEEN]

{INCLUDE C:\\ARTICLES\\MMF_SORT\\MAKEFILE. \c AnsiText!include <ntwin32.mak>

OBJS=mmf_sort.obj error.obj globvars.obj sort_def.obj compare.obj

all: mmf_sort.exe

mmf_sort.exe: $(OBJS)

.C.obj:
 $(cc) $(cflags) -Ge $(cvars) $*.c


mmf_sort.exe: $(OBJS)
 $(link) $(conflags) -out:mmf_sort.exe shell32.lib $(OBJS) $(conlibs)

End Listings

























































Special Issue, 1993
Multitasking Fortran and Windows NT


Calling the Win32 API directly from Fortran




Shankar Vaidyanathan


Shankar is with Microsoft Corporation and can be reached at One Microsoft Way,
Redmond, WA 98052. His interests include multiprocessor and remote
procedure-call programming technologies.


AWindows NT application can consist of more than one process, and a process
can consist of more than one thread. The Win32 API supports multitasking,
which allows the simultaneous execution of multiple threads and processes. In
a single-processor system, multitasking is achieved by dividing the CPU time
among all threads competing for it. With systems having multiple processors
and symmetric multiprocessing, more than one thread or process can be executed
simultaneously, resulting in a dramatic improvement in application
performance.
However, NT applications that usually jive well in such environments are
written in C and C++ because the Win32 APIs involve C-type character strings,
null pointers, pointers to valid data types, structures, array of structures,
cyclic/recursive structures, pointers to structures, and dynamic allocation of
memory. Trying to develop Fortran apps to make use of these APIs can be a
challenging and arduous task.
Numerically intensive Fortran apps, both existing and new, are suited to
Windows NT because they naturally yield to subdivision of computational tasks.
Matrix computations, solutions of linear algebraic equations, partial
differential equations, interpolations and extrapolations, integration and
evaluation of functions, Eigen systems, Fourier and fast Fourier
transformations, and statistical simulation and modeling are typical of this
divide-and-conquer paradigm. Some of these functions are inherently
parallelizable and traditionally run on mainframes and supercomputers. With a
32-bit, flat memory-model operating system like Windows NT, however, all these
applications can run on a PC. With the guidelines and interface-statement file
provided in this article, you can write Fortran apps that call the Win32 API
directly, gaining all the benefits of its multitasking and multiprocessing
abilities.


Processes and Threads


A process can be considered as a program loaded into memory and prepared for
execution. Each process has a private virtual-address space and consists of
code, data, and other system resources. Threads, on the other hand, are the
basic entity to which the operating system allocates CPU time. Each process is
started with a single thread, but additional, independently executing threads
can be created. Each thread maintains a set of structures for saving its
context while waiting to be scheduled for processing time. The context
includes the thread's set of machine registers, the kernel stack, a
thread-environment block, and a user stack in the address space of the
thread's process. The most important feature of threads is that all threads of
a process share the same virtual-address space, and can access global
variables (like the Fortran common block) and system resources of the process.
This makes communication between threads easy and cheap. Furthermore, the
system can create and execute threads more quickly than it creates processes.
The code for threads has already been mapped into the address space of the
process, whereas the code for the new process must be loaded during run time.
In addition, all threads of a process can use open handles to resources such
as files and pipes. Hence, it's usually more efficient for an application to
implement multitasking by distributing tasks among the threads of one process
rather than by creating multiple processes.


Time Slicing


The Win32 API in Windows NT is designed for preemptive multitasking. Under
preemptive multitasking, the system allocates small slices of CPU time among
the competing threads. The currently executing thread is suspended when its
time slice elapses, allowing another thread to run. When the system switches
from one thread to another, it saves the context of the suspended thread and
restores the saved context of the next thread in queue. To the application
developer, the advantage of multitasking is the ability to create applications
that use more than one process and to create processes that use more than one
thread of execution.
If, for example, you make a simple Fortran app, like matrix multiplication,
multithreaded, you can create separate threads for multiplying every row with
a particular column. Because each time slice is small, it may appear that
multiple threads are multiplying the subcomponents of the matrix
simultaneously. This is true on multiprocessor systems, where the executable
threads are distributed among the available processors.


Thread Creation


The Win32 API CreateThread creates a new thread for a process. Example 1(a)
shows how this API function is prototyped in winbase.h (shipped with the NT
SDK). Looking at the listing of kernel32.lib, you'll notice that this function
is listed as _CreateThread@24. This Win32 API is invoked with the __stdcall
convention, which means that all the function arguments are pushed on the
stack, and the stack is cleaned up by the callee. The __stdcall function names
are prefixed by an underscore and suffixed with @<number> when decorated. The
number is the number of bytes in decimal used by the widened arguments pushed
on the stack.
CreateThread returns a HANDLE which is an integer*4 (double word) entity in
Fortran. The creating thread must specify the starting address of the code
that the new thread is to execute. The loc function in Microsoft Fortran can
provide the address of variables as well as functions. By default, all
parameters are passed by value in C, and by reference in Fortran. Since all
the functions are external by specification in Fortran, declaring the function
in Example 1(a) as external isn't necessary. A process can have multiple
threads simultaneously executing the same function. The arguments specifying
the stack size of the new thread and the Creation Flags are double words in C,
and they are once again integer*4 data types in Fortran. In the function
prototype in Example 1(a), the argument to the thread function is passed
through a long pointer. On the Fortran side, this object can be passed by
reference; this will pass a long pointer (integer*4) to that object.
CreateThread returns the identifier of the thread through a long pointer to a
double word, and, on the Fortran side, that parameter can be specified as
integer*4 with the reference attribute. The first argument to CreateThread is
a structure prototyped in winbase.h, as in Example 1(b). This structure can be
implemented using STRUCTURE/END STRUCTURE statements in Fortran, as in Example
1(c). Note that BOOL in C is a logical*4 in Fortran capable of taking either a
.TRUE or .FALSE value. Since the parameter in the C-function prototype is a
long pointer to the structure, the structure itself can be passed by
reference, or the loc of the structure can be passed by value in Fortran. The
same is true for character strings. Passing the loc of the structure or
character string has a distinct advantage because if I want to pass a C null
point
er, I can simply pass a 0 in Fortran.
With all the arguments of CreateThread squared away, the interface statement
can be specified as in Example 2.


Synchronization


In a multitasking environment, it's sometimes necessary to coordinate the
execution of multiple processes or multiple threads within a process. Win32
provides a set of synchronization objects for this. A synchronization object
is essentially a data structure whose current state is signaled or
not-signaled. A thread can interact with any of these objects either by
modifying its state or by waiting for it to be in a signaled state. When a
thread waits for an object, the execution of the thread is blocked as long as
the state of the object is not- signaled. Typically, a thread will wait for a
synchronization object before performing an operation that must coordinate
with other threads; it will also wait when using a shared resource such as
file, shared memory, or a peripheral device.
There are four types of synchronization objects: critical section, mutual
exclusion (mutex), semaphores, and events. Two generic functions,
WaitForSingleObject and WaitForMultipleObjects, are used by threads to wait
for the state of a waitable object to be signaled. In addition to event,
mutex, and semaphore objects, these functions may be used to wait for process
and thread objects. The prototypes for WaitForSingleObject and
WaitForMultipleObjects are provided in winbase.h in the NT SDK; the interface
statements for them are provided in mt.fi, Listing One (page 25).


Critical Section


A critical section is a synchronization object that can be owned by only one
thread at a time, enabling threads to coordinate mutually exclusive access to
a shared resource. The restriction on this object is that it can only be used
by threads of a single process. The critical-section object is a cyclic data
structure, which makes its representation interesting and challenging in
Fortran; winnt.h (in the NT SDK) declares the structure, as in Example 3(a).
Example 3(b) is the Fortran implementation of the cyclic structure in Example
3(a). The loc function points the first structure to the second, and the
second structure back to the first. Although the LIST_ENTRY item in the C
typedef statement could be complex, I don't need to go into the implementation
details in Fortran, because all that's required is a 4-byte space for the
address of that data structure.
To illustrate, I'll develop code for finding the sum of the first 50 whole
numbers, and apply various facets of multitasking to it. I'll start by
generating 50 threads, each passing a particular value to ThreadFunc. Each of
the threads adds its value to a global variable, result, which is inside a
common block. Since you shouldn't allow simultaneous access to the global
variable by all the threads, I protect this resource inside a critical
section. This calls for an initialization of the critical-section object (done
by InitializeCriticalSection), and the modification of the global variable
result is enclosed within EnterCriticalSection and LeaveCriticalSection.
However, if the primary thread exits before the completion of all the other
threads, the child threads are "orphaned," and hence we wait for all the
threads to complete through WaitForMultipleObjects. I've made the function
wait on the handle to all the threads indefinitely until all the threads
complete their execution. The critical section object, GlobalCriticalSection,
is also inside the common block so that it need not be passed as a parameter
to ThreadFunc. The code is given in Listing Three (page 26). Also refer to the
include file mt.fd (Listing Two, page 26) for data-type declarations.


Mutex, Semaphore, and Events



A mutex (mutual exclusion) is similar to a critical-section object except that
it can be used by the threads belonging to more than one process. A semaphore
object is used as a resource gate and maintains a count between 0 and some
maximum value, thus limiting the use of a resource by counting threads as they
pass in and out of the gate. The Win32 API calls associated with mutex are
CreateMutex, OpenMutex, and ReleaseMutex; there's a similar set for
semaphores. The semaphore functions typically take an additional set of
parameters that manipulate the semaphore count. The semaphores are quite
powerful, since they are mutexes with the additional ability to control the
number of threads. The Fortran prototype for these APIs are provided as
interface statements in mt.fi (Listing One).
As another example, I'll modify the previous example to incorporate semaphores
and mutex objects. I'll also try to save space by not requiring that you save
the handles of all the threads waiting on them. Here, I generate 50 threads as
before, and enclose the global common-variable result within the mutex region.
Instead of waiting on all the threads to complete, however, I wait for the
last thread to complete; this is an indication of all the threads having
completed. To this end, a semaphore object is created with an initial count of
0. Since this is a not-signaled state of the semaphore, the call to
WaitForSingleObject blocks the main thread until the last spawned thread
releases the semaphore by incrementing the semaphore count by 1. The handles
to the mutex and semaphore objects are hMutex and hSemaphore, respectively,
and they're inside the common block so that they need not be passed as
parameters. ThreadCounter is an additional parameter in the common block to
keep track of the number of threads that have modified the global result
variable. See Listing Four (page 27).
You can use an event object to trigger execution of other processes or other
threads within a process. This is useful if one process provides data to many
other processes. Using an event object frees the other processes from the
trouble of polling to determine when new data is available. CreateEvent
creates either a manual reset event or an auto reset event, depending on the
value of one of its parameters. CreateEvent also sets the initial state of the
event to either signaled (True) or not-signaled (False) state. When an event
is not-signaled, any thread waiting on the event will block. You can set an
event to the signaled state by calling SetEvent, and reset to the not-signaled
state by calling ResetEvent. PulseEvent sets the event to the signaled state
and then immediately resets it to the not-signaled state. (I've used some of
the APIs related to events in the following process-creation example.)


Process Creation


CreateProcess creates a new process that runs independently of the creating
process. CreateProcess allows you to name the program to execute by specifying
either the pathname of the image file or a command line. This particular API
call is prototyped, as in Example 4.
In the kernel32 library, there are two occurrences of this function:
_CreateProcessA@40 and _CreateProcessW@40. All calls that take a character
string as at least one of their parameters are decorated with the trailing A
(for ASCII) or W (for wide character, or Unicode). The Unicode implementation
addresses the problem of multiple-character coding schemes and accommodates a
more comprehensive set of characters.
CreateProcess takes a long pointer to a C string as two of its arguments. In
Fortran, the loc values of the string can be passed to this function, and the
arguments can be declared in the interface statement as being passed by value.
Since these two are C strings, they should have a null terminator or a char(0)
at the end of the Fortran string. The creation-flags argument to CreateProcess
can control the way in which the process is created, for instance, whether it
is a detached process or a suspended process. It is a DWORD in C, and an
integer*4 value in Fortran. The last two parameters of this function call are
long pointers to structures. The structures are STARTUPINFO and
PROCESS_INFORMATION, and they are typedefined in winbase.h; I've
transliterated them into Fortran structures in mt.fd (Listing Two). The
STARTUPINFO structure requires initialization, and one of the members of this
structure is initialized to the size of that structure. The C sizeof function
can be implemented in Fortran by dynamically creating a two-element array of
the structure and subtracting the loc value of the first element from that of
the second. However, I simply counted the number of bytes in that structure
and specified it in the program.
A child process can inherit the following properties and resources from its
parent:
Open handles that were opened with the inherit flag set to TRUE. The functions
that create or open object handles (CreateEvent, CreateFile, CreateMutex,
CreateNamedPipe, CreatePipe, CreateProcess, CreateThread,_) take a
security-attributes argument that includes this inherit flag. The mt.fd file
(Listing Two) declares this structure in Fortran.
Environment variables.
Current directory.
The I/O buffers for console applications (stdin and stdout).


Using Named Objects


CreateProcess may allow sharing of its object handles through their names. In
the following example, the parent process creates a couple of handles to event
objects with ReadEvent and WriteEvent as object names, and passes these names
as command-line arguments to the child process. The child process retrieves
these arguments using the Microsoft Fortran Getarg runtime function and uses
the same names to open the handles to these objects. The names for each type
of object exist in their own flat address space, and so a semaphore object
could have the same name as a mutex object without collision. The child
process usually specifies the desired access to the object. In this case, the
child accesses the object with the attribute EVENT_ALL_ACCESS. This value is
calculated by calling IOR (the Microsoft Fortran Inclusive OR function) on
STANDARD_RIGHTS_REQUIRED, SYNCHRONIZE, and 3h (0x3 in C and #3 in Microsoft
Fortran).
The parent and child processes execute simultaneously after the CreateProcess
API call. However, the child process blocks at the WriteEvent until the parent
writes the question on to the file named file.out. The parent then sets the
WriteEvent, which is a green light for the child process. Subsequently, the
parent process blocks at ReadEvent and waits for the cue from the child. The
child opens the file, reads the question, writes its reply to the same file,
and then sets the ReadEvent object, thus activating the parent process. The
parent process then opens the file to read the answer given by the child
process and writes it on the screen. The parent program is in Listing Five
(page 27) and the child in Listing Six (page 27).


Inheriting Handles


A child process can inherit an open handle to a synchronization object if the
InheritHandle attribute (in the security-attribute parameter) was set when the
handle was created. The handle inherited by the child process has the same
access as the parent's handle. The code fragment in Listing Seven (page 27)
describes this aspect and provides the required initialization for the
security-attribute parameter. Note that the child process has no OpenEvent
calls, since the handles are inherited from the parent. To share an unnamed
object between unrelated processes, the creating process must communicate the
information necessary for the other process to duplicate the handle. Using
DuplicateHandle, the duplicating process can then open its handle with the
same or more restricted access than the original handle.


Conclusion


The C prototypes for the Win32 API can be found in the header files winbase.h
and winnt.h shipped with Microsoft Win32 SDK for Windows NT. The functions are
actually defined in kernel32.lib and ntdll.lib. The description for some of
these APIs can be found in the Programmer's Reference: Overviews manual and
the api32wh.hlp file shipped with NT SDK. All the programs listed here were
compiled from the command line by invoking fl32.exe. This automatically links
the object modules with the required libraries: libf.lib, libc.lib, ntdll.lib,
and kernel32.lib.
In mt.fi, I've provided the interface statements for almost the entire set of
Win32 APIs related to processes, threads, and synchronization, and the
corresponding data-structure declarations are in mt.fd. This includes
DuplicateHandle and other calls associated with attributes, priority,
suspension, resumption, and termination of threads and processes. I've also
written interface statements for all the APIs associated with thread local
storage (TLS). With TLS, one thread can allocate an index that can be used by
any thread of the process to store and retrieve a different value for each
thread.
With mt.fi and other pointers provided in this article, you should be able to
roll up your sleeves and create a killer
multithreading/multitasking/multiprocessing Fortran application under Windows
NT.
Example 1: (a) Prototype of CreateThread; (b) structure for security
attributes; (c) implementing the security-attributes structure using
STRUCTURE/END STRUCTURE.
(a)

HANDLE WINAPI CreateThread (
 LP_SECURITY_ATTRIBUTES lpThreadAttributes,
 DWORD dwStackSize,
 LPTHREAD_START_ROUTINE lpStartAddress,
 LPVOID lpParameter,
 DWORD dwCreationFlags,
 LPDWORD lpThreadId
 );


(b)

typedef struct _SECURITY_ATTRIBUTES {
 DWORD nLength;
 LPVOID lpSecurityDescriptor;
 BOOL bInheritHandle;
} SECURITY_ATTRIBUTES, *LPSECURITY_ATTRIBUTES;



(c)

STRUCTURE /SECURITY_ATTRIBUTES/
 integer*4 length
 integer*4 lpSecurityDescriptor
 logical*4 bInheritHandle
END STRUCTURE


Example 2: Interface statement for CreateThread.
interface to integer*4 function CreateThread [stdcall, alias:
_CreateThread@24']
+ (security, stack, thread_func, arguments, flags, thread_id)
 integer*4 security, stack [value]
 integer*4 thread_func [value] ! loc(thread_func) is passed by value
 integer*4 arguments [reference]
 integer*4 flags [value]
 integer*4 thread_id [reference]
 end


Example 3: (a) winnt.h (in the NT SDK) declares the cyclic data structure; (b)
Fortran implementation of the cyclic structure.
(a)

typedef struct _RTL_CRITICAL_SECTION_DEBUG {
 WORD Type;
 WORD CreatorBackTraceIndex;
 struct _RTL_CRITICAL_SECTION *CriticalSection;
 LIST_ENTRY ProcessLocksList;
 DWORD EntryCount;
 DWORD ContentionCount;
 DWORD Depth;
 PVOID OwnerBackTrace[ 5 ];
} RTL_CRITICAL_SECTION_DEBUG, *PRTL_CRITICAL_SECTION_DEBUG;

typedef struct _RTL_CRITICAL_SECTION {
 PRTL_CRITICAL_SECTION_DEBUG DebugInfo;
 LONG LockCount;
 LONG RecursionCount;
 HANDLE OwningThrea // from the thread's ClientId->UniqueThread
 HANDLE LockSemaphore;
 DWORD Reserved;
} RTL_CRITICAL_SECTION, *PRTL_CRITICAL_SECTION;


(b)

STRUCTURE /RTL_CRITICAL_SECTION_DEBUG/
 integer*4 Type
 integer*4 CreatorBackTraceIndex
 integer*4 Address
 integer*4 ProcessLocksList
 integer*4 EntryCount
 integer*4 ContentionCount
 integer*4 Depth
 integer*4 OwnerBackTrace(5)
END STRUCTURE


STRUCTURE /RTL_CRITICAL_SECTION/
 integer*4 Address
 integer*4 LockCount
 integer*4 RecursionCount
 integer*4 OwningThread
 integer*4 LockSemaphore
 integer*4 Reserved
END STRUCTURE
record /RTL_CRITICAL_SECTION/ GlobalCriticalSection
record /RTL_CRITICAL_SECTION_DEBUG/ AuxCriticalSection

GlobalCriticalSection.Address = loc(AuxCriticalSection)
AuxCriticalSection.Address = loc(GlobalCriticalSection)


Example 4: CreateProcess prototype.

BOOL WINAPI CreateProcessA(
 LPCSTR lpApplicationName,
 LPCSTR lpCommandLine,
 LPSECURITY_ATTRIBUTES lpProcessAttributes,
 LPSECURITY_ATTRIBUTES lpThreadAttributes,
 BOOL bInheritHandles,
 DWORD dwCreationFlags,
 LPVOID lpEnvironment,
 LPSTR lpCurrentDirectory,
 LPSTARTUPINFOA lpStartupInfo,
 LPPROCESS_INFORMATION lpProcessInformation
);


[LISTING ONE] (Text begins on page 21.)
 interface to integer*4 function CreateEvent
+ [stdcall, alias: _CreateEventA@16']
+ (security, reset, init_state, string)
 integer*4 security [value]
 Logical*4 reset [value]
 Logical*4 init_state [value]
 integer*4 string [value]
 end

 interface to integer*4 function CreateMutex
+ [stdcall, alias: _CreateMutexA@12']
+ (security, owner, string)
 integer*4 security [value]
 Logical*4 owner [value]
 integer*4 string [value]
 end

 interface to logical*4 function CreateProcess
 + [stdcall, alias: _CreateProcessA@40']
 + (lpApplicationName, lpCommandLine, lpProcessAttributes,
 + lpThreadAttributes, bInheritHandles, dwCreationFlags,
 + lpEnvironment, lpCurrentDirectory, lpStartupInfo,
 + lpProcessInformation)
 integer*4 lpApplicationName [value]
 integer*4 lpCommandLine [value]
 integer*4 lpProcessAttributes [value]
 integer*4 lpThreadAttributes [value]

 logical*4 bInheritHandles [value]
 integer*4 dwCreationFlags [value]
 integer*4 lpEnvironment [value]
 integer*4 lpCurrentDirectory [value]
 integer*4 lpStartupInfo [value]
 integer*4 lpProcessInformation [value]
 end

 interface to integer*4 function CreateSemaphore
 + [stdcall, alias: _CreateSemaphoreA@16']
 + (security, InitialCount, MaxCount, string)
 integer*4 security [value]
 integer*4 InitialCount [value]
 integer*4 MaxCount [value]
 integer*4 string [value]
 end

 interface to integer*4 function CreateThread
 + [stdcall, alias: _CreateThread@24']
 + (security, stack, thread_func,
 + argument, flags, thread_id)
 integer*4 security [value]
 integer*4 stack [value]
 integer*4 thread_func [value]
 integer*4 argument [reference]
 integer*4 flags [value]
 integer*4 thread_id [reference]
 end

 interface to subroutine DeleteCriticalSection
 + [stdcall, alias: _DeleteCriticalSection@4'] (object)
 integer*4 object [value]
 end

 interface to logical*4 function DuplicateHandle
 + [stdcall, alias: _DuplicateHandle@28']
 + (hSourceProcessHandle, hSourceHandle,
 + hTargetProcessHandle, lpTargetHandle,
 + dwDesiredAccess, bInheritHandle, dwOptions)
 integer*4 hSourceProcessHandle [value]
 integer*4 hSourceHandle [value]
 integer*4 hTargetProcessHandle [value]
 integer*4 lpTargetHandle [reference]
 integer*4 dwDesiredAccess [value]
 logical*4 bInheritHandle [value]
 integer*4 dwOptions [value]
 end

 interface to subroutine EnterCriticalSection
 + [stdcall, alias: _EnterCriticalSection@4'] (object)
 integer*4 object [value]
 end

 interface to subroutine ExitProcess
 + [stdcall, alias: _ExitProcess@4'] (ExitCode)
 integer*4 ExitCode [value]
 end

 interface to subroutine ExitThread

 + [stdcall, alias: _ExitThread@4'] (ExitCode)
 integer*4 ExitCode [value]
 end

 interface to integer*4 function GetCurrentProcess
 + [stdcall, alias: _GetCurrentProcess@0'] ()
 end

 interface to integer*4 function GetCurrentProcessId
 + [stdcall, alias: _GetCurrentProcessId@0'] ()
 end

 interface to integer*4 function GetCurrentThread
 + [stdcall, alias: _GetCurrentThread@0'] ()
 end

 interface to integer*4 function GetCurrentThreadId
 + [stdcall, alias: _GetCurrentThreadId@0'] ()
 end

 interface to logical*4 function GetExitCodeProcess
 + [stdcall, alias: _GetExitCodeProcess@8']
 + (hProcess, lpExitCode)
 integer*4 hProcess [value]
 integer*4 lpExitCode [reference]
 end

 interface to logical*4 function GetExitCodeThread
 + [stdcall, alias: _GetExitCodeThread@8']
 + (hThread, lpExitCode)
 integer*4 hThread [value]
 integer*4 lpExitCode [reference]
 end

 interface to integer*4 function GetLastError
 + [stdcall, alias: _GetLastError@0'] ()
 end

 interface to integer*4 function GetThreadPriority
 + [stdcall, alias: _GetThreadPriority@4'] (hThread)
 integer*4 hThread [value]
 end

 interface to logical*4 function GetThreadSelectorEntry
 + [stdcall, alias: _GetThreadSelectorEntry@12']
 + (hThread, dwSelector, lpSelectorEntry)
 integer*4 hThread [value]
 integer*4 dwSelector [value]
 integer*4 lpSelectorEntry [value] ! Pass loc of the struct
 end

 interface to subroutine InitializeCriticalSection
 + [stdcall, alias: _InitializeCriticalSection@4'] (object)
 integer*4 object [value]
 end

 interface to subroutine LeaveCriticalSection
 + [stdcall, alias: _LeaveCriticalSection@4'] (object)
 integer*4 object [value]

 end

 interface to integer*4 function OpenEvent
 + [stdcall, alias: _OpenEventA@12']
 + (dwDesiredAccess, bInheritHandle, lpName)
 integer*4 dwDesiredAccess [value]
 logical*4 bInheritHandle [value]
 integer*4 lpName [value]
 end

 interface to integer*4 function PulseEvent
 + [stdcall, alias: _PulseEvent@4'] (hEvent)
 integer*4 hEvent [value]
 end

 interface to Logical*4 function ReleaseMutex
 + [stdcall, alias: _ReleaseMutex@4'] (handle)
 integer*4 handle [value]
 end

 interface to Logical*4 function ReleaseSemaphore
 + [stdcall, alias: _ReleaseSemaphore@12']
 + (handle, ReleaseCount, LpPreviousCount)
 integer*4 handle [value]
 integer*4 ReleaseCount [value]
 integer*4 LpPreviousCount [reference]
 end

 interface to integer*4 function ResumeThread
 + [stdcall, alias: _ResumeThread@4'] (hThread)
 integer*4 hThread [value]
 end

 interface to integer*4 function SetEvent
 + [stdcall, alias: _SetEvent@4'] (handle)
 integer*4 handle [value]
 end

 interface to subroutine SetLastError
 + [stdcall, alias: _SetLastError@4'] (dwErrorCode)
 integer*4 dwErrorCode [value]
 end

 interface to logical*4 function SetThreadPriority
 + [stdcall, alias: _SetThreadPriority@8'](hThread, nPriority)
 integer*4 hThread [value]
 integer*4 nPriority [value]
 end

 interface to integer*4 function SuspendThread
 + [stdcall, alias: _SuspendThread@4'] (hThread)
 integer*4 hThread [value]
 end

 interface to logical*4 function TerminateProcess
 + [stdcall, alias: _TerminateProcess@8']
 + (hProcess, uExitCode)
 integer*4 hProcess [value]
 integer*4 uExitCode [value]

 end

 interface to logical*4 function TerminateThread
 + [stdcall, alias: _TerminateThread@8']
 + (hThread, dwExitCode)
 integer*4 hThread [value]
 integer*4 dwExitCode [value]
 end

 interface to integer*4 function TlsAlloc
 + [stdcall, alias: _TlsAlloc@0'] ()
 end

 interface to logical*4 function TlsFree
 + [stdcall, alias: _TlsFree@4'] (dwTlsIndex)
 integer*4 dwTlsIndex [value]
 end

 interface to integer*4 function TlsGetValue
 + [stdcall, alias: _TlsGetValue@4'] (dwTlsIndex)
 integer*4 dwTlsIndex [value]
 end

 interface to logical*4 function TlsSetValue
 + [stdcall, alias: _TlsSetValue@8'] (dwTlsIndex, lpTlsVal)
 integer*4 dwTlsIndex [value]
 integer*4 lpTlsVal [value]
 end

 interface to integer*4 function WaitForMultipleObjects
 + [stdcall, alias: _WaitForMultipleObjects@16']
 + (Count, LpHandles, WaitAll, Mseconds)
 integer*4 Count [value]
 integer*4 LpHandles [reference]
 logical*4 WaitAll [value]
 integer*4 Mseconds [value]
 end

 interface to integer*4 function WaitForSingleObject
 + [stdcall, alias: _WaitForSingleObject@8']
 + (handle, Mseconds)
 integer*4 handle [value]
 integer*4 Mseconds [value]
 end

[LISTING TWO]

PARAMETER (MAX_THREADS = 50)
PARAMETER (WAIT_INFINITE = -1)
PARAMETER (STANDARD_RIGHTS_REQUIRED = #F0000)
PARAMETER (SYNCHRONIZE = #100000)

STRUCTURE /PROCESS_INFORMATION/
 integer*4 hProcess
 integer*4 hThread
 integer*4 dwProcessId
 integer*4 dwThreadId
END STRUCTURE


STRUCTURE /RTL_CRITICAL_SECTION_DEBUG/
 integer*4 Type
 integer*4 CreatorBackTraceIndex
 integer*4 Address
 integer*4 ProcessLocksList
 integer*4 EntryCount
 integer*4 ContentionCount
 integer*4 Depth
 integer*4 OwnerBackTrace(5)
END STRUCTURE

STRUCTURE /RTL_CRITICAL_SECTION/
 integer*4 Address
 integer*4 LockCount
 integer*4 RecursionCount
 integer*4 OwningThread
 integer*4 LockSemaphore
 integer*4 Reserved
END STRUCTURE

STRUCTURE /SECURITY_ATTRIBUTES/
 integer*4 nLength
 integer*4 lpSecurityDescriptor
 logical*4 bInheritHandle
END STRUCTURE

STRUCTURE /STARTUPINFO/
 integer*4 cb
 integer*4 lpReserved
 integer*4 lpDesktop
 integer*4 lpTitle
 integer*4 dwX
 integer*4 dwY
 integer*4 dwXSize
 integer*4 dwYSize
 integer*4 dwXCountChars
 integer*4 dwYCountChars
 integer*4 dwFillAttribute
 integer*4 dwFlags
 integer*2 wShowWindow
 integer*2 cbReserved2
 integer*4 lpReserved2
END STRUCTURE

[LISTING THREE]

Program to demonstrate thread creation and critical section object
 include mt.fi'

Thread function as a subroutine
 subroutine ThreadFunc (param)
 include mt.fd'
 integer*4 param, result
 record /RTL_CRITICAL_SECTION/ GlobalCriticalSection
 record /RTL_CRITICAL_SECTION_DEBUG/ AuxCriticalSection
 common result, GlobalCriticalSection

Critical section region begins...
 Call EnterCriticalSection ( loc(GlobalCriticalSection))

 result = param + result

Critical section region ends...
 Call LeaveCriticalSection ( loc(GlobalCriticalSection))
 Call ExitThread(0)
 return
 end

Main program begins here
 program test
 include mt.fd'
 external ThreadFunc
 integer*4 ThreadHandle(MAX_THREADS), inarray(MAX_THREADS)
 integer*4 CreateThread, threadId
 integer*4 waitResult, WaitForMultipleObjects
 integer*4 loop, result
 record /RTL_CRITICAL_SECTION/ GlobalCriticalSection
 record /RTL_CRITICAL_SECTION_DEBUG/ AuxCriticalSection
 common result, GlobalCriticalSection

Creating the cyclic structure for the critical section object
 GlobalCriticalSection.Address = loc(AuxCriticalSection)
 AuxCriticalSection.Address = loc(GlobalCriticalSection)

 result = 0

Initializing critical section...
 Call InitializeCriticalSection(loc(GlobalCriticalSection))

 do loop = 1, MAX_THREADS
 inarray(loop)= loop
 write(*, (1x, A, I3)') Creating Thread # , loop
 ThreadHandle(loop) = CreateThread( 0, 0, loc(ThreadFunc),
inarray(loop), 0, threadId)
 end do

 write(*,*) Waiting for all the threads to complete ...'
 waitResult = WaitForMultipleObjects
 + (MAX_THREADS, ThreadHandle, .TRUE. , WAIT_INFINITE)
 write(*, (1x, A, I6, A, I10)' )
 + The sum of the first , MAX_THREADS,' #s is , result
 end

[LISTING FOUR]

Program to demostrate the semaphore and mutual exclusion objects
 include mt.fi'

The thread function begins here
 subroutine ThreadFunc (param)
 include mt.fd'
 integer*4 param, waitResult, WaitForSingleObject
 integer*4 ThreadCounter
 integer*4 result, hMutex, hSemaphore, PreviousCount
 logical*4 release, ReleaseMutex, ReleaseSemaphore
 common result, hMutex, hSemaphore, ThreadCounter

Mutual exclusion region begins here
 waitResult = WaitForSingleObject(hMutex, WAIT_INFINITE)


Modifying the global variables
 result = param + result
 ThreadCounter = ThreadCounter + 1

Release the sempahore if this is the last thread
 if (ThreadCounter .EQ. MAX_THREADS)
 + release = ReleaseSemaphore(hSemaphore, 1, PreviousCount)

Mutual exclusion region ends here
 release = ReleaseMutex(hMutex)
 return
 end

Main program begins here
 program test
 include mt.fd'
 external ThreadFunc
 integer*4 ThreadHandle, threadId
 integer*4 CreateSemaphore, CreateThread, CreateMutex
 integer*4 waitResult, WaitForSingleObject
 integer*4 loop
 integer*4 result, hMutex, hSemaphore, ThreadCounter
 integer*4 inarray
 dimension inarray(MAX_THREADS)
 common result, hMutex, hSemaphore, ThreadCounter

Initializing the global variables
 ThreadCounter = 0
 result = 0
 hMutex = CreateMutex(0, .FALSE. , 0)
 hSemaphore = CreateSemaphore(0, 0, 1, 0)

 do loop = 1, MAX_THREADS
 inarray(loop)= loop
 write(*,*) "Generating Thread #", loop
 ThreadHandle = CreateThread( 0, 0, loc(ThreadFunc),
 + inarray(loop), 0, threadId)
 end do

 write(*,*) Waiting for the semaphore release...'
 waitResult = WaitForSingleObject(hSemaphore, WAIT_INFINITE)
 write(*, (1x, A, I4, A, I8)')
 + The sum of the first , MAX_THREADS,' #s is', result
 end

[LISTING FIVE]

Parent Program (process) passing names of event objects to child process
 include mt.fi'

 program Parent
 include mt.fd'
 logical*4 procHandle, CreateProcess
 integer*4 CreateEvent, hReadEvent, hWriteEvent, SetEvent
 integer*4 waitResult, WaitForSingleObject
 character*255 buffer
 character*10 strReadEvent, strWriteEvent, FileName


 record /PROCESS_INFORMATION/ pi
 record /STARTUPINFO/ si

Initializing the strings
 strReadEvent = ReadEvent 
 strWriteEvent = WriteEvent 
 FileName =  file.out 
 buffer = "child "//strReadEvent//strWriteEvent//FileName//" "C
 strReadEvent(10:10) = char(0)
 strWriteEvent(10:10) = char(0)

Initializing the STARTUPINFO structure
 si.cb = 56 ! sizeof (STARTUPINFO)
 si.lpReserved = 0
 si.lpDeskTop = 0
 si.lpTitle = 0
 si.dwFlags = 0
 si.cbReserved2 = 0
 si.lpReserved2 = 0

Creating Read and Write Event objects
 hReadEvent = CreateEvent(0, .FALSE., .FALSE., loc(strReadEvent))
 hWriteEvent = CreateEvent(0, .FALSE., .FALSE.,loc(strWriteEvent))


Spawning the child prcoess
 procHandle=CreateProcess(0,loc(buffer),0,0,.TRUE.,0,0,0,loc(si),loc(pi))

Providing a question for the child
 open (10, file= FileName)
 write(10, (A)') "What issue of Dr. Dobb's is this?"
 close (10)

 write(*,*) Providing the green signal for child to continue...'
 waitResult = SetEvent(hWriteEvent)
 write(*,*) Waiting for the child to answer the question - 
 waitResult = WaitForSingleObject (hReadEvent, WAIT_INFINITE)

Writing the reply from the child on to the screen
 open (10, file= FileName)
 read(10, (A)') buffer
 close (10)
 write(*,*) buffer
 end

[LISTING SIX]

Child program (process) accepting named objects from the parent
 include mt.fi'

 program ChildProcess
 include mt.fd'

 character*255 buffer
 character*100 filename, strReadEvent, strWriteEvent
 integer*4 hReadEvent, hWriteEvent, OpenEvent, SetEvent
 integer*2 status
 integer*4 EVENT_ALL_ACCESS
 integer*4 waitResult, WaitForSingleObject


Retrieving the first command line parameter which is the name of the ReadEvent
 Call Getarg (1, buffer, status)
 strReadEvent(1:status) = buffer(1:status)
 status = status+1
 strReadEvent(status:status) = char(0) ! to make it a C string

Retrieving the second command line parameter which is the name of the
WriteEvent
 Call Getarg (2, buffer, status)
 strWriteEvent(1:status) = buffer(1:status)
 status = status+1
 strWriteEvent(status:status) = char(0) ! to make it a C string

Setting the access privilege for the child
 EVENT_ALL_ACCESS = IOR (STANDARD_RIGHTS_REQUIRED, SYNCHRONIZE)
 EVENT_ALL_ACCESS = IOR (EVENT_ALL_ACCESS, #3)

Opening handles for event objects passed from parent as named objects
 hReadEvent=OpenEvent(EVENT_ALL_ACCESS, .FALSE., loc(strReadEvent))
 hWriteEvent=OpenEvent(EVENT_ALL_ACCESS, .FALSE., loc(strWriteEvent))

Wait until the parent signals the WriteEvent
 waitResult = WaitForSingleObject(hWriteEvent, WAIT_INFINITE)

Retrieve the file name which is the third argument
 Call Getarg (3, buffer, status)
 filename (1:status) = buffer(1:status)

Read the parent's question and then reply
 open (11, file= filename, mode ='readwrite')
 read(11, (A)') buffer
 print *, buffer
 rewind 11
 write(11, (A)') September 1993 issue'
 close (11)

Signal the parent to continue
 waitResult = SetEvent(hReadEvent)
 end

[LISTING SEVEN]

A fragment of the parent program

 ...

Initialization of Security attributes for Read and Write Events
 record /SECURITY_ATTRIBUTES/ saR
 record /SECURITY_ATTRIBUTES/ saW

 saR.nLength = 12
 saR.lpSecurityDescriptor = 0
 saR.bInheritHandle = .TRUE.

 saW.nLength = 12
 saW.lpSecurityDescriptor = 0
 saW.bInheritHandle = .TRUE.

Creating events whose handles can be inherited

 hReadEvent = CreateEvent(loc(saR), .FALSE., .FALSE., 0)
 hWriteEvent = CreateEvent(loc(saW), .FALSE., .FALSE., 0)
 ...
-----------------------------------------------------------------------------
 ...

A fragment of the child program.
Retrieve the handle to Read and Write Events from the command line using
Getarg, and assign them to integer variables through Internal Read
 CALL GETARG(1, buffer, status)
 read(buffer(1:status), (i4)') hReadEvent
 CALL GETARG(2, buffer, status)
 read(buffer(1:status), (i4)') hWriteEvent

 waitResult = WaitForSingleObject(hWriteEvent, WAIT_INFINITE)
 ...















































Special Issue, 1993
VWinL A Virtual Window Library


Automatic window management for Windows 3.1, Windows NT, and Win32S




Al Williams


Al is the author of several books, including DOS 6: A Developer's Guide
(M&TBooks, 1993) and Commando Windows Programming (forthcoming from
Addison-Wesley). You can reach him at 310 Ivy Glen Court, League City, TX
77573 or on CompuServe at 72010,3574.


Part of the difficulty in writing for Windows lies in a typical program's
architecture. Under Windows' event-driven architecture, programs don't
manipulate their display in response to user input (or other events). Instead,
they update an internal application model to reflect the program's current
state. Upon request from Windows, the program (via its WM_PAINT handler)
renders a representation, or view, of the model in a window.
Consider, for example, a program that dumps system information to a true
window (not a dialog or edit control). You can't simply draw the text to the
window as you would under DOS. Instead, you must store the data you want to
display internally. When Windows sends you a WM_PAINT message, you have to
calculate what part of the text is visible and draw it. You may be asked to
draw it repeatedly because of events beyond your control. If all the text
won't fit, you'll also need to manage scroll bars and change how you display
your text based on them. Most GUI systems encourage you to use this style of
programming for one simple reason: resources. A model of your data should
require much less space than a bitmapped image of the screen. After all, an
800x600x256 display takes about one-half of a megabyte to store. With the
traditional GUI model, all of that memory is in the display adapter. In the
system-information example, the text strings to display take up much less
space than the dots that draw the characters on the screen.
To simplify this process, I've created VWinL, the Virtual Window Library.
VWinL automatically manages your Windows 3.1, Windows NT, or Win32S windows.
(See the accompanying textbox, "Porting to Win32.") When you want to display
something, you draw it with the usual Windows calls once. VWinL makes sure it
stays there, and can automatically manage scroll bars, scaling, and other
common tedious tasks.


VWinL


With the advent of Windows 3.1 and Windows NT, PC GUI programming has hit the
big leagues. Windows 3.1 no longer supports real mode--you always have access
to extended memory. Windows in 386 Enhanced mode and Windows NT provide
virtual memory--you can convert part of your hard disk into usable memory.
Since today's typical Windows programs have access to two or more megabytes of
memory, why be stingy? Why not use some of that memory to make programming
easier?
That's the basic philosophy behind VWinL. VWinL creates virtual drawing
surfaces (VMAPs) that you can draw on with standard Windows GDI calls. You can
optionally associate a VMAP with one or more physical windows. You can ask
VWinL to scale the VMAP to fit in the window, or show as much of the VMAP as
will fit. If the VMAP is too large to fit in the window, VWinL will
automatically manage scrollbars for you, if you desire.
Each physical window has an independent view of its VMAP. You can have two (or
more) windows on the screen viewing the same VMAP in different ways. For
example, one window might show the VMAP scaled to fit, while two others are
scrolled to show different areas of the VMAP without scaling.
Once you draw something to a VMAP with a window attached, you won't need to
draw it again. If you iconify the window, or cover it up and expose it, the
image stays in place with no action on your part. As an extra bonus, screen
updates are unusually fast--often faster than with traditional Windows
programs.
Listing One (page 32) shows a simple program that uses VWinL. (You'll find a
summary of VWinL calls in Table 1.) Notice that it includes the VWINL.H file
(Listing Two, page 32) and compiles with MAKEFILE.SIM, SIMPLE.RC, and
SIMPLE.DEF. (All three are available electronically, see "Availability," page
3.) VWinL programs have a main() function (which is more like a WinMain()
function in form) and window callbacks like normal Windows programs. They
don't have WM_PAINT routines or event loops like ordinary Windows programs.
You'll find VWIN.C in Listing Three (page 33).
Although a VWinL callback looks like a conventional callback, there are
several important differences:
You don't need to export a VWinL callback.
You don't need a WM_PAINT case.
You handle the WM_VCREATE message instead of WM_CREATE.
You need to take special steps if you don't want a WM_DESTROY message to
terminate your application.
The main() function is the place to create your primary application window.
Don't draw to it from inside main(), though--VWinL hasn't properly initialized
the window yet. Use the WM_VCREATE message processing in your callback routine
if you want to draw to the new window. VWinL callbacks don't receive WM_CREATE
messages at all--only WM_VCREATE messages.
Most VWinL main() functions are just calls to Vcreate_window() (see Figure 1).
This call mimics CreateWindow() for the most part. One difference between
Vcreate_window() and CreateWindow() is the menu parameter. CreateWindow()
expects a handle to a menu. Vcreate_window() takes an ASCII string or resource
ID (the same as LoadMenu()). If you create a child window, cast the integer
child id to an LPCSTR and pass it as though it were a menu name.
The other major difference between Vcreate_window() and CreateWindow() is the
addition of a parameter for VWinL flags. These flags control VWinL's
operation; see Table 2. For example, the V_SCALE flag causes VWinL to scale a
VMAP to fit the window. The flags are set separately for each window. By
default, Vcreate_window() creates a window and a VMAP simultaneously. However,
you can specify the V_NOMAP flag to create a bare window. You'll then need to
use Vselect_map() to associate a VMAP with the window.
During window creation, VWinL looks for a resource named VAPPICON to specify
your application's icon. If you want to add accelerators, name the table
VACCEL so that VWinL can find it.
When you want to draw to a VMAP, you obtain a device context using Vget_mdc()
or Vget_vdc(). Use Vget_mdc() if you have a pointer to the VMAP, and
Vget_vdc() if you have a window handle and want the underlying VMAP. You can
use the device context freely with any GDI call. Don't call ReleaseDC() or
DestroyDC(), however. If you want to release the resources associated with a
VMAP, call Vdestroy_map(). If the VMAP is in use by more than one window, the
call will not do anything, so be careful to destroy VMAPs at the proper time.
(VWinL attempts to destroy a window's VMAP when the window closes--more about
that later.)
When you draw to a VMAP associated with a window, the changes may not be
immediately visible. You can force the drawing to appear by calling
Vcommit_draw(). Vselect_map() also forces the window to update.
Don't draw to a VMAP when you want to draw something transient (like dragging
a selection box or stretching an object in sync with the mouse). Instead, get
the window's real DC (using GetDC(), or some other Windows call) and draw with
it. Then to restore the window to its original state, you can call
Vcommit_draw().
Be careful to use a solid brush for your window backgrounds if you use the
V_SCALE mode. A patterned brush will look strange when VWinL scales it to fit
in the window.


Calling it Quits


When Windows sends your program a WM_DESTROY message, VWinL intercepts it. It
then sends your callback routine a WM_DESTROY message. If you want to allow
the program to end, you don't need to do anything. If you want the program to
continue, call the Vdont_quit() function.
When VWinL detects a WM_DESTROY message, it will try to delete the window's
VMAP (if it has one). Still, you should try to clean up any VMAPs you have
open in your main WM_DESTROY routine. If you destroy a VMAP, detach it from
its window (using Vselect_map(w,NULL)) so VWinL will not try to destroy it
again.
Since VMAPs can be large, make sure your clean-up routine (or VWinL's)
executes. For example, don't call PostQuitMessage() in response to an Exit
menu command. The application will terminate immediately and you will lose
memory. Instead, pass your main-application window to DestroyWindow(). This
will close the window, causing VWinL to terminate your program cleanly.


Life in the Fast Lane


Although VWinL repaints the entire window on each WM_PAINT message, it is
still fast. You may notice that many VWinL programs are faster than comparable
ordinary programs when you resize them or restore them from an icon. To
display text, for example, an ordinary program must redraw the text in the
selected font each time it processes a WM_PAINT message. Windows must
calculate the position of each pixel every time. VWinL programs only calculate
these coordinates once when you first draw the text. On subsequent paints, the
BitBlt() function rapidly transfers the pixels directly to the screen. This
makes VWinL programs faster than their conventional counterparts in many
cases.
Be careful if you use the V_SCALE flag to force VMAPs to fit in a window. The
StretchBlt() call VWinL uses to do scaling is much slower than the ordinary
BitBlt(). This is especially true when the window is much larger than the
VMAP. You might consider making the VMAP larger than the maximum window size,
or restricting the window's size by intercepting the WM_MINMAXINFO message.
Of course, there is no free lunch. VWinL's increased speed and ease of use
come at the expense of memory--lots of memory. If your application doesn't
need color, you should consider calling Vset_monomode() in your main routine
before calling Vcreate_window() or Vcreate_map(). This will considerably
reduce the number of bytes VWinL uses to store VMAPs (unless you are on a
monochrome display anyway--then it won't make any difference).



A Practical Example


To illustrate the use of VWinL, the Freeshow program continuously displays the
percentage of free system resources available textually and graphically; see
Figure 2. (The files FREESHOW.C, FREESHOW.H, FREESHOW.RC, and FREESHOW.DEF are
available electronically, see "Availability," page 3.) You can compile this
program with the Borland or Microsoft compilers (makefiles for each are
on-line). Since Freeshow uses TOOLHELP.DLL, you can't create it as a WIN32
program.
FREESHOW.C has more menu options than necessary for such a simple program, but
it illustrates several key VWinL features. For example, you can use the Scale
menu choice to flip between VWinL scaling and clipping. If the window is too
small to contain the clipped display, VWinL automatically provides scroll
bars.
The heart of Freeshow is the get_free_info() routine. This function does all
the drawing in response to a WM_TIMER message (which the WM_VCREATE handler
sets up to occur once per second). The SystemHeapInfo() function (from
TOOLHELP.DLL) returns the percentage of free space in the USER heap and the
GDI heap. Freeshow displays the smaller of these two numbers.
Freeshow's callback only handles a creation message (WM_VCREATE), menu
messages (WM_COMMAND), timer messages (WM_TIMER), and the WM_CLOSE message. If
Freeshow didn't require timer activation, it wouldn't have to process anything
but the WM_COMMAND messages.


How Does It Do That?


VMAPs take advantage of two special Windows features: bitmaps and memory
device contexts. All GDI (drawing) functions operate on a device context (DC).
Typically, output to a DC appears on a window. VWinL uses the
CreateCompatibleDC() function to create a memory device context. A memory DC
must have a bitmap associated with it (via SelectObject()). Drawing operations
you perform against the memory DC don't appear anywhere on the screen.
Instead, the drawing operations act on the associated bitmap.
Windows only allows bitmaps to be 65,535x65,535--VMAPs can't exceed this size.
If you use the autoscroll feature, you must restrict your VMAPS to
32,767x32,767. Windows doesn't allow scrollbar ranges to exceed 32K.
The key to VWinL is its default WM_PAINT handler, do_paint(). This routine
copies the bitmap from the window's VMAP to the client area. If the V_SCALE
flag is set, VWinL uses StretchBlt() to scale the image as it copies it.
Otherwise, BitBlt() simply copies the bitmap.
If the bitmap is smaller than the window's client area, do_paint() erases the
region outside the bitmap using the PatBlt() function. This ensures a
consistent background when you resize the window.
Windows may send your program a WM_PAINT message for many different reasons.
If you iconify your window and restore it, you'll get a WM_PAINT message.
You'll also get a WM_PAINT when another window obscures yours and then moves
to expose it again. The Vcommit_draw() function is a macro that calls
InvalidateRect(). The InvalidateRect() call also generates WM_PAINT messages.


Other Uses


Almost any Windows program can use the VWinL library. Programs that need a
quick-and-dirty display of text and graphics work especially well. Although
you can implement sophisticated programs like word processors and spreadsheets
using VWinL, you may not want to do so. For these programs, you need to create
a data model anyway. Therefore you might as well use the more conventional
Windows program architecture.
Be careful to consider VWinL's memory usage. A word processor that might have
many windows open at once would consume a disproportionate amount of memory.
Only you can decide how much memory is too much. Of course, you can mix VWinL
windows and normal windows in the same program. You might use VWinL windows
only where appropriate.


Conclusion


VWinL can simplify many types of Windows programs. One day, Windows (or
another GUI) may support VMAP-style programming. With built-in support, the
VMAPs could be stored as a sparse array, and perhaps compressed. Until then,
you can use VWinL to experiment with this technique. You will notice that
VWinL programs resemble ordinary DOS graphics code more than Windows
applications.
You might like to enhance VWinL. A function to print a VMAP would be useful,
as would a more sophisticated scrolling algorithm. You might also consider
allowing VWinL windows to be MDI children or place VWinL in a DLL.
VWinL will work with Windows NT and Win32S. Since these 32-bit environments
offer improved memory management, VWinL makes even more sense for them. Next
time you write a Windows program, try VWinL and see how simple a Windows
program can be.
Table 1: VWINL calls.
Function/Description
void Vcommit_draw(HWND w)Forces contents of window's VMAP to appear in the
window. Until you call Vcommit_draw(), any output to the VMAP may or may not
be visible. Actually a macro.
HDC Vget_mdc(VMAP *map)Returns DC associated with the specified VMAP. Actually
a macro.
int Vget_stretchmode(VMAP *map)Returns stretch mode for specified VMAP. For
more about stretch modes, see SetStretchBltMode() in the Windows API
reference. Actually a macro.
void Vget_info(HWND w,MEMWINFO *info)Returns read-only structure of
information pertaining to the window.
VMAP *Vget_map(HWND w)Returns a pointer to the VMAP associated with the
window.
VMAP *Vcreate_map(int width, int height)Creates a VMAP of specified width and
height matching your current display, unless you have set the monochrome mode
(see Vset_monomode()).
void Vdestroy_map(VMAP *map)Releases a VMAP's resources. When a window closes,
VWINL attempts to free its VMAP unless the V_NOFREEMAP flag is set.
VMAP *Vselect_map(HWND w, VMAP *new)Changes VMAP associated with a window. If
VMAP pointer is NULL, the window will have no VMAP. Function returns a pointer
to the previously selected VMAP.
unsigned long Vset_flags(HWND w,unsigned long flags,int cmd)You can use
Vset_flags() to change a VWINL window's flags. You may need to call
Vcommit_draw() after changing some flags. The cmd argument specifies how VWINL
interprets the flag's argument. If cmd is VF_STO, VWINL copies the flags to
the window. VF_SET sets the specified flags leaving the other bits unchanged;
VF_CLR clears them. The VF_TOG command causes the specified flags to change
state. Return value is the previousflag value.
void Vset_offset(HWND w,int x,int y)Sets offset of specified window. When
VWINL draws the VMAP to the window, it will use the offset as the VMAP's
starting point (unless V_SCALE is set). x and y parameters are in pixels.
void Vget_offset(HWND w,int *x,int *y)Returns the window's offset (see
Vset_offset()).
int Vcreate_window(char *title,DWORD style,int x,int y,int width, int height,
HWND parent,LPCSTR menu, long (*callback)(), unsigned vflags, HDC *dc,HWND
*win,int show)Vcreate_window mimics CreateWindow(). The menu parameter is
actually a resource name or id. vflags field is a VWINL flag. Window handle
returns via the win pointer and the VMAP DC (if any) goes to the dc pointer
(unless the dc pointer is NULL). Function returns zero upon success; any other
value indicates failure.
HDC Vget_vdc(HWND w)Returns the VMAP DC associated with the given window.
int Vresize_winmap(HWND w,int width,int height)Resizes the VMAP associated
with the specified window. Automatically adjusts the window's scroll bars and
handles other details.
int Vresize_map(VMAP *m,int wid,int hi)Uses Vresize_map() to change the size
of a VMAP. If the VMAP is attached to a window, you will usually want to use
Vresize_winmap() instead.
void Vdont_quit(void)During a WM_CLOSE message, you may call Vdont_quit() to
prevent VWINL from terminating the application.
void Vset_scroll(VMAP *m,int xstep,int ystep,int xpage,int ypage)Sets the
scroll increments for a VMAP. By default xstep and ystep equal 1, and the page
variables equal 10. This causes smooth scrolling when you click the scroll-bar
arrows. When you scroll a page, ten pixels go by.
void Vclear_map(VMAP *m)Erases entire drawing surface of a VMAP using the
background color.
void Vclear_win(HWND *w)A macro that clears the VMAP associated with a window.
int Vset_stretchmode(VMAP *m,int mode)Sets the VMAP's stretch mode (used when
V_SCALE is set). For more about stretch modes, see SetStretchBltMode() in the
Windows API documentation. Returns the previous stretch mode.
int Vset_monomode(int mode)Sets or clears VWINL's monochrome mode. When
monochrome mode is set, all Vcreate_window() and Vcreate_map() calls create
monochrome bitmaps. These bitmaps may take up less space, but support only two
colors.


Porting to Win32



As you may have guessed, VWinL started life as a conventional Windows program
and only recently moved to Win32. Porting to Win32 was fairly straightforward,
but there are some major issues to consider. First, the wParam variable in the
window callback is 32 bits wide. Note that all callbacks use UINT instead of
WORD for this parameter. The UINT type is a WORD under Windows 3 and a DWORD
(32 bits) under Win32. Also, Win32 packs the WM_COMMAND, WM_HSCROLL, and
WM_HSCROLL messages differently. Win32 changes which parameters are in wParam
and which are in lParam for several messages. The same data is there, it's
just been rearranged. VWinL has special code to get at the right values.
Another issue is that calls that return dimensions have changed. Calls like
GetBitmapDimension() and SetWindowOrg() return a 32-bit word with x and y
dimensions packed in it. Since each dimension requires a 32-bit word under
Win32, these functions now take a pointer to a SIZE or POINT structure. Since
this is radically different, the functions now have an Ex suffix (for example,
GetBitmapDimensionEx()).
Although integers are now 32 bits wide, this is usually not a problem. VWinL,
however, used an integer cast in the Vget_info() and save_info() routines.
Until I changed this to a short cast (16 bits), the CreateWindow() routine
would fail mysteriously. (Apparently, I was corrupting some important memory
locations.) I compiled and tested VWinL and its companion programs using Phar
Lap's free QuickStart package and the October Win32 SDK beta tools. QuickStart
allows you to run the Win32 tools under DOS (or in a Windows DOS box). The
resulting executables will run under Windows NT or Win32S (Microsoft's
extension to Windows 3.1 that allows you to run many Windows NT programs).
Since Freeshow uses toolhelp.DLL (no longer supported under Win32), it won't
work with Win32. Also MAIN5 (available electronically, see "Availability,"
page 3) may not work on your Win32S system due to some floating-point
emulation problems. It does work under regular Windows 3.1. Despite these
problems, however, Win32 is the wave of the future. With the additional memory
and improved efficiency Win32 offers, VWinL makes even more sense for NT
programs.
--A.W.
Figure 1: The Vcreate_window() function.
int Vcreate_window(char *title,DWORD style,int x,int y, int wid, int hi, HWND
parent,
 LPCSTR menu, long (*cb)(HWND,UINT,UINT,LONG), unsigned long vflags,
 HDC *dc,HWND *win, int show);
Parameters
title Title that appears in the caption bar.
style The same style bits used by CreateWindow.
x X coordinate for the window; often CW_USEDEFAULT.
y Y coordinate for the window.
wid Width of the window; often CW_USEDEFAULT.
hi Window's height.
parent Handle to the window's parent window. If NULL, create a top-level
window.
menu If parent is NULL, a string that identifies the window's menu (or NULL if
there is no menu). If parent is not NULL, this is the child-window id (see
CreateWindow).
cb Pointer to your callback function. Unlike a normal callback, you don't need
to export this function or call MakeProcInstance() to get the pointer.
vflags VWINL flags (see Table 2).
dc Pointer to the new window's VMAP DC (use NULL if you don't need this
value).
win Pointer to an HWND that receives the new window handle. You must supply
this pointer.
show Same as nShow in CreateWindow. Returns zero if successful, nonzero on
failure.
Table 2: VWINL flags. (Note: V_SCALE is incompatible with V_RESIZE,
V_AUTOHSCROLL, or V_AUTOVSCROLL. The V_RESIZE flag is not compatible with
V_AUTOHSCROLL or V_AUTOVSCROLL.)
FLAG/Description
V_SCALECauses VWINL to scale the window's VMAP to fit the window's client
area. If this flag is not set, VWINL clips the VMAP to the window. When
clipping, VWINL can offset the VMAP (see Vset_offset()) or automatically
manage scroll bars.
V_RESIZECauses the window's VMAP to automatically resize when the window
resizes. This causes the VMAP's size to always match the window's size.
V_AUTOHSCROLLWhen set, VWINL will automatically manage horizontal scroll bars
for this window. When passed to Vcreate_window(), this flag forces the window
to use the WS_HSCROLL style.
V_AUTOVSCROLLWhen set, VWINL will automatically manage vertical scroll bars
for this window. When passed to Vcreate_window(), this flag forces the window
to use the WS_VSCROLL style.
V_NOMAPPass this flag to Vcreate_window() to prevent VWINL from automatically
creating a VMAP with the window. Presumably, you will use Vselect_map() to use
a VMAP from another window or from Vcreate_map(). V_NOMAP is only meaningful
during Vcreate_window().
V_NOQUITOrdinarily, closing a VWINL window will cause the entire application
to terminate. If V_NOQUIT is set for a window, you may close it without
disturbing your applications.
V_NOFREEMAPThis flag prevents VWINL from automatically freeing the window's
VMAP when you close the window. You are responsible for calling Vdestroy_map()
yourself. This is useful when more than one window shares a VMAP.
V_KSCROLLAllows VWINL to intercept scrolling keys and translates them into
scroll bar events. This is especially useful in conjunction with V_AUTOHSCROLL
and V_AUTOVSCROLL.
V_INITAn internal flag used by VWINL. Don't set this flag at home.
 Figure 2: The Freeshow program illustrates how VWinL can be used.
[LISTING ONE] (Text begins on page 28.)

/* Very simple VWINL program */
#include "vwinl.h"

/* User's callback */
long usr_cb(HWND hWnd, UINT Message,
 UINT wParam, LONG lParam)
 {
 if (Message==WM_VCREATE)
 {
 Vresize_winmap(hWnd,150,100);
/* Why limit ourselves? */
 TextOut(Vget_vdc(hWnd),5,50,"Hello Universe",14);
 Vcommit_draw(hWnd);
 }
 return DefWindowProc(hWnd,Message,wParam,lParam);
 }
/* Start here */
int main(HANDLE hInstance, HANDLE hPrevInstance,
 LPSTR lpszCmdLine, int nCmdShow )
 {

 HWND hWnd;
/* Create window or die */
 if (Vcreate_window("Simple Test Program",
 WS_OVERLAPPEDWINDOW,CW_USEDEFAULT,0,
 CW_USEDEFAULT,0,NULL,NULL, usr_cb,V_SCALE,
 NULL,&hWnd,nCmdShow))
 {
 MessageBox(NULL,"Can't create window",NULL,MB_OK);
 return 0;
 }
 return 1;
 }

[LISTING TWO]

/* Header for VWINL -- Williams */
#ifndef _VWINL_H
#define _VWINL_H
#include <windows.h>

#ifndef WIN32
#define APIENTRY FAR PASCAL
/* Check for message cracker definition if 1 is there assume they all are */
#ifndef GET_WM_VSCROLL_CODE
#define GET_WM_VSCROLL_CODE(w,l) (w)
#define GET_WM_HSCROLL_CODE(w,l) (w)
#define GET_WM_VSCROLL_HWND(w,l) ((HWND)HIWORD(l))
#define GET_WM_HSCROLL_HWND(w,l) ((HWND)HIWORD(l))
#define GET_WM_VSCROLL_POS(w,l) (LOWORD(l))
#define GET_WM_HSCROLL_POS(w,l) (LOWORD(l))
#endif

#endif /* End of non-WIN32 definitions */

/* Flags */
/* V_SCALE doesn't make sense with V_RESIZE, V_AUTOHSCROLL, or V_AUTOVSCROLL.
V_RESIZE, doesn't make sense with any of AUTOXSCROLL flags. V_NOMAP is only
valid during window creation. V_INIT is reserved for internal use. */

#define V_SCALE 1L
#define V_RESIZE 2L
#define V_AUTOHSCROLL 4L
#define V_AUTOVSCROLL 8L
#define V_NOMAP 0x10L
#define V_NOQUIT 0x20L
#define V_NOFREEMAP 0x40L
#define V_KSCROLL 0x80L
#define V_INIT 0x80000000L

/* Flags for Vset_flags() */
#define VF_STO 0
#define VF_SET 1
#define VF_CLR 2
#define VF_TOG 3

#define WM_VCREATE WM_USER
#define Vcommit_draw(w) InvalidateRect(w,NULL,FALSE)
/* Get VMAP dc */
#define Vget_mdc(m) ((m)->dc)

#define Vget_stretchmode(m) ((m)->stretch_mode)
#define Vclear_win(w) Vclear_map(Vget_map(w))

long APIENTRY VWndProc (HWND, UINT, UINT, LONG) ;
int main(HANDLE hInstance, HANDLE hPrevInstance,
 LPSTR lpszCmdLine, int nCmdShow );
typedef struct
 {
 HBITMAP bitmap;
 HDC dc;
 HBITMAP defbitmap;
 int xstep,ystep,xpage,ypage;
 unsigned refct;
 int stretch_mode;
 } VMAP;
typedef struct
 {
 VMAP *map;
/* dimensions of bitmap (not window) */
 unsigned int width;
 unsigned int height;
/* flags */
 unsigned long flags;
/* display offset */
 unsigned int xoff;
 unsigned int yoff;
 long (*cb)(HWND,UINT,UINT,LONG);
 } MEMWINFO;
void Vget_info(HWND w,MEMWINFO *info);
VMAP *Vget_map(HWND w);
VMAP *Vcreate_map(int wid,int hi);
void Vdestroy_map(VMAP *map);
VMAP *Vselect_map(HWND w,VMAP *new);
unsigned long Vset_flags(HWND w,unsigned long flags,int cmd);
void Vset_offset(HWND w,int x,int y);
void Vget_offset(HWND w,int *x,int *y);
int Vcreate_window(char *title,DWORD style,int x,int y, int wid,int hi,
 HWND parent,LPCSTR menu, long (*cb)(HWND,UINT,UINT,LONG),
 unsigned long vflags,HDC *dc,HWND *win,int show);
HDC Vget_vdc(HWND w);
int Vresize_winmap(HWND w,int wid,int hi);
int Vresize_map(VMAP *m,int wid,int hi);
void Vdont_quit(void);
void Vset_scroll(VMAP *m,int xstp,int ystp,int xpg,int ypg);
void Vclear_map(VMAP *m);
int Vset_stretchmode(VMAP *m,int mode);
int Vset_monomode(int mode);

#ifndef __BORLANDC__
#define main vwin_main
int vwin_main(HANDLE,HANDLE,LPSTR,int);
#else
int main(HANDLE,HANDLE,LPSTR,int);
#endif

#endif

[LISTING THREE]


/* VWIN virtual window package -- Al Williams */
#include "vwinl.h"
#include <windowsx.h>
#include <string.h>

/* Local prototypes */
static void do_paint(HWND);
long WINAPI VWndProc (HWND w, UINT Message, UINT wParam, LONG lParam);
static void save_info(HWND w,MEMWINFO *info);
static void set_sb(HWND w,MEMWINFO *minfo,UINT wid, UINT hi,int save);
static void scrollit(HWND w,MEMWINFO *minfo,int type, WORD code,HWND sb,
 WORD pos);
static void key_scroll(HWND w,UINT key);

/* Global variables */
static HANDLE Hinstance; /* Our instance */
static int monomode; /* Make mono bitmaps? */
/* Flag to tell us if user allows us to quit */
static int V_quit=0;
/* VWINL WinMain -- this calls our main() function.
 If main() returns 0, then we abort. */
int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance,
 LPSTR lpszCmdLine, int nCmdShow )
{
 WNDCLASS wndClass;
 MSG msg;
 HACCEL haccel;
 /* Register window class style if first instance of this program. */
 Hinstance=hInstance;
 if ( !hPrevInstance )
 {
/* NOTE: VWINL assumes the window will have a common DC! Don't use CS_OWNDC or
 CS_PARENTDC unless you know what you are getting into! */
 wndClass.style=CS_HREDRAWCS_VREDRAWCS_DBLCLKS;
 wndClass.lpfnWndProc=(WNDPROC)VWndProc;
 wndClass.cbClsExtra=0;
 wndClass.cbWndExtra=
 sizeof(MEMWINFO)+(sizeof(MEMWINFO)%2);
 wndClass.hInstance=hInstance;
 wndClass.hIcon=LoadIcon(hInstance,"vappicon");
 wndClass.hCursor=LoadCursor(NULL, IDC_ARROW );
 wndClass.hbrBackground=GetStockObject(WHITE_BRUSH);
 wndClass.lpszMenuName=NULL;
 wndClass.lpszClassName="VWINL";
 if (!RegisterClass(&wndClass))
 return FALSE;
 }
 if (!main(hInstance,hPrevInstance, lpszCmdLine, nCmdShow )) return FALSE;
/* Try to load an accelerator -- no big deal if it isn't there */
 haccel=LoadAccelerators(hInstance,"VACCEL");
/* Sorta ordinary message loop -- will translate accelerators if appropriate
*/
 while (GetMessage(&msg, NULL, 0, 0))
 {
 if (!haccel
 !TranslateAccelerator(msg.hwnd, haccel, &msg))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 }

 }
 return 0;
 }
/* Main Window procedure */
long APIENTRY VWndProc (HWND w, UINT Message, UINT wParam, LONG lParam)
 {
 MEMWINFO minfo;
 Vget_info(w,&minfo);
 switch(Message)
 {
/* Don't let user see WM_CREATE -- we aren't ready for him yet */
 case WM_CREATE:
 return DefWindowProc(w,Message,wParam,lParam);
/* Always do painting and don't tell user */
 case WM_PAINT:
 do_paint(w);
 return 0;
 case WM_SIZE:
/* Catch 1st size */
 if (minfo.flags&V_INIT)
 {
 if ((LOWORD(lParam)!=minfo.width
 HIWORD(lParam)!=minfo.height)&&minfo.map)
 {
 Vresize_winmap(w,LOWORD(lParam),HIWORD(lParam));
 }
 minfo.flags&=~V_INIT;
 save_info(w,&minfo);
 break;
 }
/* Set scrollbars */
 if (minfo.map)
 {
/* Everytime you toggle a scroll bar from on to off or off to on you get a
WM_SIZE message! This little state machine turn off bars without getting into
an endless loop. */
 static sizelock=0;
 RECT r;
 if (sizelock==1) return 0;
 if (sizelock!=2)
 {
/* Turn off both scroll bars so set_sb() can use the whole
 client area if that's what it needs */
 sizelock=1;
 if (minfo.flags&V_AUTOVSCROLL)
 SetScrollRange(w,SB_VERT,0,0,FALSE);
 if (minfo.flags&V_AUTOHSCROLL)
 SetScrollRange(w,SB_HORZ,0,0,FALSE);
/* Size might have changed, so reset it */
 GetClientRect(w,&r);
 lParam=MAKELONG(r.right-r.left,r.bottom-r.top);
 }
/* Turn scroll bars on or off */
 sizelock=2;
 set_sb(w,&minfo,LOWORD(lParam),HIWORD(lParam),TRUE);
/* Size might have changed again, so reset it */
 GetClientRect(w,&r);
 lParam=MAKELONG(r.right-r.left,r.bottom-r.top);
 sizelock=0;

 }
/* Handle V_RESIZE if active */
 if (minfo.flags&V_RESIZE&&minfo.map)
 {
 Vresize_winmap(w,LOWORD(lParam),HIWORD(lParam));
 }
 break;
/* Scroll cases */
 case WM_VSCROLL:
 if (minfo.flags&V_AUTOVSCROLL)
 scrollit(w,&minfo,SB_VERT,
 GET_WM_VSCROLL_CODE(wParam,lParam),
 GET_WM_VSCROLL_HWND(wParam,lParam),
 GET_WM_VSCROLL_POS(wParam,lParam));
 break;
 case WM_HSCROLL:
 if (minfo.flags&V_AUTOHSCROLL)
 scrollit(w,&minfo,SB_HORZ,
 GET_WM_HSCROLL_CODE(wParam,lParam),
 GET_WM_HSCROLL_HWND(wParam,lParam),
 GET_WM_HSCROLL_POS(wParam,lParam));
 break;
 case WM_KEYDOWN:
 if (minfo.flags&V_KSCROLL) key_scroll(w,wParam);
 break;
 case WM_DESTROY:
/* pass to user if V_NOQUIT set */
 if (minfo.flags&V_NOQUIT) break;
 V_quit=1;
/* Pass to user, if V_quit is set, go ahead and kill ourselves */
 if (!minfo.cb(w,Message,wParam,lParam)&&V_quit)
 {
 /* Clean up window's resources here... */
 if (minfo.map&&minfo.map->refct&&!(minfo.flags&V_NOFREEMAP))
 {
 minfo.map->refct--;
 Vdestroy_map(minfo.map);
 }
 PostQuitMessage(0);
 return 0;
 }
 return 0;
 }
/* pass to user's callback */
 if (minfo.cb)
 return minfo.cb(w,Message,wParam,lParam);
 else
 return DefWindowProc(w,Message,wParam,lParam);
 }
/* Create a VWIN -- see text for description */
int Vcreate_window(char *title,DWORD style,int x,int y,
 int wid,int hi,HWND parent,LPCSTR menu,
 long (*cb)(HWND,UINT,UINT,LONG),
 unsigned long vflags,HDC *dc,HWND *win,int show)
 {
 HWND w;
 HMENU hMenu=NULL;
 RECT r;
 MEMWINFO minfo;

 if (menu&&!parent)
 hMenu=LoadMenu(Hinstance,menu);
 else
 hMenu=(HMENU)menu;
/* Auto set scroll style */
 if (vflags&V_AUTOHSCROLL) style=WS_HSCROLL;
 if (vflags&V_AUTOVSCROLL) style=WS_VSCROLL;
 memset(&minfo,0,sizeof(MEMWINFO));
 minfo.flags=vflagsV_INIT;
 minfo.cb=cb;
 w=*win=CreateWindow("VWINL",title,style,x,y,
 wid,hi,parent,hMenu,Hinstance,NULL);
 if (!*win) return 1;
 save_info(*win,&minfo);

 SetScrollRange(*win,SB_HORZ,0,0,TRUE);
 SetScrollRange(*win,SB_VERT,0,0,TRUE);
 GetClientRect(w,&r);
 /* create DC the same size */
 minfo.xoff=minfo.yoff=0;
 minfo.width=r.right-r.left;
 minfo.height=r.bottom-r.top;
 if ((minfo.flags&V_NOMAP)==0)
 {
 minfo.map=Vcreate_map(minfo.width,minfo.height);
 if (!minfo.map) return 2;
 if (dc) *dc=minfo.map->dc;
 }
 else
 minfo.map=NULL;
 /* Store cb and other data in extra words */
 save_info(w,&minfo);
/* finish up */
 ShowWindow(*win,show);
/* Call user's callback with WM_VCREATE */
 minfo.cb(*win,WM_VCREATE,0,0);
 UpdateWindow(*win);
 return 0;
 }
/* Make a VMAP -- respects monomode flag */
VMAP *Vcreate_map(int wid,int hi)
 {
 HDC dc;
 VMAP *bm;
 HWND w;
 w=GetDesktopWindow(); /* any window will do */
 dc=GetDC(w);
 if (!dc) return NULL;
/* temp use of malloc */
 bm=(VMAP *)LocalAlloc(LPTR,sizeof(VMAP));
 if (!bm) return NULL;
 bm->dc=CreateCompatibleDC(dc);
/* release Desktop DC */
 ReleaseDC(w,dc);
 if (!bm->dc)
 {
 LocalFree((HLOCAL)bm);
 return NULL;
 }

/* Make the bitmap */
/* NOTE: Windows won't let you make an arbitrary colored
 bitmap. You must have a device that corresponds to
 the color-size of the bitmap */
 if (!monomode)
 bm->bitmap=CreateBitmap(wid,hi,GetDeviceCaps(dc,PLANES),
 GetDeviceCaps(dc,BITSPIXEL),NULL);
 else
 bm->bitmap=CreateBitmap(wid,hi,1,1,NULL); /* mono */
 if (!bm->bitmap)
 {
 DeleteDC(bm->dc);
 LocalFree((HLOCAL)bm);
 return NULL;
 }
/* Note: This is supposed to be in .1 mm units, but since no
 one else uses it, who cares! */
#ifndef WIN32
 SetBitmapDimension(bm->bitmap,wid,hi);
#else
 SetBitmapDimensionEx(bm->bitmap,wid,hi,NULL);
#endif
 bm->defbitmap=SelectObject(bm->dc,bm->bitmap);
 if (!bm->defbitmap)
 {
 DeleteDC(bm->dc);
 LocalFree((HLOCAL)bm);
 return NULL;
 }
/* Set default stuff */
 bm->xstep=bm->ystep=1;
 bm->xpage=bm->ypage=10;
 bm->refct=1;
 bm->stretch_mode=BLACKONWHITE;
 Vclear_map(bm);
 return bm;
 }
/* Free up a valid VMAP if its refct is 1 or 0 */
void Vdestroy_map(VMAP *map)
 {
 if (map->refct>1)
 return;
 SelectObject(map->dc,map->defbitmap);
 DeleteObject(map->bitmap);
 DeleteDC(map->dc);
 LocalFree((HLOCAL)map);
 }
/* Associate a new map (or NULL for no map) with a window--returns old map */
VMAP *Vselect_map(HWND w,VMAP *new)
 {
 MEMWINFO minfo;
 VMAP *rc;
#ifndef WIN32
 DWORD dims;
#else
 SIZE dims;
#endif
 RECT r;
 GetClientRect(w,&r);

 Vget_info(w,&minfo);
 rc=minfo.map;
 minfo.map=new;
 if (rc) rc->refct--;
/* NULL is OK here in which case we don't do much */
 if (new)
 {
 new->refct++;
#ifndef WIN32
 dims=GetBitmapDimension(minfo.map->bitmap);
 minfo.width=LOWORD(dims);
 minfo.height=HIWORD(dims);
#else
 GetBitmapDimensionEx(minfo.map->bitmap,&dims);
 minfo.width=dims.cx;
 minfo.height=dims.cy;
#endif
 }
 else
 {
 minfo.width=minfo.height=0;
 }
 save_info(w,&minfo);
 set_sb(w,&minfo,r.right-r.left,r.bottom-r.top,TRUE);
 Vcommit_draw(w);
 return rc;
 }
/* Set stretch mode for map -- returns old mode */
int Vset_stretchmode(VMAP *m,int mode)
 {
 int rv;
 rv=m->stretch_mode;
 m->stretch_mode=mode;
 return rv;
 }
/* Sets global monomode flag which causes VWINL to create
 monochrome bitmaps to save space.
 Returns old value (of course) */
int Vset_monomode(int mode)
 {
 int rv;
 rv=monomode;
 monomode=mode;
 return rv;
 }
/* Get window's map */
VMAP *Vget_map(HWND w)
 {
 MEMWINFO minfo;
 Vget_info(w,&minfo);
 return minfo.map;
 }
/* Erase a map's surface */
void Vclear_map(VMAP *m)
 {
 HBRUSH brush;
 unsigned int wid,hi;
#ifndef WIN32
 DWORD dims;

#else
 SIZE dims;
#endif
/* We store bitmap dimension this way.
 Units are pixels contrary to the .1mm convention */
#ifndef WIN32
 dims=GetBitmapDimension(m->bitmap);
 wid=LOWORD(dims);
 hi=HIWORD(dims);
#else
 GetBitmapDimensionEx(m->bitmap,&dims);
 wid=dims.cx;
 hi=dims.cy;
#endif
/* make background brush */
 brush=CreateSolidBrush(GetBkColor(m->dc));
 brush=SelectObject(m->dc,brush);
/* Brush area */
 PatBlt(m->dc,0,0,wid,hi,PATCOPY);
 DeleteObject(SelectObject(m->dc,brush));
 }
/* Change a window's VWINL flags -- this only makes sense for some flags. For
example, V_NOMAP is meaningless here. If you ever plan to set V_AUTOHSCROLL
or V_AUTOVSCROLL, make sure to set the scroll bar style flags during
Vcreate_window (this happens automatically when you set V_AUTOxSCROLL during
the create. Returns old flag value */
unsigned long Vset_flags(HWND w,unsigned long flags,int cmd)
 {
 unsigned long rv;
 MEMWINFO minfo;
 RECT r;
 Vget_info(w,&minfo);
 GetClientRect(w,&r);
 rv=minfo.flags;
 switch (cmd)
 {
 case VF_SET:
 minfo.flags=flags;
 break;
 case VF_CLR:
 minfo.flags&=~flags;
 break;
 case VF_TOG:
 minfo.flags^=flags;
 break;
 default:
 minfo.flags=flags;
 break;
 }
 save_info(w,&minfo);
 if (((rv&V_AUTOHSCROLL)^(minfo.flags&V_AUTOHSCROLL))
 ((rv&V_AUTOVSCROLL)^(minfo.flags&V_AUTOVSCROLL)))
 {
/* scroll changed */
 if (!(minfo.flags&V_AUTOHSCROLL))
 {
 minfo.xoff=0;
 SetScrollRange(w,SB_HORZ,0,0,TRUE);
 }

 if (!(minfo.flags&V_AUTOVSCROLL))
 {
 minfo.yoff=0;
 SetScrollRange(w,SB_VERT,0,0,TRUE);
 }
 save_info(w,&minfo);
 if (minfo.flags&(V_AUTOHSCROLLV_AUTOVSCROLL))
 set_sb(w,&minfo,r.right-r.left,r.bottom-r.top,TRUE);
 Vcommit_draw(w);
 }
 return rv;
 }
/* Set the VMAP offset in pixels */
void Vset_offset(HWND w,int x,int y)
 {
 MEMWINFO minfo;
 Vget_info(w,&minfo);
 minfo.xoff=x;
 minfo.yoff=y;
 save_info(w,&minfo);
 }
/* Read the VMAP pixel offsets */
void Vget_offset(HWND w,int *x,int *y)
 {
 MEMWINFO minfo;
 Vget_info(w,&minfo);
 if (x) *x=minfo.xoff;
 if (y) *y=minfo.yoff;
 }
/* Get window's VMAP dc */
HDC Vget_vdc(HWND w)
 {
 MEMWINFO minfo;
 Vget_info(w,&minfo);
 return minfo.map->dc;
 }
/* Resize a map
 Returns 0 if OK */
int Vresize_map(VMAP *m,int wid,int hi)
 {
 VMAP *newmap;
 int oldstate;
 HBITMAP oldbm=m->bitmap;
 newmap=Vcreate_map(wid,hi);
 if (!newmap) return 1;
 oldstate=SetMapMode(m->dc,MM_TEXT);
 BitBlt(newmap->dc,0,0,wid,hi,m->dc,0,0,SRCCOPY);
/* copy the right parts from newmap */
 m->bitmap=newmap->bitmap;
 SetMapMode(m->dc,oldstate);
/* Except for bitmap! */
 SelectObject(newmap->dc,newmap->defbitmap);
 SelectObject(m->dc,m->bitmap);
/* Delete old dc and bitmap */
 DeleteDC(newmap->dc);
 DeleteObject(oldbm);
 LocalFree((HLOCAL)newmap);
 return 0;
 }

/* Resize VMAP attached to window -- returns 0 if OK */
int Vresize_winmap(HWND w,int wid,int hi)
 {
 MEMWINFO minfo;
 RECT r;
 GetClientRect(w,&r);
 Vget_info(w,&minfo);
 if (Vresize_map(minfo.map,wid,hi)) return 1;
 minfo.width=wid;
 minfo.height=hi;
 save_info(w,&minfo);
 set_sb(w,&minfo,r.right-r.left,r.bottom-r.top,TRUE);
 return 0;
 }
/* Clear quit flag for user */
void Vdont_quit()
 {
 V_quit=0;
 }
/* Get VWIN info -- public */
void Vget_info(HWND w,MEMWINFO *info)
 {
 int i;
 for (i=0;i<sizeof(MEMWINFO);i+=2)
 *(unsigned short *)(((unsigned char *)info)+i)=GetWindowWord(w,i);
 }
/* Save VWIN info (local use only) */
static void save_info(HWND w,MEMWINFO *info)
 {
 int i;
 for (i=0;i<sizeof(MEMWINFO);i+=2)
 SetWindowWord(w,i,*(unsigned short *) (((unsigned char *)info)+i));
 }
/* Scroll handler (local) */
static void scrollit(HWND w,MEMWINFO *minfo,int type,
 WORD code,HWND sb,WORD pos)
 {
 unsigned int *offset;
/* Store as long to avoid unsigned underflow */
 long newoffset;
 int step,page;
 RECT r;
 GetClientRect(w,&r);
/* Set up offset and steps */
 if (type==SB_VERT)
 {
 offset=&minfo->yoff;
 step=minfo->map->ystep;
 page=minfo->map->ypage;
 }
 else
 {
 offset=&minfo->xoff;
 step=minfo->map->xstep;
 page=minfo->map->xpage;
 }
 newoffset=*offset;
/* Process scroll command */
 switch (code)

 {
 case SB_TOP:
 newoffset=0;
 break;
 case SB_BOTTOM:
 if (type==SB_VERT)
 {
 newoffset=minfo->height-(r.bottom-r.top);
 }
 else
 {
 newoffset=minfo->width-(r.right-r.left);
 }
 break;
 case SB_LINEUP:
 step=-step;
/* fall thru */
 case SB_LINEDOWN:
 newoffset+=step;
 break;
 case SB_PAGEUP:
 page=-page;
/* fall thru */
 case SB_PAGEDOWN:
 newoffset+=page;
 break;
 case SB_THUMBPOSITION:
 newoffset=pos;
 break;
/* No SB_THUMBTRACK processing; a big hi-res VMAP takes too long to paint */
 }
 if (newoffset<0) newoffset=0;
/* Update the offset */
 if (type==SB_VERT)
 {
 if (newoffset+(r.bottom-r.top)>minfo->height)
 newoffset=minfo->height-(r.bottom-r.top);
 }
 else
 {
 if (newoffset+(r.right-r.left)>minfo->width)
 newoffset=minfo->width-(r.right-r.left);
 }
/* Update position */
 SetScrollPos(w,type,(unsigned)newoffset,TRUE);
 *offset=newoffset;
 save_info(w,minfo);
 Vcommit_draw(w);
 }
/* Set scroll parameters. Currently you can't read them back unless you call
 Vget_info(). If you must have a Vget_scroll call you can write it! */
void Vset_scroll(VMAP *m,int xstep,int ystep,int xpage,int ypage)
 {
 m->xstep=xstep;
 m->ystep=ystep;
 m->xpage=xpage;
 m->ypage=ypage;
 return;
 }

/* Process "scroll" keys */
static void key_scroll(HWND w,UINT key)
 {
 switch (key)
 {
 case VK_HOME:
 SendMessage(w,WM_VSCROLL,SB_TOP,0L);
 break;
 case VK_END:
 SendMessage(w,WM_VSCROLL,SB_BOTTOM,0L);
 break;
 case VK_PRIOR:
 SendMessage(w,WM_VSCROLL,SB_PAGEUP,0L);
 break;
 case VK_NEXT:
 SendMessage(w,WM_VSCROLL,SB_PAGEDOWN,0L);
 break;
 case VK_UP:
 SendMessage(w,WM_VSCROLL,SB_LINEUP,0L);
 break;
 case VK_DOWN:
 SendMessage(w,WM_VSCROLL,SB_LINEDOWN,0L);
 break;
 case VK_LEFT:
 SendMessage(w,WM_HSCROLL,SB_LINEUP,0L);
 break;
 case VK_RIGHT:
 SendMessage(w,WM_HSCROLL,SB_LINEDOWN,0L);
 break;
 }
 }
/* Local function to set scroll bars up */
static void set_sb(HWND w,MEMWINFO *minfo,UINT wid, UINT hi,int save)
 {
 RECT r;
 if (minfo->flags&V_INIT) return;
 if (minfo->flags&V_AUTOHSCROLL)
 {
 if (minfo->xoff&&minfo->width<=wid)
 {
/* If bitmap will fit in client area, make it do so */
 minfo->xoff=0;
 if (save) save_info(w,minfo);
 }
/* Set up H bar */
 SetScrollPos(w,SB_HORZ,minfo->xoff,FALSE);
 SetScrollRange(w,SB_HORZ,0, (unsigned)
 max(0L,(long)minfo->width-(long)wid) ,TRUE);
/* Recompute size -- may have changed if scroll bar enabled by above step */
 GetClientRect(w,&r);
 wid=r.right-r.left;
 hi=r.bottom-r.top;
 }
 if (minfo->flags&V_AUTOVSCROLL)
 {
 if (minfo->yoff&&minfo->height<=hi)
 {
/* If bitmap will fit in client area, make it do so */
 minfo->yoff=0;

 if (save) save_info(w,minfo);
 }
/* Set up V bar */
 SetScrollPos(w,SB_VERT,minfo->yoff,FALSE);
 SetScrollRange(w,SB_VERT,0, (unsigned)max(0L,(long)minfo->
 height-(long)hi), TRUE);
 }
 }
/* Magic paint routine */
static void do_paint(HWND w)
 {
 HDC hdc;
 PAINTSTRUCT ps;
 MEMWINFO minfo;
 RECT r;
 int oldmode;
#ifndef WIN32
 DWORD oldworg,oldvorg;
#else
 POINT oldworg,oldvorg;
#endif
 hdc=BeginPaint(w,&ps);
 GetClientRect(w,&r);
 Vget_info(w,&minfo);
 if (minfo.map==NULLminfo.map->dc==0)
 {
 ReleaseDC(w,hdc);
 EndPaint(w,&ps);
 return;
 }
/* Set up DC the way we like it */
 oldmode=SetMapMode(minfo.map->dc,MM_TEXT);
#ifndef WIN32
 oldworg=SetWindowOrg(minfo.map->dc,0,0);
 oldvorg=SetViewportOrg(minfo.map->dc,0,0);
#else
 SetWindowOrgEx(minfo.map->dc,0,0,&oldworg);
 SetViewportOrgEx(minfo.map->dc,0,0,&oldvorg);
#endif
/* Do something different for scale window */
 if (minfo.flags&V_SCALE)
 {
 int oldmode;
 oldmode=SetStretchBltMode(hdc,minfo.map->stretch_mode);
 StretchBlt(hdc,0,0,r.right-r.left,r.bottom-r.top,
 minfo.map->dc,minfo.xoff,minfo.yoff,
 minfo.width,minfo.height,SRCCOPY);
 SetStretchBltMode(hdc,oldmode);
 }
 else
 {
/* if VMAP doesn't entirely cover window, clear first */
 if (r.right-r.left>minfo.width-minfo.xoff
 r.bottom-r.top>minfo.height-minfo.yoff)
 {
 HBRUSH brush;
 brush=CreateSolidBrush(GetBkColor(minfo.map->dc));
 brush=SelectObject(hdc,brush);
 /* erase "under" bitmap */

 PatBlt(hdc,0,minfo.height,
 r.right-r.left,r.bottom-r.top,PATCOPY);
 /* erase to "right" of bitmap */
 PatBlt(hdc,minfo.width,0,
 r.right-r.left,r.bottom-r.top,PATCOPY);
 DeleteObject(SelectObject(hdc,brush));
 }
/* Draw it */
 BitBlt(hdc,0,0,minfo.width-minfo.xoff, minfo.height-minfo.yoff,
 minfo.map->dc, minfo.xoff,minfo.yoff,SRCCOPY);
 }
 SetMapMode(minfo.map->dc,oldmode);
#ifndef WIN32
 SetWindowOrg(minfo.map->dc,LOWORD(oldworg), HIWORD(oldworg));
 SetViewportOrg(minfo.map->dc,LOWORD(oldvorg), HIWORD(oldvorg));
#else
 SetWindowOrgEx(minfo.map->dc,oldworg.x,oldworg.y,NULL);
 SetViewportOrgEx(minfo.map->dc,oldvorg.x,oldvorg.y,NULL);
#endif
 ReleaseDC(w,hdc);
 EndPaint(w,&ps);
 }

End Listings






































Special Issue, 1993
Faking DDE with Private Servers


An alternative to the protocol from hell




Joseph M. Newcomer


Dr. Joseph M. Newcomer received his PhD in the area of compiler optimization
from Carnegie Mellon University in 1975. He can be contacted at 610 Kirtland
St., Pittsburgh, PA 15208.


Client/server architecture is an elegant solution to a number of
application-design problems. In Windows, Microsoft provides a protocol for
doing client/server systems called dynamic data exchange (DDE), which is
bundled with the 3.1 SDK. DDE is a complex protocol, as evident by the
Microsoft Systems Journal article that referred to it as "the protocol from
hell." Even the DDEML library doesn't make it easy to use or understand. Part
of the complexity comes from the generality: This protocol is designed to
allow clients, potentially on multiple machines, to talk to a central server.
OLE, on the other hand, is a very sophisticated protocol built on top of DDE.
Like DDE, it provides for having multiple servers (perhaps on multiple
machines), but handles the details of initiating the servers. The techniques
described here are used for intratask servers. I don't deal with linked or
embedded objects, just with distributed control.
However, on one project, I needed something much simpler than the fully
general case--I needed a client/server architecture within the application
itself. The application consisted of several modeless dialog boxes that had
interacting state and had to maintain a consistent state. The number of dialog
boxes present would vary. As each dialog box started up, it needed to
determine the distributed state that affects its display, then either track
the changing state or inform other boxes of the state it had changed.
In short, the application popped up a number of modeless dialog-box windows,
each of which provided for database access under a different control scenario.
(For the sake of example, I'll consider only a few controls.) Each window had
a set of controls labeled << (go backward one record), >> (go forward one
record), and First (go to top of database). These controls had to be enabled
only when they were valid:
When the database wasn't open, no control was valid.
When EOF was hit, >> was to be disabled.
When BOF was hit, << and First were to be disabled.
When not at BOF or EOF, all motion buttons should be enabled.
Additionally, each window had a static text-display box which displayed the
current record number that needed to be updated. Finally, some boxes displayed
the contents of the actual record being examined, and one of the boxes allowed
the user to make changes in this record.
Trying to keep track of who got notified by what means, when the buttons were
updated, and so forth, was overly complex. Each new modeless dialog box
required notification when the file status changed so it could be updated. It
wasn't acceptable to update the windows only when they received the input
focus, because the windows were small and could be displayed simultaneously;
having one display one form of information and another display different
information would be misleading to the user. Since the application was
intended to be a single-user, single-application, single-instance server, I
didn't worry about database locking (although it's easy to add).
Although Windows is multitasking, I've decided as a matter of programming
practice that life is too short for local/global allocation, small/medium
model, and other incidental distractions. Therefore, I program exclusively in
the large model, because certain C compilers do not allow two instances of the
same program to run under Windows. I was able to avoid locking by knowing that
only a single instance of the program could be active at any time. (If I'd
chosen to use the small or medium models and wanted to avoid locking, I could
have used the hPrevInstance parameter to WinMain to determine if an instance
was already running, informing the user accordingly.)
Finally, the program was a single-application environment; that is, it was the
only program that can or will access the data. Although I use a commercially
available database format, I assumed that end users wouldn't actually own the
database engine. Still, I used the OF_SHARE_EXCLUSIVE flag to add a single,
global, file-lock operation when the file is opened to ensure that no two
applications can access the data.
What I ended up building was what I call a "private server," which is
implemented as a window that accepts messages I send it and returns values to
the sender. Initially, the window is created and made invisible. It accepts
messages such as DB_OPEN, DB_GET_NEXT_RECORD, and DB_CLOSE, which perform
obvious operations. It also accepts messages to insert, delete, and modify
database records. Field modification is done by sending a message such as
DB_WRITE_FIELD, in which wParam is the field index and lParam is an LPSTR to
the text to be put into the field. The details don't matter much, and you can
apply this technique to any database or other application library. A nice
feature of this technique is that the database engine is now separated from
the application by this server, allowing you to use it with just about any
engine.
The model I am using here is "centralized knowledge/distributed control." The
"server" has all the "knowledge" about the state of the database, but has no
"control." The many modeless dialog boxes have no knowledge of the state of
the database, but are the initiators of actions (and hence have control). The
problem arises because these distributed control points, the "clients," in
fact need to have knowledge of the database (such as the current record
number, what operations are valid, the actual data record, and so forth). The
way they obtain this information is that the server distributes its knowledge
to the clients when it knows the information has changed. The clients do not
need to take responsibility for asking for information; they are told when
things have changed. This is analogous to the "polling vs. interrupt" paradigm
we have used for years: The peripheral device knows its state, and
occasionally needs to inform the processor about it.
An interesting dual is that this is also a system that implements "distributed
knowledge/centralized control." Each of the client windows possesses knowledge
about its own state, and uses the server as a control mechanism for
distributing this knowledge to other interested windows. In addition, each
client window that needs knowledge of distributed state (such as the
visibility of another client window) uses the server as a control mechanism
for making queries about this distributed state.


The Event Registry


The real power comes from the notions of an event registry and of event
notification. Any modeless dialog window I pop up knows how to find the
database window (dbwind); once it does, it sends a message in its
WM_INITDIALOG handler registering itself: SendMessage(dbwind, DB_REGISTER,
hDlg, 0L). This causes the database window to add the window handle passed as
wParam to its registry. Certain important events will now notify the
registered windows by sending them messages. When a modeless dialog box is
destroyed, it sends, as part of its WM_DESTROY handler, a DB_UNREGISTER
message to ask the registry to remove it. The implementation of the
DB_REGISTER and DB_UNREGISTER handlers is trivial; see Listing Four (page 44).
I could have caused the database window to broadcast to all windows using the
HWND_BROADCAST designator for the destination window (0xFFFF). However, this
meant that every window in the system would receive these messages. Although
the WM_USER identifier is supposed to be the user-defined message base, both
existing Windows classes (such as ListBox) and other user-defined classes in
applications unknown to me would be using numbers in the WM_USER and higher
range. Therefore, I would have to use RegisterWindowMessage to register all
the messages, and UnregisterWindowMessage to remove them. This would
unnecessarily complicate what was already turning into a significant effort.
Therefore, I decided to force each of the modeless dialog boxes to register
their desire instead.


Creating the Server


For the server window, I wanted to use a modeless dialog box but needed to
return FAR pointers, 32-bit record values, and other 32-bit values as results.
A dialog box normally returns True, False, or a specific type of value such as
a brush handle (that is, for WM_CTLCOLOR messages). Returning anything else in
the return statement for any other message normally causes the dialog box to
truncate the value to a 16-bit BOOL.
There are three alternatives:
Don't use the CreateDialog family of API calls and "roll it by hand."
Use the technique from Petzold for his HEXCALC program (Petzold chapter 10, p.
454ff) and live with 16-bit results.
Use the undocumented SetDialogMsgResult function in windowsx.h.
Use the HEXCALC technique and the undocumented SetDialogMsgResult function in
windowsx.h to have a dialog window and a 32-bit result. The SetDialogMsgResult
macro uses SetWindowLong to set the DWL_MSGRESULT word to a 32-bit result.
Each dialog window has a 32-bit value--the GDL_USER long word--that can be
accessed by GetWindowLong and SetWindowLong. During the WM_INITDIALOG
processing, I create dbfile, a data structure, and store a pointer to it in
this word. Subsequently, I set the local variable db to reference this value
each time the procedure is activated. (Before WM_INITDIALOG processing, this
pointer is NULL but is not used.)
I use an object-oriented programming technique here: the GDL_USER word is the
equivalent to "self," so I only need to pass around the window handle to have
access to the "self" pointer; I can also have multiple windows, each with its
own private data pointer. By using a pointer to objects in your own heap, you
avoid using "window extra" words which consume precious USER heap. For modal
dialog boxes, you can even have the data object referenced by this pointer on
the stack. An example of the creation and deletion of these objects is in the
WM_INITDIALOG and WM_ DESTROY handlers in Listing Two (page 43).
The modeless dialog box associated with the server would normally not be seen
by the end user; therefore, in specifying it to the dialog editor I remove the
check mark from the Visible box, which removes the WS_VISIBLE property from
the resource declaration. Figure 1 is an example of my server window.


The Server-window Procedure


The basic data structures for the server are shown in Listing One (page 43).
The registry is a simple, doubly linked list structure. The dbfile structure
holds all of the critical state information about the database file: filename
(which is shown), an object used by the underlying database engine (which I
give the placeholder DATABASE_APPLICATION_THING dbref), and similar
information. I omit most of these because the use of this technique to
construct a database-server window isn't as important as the details of the
generic "server window" architecture, which could be used for purposes other
than a database server.
When a WM_CLOSE message comes in, I request that the window be destroyed. I
could also handle WM_QUERYCLOSE if I wanted to interact with the user for
confirmation, but this is handled by the top-level window in my app.

Finally, WM_DESTROY sends a DB_CLOSE message to close the database file, frees
up any memory or resources in use by the dbfile object, and finally frees the
dbfile object itself, setting the reference in the dialog window to NULL just
for completeness.
WM_COMMAND reports the events from the child controls. The only control I've
illustrated here is the event from the Hide button, DB_HIDE, which sends a
message to the window telling it to hide itself.
The remaining messages are DB_ messages that implement the actual operations
of the private server. A DB_ message is sent to the window by one of its
clients. The server can either return a 32-bit value as its result or, if it
must return longer information (such as a string), it can accept in lParam a
pointer to the location of the result.
As an example, the message DB_ GET_RECORD_NUMBER queries the database library
for the current record number (via get_recno) and returns the 32-bit record
number as the result of the SendMessage. DB_GET_FIELD takes a field number in
wParam and a destination in lParam. Because the database has a limit of 256
bytes for a field, this code will handle the maximum field width, setting the
value of the field into the control ID_DATA in one of the client dialog
windows.


The Event Registry Redux


The real heart of the server is the event registry, a doubly linked list of
windows which would like to receive notification events when various
"interesting" things happen in the server. For example, the event of hitting
end-of-file means that the Next Record button must be disabled or, if the file
is closed, all the record-movement buttons must be disabled.
Each window to receive notification registers its intent with the server by
sending the DB_REGISTER message. When the window is destroyed, it sends a
DB_UNREGISTER message. The handlers for these are trivial; see Listing Three
(page 43). The code in each dialog window is shown in Listing Five (page 44).
A notification message can be sent in several ways. Normally, whenever
something "interesting" happens in the server, a notification is sent. This is
a DBN_NOTIFICATION message, and it is sent to all registered windows. Each
DBN_NOTIFICATION message reports the state of the database, using the protocol
shown in Listing Six (page 44).
Whenever appropriate, any window can send a DB_NOTIFY message to the server,
and all registered windows will receive a set of DBN_NOTIFICATION messages
that accurately reflect the state of the database. The handler for this is
shown in Listing Seven (page 44). An event, particularly DB_NOTIFY, can create
more than one DBN_NOTIFICATION message. For example, if the file is positioned
in an "internal" record (not the first or last record) and that record is
deleted, DB_NOTIFY will send a DB_NOTIFY_MID (in middle of file) and
DB_NOTIFY_DELETED. The response of a receiving window is to enable the Next
Record, First Record, and Previous Record buttons, and set the text of the
Delete button to Undelete. For debugging purposes, I added a Notify button to
the server dialog window that forces it to send a DB_NOTIFY message to itself.
The DB_NOTIFY_ME message takes the window handle of the client as its wParam.
To handle the initialization of a new modeless dialog box which is a server of
the client, DB_NOTIFY_ME notifies a single window. This typically takes place
during the WM_INITDIALOG handler for a client.
Whenever a field is changed, a DBN_FIELD_CHANGE message is sent with
HIWORD(lParam) being the index of the changed field. This is used only when an
individual field--not an entire record--is changed, and its purpose is to
allow the clients to update their displays of that one particular field. If
all clients are to be kept up to date on a character-by-character basis with
typein, you have to respond to EN_CHANGE or EN_UPDATE messages in the client
that has the window with the input focus.
DB_USER_NOTIFY handles the interaction of window state that is outside the
scope of the database and is defined instead by the application.
DB_USER_NOTIFY can take a 16-bit notification code as its wParam, and it sends
this to every registered window using the DBN_USER_NOTIFICATION message. For
example, in my application one window contains a pushbutton that toggles the
visibility of another window. If the target window is closed or minimized, the
window(s) containing a Show button must be notified, so the legend can be
changed from Hide to Show. However, the target window doesn't have to know
which window(s) actually can control it; as part of its response to a minimize
request, it sends a DB_USER_NOTIFY message with the appropriate code to
indicate it is closing, and all the windows in the registry are notified that
it has closed. They may modify their local controls appropriately; see Listing
Nine (page 44).
Finally, there's a protocol for passing state around. Any window whose state
may be interesting is expected to define a set of DBN_USER_NOTIFICATION
queries which will cause it to report its state (by means of DB_USER_NOTIFY
messages). Thus, a window which wants to know the state of the debug window
(wherever it is), knows that if it sends out a DB_USER_NOTIFY request with the
code UDB_QUERY_DEBUG_STATE it will eventually receive a (possibly series of)
DNB_USER_NOTIFICATION messages revealing the debug state. These might be
defined as telling if the window is hidden or visible, if single stepping is
on, or whatever is appropriate. What makes this interesting is that the window
making the query only needs to know the query code and the server window
handle; it doesn't need to know the window handle of the debug window. The
debug window does not need to know who placed the query; it responds with a
DB_USER_NOTIFY message to the server, which undertakes to deliver the
notification to all registered windows. Once a window has defined the queries
it will respond to, it does not need to know which window has made the query.
An interesting architectural feature is that you can later reimplement the
"debug window" as three windows, and none of the clients need to know that the
reimplementation has taken place.


Window Visibility


The message DB_SHOW_STATUS takes SW_SHOW or SW_HIDE as a wParam. This causes
the window to be displayed or hidden. (SW_SHOW opens iconic windows.) The
message DB_GET_SHOW_STATUS returns SW_SHOW or SW_HIDE to indicate the current
state and is typically used in the main-menu processing to set the check mark
properly during WM_INITMENUPOPUP processing, as shown in Listing Ten (page
44). The DB_GET_SHOW_STATUS handler is in Listing Eight (page 44).
The visibility of the server window was a bit tricky. The server window is a
child window of the main window. No Windows protocol says, "Make this window
the topmost window as long as it has the focus, then when it loses the focus,
revert its topmostness to whichever window was topmost before." Without the
ability to make this window the topmost, it can be partially obscured by, or
even totally hidden behind, some other window which is already the topmost
window. Additionally, if it is topmost--but doesn't have the focus--it can
obscure other important windows such as a file-open dialog box.
To handle this, I adopted a protocol for my application whereby any window
that wishes to be topmost must handle WM_SETFOCUS messages and make itself the
topmost. It does not have to worry about determining which window was topmost.
This is probably not optimal, but is the best I could do without the
additional complication of doing EnumWindows. What I really want is a
task-topmost window.
To make the window the topmost window, I use the handler in Listing Eleven
(page 44). The SWP_NOACTIVATE flag is important; otherwise, the SetWindowPos
will (as an additional side effect) cause focus to return the window,
resulting in the window again being forced topmost.
In addition, because focus always returns to the server window, it becomes
impossible to gain control of any other window, including the main window, so
the menu bar becomes inaccessible! Unfortunately, Windows does not appear to
have any way to force a clean "close" on such a runaway process; the only way
to regain control is to use Ctrl-Alt-Del, and forcing task termination, which
can leave the USER and GDI heaps cluttered with unrecoverable objects and
references counts on DLLs artificially high. Eventually, you will have to
restart Windows or reboot the machine to get into a clean state.


Field Changes


I've used several dBase III/IV compatible engines, the Paradox Engine, and a
number of "proprietary" engines. One common feature is the ability to
represent a field by a small integer and name, and map between the two
representations. I chose to use the field number in all operations on fields
because it is a compact representation (although I use 16 bits, most databases
are limited to under 256 unique fields so an 8-bit number is actually
sufficient). Fields are handled by the following messages:
DB_GET_NFIELDS returns the number of fields in the database, numbered 1
through the number returned.
DB_TYPE_BY_NUMBER returns a character code indicating the field type given the
field number.
DB_WIDTH_BY_NUMBER returns a character code indicating the field width, in
characters, given the field number.
DB_GET_FIELD takes a field number as wParam and a pointer to a buffer as
lParam, and copies the contents of the record into the buffer.
DB_SET_FIELD takes a field number as wParam and a pointer to a buffer as
lParam, and copies the text of the buffer into the field. It also sends out a
DBN_FIELD_CHANGE message to all registered users.
DB_NAME_FROM_NUMBER takes a field number as wParam and a pointer to a buffer
as lParam, and copies the name of the field into the buffer.
DB_FIELDNO_BY_NAME takes a pointer to a field name as lParam and returns as
its value the field number.
There are variations, of course. For example, DB_SET_FIELD_DATE takes a
canonical 32-bit date representation as lParam and sets this in the date field
in the appropriate form. (This isolates the application from the date
representation used by the underlying engine.) DB_SET_FIELD_LOGICAL takes a
Boolean value in lParam and sets a Boolean field value to the code appropriate
for the database (in dBase IV, this is T or F). DB_SET_FIELD_LONG takes a
32-bit integer as lParam and stores it in an integer field. (In dBase IV, this
is a right-justified text string, but in other engines it could simply be a
copy of the lParam value.) The corresponding DB_GET_ operations perform the
inverse mapping, so DB_GET_FIELD_LONG returns as its value a 32-bit integer,
DB_GET_FIELD_LOGICAL returns a BOOL, and so on.
The handling of these messages is illustrated in Listing Two. For example, the
db structure is assumed to have an array of field-width values in this sketch.
Note that this code does not check that wParam is valid; in the actual code,
the database engine provides this service for me. Operations such as CopyField
and get_recno are other illustrations of calls to the underlying database
engine.


The System Menu


The dialog box is of a fixed size; resizing is not supposed to be permitted.
However, there is a system-menu box associated with this window, and it's
necessary to disable the SC_SIZE menu option. Also, I didn't want to allow the
user to actually close the window, so I had to disable the SC_CLOSE option
too. The system-menu handling is shown in the server's pop-up menu handler in
Listing Twelve (page 44).


Record Handling


Records are accessed by sequential or index order. DB_SKIP takes a signed
lParam value indicating the number of records forward (positive) or backward
(negative) to move; for an indexed database, a skip of 1 moves to the next
record as determined by the index (next logical record), and for an unindexed
database it moves to the next physical record. DB_REWIND and DB_GOTO_EOF
position the record pointer just before the first, or just after the last
physical or logical record. An arbitrary record can be found by using DB_SEEK,
whose lParam is a pointer to a text key. I plan to add DB_SEEK_LONG and
DB_SEEK_DATE for those (for me) rare cases in which a numeric or date value is
used. This application required only a text-key seek.
One problem with the distributed control paradigm is that it's sometimes
necessary for a client modeless dialog to do some processing if the active
record position is about to be changed; for example, to make sure the values
in edit boxes on the screen are properly copied to the record. In addition, it
may not be permissible to change the record position (for example, if an edit
is underway but a constraint is not met). A typical case might be that an
illegal date or illegal number has been typed into an edit control.
Consequently, all client windows honor the DBN_QUERY_CHANGE message, which
asks, "Are you willing to let the server move to another record?" If a window
doesn't care, or successfully performs the operations it wishes to perform, it
returns True; if it doesn't wish the current record to move, it returns False.
In the latter case, DB_SEEK, DB_GOTO_, or DB_SKIP return an error code
indicating that some other window prevented the operation from taking place.
If the operation is permitted, the record pointer is changed. After the
successful completion, a DBN_NOTIFICATION message is sent to all registered
clients indicating the position of the file: beginning, middle, or end. This
allows the client windows enable or disable their "forward" or "backward"
controls. A DBN_POS message is sent out to indicate the current record number.
To add a record, DB_APPEND_BLANK is sent to the server. This first sends out a
DBN_QUERY_CHANGE message to all the clients, since it requires moving from the
current record. If successful, it appends a blank record, then sends out a
DBN_NRECS message to tell the clients to update the number of records in the
file (if they care), and a DBN_NOTIFICATION to indicate the file position.
A record can be deleted by marking it as a "deleted record." Later, an
operation such as "pack," will physically remove all deleted records. (Some
database engines actually delete the record when the delete operation occurs,
or make it impossible to see a record marked as "deleted;" for these engines,
the notion of hitting a deleted record is irrelevant.) When a record is
selected, a DBN_NOTIFICATION message is sent out indicating whether the record
is deleted or not. In addition, the "delete" and "undelete" operations will
send out notifications of the record's status.



File Status


Clients need to be notified about overall file status; for example, the
database file may be closed at the time a client connects to the server, and
notification about its opening is important. Similarly, if the file is open,
notification about its closing is important. Finally, the transition from an
empty to a nonempty file or vice versa turned out (in my application) to be
important. All of these conditions generate DBN_NOTIFICATION messages.


Summary


Like any serious Windows application, this one required a lot of attention to
details. What started out as a trivial exercise to avoid the "more complex"
DDE protocol turned out to require more careful design than I had originally
thought. It certainly required a great deal more code.
However, it was quite successful; I ended up with an architecture of
centralized knowledge and distributed control, and the database server
implemented as an active object. Since the implementation, I've added four new
view windows to the application.
My next goal is to convert it to a DLL so I can use it with other
applications. I'll also probably extend it to support better file and record
locking so it can be used for shared database applications.
 Figure 1: Sample server window.
[LISTING ONE] (Text begins on page 37.)
typedef struct dbregistry {
 struct dbregistry * next;
 struct dbregistry * prev;
 HWND target;
 } dbregistry;
typedef struct dbfile {
 char dbname[_MAX_PATH];
 DATABASE_APPLICATION_THING dbref;
 /* ... other server-specific fields, mostly for debugging, go here... */
 LONG count; /* number of messages processed */
 /* ... end of server-specific fields */
 dbregistry * registry; /* notifications */
 } dbfile;

[LISTING TWO]

LONG FAR PASCAL IFS_WndProc(HWND hDBwnd, unsigned message, WPARAM
 wParam, LPARAM lParam)
 {
 dbfile * db = (dbfile *)GetWindowLong(hDBwnd, DBDATA);
 switch(message)
 { /* message */
 case WM_INITDIALOG:
 /* ... Child control initialization goes here ... */
 db = calloc(1,sizeof(dbfile));
 SetWindowLong(hDBwnd, GDL_USER, (LONG) db);
 break;
 case WM_CLOSE:

 DestroyWindow(hDBwnd);
 break;
 case WM_DESTROY:
 SendMessage(hDBwnd, DB_CLOSE, 0, 0L);
 if(db->buffer != NULL)
 free(db->buffer);
 free(db);
 SetWindowLong(hDBwnd, DBDATA, NULL);
 break;
 case WM_COMMAND:
 switch(wParam)
 { /* wParam */
 case IFS_HIDE:
 SendMessage(hIFwnd, DB_SHOW_STATUS, SW_HIDE, 0L);
 break;

 } /* wParam */
 break;
 case DB_GET_RECORD_NUMBER:
 return get_recno(db, (LONG *)lParam);
 case DB_FIELD_WIDTH:
 return (LONG) db->fieldwidth[wParam];
 case DB_GET_FIELD:
 CopyField((LPSTR)lParam, wParam);
 return lParam;

[LISTING THREE]

/****** RegisterHandle ************************************************
* Inputs: dbfile * db: server object -- HWND wnd: Window to register
* Result: BOOL -- TRUE if success, FALSE if error
* Effect: Adds the handle to the registry
************************************************************************/
static BOOL RegisterHandle(dbfile * db, HWND wnd)
 {
 dbregistry * r;
 for(r = db->registry; r != NULL; r = r->next)
 { /* scan for membership */

 if(r->target == wnd)
 return TRUE; /* already registered */
 } /* scan for membership */
 r = calloc(1, sizeof(dbregistry));
 if(r == NULL)
 return FALSE;
 if(db->registry == NULL)
 db->registry = r;
 else
 { /* link on front */
 r->next = db->registry;
 r->next->prev = r;
 db->registry = r;
 } /* link on front */
 return TRUE;
 }
/******** UnregisterHandle ************************************************
* Inputs: dbfile * db: server object -- HWND wnd: Window reference
* Result: LONG -- SUCCESS if ok (currently it always succeeds)
* Effect: Removes the registry entry for the given handle.
* Notes: If the handle does not exist, this is acceptable
****************************************************************************/
static LONG UnregisterHandle(dbfile * db, HWND wnd)
 {
 dbregistry * r;
 for(r = db->registry; r != NULL; r = r->next)
 { /* search */
 if(r->target == wnd)
 { /* found it */
 if(r->next != NULL)
 r->next->prev = r->prev;
 if(r->prev != NULL)
 r->prev->next = r->next;
 else
 db->registry = r->next;
 free(r);

 return SUCCESS;
 } /* found it */
 } /* search */
 /* Didn't find it. May want to return a code other than SUCCESS code */
 return SUCCESS;
 }

[LISTING FOUR]

case DB_REGISTER:
 return RegisterHandle(db,wParam);
case DB_UNREGISTER:
 return UnregisterHandle(db,wParam);

[LISTING FIVE]

case WM_INITDIALOG:
 SendMessage(server, DB_REGISTER, hDlg, 0L);
 ...
 break;
case WM_DESTROY:
 SendMessage(server, DB_UNREGISTER, hDlg, 0L);
 ...
 break;

[LISTING SIX]

message: DBN_NOTIFICATION or DBN_USER_NOTIFICATION
wParam: control id of database server in parent window
lParam: LOWORD: window handle of database server window
 HIWORD: DBN_NOTIFICATION: one of the DB_NOTIFY_ codes, below
 DBN_USER_NOTIFICATION: wParam of the DB_USER_NOTIFY
#define DB_NOTIFY_ERROR 0 /* error occurred */
#define DB_NOTIFY_EMPTY 1 /* file is empty */
#define DB_NOTIFY_BOF 2 /* file is at BOF */
#define DB_NOTIFY_EOF 3 /* file is at EOF */
#define DB_NOTIFY_MID 4 /* file is not empty, not at BOF, not at EOF */
#define DB_NOTIFY_DELETED 5 /* current record is deleted */
#define DB_NOTIFY_OPEN 6 /* file is opened */
#define DB_NOTIFY_CLOSE 7 /* file is closed */
#define DB_NOTIFY_POSITION 8 /* position has changed */
#define DB_NOTIFY_UNDELETED 9 /* current record is undeleted */

#define DB_NOTIFY_NONEMPTY 10 /* file is now nonempty */

[LISTING SEVEN]

/******** NotifyRegistry ************************************************
* Inputs: HWND hDBWnd: Our window -- WORD code: Code to include in message
* WORD msg: DBN_NOTIFICATION or DBN_USER_NOTIFICATION
* Result: void
* Effect: Sends a notification message to each of the registered windows
* Notes: message: DBN_NOTIFICATION or DBN_USER_NOTIFICATION
* wParam: control id of database server in parent window
* lParam: LOWORD: window handle of database server window
* HIWORD: DBN_NOTIFICATION: one of the DB_NOTIFY_ codes, below
* DBN_USER_NOTIFICATION: wParam of the DB_USER_NOTIFY

****************************************************************************/

static void NotifyRegistry(HWND hIFSwnd, WORD msg, WORD code)
 {
 dbregistry * r;
 dbfile * db = (dbfile *)GetWindowLong(hDBwnd, IFSDATA);

 for(r = db->registry; r != NULL; r = r->next)
 { /* notify each */
 SendMessage(r->target, msg, GetWindowWord(hDBwnd, GWW_ID),
 MAKELONG(hDBwnd, code));
 } /* notify each */
 }
 /* ... */
 case DB_NOTIFY:
 NotifyRegistry

[LISTING EIGHT]

case DB_GET_SHOW_STATUS:
 if(IsVisible(hDBwnd))
 { /* visible */
 if(IsIconic(hDBwnd))
 return SW_HIDE;
 else
 return SW_SHOW;
 } /* visible */
 else
 { /* invisible */
 return SW_HIDE;
 } /* invisible */

[LISTING NINE]

#define UWM_HIDE_EDITOR (WM_USER+217)
#define UWM_SHOW_EDITOR (WM_USER+218)

#define UDBN_HIDING_EDITOR 27
#define UDBN_SHOWING_EDITOR 28

/** Editor window: receives message to hide itself, and notifies all clients
 that it has hidden itself. **/
 switch(message)
 { /* decode message */
 /* ... other messages handled here */
 case UWM_HIDE_EDITOR:

 ShowWindow(hDlg, SW_HIDE);
 SendMessage(server, DB_USER_NOTIFY, UDBN_HIDING_EDITOR, 0L);
 break;
 case UWM_SHOW_EDITOR:
 ShowWindow(hDlg, SW_SHOW);
 SendMessage(server, DB_USER_NOTIFY, UDBN_SHOWING_EDITOR, 0L);
 break;
 } /* decode message */
/** Client window which has editor hide/show button: update button text to
 indicate what will take place when the button is clicked. **/
 switch(message)
 { /* decode message */
 /* ... other messages handled here */


 case DBN_USER_NOTIFICATION:
 switch(HIWORD(lParam))
 { /* What notification? */
 case UDBN_HIDING_EDITOR:
 SetDlgItemText(hDlg, ID_EDITOR, "Show");
 break;
 case UDBN_SHOWING_EDITOR:
 SetDlgItemText(hDlg, ID_EDITOR, "Hide");
 break;
 } /* What notification? */
 } /* decode message */

[LISTING TEN]

 case WM_INITMENUPOPUP:
 CheckMenuItem(GetMenu(hWnd), IDM_SHOW_DB,
 SendMessage(server, DB_GET_SHOW_STATUS, 0, 0L)==SW_SHOW
 ? MF_CHECKED
 : MF_UNCHECKED);
 ... other popup handling
 return 0;

[LISTING ELEVEN]

 case WM_ACTIVATE:
 switch(wParam)
 { /* decode type */
 case WA_INACTIVE:
 SetWindowPos(hDBwnd, HWND_TOP, 0, 0, 0, 0,
 SWP_NOMOVE SWP_NOSIZE SWP_NOACTIVATE);
 break;
 default: /* WA_(CLICK)ACTIVE */
 SetWindowPos(hDBwnd, HWND_TOPMOST, 0, 0, 0, 0,
 SWP_NOMOVE SWP_NOSIZE);
 break;
 } /* decode type */
 SetDlgMsgResult(hDlg, message, 0);
 return true;

[LISTING TWELVE]

 case WM_INITMENUPOPUP:
 if(HIWORD(lParam))
 { /* system menu */
 HMENU sys;
 sys = GetSystemMenu(hDBwnd, false);
 EnableMenuItem(sys, SC_SIZE, MF_GRAYED);
 EnableMenuItem(sys, SC_MAXIMIZE, MF_GRAYED);
 EnableMenuItem(sys, SC_CLOSE, MF_GRAYED);
 } /* system menu */

End Listings










Special Issue, 1993
Horizontally Scrollable Listboxes


Displaying strings of varying and unknown length




Ted Faison


Ted is a writer and developer, specializing in Windows and C++. He has
authored two books on C++. He is president of Faison Computing, a firm which
develops C++ class libraries for DOS and Windows. He can be contacted on
CompuServe at 76350,1013.


Windows listboxes are typically used to show lists of files, fonts, or other
variable-length lists of textual information. To add a listbox to a dialog
box, you generally edit a resource file using programs such as Microsoft's
Dialog Editor or Borland's Resource Workshop. Windows transparently handles
most of the listbox details. For example, if you add strings to a listbox,
Windows will automatically put a scroll bar on the control when the listbox
contains more strings than can be displayed in the client area. Windows also
handles scroll bar events without any need for application code.
Displaying a list of files in a listbox is generally easy because filenames
have a predefined maximum number of characters. But, if you plan to use a
listbox to display strings of varying and unknown length, such as the names of
people or the titles of your CD collection, creating a listbox wide enough to
accommodate the widest string you expect isn't always practical. An
alternative is to make the listbox wide enough to handle the average string,
using a horizontal scroll bar to scroll the client area when strings are too
long. One solution is to create a listbox using a tool such as Resource
Workshop, and to set the horizontal scroll bar property (or enable the
WS_HSCROLL style bit). Unfortunately, when you display the listbox in your
application, you'll find the horizontal scroll bar doesn't appear. The reason
is that Windows doesn't handle all aspects of horizontal scroll bars in
listboxes. It's up to the application to supply code to make horizontal scroll
bars work correctly.
In this article, I'll discuss using Borland C++ to show how to create a
listbox class, which will manage all of the details necessary to scroll
horizontally within a listbox. In the process, I'll present a sorted container
class, derived from Borland's template-based BIDS library, that keeps track of
text extents. Finally, I'll present a sample application to take advantage of
the horizontal listbox class to copy long strings from one listbox control to
another.


Text Extents


When you use the WS_VSCROLL style with a listbox, Windows keeps track of the
number of items added to the control. Knowing the size of the listbox's font,
the number of items in the listbox, and the height of the listbox's client
area, Windows can determine when there are more items than will fit on the
control. When this condition is met, Windows adds the vertical scroll bar to
the listbox.
It's more complicated with horizontal scroll bars because the width of a
string is dependent on both the font and the length of the string. The width
of a string measured in pixels is called the "text extent." Windows doesn't
automatically keep track of the extents of strings inserted into listboxes.
That's where your application code comes into play. When you add a string to a
listbox, you must tell Windows the extent of the string. When you indicate an
extent that exceeds the width of the listbox's client area, Windows draws the
horizontal scroll bar. You tell Windows the extent of a string using
LB_SET-HORIZONTALEXTENT. The wParam parameter indicates the extent of the
string.
When you add more than one string to a listbox, Windows only needs to know the
extent of the widest string. This means that your application has to keep
track of the extents of all the strings in a listbox. The management of text
extents takes a significant amount of effort using ordinary C code. The job is
considerably simpler using C++ containers and a Windows application framework
such as Borland's Object Windows Library (OWL).


Containers for Text Extents


In C++, containers can hold any kind of object including ints, chars,
pointers, and class objects. A simple container is not quite sufficient to
handle our text extents because we need to know the extent of the widest
string in the listbox. This requires a sorted container that stores the text
extents in ascending order. Each time a string is added to the listbox, its
extent is computed and added to the container. With a sorted container, the
greatest extent will always be the last item in the container.
The implementation of a sorted container restricts the type of objects that
can be put into the container. When a new item is added, the container needs
to compare the new item with those already in the container in order to locate
the correct insertion point. The comparison is achieved by calling the
operator< and operator== member functions for the item inserted. Because
member functions are invoked, only class objects can be used in sorted
containers.
Borland provides two container class libraries: the object container library,
which is easier to learn and use, but less efficient in terms of performance;
and a template-based BIDS (Borland International Data Structure) container
class library. In this article, I use BIDS. With BIDS containers, objects
inserted into a sorted container must have the member functions shown in
Figure 1. A small class called "Extent'' to handle the text extents is shown
in Listing One, page 50. Having a class for the text extents, we can create a
template-based container to manage the extents. Using BIDS containers, the
name of the class we need is BI_SArrayAsVector. BI refers to Borland
International; S indicates a sorted array. To create a container called
MyContainer to hold Extent objects, use the notation:
BI_SArrayAsVector<Extent> MyContainer;. Anytime you define a template-based
variable, the C++ compiler creates a completely new class based on the
template variable.
Before you can add an extent to the container, you need to determine the
extent for the string. TextExtent() (see Listing Two, page 50) computes the
extent. A WM_GETFONT message is sent to the listbox to retrieve the handle of
the font used in the control. If the system font is being used (which is often
the case), a NULL handle is returned. If a non-system font is being used, it
must be placed in the listbox's device context before calling the Windows API
GetTextExtent. The code in Figure 2 adds an extent to the container. The
textExtents variable is the extent container, and AString is a pointer to a
null-terminated array of characters. The extent object inserted into the
container is created with the global new operator. You have to be a little
careful with objects that are put into containers, because by default
containers own the items they contain. This means that when the container goes
out of scope, it will try to delete all the items that are still in it. If
you've put a stack-based or global object in the container, your application
will likely crash, since the delete operator may be called only for objects
created by the new operator. To place stack-based or global objects into a
container, you must tell the container that it doesn't "own" the elements in
it. You do so by calling the container's member function ownsElements(),
passing the value 0. Once ownership is disabled, the container will never try
to delete any of the items in it.


Custom Controls with OWL


One way to handle horizontal scroll bars is to use a standard listbox and make
the parent window do all the work of managing the text extent container,
sending WM_SETHORIZONTALEXTENT messages to the listbox, and so on. A better,
more object-oriented approach is to create a new control that does all the
housekeeping by itself, without disturbing the parent. With OWL, building a
custom control of this type is not difficult. OWL already defines the class
TListBox to act as an object-oriented stand-in for regular Windows listboxes.
All you have to do is derive a new class from TListBox and add the necessary
support for horizontal scroll bars. THorizontalListBox (see Listing One)
encapsulates all the details described.
Because class Extent is designed to be used exclusively inside
THorizontal-ListBox, it is declared as a nested class. Therefore, Extent is
only in scope inside THorizontalListBox. The template-based container is
typedefed to improve code readability. Using the type Extents is a lot easier
than having to use BI_SArrayAsVector<Extent>. The textExtents data member is
the actual container for the text extents. The base class functions AddString,
InsertString, Delete-String, and ClearList are overridden, because every time
a string is added or removed from the listbox, the extents container needs to
be updated and the listbox's horizontal extent kept up to date.
InsertExtent computes a string's extent, adds it to the extent container, and
updates the listbox's horizontal extent. The base class function AddString is
then called to actually add the string to the listbox. The DeleteString
function (see Listing Two) is similar to AddString, except that it needs to
locate the extent of the string being removed from the listbox, delete the
extent object, then update the listbox's horizontal extent. A short loop
searches the extent container for the extent of the string being removed from
the listbox. The textExtents.detach() statement removes the extent object from
the container and deletes the object itself. UpdateHorizontalExtent() sets the
extent of the listbox using the greatest extent still stored in the extent
container. The base class function TListBox::DeleteString does the actual work
of removing a string from the listbox.


The Transfer Buffer


The only tricky feature of class THorizontalListBox is the way it handles data
that is inserted into the listbox directly from a transfer buffer. Using OWL,
child controls can be set up to operate with a special buffer called a
"transfer buffer.'' When the child control is created, it reads its initial
data from the transfer buffer. When the child control is destroyed, its data
can be saved into the transfer buffer. Using a transfer buffer drastically
simplifies the task of reading and writing data to and from child controls.
When strings are put into a THorizontalListBox object from a transfer buffer,
OWL calls the function THorizontalListBox::Transfer(), passing TF_SETDATA as
the second parameter (see Listing Two). THorizontalListBox needs to invoke the
base class to successfully copy the string data from the transfer buffer into
the control. Then the extent container must be updated for each string in the
control. The first parameter passed to Transfer() is a pointer to a pointer to
a TListBoxData object. This object contains an array data member called
Strings, which holds the strings to be copied into the listbox. After calling
the base class to copy the strings, Transfer() uses the forEach iterator
function to iterate over the text extents stored in the container. For each
extent found, the function is called to update the container object.


A Short Example


I wrote a short OWL application, HORSCROL, to demonstrate the use of the
THorizontalListBox class. The source code is provided electronically; see
"Availability," page 3. HORSCROL initially displays a dialog box (see Figure
3). Using the >> and << buttons, you can move selected entries from one
listbox to the other and see the way the horizontal scroll bar is affected.
Windows displays the scroll bar only if a listbox contains strings that are
wider than the listbox.
To get Windows to display a horizontal scroll bar, a listbox must be created
with both the WS_VSCROLL and WS_HSCROLL styles. If you omit the WS_VSCROLL
style, Windows adds it automatically. The physical act of displaying a
horizontal scroll bar is independent from the display of a vertical scroll
bar. For example, adding a long string to the right listbox causes a
horizontal scroll bar, but no vertical one to appear.



Conclusion


Managing horizontal scroll bars in listboxes is not something application
programs should be required to do. Windows should bear the responsibility for
providing full support, just as it does with vertical scroll bars. Complete
support is likely to appear in the next release of Windows, but in the
meantime developers are stuck rolling their own scroll bars. Microsoft
provides a technical bulletin, available both on Compu-Serve in the
Developer's Network Forum, and on the Developer's Network CD, that describes
how to implement horizontal listboxes in C, and includes DLL code to support
horizontal listboxes for C applications.


References


Brockschmidt, Kraig and Kyle Marsch, "Considerations for Horizontal Scroll
Bars in listboxes," Microsoft Developer's Network CD, March 1992.

Figure 1: Required member functions for objects inserted into a sorted
container.
Default constructor
Copy constructor
Operator ==
Operator <
isSortable()


Figure 2: Adding an extent to the Extent container.

int length = TextExtent(AString);
Extent extent = *new Extent(length);
textExtents.add(extent);


 Figure 3: Sample use of horizontally scrollable listboxes.
[LISTING ONE] (Text begins on page 47.)

#ifndef __HLBOX_HPP
#define __HLBOX_HPP

#include <arrays.h>
#include <owl.h>
#include <listbox.h>

class THorizontalListBox: public TListBox {

 class Extent {
 int value;
 public:
 Extent(int i) {value = i;}
 Extent() {value = 0;}
 Extent(Extent& i) {value = (int) i;}
 int operator==(Extent& i) const {return value == (int) i;}
 int operator<(Extent& i) const {return value < (int) i;}
 operator int() {return value;}
 virtual int isSortable() {return 1;}
 };
 typedef BI_SArrayAsVector<Extent> Extents;
 Extents textExtents;
 int TextExtent(LPSTR);
 void UpdateHorizontalExtent();
public:
 THorizontalListBox(PTWindowsObject, int, PTModule = NULL);

 int AddString(LPSTR);
 int InsertString(LPSTR, int);

 int DeleteString(int);
 void ClearList();
 void InsertExtent(LPSTR);
 virtual WORD Transfer(void*, WORD);
};
#endif

[LISTING TWO]
#include "hlbox.hpp"

THorizontalListBox::THorizontalListBox(TWindowsObject* AParent,
 int id, TModule* module):TListBox(AParent, id, module),textExtents(20, 0, 20)
{
}
static void AddTextExtent(Object& AString, void* P)
{
 THorizontalListBox* listBox = (THorizontalListBox*) P;
 LPSTR string = (char*)(const char*)(RString)AString;
 if (AString != NOOBJECT)
 listBox->InsertExtent(string);
}
WORD THorizontalListBox::Transfer(void* DataPtr, WORD TransferFlag)
{
 WORD value = TListBox::Transfer(DataPtr, TransferFlag);
 if (TransferFlag == TF_SETDATA) {
 // for each string in the transfer buffer,
 // add its extent to the extent container
 TListBoxData* ListBoxData = *(PTListBoxData*) DataPtr;
 ListBoxData->Strings->forEach(AddTextExtent, this);
 }
 return value;
}
int THorizontalListBox::AddString(LPSTR AString)
{
 InsertExtent(AString);
 return TListBox::AddString(AString);
}
int THorizontalListBox::InsertString(LPSTR AString, int Index)
{
 InsertExtent(AString);
 return TListBox::InsertString(AString, Index);
}
void THorizontalListBox::InsertExtent(LPSTR AString)
{
 // store the extent of each string in sorted order in a container
 int length = TextExtent(AString);
 Extent extent = *new Extent(length);
 textExtents.add(extent);
 // update the ListBox horizontal extent
 UpdateHorizontalExtent();
}
int THorizontalListBox::DeleteString(int Index)
{
 // find the text extent of the string to be deleted
 char string [256];
 GetString(string, Index);
 // remove the extent from the container
 Extent extent = Extent(TextExtent(string) );
 for (int i = 0; i < textExtents.getItemsInContainer(); i++) {

 if (extent == textExtents [i]) {
 textExtents.detach(extent, TShouldDelete::Delete);
 break;
 }
 }
 // update the ListBox horizontal extent
 UpdateHorizontalExtent();
 return TListBox::DeleteString(Index);
}
void THorizontalListBox::ClearList()
{
 // delete all the text extents in the container
 textExtents.flush();
 // update the ListBox horizontal extent
 UpdateHorizontalExtent();
 // Call DeleteString, to force Windows 3.0 to remove
 // the horizontal scrollbar. Windows 3.1 doesn't need this call
 DeleteString(0);
 // clear out the remaining strings in the ListBox
 TListBox::ClearList();
}
// find the extent of a ListBox string
int THorizontalListBox::TextExtent(LPSTR AString)
{
 int extent;
 // select the ListBox into the device context
 HDC hdc = GetDC(HWindow);
 HFONT hfont = (HFONT) SendMessage(HWindow, WM_GETFONT, 0, 0);
 if (hfont) {
 // non-system font being used: select it into the
 // ListBox's device context before calling GetTextExtent
 HGDIOBJ oldObject = SelectObject(hdc, hfont);
 // find the text extent of the string
 extent = GetTextExtent(hdc, AString, _fstrlen(AString) );
 // release resources used
 SelectObject(hdc, oldObject);
 ReleaseDC(HWindow, hdc);
 }
 else {
 // system font in use: no font selection necessary,
 // because GetTextExtent will use the system font by default
 extent = GetTextExtent(hdc, AString, _fstrlen(AString) );
 ReleaseDC(HWindow, hdc);
 }
 return extent;
}
void THorizontalListBox::UpdateHorizontalExtent()
{
 int greatestExtent;

 // find the extent of the longest string in the
 // ListBox, and set the horizontal extent accordingly
 int lastElement = textExtents.getItemsInContainer() - 1;
 if (lastElement < 0)
 // no more strings in the ListBox
 greatestExtent = 0;
 else
 greatestExtent = textExtents [lastElement];
 // add a small amount of space, so that when ListBox is completely scrolled

 // to the right, the last character is completely visible
 HDC hdc = GetDC(HWindow);
 greatestExtent += GetTextExtent(hdc, "X", 1);
 ReleaseDC(HWindow, hdc);
 // if the longest string fits completely in the ListBox, then scroll the box
 // completely to the left, so Windows will hide the scrollbar
 RECT rect;
 GetClientRect(HWindow, (LPRECT) &rect);
 int listWidth = rect.right - rect.left;
 if (listWidth >= greatestExtent)
 SendMessage(HWindow, WM_HSCROLL, SB_TOP, 0);
 // set the extent
 SendMessage(HWindow, LB_SETHORIZONTALEXTENT, greatestExtent, 0);
}
End Listings















































Special Issue, 1993
Writing Portable Windows Applications


Guidelines for moving from Win16 to Win32




David Van Camp


David is a freelance software developer. He is specializing in Windows and
Windows NT development. You can contact him on CompuServe at 70323,3510.


If you're a Windows programmer who's yet to start Windows NT development,
chances are you've at least considered it. If so, the two questions you've
most likely asked are: 1. how difficult is it to port to NT?; and 2. what
upfront development work can I do to simplify moving my current Windows 3 apps
to NT?
Obviously, the difficulty of porting any Windows application to NT depends on
the program's features, not to mention the techniques used to develop it.
Simple applications, such as Windows-supplied applets, are usually easy ports.
(Microsoft claims File Manager was ported with the user interface running
within a week.) With full-scale applications, however, the process is far more
complicated and time-consuming.


Why is Porting Difficult?


The 32-bit Windows API (Win32) supported by NT is similar to the 16-bit
Windows 3.1 SDK (Win16), although there are subtle differences between the two
platforms that can be overlooked. The greatest difficulty you'll likely
encounter is when your application must be developed or maintained for both
platforms simultaneously.
However, one way you can get a jump on potential porting problems when writing
Win16 code is by following the guidelines summarized in Figure 1. For the most
part, these guidelines are based on my experiences in writing NT applications
which required simultaneous single-source compatibility with Windows 3 (for
instance, multiple tape-backup systems).
Let's face it, hard-to-find portability problems cause the biggest headaches
because it's impossible to fix a problem--no matter how simple--if you don't
know where it is or what's causing it. Easy-to-find problems are less
troublesome, even if a significant amount of work is required to correct them.
Consequently, one of your basic tactics should be to make
impossible-to-eliminate problems as easy as possible to find and fix (although
you still want to try to remove as many problems as possible). One way to do
this is mark that code by placing an NTPORT macro (see Listing One, page 55).
In Listing One (a), the NTPORT pragma macro shows a simple way of marking
Win16 code you suspect may cause a problem when porting to NT. This macro
should be defined in a header file, which is included by all your C files. As
shown in Listing One (b), you should put the macro close to the suspected
problem and include a short description of it. Now the problem may be easily
found when porting to NT either by searching your C files for NTPORT or by
inspecting the compiler warning messages when compiling the code for NT.


Avoiding Portability Problems


While tagging potential problems makes them easier to find, you can save more
time by avoiding problems in the first place. With any multiplatform software
project, many problems are prevented if you adhere to good
structured-programming techniques, avoid hardcoded values, and ensure that
your code compiles without warnings or errors. Always compile your Windows
code using the --W4 (warning level 4) and --DSTRICT command-line options. When
you use --W4, the compiler notifies you of any practice it considers suspect.
The --DSTRICT option enables the strictest possible type checking and
disallows many nonportable operations using incompatible data types.
Code which presumes any specific size of an int or a pointer isn't portable.
The size of these data types depends on the platform or memory model used, and
differs across platforms. For Win32, all pointers and integers are widened to
32-bits and the near key word is ignored. Further, Win32 uses 32-bit
flat-addressing, which means segmented addressing is obsolete. Never presume
that allocated memory will be aligned on a 64K boundary and avoid any
operations which assume segment:offset encoded pointers. Do not compute
offsets to arrays which combine a 16-bit computed offset to the high-order
16-bits of an address pointer--and never write code that assumes a pointer or
integer will wrap around to 0 when adding 1 to 0xffff.
Many Win16 system resources have changed for NT. Do not directly read or write
system files, including executables and resources, as the binary format of
these files has changed. Also, the format of many system objects has changed,
so you shouldn't access them directly. Never attempt to directly access a
device port or any system code from an application; do so from a device
driver. Always use the Windows API procedures to perform these types of
operations and stay away from undocumented calls and data formats.


Changes to the Windows API


Under NT, many Windows API procedures have been widened to use 32-bit values.
In most cases this means all pointers are 32 bits wide, and many WORD
parameters have been changed to UINT. In general, it's a good idea to avoid
using WORD data types except where they're required (for example, when a
procedure parameter requires a pointer to a WORD). Graphics coordinates should
be declared using UINT; INT should be used for general integers and array
indexes. Widened types should never be assigned to a WORD, or any other 16-bit
type, particularly pointers, ints, or any type of handle. And, don't mix
handle types--HANDLE, HWND, HINSTANCE, and HDC are separate and distinct types
and noninterchangeable.
A number of Win16 procedures have been replaced, modified, or eliminated
altogether. For replaced functions, it's simple to find and modify the calls
when porting to NT by using one of two techniques: Either the functions are
reimplemented by writing replacements which map the parameters and call the
new API procedures, or the calls can be changed to the Win32 replacement
procedures. Win16 versions of those procedures are then created to map the
parameters back to the original functions. Either way, this isn't much of a
problem. For procedures which have undergone functional modification, however,
the matter is often more serious. Finally, procedures which have been
completely eliminated should be avoided; see Figure 2.
Of the API procedures which have been modified, the most important are the
callbacks, particularly window and dialog procedures. Whenever you declare a
pointer to a callback procedure, use the appropriate type, such as WNDPROC for
window procedures, DLG-PROC for dialog procedures, HOOK-PROC for hook
procedures, and so on. Do not use FARPROC or NEARPROC. Also, when declaring
these procedures, use the proper function prototype. Always declare window and
dialog procedures like this: RESULT CALLBACK ProcName ( HWND hwnd, UINT wMsg,
WPARAM wParam, LPARAM lParam) where, RESULT is LRESULT for window procedures
or BOOL for dialog procedures. BOOL is widened to 32 bits for Win32, as are
wMsg and wParam (both formally WORDs.) See the Win16 or Win32 version of
WINDOWS.H for a complete list of the callback types. Also note the use of
CALLBACK instead of FAR PASCAL. This is a more portable modifier.


Modifications to Message Handling


The declaration prototype has changed for window procedures because the
parameter packing for a number of messages has changed. For most messages,
these changes were necessary because a widened value, usually a HWND, was
originally packed into the upper or lower 16 bits of the LPARAM parameter.
Since these values are widened, they have been typically moved to the WPARAM
parameter, which is also widened. Additionally, other parameters were moved as
well. Consequently, you can't write portable code which directly picks any
value from the parameters of a modified message. Microsoft has provided a
number of macros, collectively called "message crackers," which present
different solutions for this.
One pair of message crackers, first introduced with the Windows 3.1 SDK, are
called "handlers and forwarders.'' I recommend using these macros when writing
new code to process window messages, since these macros not only solve the
parameter packing changes introduced by Win32, but they also provide a
highly-structured method of message processing. These macros, defined in
WINDOWSX.H, allow an almost object-oriented approach to message handling that
mimics the solution used by the Microsoft C++ Foundation Classes. Listing Two
(page 55) shows an example of WM_COMMAND message processing for Win16. Listing
Three (page 55) shows how this same code would be implemented using message
handlers and forwarders. These macros use the following naming convention for
message handlers: HANDLE_message (hwnd, wParam, lParam, function_name);. In
this convention, message is the window message ID and function_name is the
name of your handler function. This macro unpacks the parameters in lParam and
wParam and calls your handler function. Always declare the handler function
using the example in the comment above the message-handler macro definition in
WINDOWSX.H. For message forwarders, the FORWARD_message (paramlist,
message_proc) naming convention is used where paramlist is the list of
parameters required for the particular message and message_proc is a
message-passing procedure (SendMessage, PostMessage, CallWindowProc, and so
on). This macro packs the parameters into lParam and wParam and calls the
specified procedure. See the Microsoft Windows SDK documentation and
WINDOWSX.H for more information and a complet
e list of these macros.
The other pair of message crackers, simpler than handlers and forwarders,
extract or pack the message parameters via a portable macro. Listing Four
(page 55) shows how WM_COMMAND processing looks when extractors and packers
are used. The general naming convention used for the extractors is
GET_message_item (wParam, lParam); where message is the message ID and item is
the particular data item you wish to extract from the parameters. The return
value type depends on the type of data extracted. The message packers follow
this naming convention GET_message_MPS (wParam, lParam, paramlist); where
paramlist is the list of parameters required for the particular message.
For Win32, Microsoft has only provided these macros for those messages whose
parameter packing has changed. Since Microsoft didn't provide any definitions
of these macros for Win16, I've done so; see Listing Five (page 55). Since
macros are only defined for changed messages, this code also serves as a quick
reference to the changed messages. These macros are best-suited for porting an
existing code base to Windows NT, since less work is typically required to
convert code using them than is needed for handlers and forwarders. Handlers
and forwarders are better for new development because they can be used with
all window messages and because of the highly structured solution they
provide.
In Listing Five, the WM_CTLCOLOR message macros are an exception to the
previously described naming convention. This Win16 message posed a problem
because it contained two parameters widened to 32-bits and one 16-bit
parameter. Consequently, there wasn't enough room, so the message had to be
split into a series of messages. When declaring your own messages, never pack
more than two widened types, or one widened type and one 16-bit value into
message parameters.
Win32 dynamic data exchange (DDE) messages have undergone such significant
changes that it's virtually impossible to write portable code which processes
them. Consequently, always use the high-level DDEML procedures to perform DDE
functions. Other aspects of message processing have changed as well. It isn't
possible in Win32 to subclass a window belonging to another process. Also,
global classes can only be registered in DLLs which are loaded during system
initialization, never from an application. Avoid using these non-portable
techniques whenever possible.


Win32 Features



Windows NT provides features not available under Win16--support for
memory-mapped files, multiple users, advanced file systems, preemptive
multi-tasking and multithreaded processes, C/2 level security, protected
address spaces, and the Unicode character standard. Although you can't write
truly portable code which uses these features, carefully crafted Win16 code
simplifies the changes required to integrate these features when porting to
NT. The most significant of these features are memory-mapped files.
Globally-shared memory, implemented in Win16 using GlobalAlloc with the
GMEM_SHARE option or via named data segments, won't work for NT applications.
Instead, your application needs to be modified to use memory-mapped files. For
this reason, all code which uses shared memory should be isolated and marked
with the NTPORT macro. Avoid storing addresses in shared memory, as this will
often cause problems under NT. Additionally, due to preemptive multitasking,
access to global memory must be carefully synchronized to ensure that
different processes cannot attempt to modify and read information
simultaneously.
You should never assume that file names follow standard DOS naming
conventions. Both HPFS and NTFS, the advanced file systems supported by NT,
allow long file names (up to 256 characters) and new characters (such as
spaces and dots). C/2 security, a governmental classification, and protected
address spaces ensure that certain "unsecured" operations may fail under NT
and should be avoided. These include shutting down the system, directly
accessing devices or system memory, changing scheduling priorities, and
modifying the system's CMOS or date and time. Also, if your application is
expected to be used internationally, you should employ transparent-character
techniques so that you can properly utilize NT's Unicode support.
Figure 1: Guidelines for writing portable Windows applications.
 1. Eliminate all compiler and linker warnings.
 2. Use the NTPORT macro to mark all potential portability problems.
 3. Use WORD type data only when necessary; use INT or UINT otherwise.
 4. Use SetClassLong or SetWindowLong to store widened types.
 5. Never assign a handle or pointer to a short (16 bit) data type.
 6. Use unique types for handles (HWND, HPEN, HBRUSH, and so on).
 7. Avoid using obsolete API procedures whenever possible.
 8. Portably declare all window and dialog procedures.
 9. Use message crackers to process window messages.
10. Do not pack widened types into lParam words.
11. Do not use model-specific or segmented addressing.
12. Isolate operations which use globally shared memory.
13. Do not assume that filenames follow the DOS naming conventions.
14. Do not read or write system files.
15. Do not presume data elements are aligned on a specific byte boundary.

Figure 2: Windows API procedures that have been eliminated: (a) These
procedures have been dropped from the 32-bit Windows API. No replacements are
available. Consequently, code which calls these procedures will not be
portable; (b) these sound procedures have been dropped; use the
multimedia-sound support API instead.
(a) AccessResource, AllocDSToCSAlias, AllocResource, AllocSelector,
ChangeSelector, GetCodeHandle, GetCodeInfo, GetCurrentPDB, GetEnvironment,
GetInstanceData, GetKBCodePage, GetTempDrive, GlobalDosAlloc, GlobalDosFree,
GlobalPageLock, GlobalPageUnlock, LimitEMSPages, LocalNotify, SetEnvironment,
SetResourceHandler, SwitchStackBack, SwitchStackTo, UngetCommChar,
ValidateCodeSegment, ValidateFreeSpaces.
(b) CloseSound, CountVoiceNotes, GetThresholdEvent, GetThresholdStatus,
OpenSound, SetSoundNoise, SetVoiceAccent, SetVoiceEnvelope, SetVoiceNote,
SetVoiceQueueSize, SetVoiceSound, SetVoiceThreshold, StartSound, StopSound,
SyncAllVoices, WaitSoundState.

[LISTING ONE] (Text begins on page 52.)
(a)

#ifdef _WINNT_ /* if compiling for Windows NT, generate a compiler warning */
#define NTPORT(msg) message(__FILE__ " NTPORT: " msg)
#else /* we are compiling for Win16, so do nothing */
#define NTPORT(msg)
#endif


(b)

#pragma NTPORT("warning, pointer stored in globally shared memory")
 gpszGlobalString = szLocalString;

[LISTING TWO]

#include <windows.h> /* normal include for all windows applications */

/* The following window procedure declaration is NOT portable to Windows NT!
*/
LONG FAR PASCAL MyWinProc (HWND hwnd, WORD wMsg, WORD wParam, WORD wParam)
{
 switch ( wMsg )
 {
 case WM_COMMAND:
 /* Nonportable reference to control ID in message params */
 switch ( wParam )
 {
 /* processing for WM_COMMAND based on control ID goes here...*/
 }
 case WM_SOMEMESSAGE:
 /* non-portable method to send WM_COMMAND message to parent */
 SendMessage ( GetParent (hwnd), WM_COMMAND,
 wMyID, MAKELONG (hwnd, wNotifyCode) );
 }
}


[LISTING THREE]

#include <windows.h> /* normal include for all windows applications */
#include <windowsx.h> /* include macro definitions for Win16 or NT */
/* Declare portable WM_COMMAND message handler function... */
void MyWinProc_OnCommand (HWND hwnd, int id, HWND hwndCtl, UINT codeNotify)
{
 /* Portable reference to control ID in message params */
 switch ( id )
 {
 /* processing for WM_COMMAND based on control ID goes here... */
 }
}
/* The following window procedure declaration IS portable to Windows NT! */
LRESULT CALLBACK MyWinProc (HWND hwnd, UINT wMsg, WPARAM wParam, LPARAM
wParam)
{
 switch ( wMsg )
 {
 case WM_COMMAND:
 /* Portable WM_COMMAND processing using macro... */
 return HANDLE_WM_COMMAND (hwnd, wParam, lParam,
 MyWinProc_OnCommand );

 case WM_SOMEMESSAGE:
 /* portable method to send WM_COMMAND message to parent */
 FORWARD_WM_COMMAND ( GetParent (hwnd), wMyID, hwnd,
 wNotifyCode, SendMessage);
 }
}

[LISTING FOUR]

#include <windows.h> /* normal include for all windows applications */
#include <windowsx.h> /* include macro definitions for NT only */
#include <move2nt.h> /* include macro definitions for Win16 */

/* The following window procedure declaration IS portable to Windows NT! */
LRESULT CALLBACK MyWinProc (HWND hwnd, UINT wMsg, WPARAM wParam, LPARAM
wParam)
{
 switch ( wMsg )
 {
 case WM_COMMAND:
 /* Portable reference to control ID in message params */
 switch ( GET_WM_COMMAND_ID (wParam, lParam) )
 {
 /* processing for WM_COMMAND based on control ID goes here*/
 /* using GET_WM_COMMAND_xxx macros for portability... */
 }
 case WM_SOMEMESSAGE:
 /* Portable method to send WM_COMMAND message to parent */
 SendMessage ( GetParent (hwnd), WM_COMMAND,
 GET_WM_COMMAND_MPS (wMyID, hwnd, wNotifyCode) );
 }
}
[LISTING FIVE]

/* File: MOVE2NT.H - Message extractor and packer macros for Win16
 * Author: David Van Camp, July 1993 */


#if !defined (MOVE2NT_INCL) && !defined (_WINNT_)
#define MOVE2NT_INCL

#define GET_EM_LINESCROLL_MPS(vert, horz) \
 (WPARAM)0, MAKELONG (vert, horz)
#define GET_EM_SETSEL_START(wp, lp) (INT)HIWORD(lp)
#define GET_EM_SETSEL_END(wp, lp) (INT)LOWORD(lp)
#define GET_EM_SETSEL_MPS(iStart, iEnd) \
 (WPARAM)0, MAKELONG(iStart, iEnd)
#define GET_WM_ACTIVATE_STATE(wp, lp) (wp)
#define GET_WM_ACTIVATE_FMINIMIZED(wp, lp) (BOOL)HIWORD(lp)
#define GET_WM_ACTIVATE_HWND(wp, lp) (HWND)LOWORD(lp)
#define GET_WM_ACTIVATE_MPS(s, fmin, hwnd) \
 (WPARAM)(s), MAKELONG((hwnd), (fmin))
#define GET_WM_CHANGECBCHAIN_HWNDNEXT(wp, lp) (HWND)LOWORD(lp)

#define GET_WM_CHARTOITEM_CHAR(wp, lp) (CHAR)(wp)
#define GET_WM_CHARTOITEM_POS(wp, lp) HIWORD(lp)
#define GET_WM_CHARTOITEM_HWND(wp, lp) (HWND)LOWORD(lp)
#define GET_WM_CHARTOITEM_MPS(ch, pos, hwnd) \
 (WPARAM)(ch), MAKELONG((hwnd), (pos))
#define GET_WM_COMMAND_ID(wp, lp) (wp)
#define GET_WM_COMMAND_HWND(wp, lp) (HWND)LOWORD(lp)
#define GET_WM_COMMAND_CMD(wp, lp) HIWORD(lp)
#define GET_WM_COMMAND_MPS(id, hwnd, cmd) \
 (WPARAM)(id), MAKELONG(hwnd, cmd))

/* The WM_CTLCOLOR message was split in to multiple messages for NT, one
 * for each supported control type. For this reason, a extra macro is added,
 * GET_WM_CTLCOLOR_MSG which must be used to determine the message ID
 * to use for a particular type. Use this macro in the following manner:
 * SendMessage (hwnd, GET_WM_CTLCOLOR_MSG(type)
 * GET_WM_CTLCOLOR_MPS(hdc,hwnd,type));
 * where type is any of the types used in the Win16 WM_CTLCOLOR message.
 * Also notice that the extractor macros require the message ID in addition
 * to the two message parameters. */
#define GET_WM_CTLCOLOR_HDC(wp, lp, msg) (HDC)(wp)
#define GET_WM_CTLCOLOR_HWND(wp, lp, msg) (HWND)LOWORD(lp)
#define GET_WM_CTLCOLOR_TYPE(wp, lp, msg) HIWORD(lp)
#define GET_WM_CTLCOLOR_MSG(type) (WORD)(WM_CTLCOLOR)
#define GET_WM_CTLCOLOR_MPS(hdc, hwnd, type) \
 (WPARAM)(hdc), MAKELONG(hwnd,type)
#define GET_WM_HSCROLL_CODE(wp, lp) (wp)
#define GET_WM_HSCROLL_POS(wp, lp) LOWORD(lp)
#define GET_WM_HSCROLL_HWND(wp, lp) (HWND)HIWORD(lp)
#define GET_WM_HSCROLL_MPS(code, pos, hwnd) \
 (WPARAM)(code), MAKELONG(hwnd, pos)
#define GET_WM_MENUSELECT_CMD(wp, lp) (wp)
#define GET_WM_MENUSELECT_FLAGS(wp, lp) (UINT)LOWORD(lp)
#define GET_WM_MENUSELECT_HMENU(wp, lp) (HMENU)HIWORD(lp)
#define GET_WM_MENUSELECT_MPS(cmd, f, hmenu) \
 (WPARAM)(wp), MAKELONG(f, hmenu)
/* These extractors are for MDIclient to MDI child messages only. */
#define GET_WM_MDIACTIVATE_FACTIVATE(hwnd, wp, lp) (wp)
#define GET_WM_MDIACTIVATE_HWNDDEACT(wp, lp) (HWND)HIWORD(lp)
#define GET_WM_MDIACTIVATE_HWNDACTIVATE(wp, lp) (HWND)LOWORD(lp)
/* This packer is for sending to the MDI client window only. */
#define GET_WM_MDIACTIVATE_MPS(f, hwndD, hwndA) (WPARAM)(hwndA), 0L


#define GET_WM_MDISETMENU_MPS(hmenuF, hmenuW) \
 (WPARAM)!(hmenuFhmenuW), MAKELONG(hmenuF, hmenuW)
#define GET_WM_MENUCHAR_CHAR(wp, lp) (CHAR)(wp)
#define GET_WM_MENUCHAR_HMENU(wp, lp) (HMENU)HIWORD(lp)
#define GET_WM_MENUCHAR_FMENU(wp, lp) (BOOL)LOWORD(lp)
#define GET_WM_MENUCHAR_MPS(ch, hmenu, f) \
 (WPARAM)(ch), MAKELONG(f, hmenu)
#define GET_WM_PARENTNOTIFY_MSG(wp, lp) (wp)
#define GET_WM_PARENTNOTIFY_ID(wp, lp) HIWORD(lp)
#define GET_WM_PARENTNOTIFY_HWNDCHILD(wp, lp) (HWND)LOWORD(lp)
#define GET_WM_PARENTNOTIFY_X(wp, lp) (int)(short)LOWORD(lp)
#define GET_WM_PARENTNOTIFY_Y(wp, lp) (int)(short)HIWORD(lp)
/* Use this packer for WM_CREATE or WM_DESTROY msg values only */
#define GET_WM_PARENTNOTIFY_MPS(msg, id, hwnd) \
 (WPARAM)(msg), MAKELONG(hwnd, id)
/* Use this packer for all other msg values */
#define GET_WM_PARENTNOTIFY2_MPS(msg, x, y) (WPARAM)(msg), MAKELONG(x, y)

#define GET_WM_VKEYTOITEM_CODE(wp, lp) (int)(wp)
#define GET_WM_VKEYTOITEM_ITEM(wp, lp) HIWORD(lp)
#define GET_WM_VKEYTOITEM_HWND(wp, lp) (HWND)LOWORD(lp)
#define GET_WM_VKEYTOITEM_MPS(code, item, hwnd) \
 (WPARAM)(code), MAKELONG(item, hwnd)
#define GET_WM_VSCROLL_CODE(wp, lp) (wp)
#define GET_WM_VSCROLL_POS(wp, lp) LOWORD(lp)
#define GET_WM_VSCROLL_HWND(wp, lp) (HWND)HIWORD(lp)
#define GET_WM_VSCROLL_MPS(code, pos, hwnd) \
 (WPARAM)(code), MAKELONG(hwnd, pos)
#endif /*MOVE2NT_INCL && _WINNT_*/

End Listings






























Special Issue, 1993
Getting to Know TrueType


Meeting some interesting characters




Steven Reichenthal


Steve develops Windows CAD applications and teaches object-oriented
programming. He received an MS in computer science from Cal State Fullerton
with an emphasis in graphics. Steve can be reached at xsreiche@aunix.
fullerton.edu.


Although Windows 3.1 has been out for some time, a few of its features are
only now moving into many programmers' field of view. One such feature is the
built-in support for outline fonts, which uses the TrueType format defined by
Microsoft and Apple.
In this article, I'll describe the TrueType format and how fonts are rendered,
then show how a new function in the Windows API, GetGlyphOutline(), can be
used to create a simple font-viewing program. But first, some background on
digital fonts.


Flavors of Fonts


The early versions of Windows only supported two font formats: bitmap and
vector. As you know, bitmap fonts represent a character shape simply by
selectively coloring a grid of pixels. Bitmap fonts can be displayed quickly,
and, if properly designed, they look very good at the size at which they were
created. But when scaled to larger sizes, the resulting "jaggies" look
terrible and have given bitmap fonts a bad name. Nevertheless, for
high-quality results at low screen resolutions, nothing beats a hand-tuned
bitmap font in readability.
Vector fonts are more scalable than bitmaps, but don't look as good--either at
the original size or enlarged. Vector fonts represent character shapes via
straight-line segments. They have a "CAD look" to them, because the space
between the lines doesn't get filled in. When scaled large enough, the
characters look very thin. Also, since curved features are rendered via
straight lines, at a certain size these straight edges and corners are
glaringly visible.
Outline-based fonts such as TrueType combine the best features of both vector
and raster fonts without the disadvantages of either, by representing
characters with mathematical outlines instead of simple strokes or a raster
grid of pixels. Outline fonts are not new with TrueType; they have been used
in electronic publishing systems for over 20 years, initially in imaging or
typesetting systems, and more recently in interactive desktop systems. The
approach used by TrueType has much in common with these older systems, but has
also pushed the technology further in the areas of font rendering and hinting
(discussed later).
In TrueType, a character's outline is defined by combinations of lines and
curves. An outline can be scaled to fit a wide range of sizes, and then filled
in to result in a high-quality bitmap font at the desired size. The scaling
and rendering process happens at run time. Once in bitmap form, TrueType fonts
can be displayed as quickly as raster fonts. Conversion of an entire font at a
given size usually takes only a second or two, depending on the sizes of the
bitmaps and the speed of the computer. Once converted, Windows stores the
bitmaps in a font cache where they are used over and over.
TrueType brings with it the specialized terminology of typography and digital
fonts, as well as introducing its own terms. Character figures are called
"glyphs." The outlines that describe glyphs are collections of closed curves
called "contours." For example, the glyph outline for the lowercase "i"
consists of two contours: one for the dot and one for the stem. The lowercase
"b" also has two contours: an outer one and an inner one. Contours are defined
by ordered sequences of points, sometimes called "control points." Each point
is specified to be either on or off of the contour. If two consecutive points
are on the contour, a straight line connects them, otherwise a smooth curve is
tangent to them. The points may range from --16,384 to 16,383 in units known
as "font units" or "FUnits." Points specify locations relative to a grid
called the "EM square." The fonts supplied with Windows happen to use 2048
FUnits per EM.
TrueType font files have the TTF extension and are stored in the Windows
System directory. For example, the file ARIAL.TTF contains the normal (i.e.
not bold or italic) font for the Arial typeface. These files consist of a
series of tables. One table (glyf) contains the points and "hints" that
describe the outlines of the character figures. Another table (cmap) indexes
the characters in the glyf table. TTF files also contain a table called head
that provides scaling information. There are 16 other defined tables that may
appear--but these three tell us the most about the inner workings of the
rendering process.


The Three Rendering Stages


As shown in Figure 1, each glyph outline goes through three transformations
before emerging as a bitmap. The transformations are accomplished by three
TrueType modules known as the Scaler, Interpreter, and Rasterizer. First, the
Scaler shrinks (or stretches) the outline to the requested size. Then the
Interpreter grid fits the scaled outline by executing instructions ("hints")
attached to the glyph. The resulting outline goes to the Rasterizer to
generate a bitmap.
As an example, consider the rendering process applied to the letter "b" of the
Arial font, displayed at 14 points in EGA resolution. The Scaler converts
coordinates from font units into device units (pixels). Most EGA monitors
display 96 pixels per inch horizontally and 72 pixels per inch vertically. In
the TTF file, the glyph for the letter "b" is 921 FUnits wide and 1490 FUnits
high on a scale of 2048 FUnits per EM. The resulting scaled dimensions are
8.39 pixels wide and 10.19 pixels high on the EGA screen. Figure 2(a) shows
the scaled outline mapped to a pixel grid.
In scaling glyphs down to small sizes, it often becomes unclear whether a
given pixel belongs to the glyph or not. This decision is critical if a glyph
is scaled so small that a typographic feature occupies only a single pixel;
because if even one pixel is missing or out of place, the glyph may become
illegible. In a process called "grid fitting," the Interpreter uses the
"hints" associated with the glyph to distort the scaled outline so that it
improves the appearance of the bitmap.
One advance of TrueType over older outline-font technology is the
sophistication of its hinting mechanism. Hints are not passive data
structures, as their name implies, but active software programs that literally
take control of the Interpreter. The TrueType instruction set resembles
assembly language, complete with opcodes and mnemonics for If/Then constructs,
loops, subroutines, and a full complement of arithmetic and logical
operations. For example, the MD instruction measures the distance between two
outline points and pushes the result on the Interpreter's stack--to possibly
serve as part of a further calculation.
A similar instruction, MPS, makes it possible to measure the current point
size, perhaps as a basis for choosing an alternate path through the
instruction stream. Grid-fitting is also aided by the RTG instruction that
aligns points to the nearest grid line. There are over 120 different
instructions. Fortunately, they are generated automatically by font editors.
Figure 2(b) shows the outlines after hints are applied.
After the Interpreter makes the necessary adjustments, it sends the
grid-fitted outline to the Rasterizer to produce a bitmap. The Rasterizer
fills in the outline by following a simple rule: It turns on only those pixels
whose center lies either inside or exactly on the outline of the glyph. The
grids in Figures 2(a) and 2(b) indicate the center of a pixel with a dot. By
analyzing the direction traveled between any two points, the Rasterizer can
always determine where the inside of the glyph is. The points are ordered such
that as it follows along the outline in the direction from one point to
another, the inside is always to the right.


Using GetGlyphOutline


If you want to experiment with TrueType glyph outlines, the Windows 3.1 API
provides the GetGlyphOutline function and a few specialized data structures.
This function retrieves the same fully scaled and hinted outline that the
Rasterizer gets. The function takes a device context, a character in the
current font, and the address of a buffer where it will store the glyph data.
We usually need to call this function with a NULL buffer the first time so
that it can return the required buffer size. GetGlyphOutline returns
information about the dimensions of the glyph in a GLYPHMETRICS structure.
The points that describe a glyph outline use fixed-point numbers, which can
carry 16 bits of fractional precision. Fractional precision is not only
necessary for accurately scaling and rotating the points, but should be
maintained when computing the curves. Both popular techniques for rendering
polynomial curves--forward differencing and subdivision--require some degree
of fractional precision, although subdivision usually requires less. The
Windows header file defines FIXED as a structure with two 16-bit components:
an integer part and a fractional part. It also defines fixed-point coordinates
using a POINTFX structure that contains two FIXED structures.
It's easy to carry out any necessary math if we treat FIXED structures as
signed longs, noting that the nth bit starting at 0 has a value of 2n--16. For
example, the long number 65,536 corresponds to the real number 1; the long
number 32,768 corresponds to the real number 0.5, and so on. When dealing with
POINTFX structures, it's convenient to define a structure called LONGPOINT and
use a type cast. The structure is:
struct LONGPOINT
 {
 long x, y;
 };

The glyph data from GetGlyphOutline is returned in a buffer containing each
contour of the outline. Parts of a contour can be a mixture of straight lines
or curves. Each contour begins with a 16-byte TTPOLYGONHEADER structure. The
first data member, cb, specifies the number of bytes in the contour--in other
words, the next contour, if any, starts exactly cb bytes from the beginning of
the current one. This structure also contains the pfxStart member which
denotes the first (and last) point on the contour.
One or more TTPOLYCURVE records immediately follow the header. Each contains a
variably sized array called apfx, which holds the actual points (POINTFX
structures) that define a curve or polyline on the contour. The wType member
indicates whether the points represent polylines (with the value TT_PRIM_LINE)
or quadratic B-splines (with the value TT_PRIM_QSPLINE). Naturally, since the
array can contain any number of points, this record has a member cpfx that
specifies how many.
Here's the rule for connecting sequences of curves: Every curve automatically
begins where the last point on the previous curve ends--unless it's the first
curve, which begins at pfxStart. This way each point is specified only once.
When the last point on the last curve is different from pfxStart, a straight
line should be drawn between the two points, closing the contour.
GetGlyphOutline also requires, as a parameter, a two-dimensional
transformation matrix of type MAT2. Since this structure contains four FIXED
numbers, we can play the same kind of trick that we used with POINTFX:
typecast the MAT2 variable and manipulate it as an array of longs. Be careful,
however, with the values you put in this matrix because GetGlyphOutline can
overflow--causing an unrecoverable application error (UAE). Although the
identity matrix works well, there may be times when you want a glyph rotated.
If you plan to rotate a glyph, be prepared to do a suitable translation
because all rotations are about the origin of the glyph's coordinate system.
Naturally, we can forego transformations within the required matrix and
provide our own transformations. The demonstration program for this article
shows how to add some special effects to font renderings.
To render a filled character from the outline data, we can either write our
own rasterizer (a lot of work) or use the PolyPolygon function in the Windows
API. Although PolyPolygon uses a different algorithm from the TrueType
Rasterizer to determine the interior of a figure, the results are usually
excellent for characters greater than 25 points or so. Smaller characters,
however, suffer in quality due to the rounding that occurs when converting
from FIXED points to POINT points. One advantage to using this function is
that we can get textured renderings by using a pattern brush. Since
PolyPolygon requires a complete array of points, we must make sure to allocate
enough memory to store all of the points.



A Glyph-viewing Program


My TrueType Font Demo program decodes and displays the outline data from
GetGlyphOutline for any character entered at the keyboard. As indicated in the
title bar of the main window, the current font may be changed from the File
menu by choosing New. An Options menu also lets the user change the fill
style, display control points, and apply special effects.
The code is in Listing One (page 60). At startup, the program does the usual
Windows initialization: registering a window class, creating a main window,
and cycling through the message loop. At this point, it selects the default
font, Arial, and the character "a." The corresponding screen display is shown
in Figure 3. To select a different font, the program calls ChooseFont, which
brings up the Font dialog box. The selection of fonts here is restricted to
TrueType fonts only. If the user chooses OK, the function returns True and
stores the information about the font in a global LOGFONT structure.
When the user changes the font, chooses an option, or types a character, the
program stores that information and then forces a repaint of the window.
Repainting involves creating a font and a brush with the current settings, and
then calling draw_glyph_outline to render the current character.
The draw_glyph_outline function takes, as arguments, the device context, the
location of the upper corner of the character, and the desired character to be
displayed. It first calls GetGlyphOutline to fill a buffer with the outline
data for the character. It then calls compute_memory_requirement and allocates
enough memory to hold the array of contour points that eventually get passed
to PolyPolygon. Next, it walks through the data in a doubly nested loop. The
outer loop finds each closed outline and sends it to the inner loop, which in
turn steps through the individual pieces of the outline. The inner loop passes
the groups of control points to draw_polyline and draw_quadratic_bspline as
directed by the wType member of the TTPOLYCURVE structure. Before drawing an
individual curve, it first modifies the point array (apfx) by inserting the
last point of the previous curve at the beginning so that each group of
control points is independent from its predecessor. Then it calls the
transform function.
The compute_memory_requirement function steps through the outline data in the
same manner as draw_glyph_outline. When it encounters a polyline, it advances
the count variable by the number of vertices. When it encounters a B-spline,
it advances the count by the number of Bzier curve segments multiplied by the
maximum number of points in each segment. At the end, it returns the maximum
number of bytes required to hold the array of contour points.
The transform function produces special effects by moving the control points
of the outline. If the user has chosen Pinch from the options menu, then this
function will pull control points nearing the center of the character even
closer. This results in a cartoon-like effect. If you choose Punch from the
menu, then the points near the center are pushed farther away--resulting in a
bloating effect.
When draw_polyline receives the address and size of an array of POINTFX
structures representing vertices, it stores each point at the end of the array
of contour points. Likewise, draw_quadratic_bspline takes the address and size
of an array of POINTFX structures. If a curve has three control points, these
are sent unmodified to draw_Bezier_curve. Otherwise, the Bzier conversion
method is applied to each consecutive three-point grouping and the results
sent individually to draw_Bezier_curve.
Producing a smooth curve depends on draw_Bezier_curve. Since it uses recursive
subdivision, the quality of the shape depends on the depth of the recursion.
There is a trade-off here since the more points it stores, the slower the
curve will draw. The depth is set at 8 for this program, which means only
three bits of fractional precision are required for accuracy. POINTFX
structures keep 16 bits of fractional precision so there are no round-off
problems when computing the curve. With a depth of 8, a maximum of 257 points
along a curve can be stored. But this storage requirement grows exponentially
as the depth increases.
Conversion of a POINTFX value to a window coordinate takes place in
fixed_to_int, which selects the closest pixel by rounding the FIXED values to
the nearest whole number. Rounding points to the nearest pixel produces good
results when the glyphs are large, but achieving higher quality at small sizes
requires an algorithm based on the method used by the Rasterizer.


References


Foley, James, Andries van Dam, Steven K. Feiner, and John F. Hughes. Computer
Graphics: Principles and Practice, second edition. Reading, MA:
Addison-Wesley, 1990.
Rubinstein, Richard. Digital Typography. Reading, MA: Addison-Wesley, 1988.
TrueType Font Files. Microsoft Corporation, 1991.
 Figure 1: The TrueType rendering pipeline.
 Figure 2: (a) A scaled outline before grid fitting; (b) the result of grid
fitting.
 Figure 3: A glyph-viewing program.


The Mathematics of Quadratic B-splines


The implementors of TrueType chose splines as the means for representing
curves. You can think of splines as a series of simple polynomial curves
smoothly spliced together. For a quadratic spline, each piece (or segment) is
described by a quadratic polynomial function of the form y(x)=ax2+bx+c, where
a, b, and c are constant coefficients. Remember from algebra that this
function produces a parabola. It has only one point of inflection and
depending on the coefficient a, opens either upward or downward. But, for
free-form quadratic splines like the ones in TrueType, we need parabolas that
can open not only up and down, but in any direction. That's why curves are
typically represented parametrically. In two dimensions we plot the functions
x(t)=axt2+bxt+cx and y(t)=ayt2+byt+cy.
By restricting the variable t to the closed interval between 0 and 1, a curve
segment has definite starting and ending coordinate, and can open in any
direction. Each function requires three coefficients.
In the early '70s Pierre Bzier, a mathematician working for the French
automaker Renault, devised a clever way of blending three points, called
"control points," to obtain the coefficients such that a curve segment
connects to the two endpoints and pulls toward the other point. The function
for a quadratic Bzier curve is: Q(t)=t2(B0-2B1+B2)+t(-2B0+2B1)+B where each
Bi is a control point. Figure 4 shows what a quadratic Bzier curve looks
like. It's easy to verify mathematically that it connects to the endpoints by
evaluating the function for t equal to 0, and then 1. For simplicity, we can
also describe the curve as function of its control points. For instance,
Bezier (B0, B1, B2) equals the drawing in Figure 4.
What makes Bzier curves attractive is that your program can render them
without fully evaluating the polynomials, using a fast algorithm known as the
deCasteljau algorithm. Given the three control points for a Bzier curve, the
algorithm uses a series of recursive subdivision operations to render the
curve. Each subdivision splits a Bzier curve into two smaller Bzier curves
and generates a point on the original curve. Figure 4 shows one subdivision.
If the points are integer coordinates, then only a few shifts and adds are
required at each subdivision step. Once the curve has been divided enough
times, you can connect the resulting points on the curve with straight lines.
Quadratic B-splines, which can have three or more control points, can be
converted into quadratic Bzier curves by choosing each consecutive point and
the two points that follow. For example, a spline with seven points converts
to five Bzier curve segments by choosing:
{(P0,P1,P2),(P1,P2,P3),(P2,P3,P4), (P3,P4,P5),(P4,P5,P6)}
You can then use the following rules to find the Bzier curve segments:
1. Bezier (P0,P1,P2), when the spline has exactly one curve segment (3
points).
2. Bezier (P0,P1,(P1+P2)/2), for the first segment.
3. Bezier ((Pi+Pi+1)/2, Pi+1,(Pi+1+Pi+2)/2), for the interior segments.
4. Bezier ((Pn--3+Pn--2)/2, Pn--2, Pn--1), for the last segment.
Rule #3 is the uniform conversion applied to the inner curve segments. Rules
#1, #2, and #4, are called "end conditions," because uniform B-splines don't
ordinarily connect to their endpoints--as we need them to.
--S.R.
 Figure 4: Quadratic Bzier curve showing one level of subdivision.
[LISTING ONE] (Text begins on page 56.)
/*****************************************************************************
 * Glyph viewing program by Steven Reichenthal, 1993. Although this code
 * is basically C, it uses certain C++ constructs such as in-place
 * declaration of variables that require use of a C++ compiler.
 ****************************************************************************/

#include <windows.h>
#include <stdlib.h>
#include <string.h>
#include <commdlg.h>
#include <math.h>

#define IDM_NEW 101 // File Menu ID's
#define IDM_EXIT 108


#define MARKERSIZE 4

#define IDM_SHOWCONTROLPOINTS 402
#define IDM_FILL 403
#define IDM_OUTLINE 404
#define IDM_NORMAL 405
#define IDM_PINCH 406
#define IDM_PUNCH 407

long FAR PASCAL WndProc (HWND hwnd, unsigned iMessage, WORD wParam,
LONG lParam);

HANDLE hInst;
HWND hwnd;
HDC hdc;
PAINTSTRUCT ps;
HWND hwndFrame;
WORD maxClient;
WORD cxClient,cyClient;
LOGFONT lf;
CHOOSEFONT cf;
int character;
int nEffect;
BOOL bShowControlPoints = TRUE;
BOOL bFill;
BOOL bOutline = TRUE;
float pi;

static char szAppName [] = "TrueType Font Demo";

/*----------------------------------------------------------------------*/
// set the caption for the frame window
void set_frame_caption ()
{
 char sz [80];
 wsprintf (sz, "%s - %s", (LPSTR) szAppName, (LPSTR) lf.lfFaceName);
 SetWindowText (hwndFrame, sz);
}
/*----------------------------------------------------------------------*/
#pragma argsused
int PASCAL WinMain (HANDLE hInstance, HANDLE hPrevInstance,
 LPSTR lpszCmdLine, int nCmdShow)
{
 HWND hwnd;
 MSG msg;
 hInst = hInstance;

 if (!hPrevInstance)
 { WNDCLASS wc;
 wc.style = CS_HREDRAW CS_VREDRAW CS_BYTEALIGNWINDOW;
 wc.lpfnWndProc = (WNDPROC) WndProc;
 wc.cbClsExtra = 0;
 wc.cbWndExtra = 0;
 wc.hInstance = hInstance;
 wc.hIcon = 0;
 wc.hCursor = LoadCursor (NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject (WHITE_BRUSH);
 wc.lpszMenuName = "MENU_1";
 wc.lpszClassName = szAppName;


 if (!RegisterClass (&wc)) return FALSE;
 }
 pi = atan2 (0, -1);
 lf.lfHeight = -200;
 lf.lfWeight = 400;
 strcpy (lf.lfFaceName, "Arial");
 character = a';

 hwnd = CreateWindow (szAppName, szAppName,
 WS_OVERLAPPEDWINDOW WS_CLIPCHILDREN WS_CLIPSIBLINGS,
 CW_USEDEFAULT, CW_USEDEFAULT, CW_USEDEFAULT, CW_USEDEFAULT,
 NULL, NULL, hInstance, NULL);

 if(!hwnd) return FALSE;

 hwndFrame = hwnd;

 set_frame_caption ();
 ShowWindow (hwnd, nCmdShow);
 UpdateWindow (hwnd);

 while (GetMessage (&msg, NULL, NULL, NULL))
 {
 TranslateMessage (&msg);
 DispatchMessage (&msg);
 }
 return msg.wParam;
}
/*----------------------------------------------------------------------*/
int xOffset,yOffset;
struct LONGPOINT
{
 long x, y;
};
/*----------------------------------------------------------------------*/
inline int fixed_to_int (long value) // convert a fixed value to an integer
{
 return (value + 32767) >> 16;
}

// draw the array of control points
void draw_control_points (POINTFX pt [], int nPoints)
{ int i;
 LONGPOINT *p = (LONGPOINT *) pt;

 if (!bShowControlPoints)
 return;
 for (i = 0; i < nPoints; i++)
 {
 POINT pt;
 pt.x = xOffset + fixed_to_int (p [i].x);
 pt.y = yOffset - fixed_to_int (p [i].y);
 PatBlt (hdc,
 pt.x - MARKERSIZE / 2, pt.y - MARKERSIZE / 2,
 MARKERSIZE, MARKERSIZE, BLACKNESS);
 }
}
/*----------------------------------------------------------------------*/

#define DEPTH 8 // Bezier recursion depth
#define MAX_BEZIER_POINTS ((1 << DEPTH) + 1)

static POINT huge * Pts; // array of contour points
static POINT huge * pPts; // current point
static POINT huge * pPtsStart; // first contour point
static int * polyCounts; // array of coutour point counts
static int nContours; // number of contours
static GLYPHMETRICS gm;

/*----------------------------------------------------------------------*/
// transform the array of control points
void transform (POINTFX pt [], int nPoints)
{
 LONGPOINT *p = (LONGPOINT *) pt;

 switch (nEffect)
 {
 case IDM_NORMAL: return;

 case IDM_PINCH:
 while (nPoints--)
 {
 float xx = 2.0 * pi * float ((p->x >> 16) - gm.gmptGlyphOrigin.x)
 / float (gm.gmBlackBoxX);
 float yy = 2.0 * pi * float ((p->y >> 16) - gm.gmptGlyphOrigin.y)
 / float (gm.gmBlackBoxY);
 p->x = p->x + float (gm.gmBlackBoxX) / 10.0 * 65536.0 * sin (xx);
 p->y = p->y + float (gm.gmBlackBoxY) / 10.0 * 65536.0 * sin (yy);
 p++;
 }
 break;
 case IDM_PUNCH:
 while (nPoints--)
 {
 float xx = 2 * pi * float ((p->x >> 16) - gm.gmptGlyphOrigin.x)
 / float (gm.gmBlackBoxX);
 float yy = 2 * pi * float ((p->y >> 16) - gm.gmptGlyphOrigin.y)
 / float (gm.gmBlackBoxY);
 p->x = p->x + float (gm.gmBlackBoxX) / 5.0 * 32768.0
 * (1 + cos (yy) * sin (xx));
 p->y = p->y + float (gm.gmBlackBoxY) / 5.0 * 32768.0
 * (1 + cos (xx) * sin (yy));
 p++;
 }
 break;
 }
}
/*----------------------------------------------------------------------*/
// store a point in the array of contour points
void store (LONGPOINT pt)
{
 pPts->x = xOffset + fixed_to_int (pt.x);
 pPts->y = yOffset - fixed_to_int (pt.y);
 pPts++;
}

// sub-divide the quadratic Bezier curve
void near pascal sub_divide (LONGPOINT p [])

{
 static int depth = DEPTH;
 LONGPOINT q [8];

 int x = xOffset + fixed_to_int (p [2].x);
 int y = yOffset - fixed_to_int (p [2].y);

 if (x == pPts [-1].x && y == pPts [-1].y) return;

 if (!depth)
 {
 store (p [2]);
 return;
 }
 q [0] = p [0];
 q [4] = p [2];

 q [1].x = (p [0].x + p [1].x) >> 1;
 q [3].x = (p [1].x + p [2].x) >> 1;
 q [2].x = (q [1].x + q [3].x) >> 1;

 q [1].y = (p [0].y + p [1].y) >> 1;
 q [3].y = (p [1].y + p [2].y) >> 1;
 q [2].y = (q [1].y + q [3].y) >> 1;

 depth--;
 sub_divide (q);
 sub_divide (q + 2);
 depth++;
}
/*----------------------------------------------------------------------*/
// draw the quadratic Bezier curve
void draw_bezier_curve (LONGPOINT p [])
{
 store (p [0]);
 sub_divide (p);
}
/*----------------------------------------------------------------------*/
// draw the quadratic B-spline from the array of points
void draw_quadratic_bspline (POINTFX pt [], int nPoints)
{
 LONGPOINT b [3];
 LONGPOINT *p = (LONGPOINT *) pt;

 if (nPoints == 3)
 {
 draw_bezier_curve (p);
 return;
 }
 b [0] = p [0];
 b [1] = p [1];
 b [2].x = (p [1].x + p [2].x) >> 1;
 b [2].y = (p [1].y + p [2].y) >> 1;

 draw_bezier_curve (b);

 for (int i = 1; i < nPoints - 3; i++)
 {
 b [0].x = (p [i].x + p [i + 1].x) >> 1;

 b [0].y = (p [i].y + p [i + 1].y) >> 1;
 b [1] = p [i + 1];
 b [2].x = (p [i + 1].x + p [i + 2].x) >> 1;
 b [2].y = (p [i + 1].y + p [i + 2].y) >> 1;

 draw_bezier_curve (b);
 }
 b [0].x = (p [i].x + p [i + 1].x) >> 1;
 b [0].y = (p [i].y + p [i + 1].y) >> 1;
 b [1] = p [i + 1];
 b [2] = p [i + 2];

 draw_bezier_curve (b);
}
/*----------------------------------------------------------------------*/
// draw the polyline from the array of points
void draw_polyline (POINTFX points [], int nPoints)
{
 LONGPOINT *p = (LONGPOINT *) points;

 for (int i = 0; i < nPoints; i++)
 store (p [i]);
}
/*----------------------------------------------------------------------*/
// calculate the number of bytes needed for the array of contour points
DWORD compute_memory_requirement (TTPOLYGONHEADER *header, DWORD cbBuffer)
{
 DWORD count = 1;
 do
 {
 TTPOLYGONHEADER *nextHeader =
 (TTPOLYGONHEADER *) (header->cb + (char *) header);
 TTPOLYCURVE *curve = (TTPOLYCURVE *) (header + 1);
 POINTFX pfxStart = header->pfxStart;

 while (1)
 {
 UINT cpfx = curve->cpfx + 1;
 POINTFX *ppfx = curve->apfx - 1;
 POINTFX pfxEnd = ppfx [cpfx - 1];

 if (curve->wType == TT_PRIM_LINE)
 count += cpfx;
 else
 count += (cpfx - 2) * MAX_BEZIER_POINTS;

 curve = (TTPOLYCURVE *) (ppfx + cpfx);
 if (nextHeader <= (TTPOLYGONHEADER *) curve)
 {
 if (memcmp (&pfxEnd, &pfxStart, sizeof (pfxEnd)))
 count += 2;
 break;
 }
 }
 count++;
 cbBuffer -= header->cb;
 header = nextHeader;
 }
 while (cbBuffer);


 return count * (DWORD) sizeof (POINT);
}
/*----------------------------------------------------------------------*/
// draw the glyph outline of the selected character in the current font
int draw_glyph_outline (HDC hdc, int x, int y, int ch)
{
 TEXTMETRIC tm;
 MAT2 mat2;

 GetTextMetrics (hdc, &tm);
 xOffset = x;
 yOffset = (y + tm.tmAscent);

 memset (&mat2, 0, sizeof (mat2));
 mat2.eM11.value = 1;
 mat2.eM22.value = 1;

 DWORD cbBuffer = GetGlyphOutline (hdc,
 ch, GGO_NATIVE, &gm, 0, NULL, &mat2);
 if (long (cbBuffer) <= 0 cbBuffer > 32767)
 return 0;

 void *buffer = malloc (int (cbBuffer));
 if (!buffer)
 return 0;

 GetGlyphOutline (hdc, ch, GGO_NATIVE, &gm, cbBuffer, buffer, &mat2);
 TTPOLYGONHEADER *header = (TTPOLYGONHEADER *) buffer;

 DWORD cbPolygons = compute_memory_requirement (header, cbBuffer);
 HANDLE hPolygons = GlobalAlloc (GMEM_FIXED, cbPolygons);
 if (!hPolygons)
 {
 free (buffer);
 return 0;
 }
 Pts = (POINT huge *) GlobalLock (hPolygons);
 pPts = Pts;
 nContours = 0;
 polyCounts = ((int *) header)+2; //use the area beyond cb for the counts
 do
 {
 TTPOLYGONHEADER *nextHeader =
 (TTPOLYGONHEADER *) (header->cb + (char *) header);
 TTPOLYCURVE *curve = (TTPOLYCURVE *) (header + 1);
 POINTFX pfxEnd = header->pfxStart;
 POINTFX pfxStart = pfxEnd;
 pPtsStart = pPts;

 while (1)
 {
 UINT wType = curve->wType;
 UINT cpfx = curve->cpfx + 1;
 POINTFX *ppfx = curve->apfx - 1;
 ppfx [0] = pfxEnd; // this overwrites 8 bytes before apfx,
 // but we are done with them at this point.
 pfxEnd = ppfx [cpfx - 1];


 transform (ppfx, cpfx);
 draw_control_points (ppfx, cpfx);

 if (wType == TT_PRIM_LINE)
 draw_polyline (ppfx, cpfx);
 else
 draw_quadratic_bspline (ppfx, cpfx);

 curve = (TTPOLYCURVE *) (ppfx + cpfx);
 if (nextHeader <= (TTPOLYGONHEADER *) curve)
 {
 if (memcmp (&pfxEnd, &pfxStart, sizeof (pfxEnd)))
 {
 ppfx [0] = pfxEnd;
 ppfx [1] = pfxStart;
 transform (ppfx, 2);
 draw_polyline (ppfx, 2);
 }
 break;
 }
 }
 *pPts++ = *pPtsStart;
 polyCounts [nContours++] = pPts - pPtsStart;
 cbBuffer -= header->cb;
 header = nextHeader;
 }
 while (cbBuffer);

 PolyPolygon (hdc, (LPPOINT) Pts, polyCounts, nContours);

 GlobalUnlock (hPolygons);
 GlobalFree (hPolygons);
 free (buffer);
 return 1;
}
/*----------------------------------------------------------------------*/
long FAR PASCAL WndProc (HWND hwnd, unsigned iMessage, WORD wParam,
LONG lParam)
{
 const MF [2] = { MF_UNCHECKED, MF_CHECKED };

 switch (iMessage)
 {
 case WM_COMMAND:
 switch (wParam)
 {
 case IDM_NEW:
 cf.lStructSize = sizeof (cf);
 cf.hwndOwner = hwnd;
 cf.lpLogFont = &lf;
 cf.Flags = CF_SCREENFONTS CF_TTONLY
 CF_FORCEFONTEXIST CF_INITTOLOGFONTSTRUCT;
 cf.nFontType = SCREEN_FONTTYPE;
 if (ChooseFont (&cf))
 {
 set_frame_caption ();
 InvalidateRect (hwnd, NULL, TRUE);
 }
 break;

 case IDM_EXIT:
 SendMessage (hwnd, WM_CLOSE, 0, 0L);
 break;
 case IDM_SHOWCONTROLPOINTS:
 bShowControlPoints ^= TRUE;
 CheckMenuItem (GetMenu (hwnd), IDM_SHOWCONTROLPOINTS,
 MF [bShowControlPoints]);
 InvalidateRect (hwnd, NULL, TRUE);
 break;
 case IDM_FILL:
 bFill ^= TRUE;
 CheckMenuItem (GetMenu (hwnd), IDM_FILL, MF [bFill]);
 InvalidateRect (hwnd, NULL, TRUE);
 break;

 case IDM_OUTLINE:
 bOutline ^= TRUE;
 CheckMenuItem (GetMenu (hwnd), IDM_OUTLINE, MF [bOutline]);
 InvalidateRect (hwnd, NULL, TRUE);
 break;
 case IDM_NORMAL:
 case IDM_PINCH:
 case IDM_PUNCH:
 nEffect = wParam;
 CheckMenuItem (GetMenu (hwnd),
 IDM_NORMAL, MF [wParam == IDM_NORMAL]);
 CheckMenuItem (GetMenu (hwnd),
 IDM_PINCH, MF [wParam == IDM_PINCH]);
 CheckMenuItem (GetMenu (hwnd),
 IDM_PUNCH, MF [wParam == IDM_PUNCH]);
 InvalidateRect (hwnd, NULL, TRUE);
 break;
 }
 break;
 case WM_SIZE:
 cxClient = LOWORD (lParam);
 cyClient = HIWORD (lParam);
 break;
 case WM_CHAR:
 character = wParam;
 InvalidateRect (hwnd, NULL, TRUE);
 break;
 case WM_PAINT:
 hdc = BeginPaint (hwnd, &ps);
 HFONT hFont = SelectObject (hdc, CreateFontIndirect (&lf));
 int PolyFillMode = SetPolyFillMode (hdc, ALTERNATE);

 static WORD brushBits [] =
 { 0xf8, 0x74, 0x22, 0x47, 0x8f, 0x17, 0x22, 0x71
 };
 HANDLE hBitmap = CreateBitmap (8, 8, 1, 1, brushBits);
 HBRUSH hBrush = SelectObject (hdc,
 bFill ? CreatePatternBrush (hBitmap)
 : GetStockObject (NULL_BRUSH));
 HPEN hPen = SelectObject (hdc, bOutline ? GetStockObject (BLACK_PEN)
 : GetStockObject (NULL_PEN));

 draw_glyph_outline (hdc, 100, 0, character);


 SelectObject (hdc, hPen);
 hBrush = SelectObject (hdc, hBrush);
 if (bFill)
 {
 DeleteObject (hBrush);
 DeleteObject (hBitmap);
 }
 SetPolyFillMode (hdc, PolyFillMode);
 DeleteObject (SelectObject (hdc, hFont));
 EndPaint (hwnd, &ps);
 break;
 case WM_QUERYENDSESSION:
 case WM_CLOSE:
 case WM_DESTROY:
 PostQuitMessage (0);
 return 1;
 default:
 return DefWindowProc (hwnd, iMessage, wParam, lParam);
 }
 return 0;
}
/*----------------------------------------------------------------------*/
End Listing







































Special Issue, 1993
Writing Windows Custom Controls


Shortening custom-control development by using standard controls as a base




Dan Brindle


Dan has been a consultant in the Bay Area for the last 20 years specializing
in Windows, Windows NT, OS2, and main frame interfaces.


With the overwhelming popularity of Microsoft Windows, many applications have
taken on a consistent and familiar look. This consistency is one of the
benefits of a graphical interface, but it can become tiresome after a while.
One of the best ways to spiff up your Windows apps is through custom controls.
The standard controls provided by Windows--the list box, pushbutton, edit
control, and so on--can easily become base classes for developing interesting
and powerful custom controls.
Since a control is really a child window with additional parent-notification
support, developing a custom control from scratch involves a considerable
amount of work. Using a standard control as a base shortens development by
allowing you to inherit most of the existing functionality and replace or add
only those features that expand functionality. For example, you can modify a
basic edit control so that it allows only numeric input. You can replace a
scroll bar's paint routine and draw a circular dial instead.
In this article, I'll use the standard Windows radio button as the basis for a
VCR-style button. In addition to providing a VCR button that can be used with
regular C applications (see Listing One, page 66), I also take advantage of an
interface that allows the control to work with Visual C++'s App Studio.
Generally, a set of VCR buttons only has a single On state for the set; they
exhibit the same behavior as a group of radio buttons but they're in a
visually different package.


Superclassing


The technique used to inherit the radio-button behavior is usually called
"superclassing." It involves registering a new window class that uses the
underlying base window procedure as the default window procedure. Instead of
passing messages to DefWindowProc, your code uses the CallWndProc function to
call the window procedure of the base class.
The first step in this process is handled by EnhancedButtonInit(), called when
the module is loaded. Because the VCR button class is implemented as a
standard Windows DLL, it is available for use by any number of applications.
Several important bits of information are set up when registering the new
class. The GetClassInfo function is used to retrieve a Windows window-class
structure containing the class information for a button control. The window
proc address is saved off in a global and replaced with the address of a new
window proc. The class name is changed, the instance is adjusted to that of
the DLL, and additional data bytes are added to the "window extra bytes."
These extra bytes are added to the control to save instance state information
for each VCR button. It is important to add the extra bytes since those
allocated by the original class will still be used by the base code. The
starting offset to the VCR state information is stored in a global variable.
The balance of the module is the new window proc and some associated service
routines.
Since we primarily want to change the look of the radio button, you'd assume
the only message that would need to be modified would be the paint routine.
This isn't the case, however, with radio buttons and check boxes. To improve
performance on older platforms, Microsoft wrote the radio button so that
certain portions of the button are painted when the state of the button
changes. To further complicate matters, these states are stored as part of the
extra bytes. The standard radio button has three extra bytes attached. This is
rather unusual because the Windows API only provides GetWindowWord() and
GetWindowLong(), to get word-size or double-word-size extra bytes. These
functions address the extra bytes starting from a base of 0 (that is, the end
of the original struct). The word at offset 1 holds the handle to the font for
the control. The byte at offset 0 contains the control's state flags, such as
focus, checked, and active. The offset to these bits is documented as part of
the BM_SETSTATE message.


Style Flags


One of the parameters to CreateWindow() is a long word of style information.
The high word is reserved for general window styles and the low word is used
for individual control styles. If a control was written from scratch, all of
the low-word style bits would be available. Since we're superclassing the
button style, the reserved style bits defined by Microsoft must be preserved.
Microsoft has not yet defined every bit so some are available for use. The
only danger here is compatibility with future versions of Windows. Visual C++
has improved on style bits and extra words for a control by providing a
property list. Since the button flags have not been redefined since Version
1.03 of Windows, the risk of incompatibility is pretty low. Should some new
flag bits be reserved (as happened with the edit control in version 3.1), you
can use a different combination.
For this example, I define a new flag, BSS_VCR. When this style bit is
defined, the custom button control will look like a VCR button. The conversion
between property lists and style bits is discussed in the Visual C++ interface
documentation.
Since we're superclassing the button class, all of the control's underlying
behavior remains. By using a special BSS_VCR style, all features of the new
code can be ignored if this style is not defined.


Message Handling


The following messages are superclassed by the custom control and are either
handled completely or preprocessed before handing off to the base control
code:
Border width. A custom message BM_SETBORDER has been defined to allow an
application to set the border width of the control externally. The border
width is stored in the extra reserved word of the VCR button and is used
during the paint procedure to draw the three-dimensional border. The 3-D
border gives the control the pushed-in or pushed-out look. For larger buttons,
a wider border gives a better look. In Visual C++, the border width is a
property. The Visual C++ translation layer converts the property value and
assigns it to the internally stored window word. This allows the control to
work transparently with C, Visual C++, and Visual Basic.
Control painting. The interior painting is done by the paint routine. The
background is filled with either the button face or the button shadow color.
This provides a clear indication of whether the button is selected or not. The
3-D border is reversed when the button is checked. A check is also made for a
disabled button. Disabled buttons are painted with grayed-out text. Here we
use system button colors to paint the button, as configured by the
control-panel settings. This provides a consistent interface for both
pushbuttons and VCR buttons.
State painting. State painting is done primarily by the BM_SETSTATE message.
Here we paint both a border showing the control selection and manage the state
bits. Picking up and managing this message relieves the custom control of a
great deal of work. Otherwise the mouse and keyboard would have to be managed.
Since the base control handles capturing the mouse and keyboard messages only,
the state bits have to be handled at this level. Normally, we would even allow
the base control to handle this message by first calling forward and then
doing any extra work. This is not possible here since the base radio button
paints the button portion in this message. Handling the state means setting or
clearing the highlight bit in the first window extra byte. Since the minimum
of a word can be retrieved, the high-order byte must be preserved. Changing
the high byte can destroy the font handle for the control and cause a crash.
For this reason the first window word is always read before modifying and
writing back to the control.
Setting state. The current checked state of the button is also maintained in
the first window extra byte. Here again, the base control handles almost all
of the work, but the BM_SETCHECK message must be handled since the radio
button paints the On/Off portion of the radio control on this message. The
least-significant bit in the low word is toggled to indicate state. The
control is then invalidated, forcing the paint routine to handle the visual
change. One other operation is carried out here. Since radio buttons can be
defined as auto radio buttons, the VCR button should emulate the same
behavior. This keeps the VCR button functionally compatible with the radio
button. Since we can't let the radio button handle the BM_SETCHECK message
without messing up the screen. Therefore, to support the auto feature, other
buttons in the group are managed by BtnWalk(), which is called twice: once, to
find all VCR buttons before the current button, and a second time, to find all
those after. The button-walk function simply turns off any buttons that are
part of the same group.
The rest of the story. All other messages are handled by the base radio-button
code, including notifying the parent and handling such special-purpose
messages as BM_GETCHECK and BM_GETSTATE. I've provided a test program in both
C and C++ to exercise the buttons. In addition, I've coded an interface layer
(in C) as a separate DLL that allows the control to be used by both Visual C++
and Visual Basic.


Using App Studio


The protocol defined by the App Studio in Visual C++ allows old-style controls
to be used, but doesn't allow test drawing or setting of styles. App Studio
defines a new protocol that increases the role of custom controls by allowing
custom and standard properties to be set for a control both at design time and
at run time. Most of this interface is documented in the Custom Control Pack
that comes as part of Professional Visual Basic 2.0; you can also obtain it as
an add-on SDK to Visual C++. Most controls are written in C, and a special
control procedure is coded to handle the additional Visual C++/Basic messages.
By providing some bridge code, you can add a Visual C++/Basic front end to an
old-style custom control. First you must add to the custom control the extra
functionality expected by the Visual C++/Basic protocol, and your code must
translate property requests to and from the underlying control. Done this way,
a control can be used in all environments. To add flexibility, you could
develop an additional DLL that allows the control to be edited by both the
normal Dialog Editor as well as Borland's Resource Workshop. The only two
custom properties implemented in the Visual VCR button are the current value
and the bevel width. The Visual interface does not allow custom-style bits to
be set directly, and the Standard-C style of custom control does not support
properties. To provide the bridge, two types of translation are required:
style-bit and property.


Style-bit Translation



The Visual control provides a control procedure that is called by the Visual
products before a call is made to the normal window procedure. This procedure
is similar to the superclassing used to add custom behavior to the VCR button.
When the control procedure is called at creation time, any associated
properties can be read and translated to style bits. This happens in the
control procedure in VCR.C (along with other files, available electronically;
see "Availability," page 3) on the WM_NCCREATE message. The custom
BSS_VCRBUTTON style is ORed into the Windows-style long word. This style data
is then automatically passed down to the VCR-button window procedure when the
control is created and the underlying control behaves as a VCR button.


Property Translation


Since not all information contained in a control's property can be passed via
style bits, an additional mechanism is needed. This is implemented by
translating property information into data held by the control as "extra
window words." For instance, the bevel-size property is an example of this
approach. When a bevel-size set-property command is sent to the Visual control
procedure, it gets translated to a SendMessage() call (of a BM_SETBORDER
message) and is sent to the underlying control. The VCR button sets the bevel
width to the passed size. Any number of custom properties can be added to a
custom control and handled in this fashion. Standard properties such as fonts
are also handled in this manner automatically and require no additional code
if the underlying control supports the WM_SETFONT message.
A Value property has been implemented to interface the control with the Visual
Basic environment. When a Visual Basic code segment is called, the Value can
be set to True or False. This causes a set property command to be sent to the
interface layer, which in turn sends a BM_SETSTATE message to the underlying
control. The accompanying code is well documented and contains further details
on my implementation.


Conclusion


You can see the power and flexibility of custom controls in Windows,
especially if you use a general framework for their implementation. By pushing
application functionality down to the level of custom controls, Windows
development becomes simpler and more consistent. Additional enhancements might
include combining bitmaps with button faces, adding data-entry validation to
edit controls (via "picture" masks), or creating pop-up helper windows for
tasks such as selecting Zip codes from a database.
Dan Brindle
[LISTING ONE] (Text begins on page 64.)

/* ----------------------------------------------------------------
 * VCRBUT.C -- This module implements a VCR button as a superclass of the
 * standard Windows radio button. The support provided by this file is C code
 * and provides a global class as a DLL. This code accompanies the article
 * "Writing Windows Custom Controls" by Dan Brindle and is provided courtesy
 * of Scopus Technology Inc. (Emeryville, CA). Thanks also to Bill Breedlove,
 * Evergreen Productions, for assistance with code revision and preparation.
 * The interface routines are: CustomButtonInit, BtnWalk, FocusBorder,
 * chk_btn_style and WndProcButton.
 * ----------------------------------------------------------------*/

#define STRICT
#define _WINDLL
#define NOMINMAX
#define NOCOMM
#define NOICONS
#define NOKEYSTATES
#define NOSYSCOMMANDS
#define NOOEMRESOURCE
#define NOATOM
#define NOCLIPBOARD
#define NOSOUND
#define NOWH
#define NOKANJI
#define NOHELP
#define NOPROFILER
#define NODEFERWINDOWPOS

#include <windows.h>
#include <string.h>

#include "vcrbut.h" // Public definitions
#include "winctl.h" // Local defintions

/*------------------global variables--------------------------------*/
HANDLE hInst; // current instance of DLL
WNDPROC OldButtonProc; // Address to underlying button procedure
int extra; // Offset to custom defined window extra bytes

/* -----------------------------------------------------------------

 * CustomButtonInit( hInst ) -- registers class for VCR button control. It's
 * called from LibMain. Allows custom windows classes to be registered when
 * the module is loaded, insuring class is always available to application.
 * Entry: Handle to DLL instance
 * Returns: Result of the RegisterClass()
 * LibMain will fail if class is not registered. This means that application
 * will also fail to load. Unfortunately, it isn't possible to bring up a
 * message box while the DLL is being loaded. Generally class registration
will
 * not fail. If error tracking is a requirement, then error messages can be
 * written to a file at load time.
 * ----------------------------------------------------------------*/
BOOL VCRButtonInit( HINSTANCE hInst )
{
 WNDCLASS WndClass;
 /* Get the underlying class to use. Since the radio button is a
 ** style of button the class information is requested from Windows. */
 GetClassInfo( NULL, "button", &WndClass );

 /* Save address of button window procedure. To be used by superclassed
 ** VCR button to pass messages that don't need additional processing.*/
 OldButtonProc = WndClass.lpfnWndProc;

 /* Register the new class. The class is made global so that
 ** all applications can create VCR buttons */
 WndClass.style = CS_GLOBALCLASS;
 WndClass.lpfnWndProc = (WNDPROC) WndProcButton;
 extra = WndClass.cbWndExtra;
 WndClass.cbWndExtra += BTN_EXTRA;
 WndClass.hInstance = hInst;
 WndClass.hbrBackground = (HBRUSH) GetStockObject(BLACK_BRUSH);
 WndClass.lpszMenuName = NULL;
 WndClass.lpszClassName = (LPSTR) "VCRButton";

 return( RegisterClass( &WndClass ) );
}
/* ------------------------------------------------------------------
 * three_D_border() -- highlights a control and creates a 3-D effect. Done
 * within client area of control. This could be modified to use non-client
area
 * if a lot of drawing was done on client area. Entry params: Handle to window
 * to highlight. Mode variable is true if pushed in, false if popped out.
 * ----------------------------------------------------------------*/
void three_D_border( HWND hWnd, int mode )
{
 RECT rect;
 HPEN hPen, hPenOld;
 HBRUSH hBrush, hBrushOld;
 COLORREF tmp_clr;
 int border_width;
 POINT pt[8];

 HDC hDC = GetDC( hWnd );
 GetClientRect( hWnd, &rect );

 // Get the internal border width for this instance of the control.
 border_width = GetWindowWord( hWnd, extra + BTN_BORDER ) & 0x00ff;

 if ( mode & 0x0001 ) tmp_clr = GetSysColor( COLOR_BTNFACE );
 else tmp_clr = GetSysColor( COLOR_BTNHIGHLIGHT );


 hPen = CreatePen( PS_SOLID, 1, tmp_clr );
 hBrush = CreateSolidBrush( tmp_clr );
 hPenOld = SelectObject( hDC, hPen );
 hBrushOld = SelectObject( hDC, hBrush );

 /* Draw the top border. Here we use the polygon so that we get
 ** diagonal corners. This is not important on a thin border
 ** but makes a difference when the border is wider than 3 pixels. */
 pt[0].x = pt[6].x = rect.left;
 pt[0].y = pt[6].y = rect.top;
 pt[1].x = border_width - 1;
 pt[1].y = pt[2].y = border_width - 1;
 pt[2].x = pt[3].x = rect.right - border_width;
 pt[3].y = rect.bottom - border_width;
 pt[4].x = pt[5].x = rect.right - 1;
 pt[4].y = rect.bottom - 1;
 pt[5].y = rect.top;

 Polygon( hDC, pt, 7 );

 SelectObject( hDC, hPenOld );
 DeleteObject( hPen );
 SelectObject( hDC, hBrushOld );
 DeleteObject( hBrush );

 if ( mode & 0x0001 ) tmp_clr = GetSysColor( COLOR_BTNHIGHLIGHT );
 else tmp_clr = GetSysColor( COLOR_BTNSHADOW );

 hPen = CreatePen( PS_SOLID, 1, tmp_clr );
 hBrush = CreateSolidBrush( tmp_clr );
 hPenOld = SelectObject( hDC, hPen );
 hBrushOld = SelectObject( hDC, hBrush );

 /*------- Draw the underline -----*/
 pt[0].x = pt[1].x = pt[6].x = rect.left;
 pt[0].y = pt[6].y = rect.top;
 pt[1].y = pt[2].y = rect.bottom - 1;
 pt[2].x = rect.right - 2;
 pt[3].x = rect.right - border_width - 1;
 pt[3].y = rect.bottom - border_width;
 pt[4].x = pt[5].x = border_width - 1;
 pt[4].y = rect.bottom - border_width;
 pt[5].y = rect.top + border_width - 1;

 Polygon( hDC, pt, 7 );

 /*-------- Cleanup up before exit ----*/
 SelectObject( hDC, hPenOld );
 DeleteObject( hPen );
 SelectObject( hDC, hBrushOld );
 DeleteObject( hBrush );

 if ( mode & 0x0004 )
 {
 hPen = CreatePen( PS_SOLID, border_width / 3 + 1, 0 );
 hPenOld = SelectObject( hDC, hPen );
 hBrushOld = SelectObject( hDC, GetStockObject( NULL_BRUSH ));
 Rectangle( hDC, rect.left, rect.top, rect.right, rect.bottom );
 SelectObject( hDC, hPenOld );

 SelectObject( hDC, hBrushOld );
 DeleteObject( hPen );
 }
 ReleaseDC( hWnd, hDC );
 return;
}
/* -----------------------------------------------------------------
 * button_walk() -- used by auto radio button style. All VCR buttons in a
group
 * are checked and state set to off. Allows automatic checking of VCR buttons.
 * Params: Starting window handle, Direction to walk the window list.
 * ----------------------------------------------------------------*/
static void button_walk( HWND hWnd, WORD wDirection )
{
 HWND hTmp;
 static char class_name[ 33 ];
 LONG lStyle;
 int tmp_chk;

 hTmp = hWnd;
 while ( hTmp = GetWindow( hTmp, wDirection ) )
 {
 /* Get the class name and check for
 ** the button class of some sort. This
 ** is a bit dangerous as some class names may
 ** contain button and not conform to the
 ** style bits. To limit this to VCRBUTTON
 ** then use a strcmp(). */
 GetClassName( hTmp, class_name, 32 );
 _fstrupr( class_name );
 if ( _fstrstr( class_name, "BUTTON" ) )
 return;
 /* Get the style and look for the AutoRadio style bit.
 ** The BSS_AUTOVCRBUTTON flag automatically sets this bit. */
 lStyle = GetWindowLong( hTmp, GWL_STYLE );
 if ( lStyle & ( BS_AUTORADIOBUTTON ) )
 {
 // stop if we reach next group
 if ( lStyle & WS_GROUP )
 break;
 /* A valid radio button found, make sure it is unchecked.
 ** Do this internally so that we don't generate
 ** additional auto walk. */
 if (( tmp_chk = GetWindowWord( hTmp, extra )) & 0x0001 )
 {
 three_D_border( hTmp, tmp_chk );
 SetWindowWord( hTmp, extra, tmp_chk &= ~0x0001 );
 }
 }
 else
 break;
 }
}
/* -----------------------------------------------------------------
 * focus_border() -- displays a focus border for the VCR button.
 * The focus border consists of a dashed rectangle around the text.
 * Entry: Handle to a control. Mode var controls adding or removing border.
 * ----------------------------------------------------------------*/
static void focus_border( HWND hWnd, int Mode )
{

 RECT rect;
 HDC hDC;
 int border_width;

 // Get the current border width
 border_width = GetWindowWord( hWnd, extra + BTN_BORDER ) & 0x00ff;

 // Always draw this on the client rectangle
 hDC = GetDC( hWnd );
 GetClientRect( hWnd, &rect );
 InflateRect( &rect,
 (GetSystemMetrics( SM_CXBORDER ) * -2) - border_width,
 (GetSystemMetrics( SM_CYBORDER ) * -2) - border_width );
 DrawFocusRect( hDC, &rect ); // Put up or take down the focus rectangle
 ReleaseDC( hWnd, hDC );
}
/* -----------------------------------------------------------------
 * chk_btn_style() -- a helper function that helps separate style bits
 * and returns the result. Entry: Handle to a control
 * ----------------------------------------------------------------*/
static int chk_btn_style( HWND hWnd )
{
 LONG style;
 style = GetWindowLong( hWnd, GWL_STYLE );
 if ( (style & ( BS_AUTORADIOBUTTON )) == BS_AUTORADIOBUTTON )
 return( BS_AUTORADIOBUTTON );
 if ( (style & ( BS_RADIOBUTTON )) == BS_RADIOBUTTON )
 return( BS_RADIOBUTTON );
 return( 0 );
}
/* -----------------------------------------------------------------
 * WndProcVCRButton() -- main window procedure for VCR button. Any messages
 * that need special handling are processed here. Rest of messages to support
 * regular button processing are passed to regular button windows procedure
via
 * the address saved when the VCR button class is registered.
 * ----------------------------------------------------------------*/
LRESULT CALLBACK
WndProcButton( HWND hWnd, UINT wMsg, WPARAM wParam, LPARAM lParam)
{
 LONG lStyle;
 RECT rect;
 HFONT hFont;
 WORD tmp_chk;
 switch ( wMsg )
 {
 /* Do some extra processing if we are in the custom mode.
 ** In this case we set a default border width for the VCR button. */
 case WM_CREATE:

 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON ) 
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 SendMessage( hWnd,
 BM_SETBORDER, DEFAULT_BTN_BORDER, 0L );
 break;
 /* Handle the calculation of the non-client area. Normal radio button
 ** reserves no area for a border. Here we pass WM_NCCALCSIZE to default
 ** window procedure which will calc an area based on type of
 ** of border the VCR Button has. */

 case WM_NCCALCSIZE:
 if (( lStyle = GetWindowLong( hWnd, GWL_STYLE )) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON ) 
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 return( DefWindowProc( hWnd, wMsg, wParam, lParam ));
 break;
 /* WM_USER + 1 -- Set or clear a check from radio button. In the
 ** case of VCR code, we invert--push in or pull out button. This
 ** message must be handled since underlying windows control code
 ** paints the state of the button on this message. */
 case BM_SETCHECK:
 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON ) 
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 {
 /* If auto style is set for the VCR button then
 ** walk list of other VCR buttons. This is done for
 ** buttons both before and after current button. */
 if ( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON )
 {
 button_walk( hWnd, GW_HWNDPREV );
 button_walk( hWnd, GW_HWNDNEXT );
 }
 /* Get previous set state from first byte of first
 ** window word. On-off state is stored in LSB.
 ** Toggle this bit but preserve rest of bits. */
 tmp_chk = GetWindowWord( hWnd, 0 );
 if ( wParam ) tmp_chk = 0x0001;
 else tmp_chk &= ~0x0001;

 SetWindowWord( hWnd, 0, tmp_chk );
 /* Since we are generally running on a fast
 ** system and the toggling of the visual state
 ** involves the entire button, it is easier
 ** to issue a re-paint for the entire button. */
 InvalidateRect( hWnd, NULL, FALSE );

 /* Block any further processing. This must be done
 ** to prevent old radio button from being drawn. */
 return( 0L );
 }
 /* If we get here then this button style does not require
 ** special processing and default routines are called. This
 ** allows VCR button class to support other button styles. */
 break;
 /* WM_USER + 3 -- This message is sent to the control when it is first
 ** selected or de-selected. Default action is to highlight control
 ** button. Again this message must be handled since it does
 ** direct screen drawing. */
 case BM_SETSTATE:
 /* Check for special button processing */
 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON )
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 {
 /* Get current selection state without destroying
 ** other state bits. */
 tmp_chk = GetWindowWord( hWnd, 0 );


 /* Put in or take out selected state bit based on
 ** passed state. State bit is documented in SDK */
 if ( wParam ) tmp_chk = 0x0004;
 else tmp_chk &= ~0x0004;

 SetWindowWord( hWnd, 0, tmp_chk );
 three_D_border( hWnd, tmp_chk );

 /* Add call to highlight button here,
 ** to draw a black border around the control */
 /*-- add code here --*/
 /* Block default processing */
 return( 0L );
 }
 break;
 /* Set border width for control. In visual tools, this as a property.
 ** Since underlying code must respond to a different size border, we
 ** keep it here as a an extra byte. This opens this capability up to
 ** a normal C program */
 case BM_SETBORDER:
 tmp_chk = GetWindowWord( hWnd, extra + BTN_BORDER );

 /* Put in size of border in low byte. High byte is reserved
 ** for some style flags. Make sure border width is at least
 ** 2 wide and less than 64. */
 tmp_chk &= 0xFF00;
 if ( wParam < 2 )
 wParam = 2;
 tmp_chk = wParam & 0x0003F;
 SetWindowWord( hWnd, extra + BTN_BORDER, tmp_chk );
 return( 0L );
 /* Since we are handling the painting of VCR button, we also have to
 ** handle situation where control is disabled. Windows will handle all
 ** other stuff related to control being disabled. */
 case WM_ENABLE:
 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON )
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 {
 /* Just let the paint routine
 ** display the disabled state */
 InvalidateRect( hWnd, NULL, FALSE );
 return( 0L );
 }
 break;
 /* On a control we also draw a focus rectangle when control has focus.
 ** This is typically done drawing a dashed rectangle around button
 ** text. In the case of the radio button, the set and kill focus
 ** message draws and removes this rectangle. */
 case WM_SETFOCUS:
 case WM_KILLFOCUS:
 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON )
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 {
 WORD tmp_chk;
 /* Toggle the focus bit */

 tmp_chk = GetWindowWord( hWnd, 0 );


 if ( wMsg == WM_SETFOCUS )
 tmp_chk = 0x0008;
 else
 tmp_chk &= ~0x0008;

 SetWindowWord( hWnd, 0, tmp_chk );

 /* Draw the focus rectangle */
 focus_border( hWnd, ( GetFocus() == hWnd ) );
 /* Block drawing the normal radio button
 ** focus rectangle. */
 return( 0L );
 }
 break;
 /* Handle any non-client Painting. This is generally the
 ** border. Here we use the default windows processing. */
 case WM_NCPAINT:
 /* Do radio button processing */
 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON )
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 {
 /* Nothing to do yet */
 break;
 }

 break;
 /* This is what makes the VCR button special. All the
 ** display level is handled here. */
 case WM_PAINT:
 {
 char szText[ MAX_TEXT_SIZE + 1 ];
 TEXTMETRIC tm;
 int iCenter;
 int iTop;
 PAINTSTRUCT ps;
 HBRUSH hBrush;
 COLORREF tmp_clr;
 /* Check for VCR radio button special processing */

 if ( GetWindowLong( hWnd, GWL_STYLE ) & BSS_VCR )
 if (( chk_btn_style( hWnd ) == BS_AUTORADIOBUTTON )
 (chk_btn_style( hWnd ) == BS_RADIOBUTTON ))
 {

 BeginPaint( hWnd, &ps );

 /* Find out window size to draw the underline */
 GetClientRect( hWnd, &rect );

 /* Draw button text in button centered, then
 ** a rectangle border */

 SetTextAlign( ps.hdc, TA_CENTER );

 /* Get any control assigned font to use from
 ** underlying font table in control and select it
 ** for drawing. Could get a font handle from extra

 ** words; would be quicker. Sending a message is
 ** more portable to future versions of Windows. */

 if ( hFont = (HFONT) LOWORD( SendMessage( hWnd,
 WM_GETFONT, 0, 0L )) )
 hFont = SelectObject( ps.hdc, hFont );
 /* Here we take the Microsoft easy way out. Generally
 ** we would send a WM_CTLCOLOR message here to allow
 ** parent to set color of button. Instead we follow
 ** Microsoft's lead and use system defined button
 ** colors. These can be changed for all buttons in the
 ** WIN.INI file. This allows VCR button to match all
 ** other push buttons in system. A nice enhancement
 ** would be to merge brush color with Windows System
 ** defined colors to create special-effect buttons.
 ** Another might be to paint button with a bit map.
 ** To make the pushed in or selected state of
 ** the button standout we use the BTNSHADOW
 ** as the background instead of the BTNFACE. */
 if ( GetWindowWord( hWnd, 0 ) & 0x0001 )
 {
 tmp_clr = GetSysColor( COLOR_BTNSHADOW );
 SetBkColor( ps.hdc, tmp_clr );
 SetTextColor( ps.hdc,
 GetSysColor( COLOR_BTNHIGHLIGHT ) );
 }
 else
 {
 tmp_clr = GetSysColor( COLOR_BTNFACE );
 SetBkColor( ps.hdc, tmp_clr );
 SetTextColor( ps.hdc,
 GetSysColor( COLOR_BTNTEXT ) );
 }
 /* Fill the button with the current solid color. */
 hBrush = CreateSolidBrush( tmp_clr );
 FillRect( ps.hdc, &rect, hBrush );
 DeleteObject( hBrush );
 /* Now handle the text for the button. Here we impose
 ** a limit on text that can be put in a push button.
 ** This can be increased by modfing the local header
 ** for the control. Text is centered by default. Could
 ** be modified to allow left and right text. */
 GetWindowText( hWnd, szText, MAX_TEXT_SIZE );
 GetTextMetrics( ps.hdc, &tm );
 iCenter = ( rect.right - rect.left ) / 2;
 iTop = (rect.bottom - tm.tmHeight) / 2;

 /* Check to see if we need to gray the string */
 if ( IsWindowEnabled( hWnd ) == FALSE )
 {
 if ( tmp_clr == GetSysColor( COLOR_BTNSHADOW ) )
 tmp_clr = GetSysColor( COLOR_GRAYTEXT );
 else
 tmp_clr = GetSysColor( COLOR_BTNSHADOW );
 SetTextColor( ps.hdc, tmp_clr );
 }
 TextOut( ps.hdc, iCenter, iTop, szText,
 lstrlen( szText ) );
 /* If we are using a special font then deselect it */

 if ( hFont )
 hFont = SelectObject( ps.hdc, hFont );
 /* Redraw the focus rectangle if we have focus. */
 three_D_border( hWnd, GetWindowWord( hWnd, 0 ));

 if ( GetFocus() == hWnd )
 focus_border( hWnd, TRUE );
 EndPaint( hWnd, &ps ); /* clean up before exit */
 return( 0L ); /* Block all other painting*/
 }
 break;
 }
 default: break;
 }
 /* Forward all unmodified messages to the old button wndproc */
 return( CallWindowProc( OldButtonProc, hWnd, wMsg, wParam, lParam ));
}
/* -----------------------------------------------------------------
 * WEP -- Standard function to cleanup tasks when the DLL is unloaded. WEP()
 * is called automatically by Windows when DLL is unloaded (no remaining tasks
 * still have DLL loaded). Microsoft strongly recommends DLLs have a WEP(),
 * even if it does nothing but returns success (1), as in this example.
 * ----------------------------------------------------------------*/
int FAR PASCAL WEP ( int bSystemExit )
{
 bSystemExit; // to quiet the compiler about unused parameter warning
 return(1);
}
/* -----------------------------------------------------------------
 * LibMain -- the main entry point for the Image Window Library
 * Returns: FALSE if unable to initialize
 * ----------------------------------------------------------------*/
int FAR PASCAL LibMain( HANDLE hModule, WORD wDataSeg,
 WORD cbHeapSize, LPSTR lpszCmdLine )
{
 wDataSeg;
 cbHeapSize;
 lpszCmdLine;
 hInst = hModule; // Save module handle to use as instance later
 if ( ! VCRButtonInit( hInst ) ) //Register the enhanced button class
 return( FALSE );
 return( TRUE );
}
End Listing


















Special Issue, 1993
A Generic SQL Class Library


Multiple database support for Windows




Ken North


Ken has been developing software--including DBMS projects for mainframe, mini,
PC, and client-server systems--for 25 years. Contact him at Resource Group
Inc., 2604B El Camino Real, #351, Carlsbad, CA 92008 or on CompuServe at
71301,1306.


If you're a database programmer, acronyms such as "ODBC" and "IDAPI" are
likely already part of your vocabulary. Microsoft's Open Database Connectivity
(ODBC) API and the Integrated Database API (IDAPI) from Borland, Novell, IBM,
and WordPerfect are emerging multi-database technologies that use loadable
drivers to provide access to multiple DBMSs and database formats. A third
tool, the Q+E Database Library (Q+E Lib) from Pioneer Software, also provides
support for multiple DBMSs via a single API.
In a perfect world, programmers using one of these middleware solutions would
be able to support different database platforms from a single set of source
code. But when it comes to DBMSs, we don't live in an ideal world. This
article explores tools which attempt to fill the gap between the real and the
ideal. In particular, I'll examine Pioneer Software's Q+E Lib and Microsoft's
ODBC 1.0, an extension to Windows. (IDAPI reference materials were not
available in time to include in this article.) In the process, I'll present a
minimal SQL class library written in Borland C++ that works with APIs for
multi-DBMS programming. I'll also provide a Windows utility that uses the
class library to identify the structure of tables in different DBMS formats.


Programming for Q+E Lib


Q+E Lib is a library consisting of Windows and OS/2 DLLs that implements
gateways or database drivers. The DLLs include drivers that deliver SQL access
to DB2, Ingres, Oracle, SQL Server, Sybase, Netware SQL, OS/2 DBM, XDB,
SQLBase, Paradox, Btrieve, dBase, Excel, and text data. Although some features
are driver dependent, for many purposes you need only one set of source code.
The ODBC architecture is similar to Q+E Lib, so it is no coincidence that
Pioneer will supply many of the early drivers when Microsoft releases ODBC.
Both Q+E Lib and ODBC provide a standard call interface for application
programs because the driver layer addresses the difference in SQL
implementations.
Q+E Lib includes dozens of functions for data conversion and functions that
return the data type of a column as a Q+E data type or the DBMS's internal
data type. It supports transaction processing (where supported by the driver)
with functions to begin, commit, and rollback transactions. It includes a
qeSetDB function to change the default database (where the DBMS supports
multiple databases) and fetch options that fetch the next record
(qeFetchNext), values for data types (qeValxxxx), multicolumn data, and data
bound to program variables (qeBindCol).
For debugging, it provides functions to log calls to connection and execution
functions (qeTraceOn and qeTraceOff). Q+E Lib supports more than one level of
error reporting. It is good programming practice to check the status after
each Q+E Lib function call by calling qeErr. If it returns an error, call
qeErrMsg for error messages or qeDBErr for the error code from the database
system.
To successfully link with the Q+E Lib import lib, it is important to avoid
C++'s name-mangling by defining the function prototypes with an "extern C"
wrapper. The QEAPI.H header illustrates this technique.


An SQL Class Library


The first step in designing a class library is determining the ideal
complexity and depth of the class hierarchy. Because SQL entities map easily
into C++ classes, some developers prefer to construct a fairly complex
hierarchy by creating a C++ object for each SQL object (tables, views,
cursors, rows, columns, values, and individual data types). Others prefer less
complexity and a higher level of abstraction. As Figure 1 illustrates, I took
the minimalist approach to implement the SQL class hierarchy.
dbObject (see Listings One and Two, page 74) is the base class that represents
generic object behavior. To assist debugging and memory management, dbObject
maintains a count of the number of users of an object. In the near future,
operating systems will provide multithreaded, networked, object-oriented,
parallel, or multiprocessing operations. We will optimize applications to work
with object-memory managers (demand-paged, virtual-memory management on an
object basis) and distributed object managers, so reference counts will become
increasingly important.
dbDatabase (Listing Three, page 74) is the parent class for database objects
such as tables, columns, and related objects. Although some DBMS products
define a single table as a database, the generally accepted definition is a
set of tables accessed by a DBMS. For example, an engineering application that
uses an Oracle database for stress data and a Sybase database for a parts
breakdown will have two instances of the dbDatabase class. dbDatabase
maintains database status and environment information.
dbConnection (Listings Four and Five, page 74 and 75) encapsulates a
connection to a data source. A data source is an object for which there is a
driver and path, and optional attributes such as server names, user IDs, and
passwords. Connection to a dBase or Paradox data source with Q+E Lib drivers
requires only the driver name, whereas the driver for other SQLs may require
user IDs and passwords. dbConnection is a child of dbDatabase. Its Connect
member creates a connection string (pointed to by a far pointer) and passes it
to qeConnect. If the connection is established, qeConnect returns a valid
connection handle. The Disconnect member function passes the connection handle
when it calls qeDisconnect to terminate the connection.
dbRequest (Listings Six and Seven, page 77 and 78) encapsulates the SQL
request. The request represents an individual SQL command or query, including
preparation and execution of the statement. It may also include the
implementation of cursors (vehicles for tracking positions), although the
example doesn't use cursors for Paradox or dBase files. dbRequest is derived
from dbConnection. When dbRequest is instantiated, it receives the command,
the connection handle, and database option settings. It uses Build Request to
process the command and supply a far pointer to ExecuteStatement, which calls
qeExecSQL to execute the SQL request. qeExecSQL returns a valid statement
handle if the command executes. FindNumberofColumns takes a statement handle
and returns a count of the number of columns associated with that statement.
TerminateStatement takes a statement handle and calls qeEndSql to terminate
the request. AllocateStatement and GetCursorforRequest are used with SQL
implementations in which an application preallocates statement and cursor
handles. dbOptions and RequestOptions are options for each database and
request. A program that accesses several data sources using a variety of
drivers must maintain a context because capabilities and options may vary from
driver to driver and request to request.
dbColumn (Listings Eight and Nine, page 79) maintains information on columns
or fields in relational database tables. In this example, it includes calls to
Q+E Lib's functions that manipulate columns. dbColumn is a child of dbRequest;
arguments passed to its constructor are the connection handle, database
options, statement handle, SQL string, and an integer-column position.
Describe calls qeColName to obtain the column name, qeColType for the data
type, qeColWidth for the width of the longest value that the column can store,
and where appropriate, qeColScale for the column scale. DecodeQEDataType maps
the column's integer data type to an equivalent type description.
Finally, dbBlob (Listings Ten and Eleven, page 81) supports binary large
objects such as images or sound. I'll examine each of these in the context of
Q+E Lib and ODBC.


The Dictionary Display


SQLSTRUC.CPP (Listing Twelve, page 82) is a utility to demonstrate the SQL
library. The ODBC API includes function calls that determine the features and
capabilities supported by a given database driver. However, Q+E Lib, version
1.x does not include a comparable function set, so the options are handled in
the SQLSTRUC application. For simplicity, SQLSTRUC doesn't include Windows GUI
classes. Rather, it is designed with a command-line interface that can be
linked with Microsoft's QuickWin or Borland's EasyWin library. These libraries
are useful for quick porting of command-line utilities that use standard C
I/O.
SQLSTRUC sets data-source values and instantiates the data source and options
for this example. It prompts for arguments (table, path, and driver) and
instantiates a connection to the driver. If there is no error, it creates the
query string, instantiates the request, defines request options, builds the
request, executes the statement, determines the number of columns, and
performs a loop to produce the description of each column. When it finishes
the loop, it terminates the statement and connection.
The complete project, including the SQL class library and example program, is
available electronically; see "Availability," page 3. To build this project,
you need the Q+E Database Library. The program identifies the choice of DBMS
by prompting for the name of the database driver. Driver names are a character
string such as QEDBF (dBase), QEPDX (Paradox), and QEORA (Oracle); refer to
the Q+E documentation for driver dependencies. For example, the Paradox driver
requires SHARE and the Paradox Engine DLL. When using make or project files
for Windows EXEs that bind to DLLs, you must include an import lib that
identifies the entry points in the DLL. If you don't have an import lib, you
may create one using EXEHDR or IMPLIB. To run the example, you must supply
three arguments: the name of the table whose structure you wish to list, the
DOS pathname that identifies the location of database files, and the database
driver. Some products such as dBase and Paradox store the structure
information in the same directory as the tables, whereas products like Netware
SQL often use a separate directory for data-dictionary files.


Enhancements


To implement a fully featured SQL class library or application framework,
additional items may be necessary. SQL applications call for a robust
error-handling class that processes SQL engine errors (local and remote),
driver errors, and internal debugging errors. Your applications may benefit
from a custom memory manager that overloads the new and delete operators and
uses the error handler to recover when there is insufficient memory to
instantiate objects. Multimedia applications call for server classes for
images, video, and sound, and many SQL implementations require classes that
support block, scrollable, or named cursors as a vehicle for tracking position
in a view.
Additional features are associated with security and data integrity issues.
Most SQLs implement some form of transaction processing and security, so a
class design should consider group privileges, concurrency, and
commit/rollback support. Some SQL servers provide logic such as stored
procedures or triggers that execute at the server. They are not standard SQL,
but powerful extensions desirable for robust library implementations. The
final format of the classlibs is another implementation decision. One solution
is to supply the libraries as object libraries, but some classes are probably
candidates for shared-class DLLs. This method of implementation may require a
bit more analysis, but DLLs have definite benefits.


ODBC Architecture



Like other layered products, ODBC consists of several components. The
application interacts with the ODBC Driver Manager (ODBC.DLL), which sits at a
layer above one or more single- or multiple-tier drivers. Single-tier drivers
process both ODBC calls and SQL statements; they sit directly above the data
source. Multiple-tier drivers process the ODBC calls but pass the SQL
statements to a server for processing. The Driver Manager processes some ODBC
calls without driver involvement.
The ODBC Software Development Kit (SDK) includes several components needed to
develop ODBC applications in C++. The required files that Microsoft ships
include the Driver Manager DLL (ODBC.DLL), its import library (ODBC.LIB), and
the headers for core-level functions (SQL.H) and extended functions
(SQLEXT.H). The SDK also includes a framework for a sample driver, a driver
test program (GATOR), and a Visual Basic sample application. The SDK documents
installation and setup functions available to Windows-based
driver-installation programs. When you install an ODBC-compliant engine, the
DBMS vendor will ship most of the files necessary to develop and administer
ODBC-enabled applications. Microsoft does not currently license the header
files for redistribution, so you will need the ODBC SDK.


Developing for ODBC


Most SQL engines that include an ODBC driver provide a Windows installation
program that requires little conscious decision making, with the exception of
reconciling previously installed software (ODBC executables and DLLs). The SDK
documents installation and setup functions. Once ODBC is installed, the ODBC
Administrator remains on your Windows desktop. The Administrator is your
vehicle for providing path and ID information about your databases. When you
run the Administrator, select an installed driver, and click on the Configure
button, the program will display driver-dependent setup dialog boxes. To
configure the data source, some drivers require minimal information such as a
name and description. Other drivers require more extensive information.
The ODBC SDK includes headers for core ODBC functions (SQL.H) and extended
ODBC functions (SQLEXT.H). These headers do not follow the C++ compatibility
convention used in the headers for Microsoft's I/O library. To avoid the C++
name-mangling problem, you will have to revise SQL.H and SQLEXT.H by including
a linkage specifier or wrapper around the ODBC function prototypes. QEAPI.H
illustrates the use of the "extern C" linkage specifier.
Programmers that have used other SQL products will find familiar territory in
most of the concepts embodied in the ODBC API--fetches, commits, rollbacks,
and cursors, for example. Functions unique to ODBC are those that relate to
matching a programming interface to a variety of DBMS engines. One of the
principal differences between developing for ODBC and Q+E Lib (or other SQL
libraries) is the informational functions included in ODBC to support run-time
programmatic decisions. A developer can use these calls to query the Driver
Manager to identify features of the driver and DBMS. These include items such
as: conformance to ODBC, SAG, and SQL grammars; supported ODBC functions and
data types; whether the driver and DBMS support stored procedures and
asynchronous processing; and so on.
One of the advantages to the client-server architecture is the ability to
execute code (procedures and triggers) at the server. Microsoft's SQL Server,
which is a port of Sybase's SQL Server, shows some of that heritage because it
includes function calls that support stored procedures. It will also work with
Oracle's triggers, but no function calls make triggers accessible by client
applications. ODBC adds support for scrollable and named cursors and a
function (SQLSpecialColumns) that permits applications to use custom
scrollable cursors. Some SQLs include support for row IDs (the best set of
columns that uniquely identify a row). The SQLSpecialColumns column type can
be used to retrieve this type of information for Oracle, Ingres, SQLBase, and
Sybase.


Determining Conformance


The ODBC specification defines several levels of conformance, with a Core
level that corresponds to the SQL Access Group's Call Level Interface (CLI),
which consists of 23 functions. There are three connection functions, five
preparation functions, two request submission functions, eight retrieval
functions, and six termination functions. Level 1 includes core functions plus
15 additional functions. Level 2 includes core and level 1, plus 16 additional
functions.
An application can determine driver functionality at run time by using several
informational functions. SQLGetTypeInfo, for example, returns information
about data types supported by the SQL engine; SQLGetFunctions returns
information about ODBC functions implemented by a driver; and SQLGetInfo
returns a variety of information that profiles a driver and data source.


Connecting to a Data Source


The functions that link to data sources provide an illustration of the
hierarchy of calls and conformance levels. An application may use a core level
1 or level 2 function to connect to a data source. The minimal implementation
(SQLConnect) that a driver must support includes a provision for user and
password information. More sophisticated (level 1) drivers that require
additional information such as schemas or procedure catalogs will include a
SQLDriverConnect function that instructs the driver to display a dialog box to
prompt for DBMS-specific information. Level 2 ODBC drivers also support a
function call that provides an iterative, browsing method (SQLBrowseConnect)
of connecting.


Class Library Revisited


One approach to supporting multiple DBMS APIs is to implement a single class
structure with superclasses for generic DBMS objects. The objects specific to
an API such as ODBC are derived from the superclasses. The ODBC and Q+E
classes that accompany this article reside in separate libraries. The APIs for
IDAPI and Q+E 2.0 are not available at the present time, so the final design
of a practical class library that will support all three APIs remains on the
agenda with the notation, "real soon now."
One of the objectives that a developer must consider when designing a class
library for ODBC is whether to provide a thin wrapper around the ODBC C
functions or to organize classes at a higher level of abstraction. One of my
objectives was to work at a level that leant itself to development for the
Windows GUI. For example, some member functions create lists in a format
suitable for use with list-box controls, although the actual code for the
Windows dialog is not in the class library.
The database class includes multiple instances of a data source--an entity
that implies a path to the data and a loadable DBMS driver. It may also
include a user ID, password, and related information. The mapping from a data
source to an SQL server may be a many-to-one or a one-to-one relationship.
ODBC supports multiple connections to multiple data sources, and some drivers
are capable of asynchronous operations. Therefore, a multi-DBMS class library
must accommodate applications that connect to several data sources at one
time, using multiple drivers. Each connection may have a one-to-one or
one-to-many relationship with requests or SQL statements. Requests may have a
one-to-one or one-to-many relationship with view, tables, rows, and columns.
There are also singular relationships to manage. There is one ODBC environment
handle per application, so it fits nicely into an application or environment
class. Some data-source variables such as driver versions and DBMS names are
static across one or several connections and requests.


ODBC Classes


Implementing for ODBC requires changes to the minimalist SQL class hierarchy
presented earlier. Besides runtime profiling, ODBC differs from Q+E Lib in the
allocation of connection handles and an environment handle.
The Q+E Lib connect function (qeConnect) returns a connection handle if the
function call is successful. ODBC provides a separate function
(SQLAllocateConnection) to allocate the handle. It requires the handle prior
to the application's call to one of the connection functions (SQLConnect,
SQLBrowseConnect, and SQLDriverConnect). ODBC uses an application environment
handle, a concept that has no counterpart in Q+E Lib. The revised SQL class
structure is shown in Table 1.


Data-source Profile


Programmers making the transition from Q+E Lib to ODBC will note several
obvious differences. ODBC provides function calls useful in making run-time
decisions about the application's DBMS platform. The ODBC connection class
includes a member function (ProfileDataSource) that makes a series of calls to
SQLGetInfo in order to create a data source and driver profile. The profile
includes information such as version numbers, commit and rollback behaviors,
whether the driver supports stored procedures, whether all tables and
procedures are accessible, and whether it is compliant with ODBC, CLI, and so
on.
DSRCINFO (available electronically) is a command-line utility that prompts for
a data-source name and then calls ProfileDataSource and SQLGetInfo to create a
data-source profile; see "Availability," page 3. SQLGetInfo is also used to
identify the types of scalar data (numeric, string, timedate) and conversion
functions that the driver supports. DSRCINFO also obtains through SQLGetInfo
the ODBC SQL conformance level, returning a 0 for minimum, 1 for core, or 2
for extended grammar. Using DSRCINFO, I found that although White Cross's SQL
is one of the few SQLs certified without conformities, the driver returns a 1.
Microrim touts R:base as ANSI level 2 (incorporating IBM DB2 enhancements),
but the R:base driver returns a 0. It is important to remember that the SQL
level is a measure of conformance with the features of ODBC SQL defined in the
SDK documentation. Appendix C of the programmer's manual includes the ODBC SQL
Grammar matrix.


Conclusion


Testing for this article included a mix of drivers, in part to demonstrate the
scalability of the technology. My tests used the SDK test driver and the
released ODBC drivers. Developers of drivers for large-scale servers (Oracle,
rdb, Teradata, and White Cross) conducted tests for me. The information from
SQLGetInfo is subject to change between the time of this writing and the
release of these drivers. Profiles for the dBase driver, Quadbase-SQL, R:base
SQL, Watcom SQL, NCR's Teradata, White Cross 9000, Oracle, and DEC's rdb are
all available electronically. Finally, the complete ODBC SDK is available free
of charge from Microsoft.


SQL Development Tools



One of the most significant trends in DBMS architecture is the emergence of
client/server systems that separate the database engine (back end) and the
user interface (front end) software. The development of Microsoft's ODBC
emphasizes that separation because developers will soon be writing Windows
front ends that will work with a variety of back-end DBMS products. One of the
benefits of the ODBC architecture is scalability. To demonstrate scalability
for this article, I ran test programs with PC-based SQL engines and "super"
servers (Teradata and White Cross).
In recent months, several companies have released SQL database engines that
include drivers for ODBC. By installing one of these engines, it is possible
to begin development of desktop applications that may eventually connect to
mainframe and server-based SQL products that will be shipping drivers in 1993.
One of the advantages of these products is that they permit a developer to
write software for laptops that will work with large mainframe data bases. In
addition to support for the ODBC API, they include other features that
differentiate the products. These unique features include BLOB support,
Windows and DOS ODBC libraries, read-only schemas for CD-ROM, PenPoint
support, dynamic data exchange (DDE) support, and statistical feedback for
tuning and optimizing queries.
Microrim's R:Base Engine supports level 2 SQL for Windows and DOS development.
Developers may write DOS SQL applications using the same API they will use for
ODBC and Windows applications, although the DOS support is in the form of
object libraries that are compatible only with Microsoft C. Microrim ships the
product with sample code for C and Visual Basic.
Quadbase-SQL 2.0 includes support for a call-level interface, ANSI Embedded
SQL (ESQL), a language-independent embedded SQL, and support for BLOBs (up to
2 Gigabytes). It includes DLLs for both Quadbase's native API and ODBC and an
ODBC DLL and custom controls for Visual Basic. It also includes classes for
Actor; sample code for Toolbook, Pascal, and C; and DOS and Windows query
utilities.
Watcom is shipping SQL Engines for DOS, QNX, Windows, and PenPoint. The
Windows product includes 16- and 32-bit engines, embedded SQL and a level 2
implementation of ODBC. Watcom's utilities include statistical feedback that
assists in tuning the query optimizer and libraries that implement a Windows
DDE server.
Developers looking for high-performance servers will find that ODBC is
available at that end of the performance spectrum. DEC's 64-bit Alpha chip is
one of the new RISC chips that has gained favor with DBMS server companies.
Oracle, Ingres, DEC's own rdb, and other powerful servers will be available on
Alpha systems that will deliver 150 MIPS or more. NCR's Teradata is a
high-performance, fault-tolerant parallel-processing system capable of
managing terabytes of information. A Teradata box can communicate with MVS,
VM, UNIX, VMS, and other systems.
White Cross Systems of England markets a new transputer-based server built
with the latest in RAID and fiber-optic technologies. The 120 MIPS entry-level
system will reputedly query 2.4 million rows per second. It provides ODBC and
X/Open call interfaces to Windows and UNIX clients, respectively.
Tools for developing the client side of client/server applications include
query products, database products with client/server capability, and tools
that enhance the capability of programming languages. At the time of this
article, the query toolmakers are not shipping ODBC-enabled products, but that
is likely to change as demand increases.
Microsoft's Visual Basic has become legendary for its ease of prototyping. The
Professional Edition (3.0) includes support for ODBC and MAPI, Microsoft's
messaging interface. The ODBC SDK includes a Visual Basic 1.0 demo program.
Crystal Services has expanded the database options available to users of its
Crystal Reports product. Version 2.0 includes connectivity to various SQL
databases in addition to Paradox, Btrieve, and FoxPro databases. The
report-engine DLL works with C++, C, VB, TPW, and ObjectVision. (Crystal
Reports is also bundled with Visual Basic 3.0.)
Although Visual Basic receives a lot of press as a premier prototyping tool,
there are also very powerful tools for C and C++ developers. Borland's
Resource Workshop is a high-end editor of Windows resource files that includes
Motif-like custom controls. ProtoView Development Corporation's suite includes
ProtoGen (application generator), ProtoView (screen manager), and DataTable (a
spreadsheet custom control).
Microsoft's Access acts as a DBMS in its own right or as a client in
client/server application using ODBC. Access 1.0 shipped with an ODBC driver
for SQL Server and Microsoft no-ted that it conducted no tests using Access
with other drivers. In my tests of SQL engines for this article, the Access
links to other drivers were tenuous. Simple import operations of simple data
types were more likely to succeed than export and update operations. (The
Access engine is also shipped with Visual Basic 3.0.)
--K.N.
Figure 1: SQL class hierarchy.
Function Description
dbObject Base class that represents generic object behavior.
dbEnvironment Single-instance, application class.
dbDatabase Parent class for database objects such as tables, columns, related
objects.
dbDataSource Encapsulates the path and identifier information.
dbConnection Encapsulates a connection to a data source.
dbRequest Encapsulates the SQL request.
dbColumn Maintains information on columns in relational database tables.
dbBlob Binary large object (BLOB) such as images or sound.
Table 1: Revised SQL Class Hierarchy.
[LISTING ONE] (Text begins on page 69.)

///////////////////////////////////////////////////
// FILE NAME: dbObject.h TITLE: database class
// AUTHOR: Ken North Resource Group, Inc.
// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
///////////////////////////////////////////////////

#ifndef __DBOBJECT_H
#define __DBOBJECT_H

class dbObject
{
 int ReferenceTotal;

protected:

public:

 dbObject();
 virtual ~dbObject();

 virtual void IncrementRefs();
 virtual int DecrementRefs();
};

#endif

[LISTING TWO]

///////////////////////////////////////////////////
// FILE NAME: dbObject.cpp TITLE: base class
// AUTHOR: Ken North Resource Group, Inc.

// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
///////////////////////////////////////////////////
// SYNOPSIS:
// implementation of dbObject class
///////////////////////////////////////////////////

#undef DEBUG
#undef DEBUG1

#include <windows.h>
#include <string.h>
#include "sql.h"
#include "sqlext.h"
#include "sqldefs.h"
#include "dbobject.h"

////////////////////////////////////////////////////
// FUNCTION NAME: dbObject
// SYNOPSIS:
// define dbObject, SQL base class constructor
////////////////////////////////////////////////////

dbObject :: dbObject()
{
 ReferenceTotal = 1; // referenced by self
}

///////////////////////////////////////////////////
// FUNCTION NAME: ~dbObject
// define dbObject, SQL data base class destructor
///////////////////////////////////////////////////

dbObject :: ~dbObject()
{
// this is a point where error handler should be
// invoked if the reference total > 1 or < 0
// systemError();
}

////////////////////////////////////////////////////////////////////////
// FUNCTION NAME: IncrementRefs -- increment object reference total
////////////////////////////////////////////////////////////////////////
void dbObject :: IncrementRefs()
{
 ReferenceTotal++;
}

//////////////////////////////////////////////////////////////////////////
// FUNCTION NAME: IncrementRefs -- increment object reference total
//////////////////////////////////////////////////////////////////////////
int dbObject :: DecrementRefs()
{
// if the reference total < 0
// {
// systemError();
// }
 return(--ReferenceTotal);
}


[LISTING THREE]

///////////////////////////////////////////////////
// FILE NAME: database.cpp TITLE: data base class
// AUTHOR: Ken North Resource Group, Inc.
// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
// SYNOPSIS: implementation of database class
///////////////////////////////////////////////////

#undef DEBUG
#undef DEBUG1

#include <windows.h>
#include <string.h>
#include "sql.h"
#include "sqlext.h"
#include "gendefs.h"
#include "sqldefs.h"
#include "dbobject.h"
#include "sqldb.h"

#ifndef __RGSTRING_H
#include "rgstring.h"
#endif

HDBC dbDataBase::ODBCLinkHandle[]
 = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
int dbDataBase::nDefinedSources = 0;
int dbDataBase::ActiveLinks = 0;
char dbDataBase::DataSourceList[] = "None";
UCHAR dbDataBase::ODBCLinkList[] = "None";

 /* standard return code for calls */
extern char ErrStat;
 /* large / huge model */
extern char MemModel;

////////////////////////////////////////////////////
// FUNCTION NAME: dbDataBase
// SYNOPSIS: dbDataBase class constructor
////////////////////////////////////////////////////
dbDataBase::dbDataBase(HENV AppEnv)
 : dbObject()
{
 // save application environment handle
 henv = AppEnv;

 // initialize status and error info
 StatusReturned = SQL_SUCCESS;
 InitODBCErrorInfo();
}

///////////////////////////////////////////////////////////
// FUNCTION NAME: ~DataBase -- DataBase class destructor
//////////////////////////////////////////////////////////

dbDataBase :: ~dbDataBase()

{
}

//////////////////////////////////////////////////////////////////////////
// FUNCTION NAME: InitODBCErrorInfo -- initialize ODBC error information
//////////////////////////////////////////////////////////////////////////
void dbDataBase::InitODBCErrorInfo()
{
 err.ErrStatus = 0;
 err.ErrorMsgLength = 0;
 err.ErrorMsgMax = 0;
 err.NativeError = 0;
 memset(err.szSQLState,'\x0',
 sizeof(err.szSQLState));
 memset(err.ErrorMsg,'\x0',
 sizeof(err.ErrorMsg));
}
/***************************************************
* FUNCTION NAME: MatchLinkName
* SYNOPSIS: scan the active link list (ODBCLinkList) to find
* the link that matches the selected link name
* (from disconnect dialog box)
*****************************************************/
int dbDataBase::MatchLinkName(UCHAR *LinkToDrop)
{
int tdx; /* link index */
int toff; /* link offset */
UCHAR NameToTest[LINK_NAME_LEN+1];

 /* string position to begin search */
unsigned short StartPos=0;

 /* position found: result of the search */
unsigned short SearchResult;


 for (tdx=0;tdx < ActiveLinks;tdx++)
 {
 toff = tdx * LINK_NAME_LEN;
 memset(NameToTest,'\x0',LINK_NAME_LEN+1);
 substr_n((char *)ODBCLinkList,
 (char *)NameToTest,
 toff, LINK_NAME_LEN ) ;
 SearchResult =
 left_srch((char *)NameToTest,
 (char *)LinkToDrop, StartPos);
 if (ErrStat == SUCCESS)
 {
 return tdx;
 }
 }
 return OOPS;
}

[LISTING FOUR]

////////////////////////////////////////////////////
// FILE NAME: connect.h TITLE: SQL connection
// AUTHOR: Ken North Resource Group, Inc.

// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
////////////////////////////////////////////////////

#ifndef __CONNECT_H
#define __CONNECT_H

#ifndef __SQLDB_H
#include "sqldb.h"
#endif

#ifndef __DATASOUR_H
#include "datasour.h"
#endif

#ifndef __DBPROFIL_H
#include "dbprofil.h"
#endif

#ifndef __ODERROR_H
#include "oderror.h"
#endif

#ifndef __DSINFO_H
#include "DSInfo.h" /* data source information */
#endif
 /* Connection is a child of data source */
class dbConnection : public dbDataSource
{

protected:

 HENV henv;
 POINTER cstr;

public:

 RETCODE Status;
 RETCODE ErrStatus;
 HDBC hdbc;
 UCHAR far *conec;
 static UCHAR ConnectionString[CONECLEN];
 static UCHAR szConnStrOut[CONECLEN];
 /* the next two variables are used for situations */
 /* where a connection's values are subsets of a */
 /* data source's values */
 SDWORD ConnTableCount; /* number of tables for this connection */
 char ConnTableList[DS_TABLE_LIST_LEN];

 dbConnection(HENV);
 virtual ~dbConnection();

 HDBC AllocateConnection(void);
 POINTER BuildBrowseConnectString(char *);
 virtual POINTER BuildConnectString(char *);
 RETCODE BrowseConnect();
 virtual RETCODE Connect(UCHAR *, UCHAR *);
 virtual int Disconnect(UCHAR *);
 RETCODE DriverConnect(UCHAR *);

 void InitConnData(void);
 void dbConnection::GetErrorInfo(void);
 RETCODE PASCAL ProfileDataSource(DataSourceInfo *);
};
#endif

[LISTING FIVE]

///////////////////////////////////////////////////////////////////
// FILE NAME: connect.cpp TITLE: database connection
// AUTHOR: Ken North Resource Group, Inc.
// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
///////////////////////////////////////////////////////////////////
// SYNOPSIS: implementation of SQL connection class
///////////////////////////////////////////////////////////////////

#include <stdio.h>
#include <string.h>
#include <windows.h>
#include <dos.h>
#include <malloc.h>
#include "sql.h"
#include "sqlext.h"
#include "sqldefs.h"
#include "gendefs.h"
#include "sqldb.h"
#include "dboption.h"
#include "oderror.h"

#ifndef __DATASOUR_H
#include "datasour.h"
#endif

#ifndef __DSINFO_H
#include "DSInfo.h" /* data source information */
#endif

#include "connect.h"

UCHAR dbConnection::ConnectionString[]="None";
UCHAR dbConnection::szConnStrOut[]="";
 /* constructor for dbConnection class */
dbConnection::dbConnection(HENV AppEnv)
 : dbDataSource(AppEnv)
{
 henv = AppEnv; /* save environment handle */
 InitODBCErrorInfo();
 Status = SQL_SUCCESS;
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: ~dbConnection
// SYNOPSIS: destructor for dbConnection
///////////////////////////////////////////////////////////////////
dbConnection::~dbConnection()
{
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: AllocateConnection

// SYNOPSIS: allocates a connection handle
///////////////////////////////////////////////////////////////////
HDBC dbConnection::AllocateConnection()
{
 /* get connection handle */
 Status = SQLAllocConnect(henv, &hdbc);
 if (Status != SQL_SUCCESS)
 {
 return NULL;
 };
 return hdbc;
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: Connect
// SYNOPSIS: connect to SQL data source
///////////////////////////////////////////////////////////////////
RETCODE dbConnection::Connect(UCHAR *Userid, UCHAR *Password)
{
SWORD cbConnStrOut;
 memset(ConnectionString,'\x0',sizeof(ConnectionString));
 memset(szConnStrOut,'\x0',sizeof(szConnStrOut));
 strcpy((char *)ConnectionString,"DSN=");
 strncat((char *)ConnectionString,(char *)TrimmedDSName,
 strlen((char *)TrimmedDSName));
 /* ODBC driver connect */
 Status = SQLDriverConnect (hdbc,
 NULL,
 ConnectionString,
 SQL_NTS,
 szConnStrOut,
 CONECLEN,
 &cbConnStrOut,
 SQL_DRIVER_COMPLETE);
 if (Status != SQL_SUCCESS)
 {
 /* ODBC connect */
 Status = SQLConnect (hdbc,
 DSName,
 SQL_NTS,
 Userid,
 SQL_NTS,
 Password,
 SQL_NTS);
 if (Status != SQL_SUCCESS)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 }
 }
 ODBCLinkHandle[ActiveLinks] = hdbc; /* add handle to active list */
 if (ActiveLinks < 1)
 {

 strncpy((char *)ODBCLinkList,(char *)DSName,
 SQL_MAX_DSN_LENGTH);
 }
 else
 {
 strncat((char *)ODBCLinkList,(char *)DSName,sizeof(DSName));
 }
 ActiveLinks++;
 return Status;
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: BuildConnectString
// SYNOPSIS: build connection string for ODBC driver
// return far pointer to connection string
///////////////////////////////////////////////////////////////////
POINTER dbConnection::BuildConnectString(char *str)
{
 UCHAR far *conec;

 conec = (UCHAR far *) malloc(CONECLEN);
 movedata(FP_SEG(str), FP_OFF(str),
 FP_SEG(conec), FP_OFF(conec),
 1+strlen(str));
 return conec;
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: Disconnect
// SYNOPSIS: terminate the SQL connection
///////////////////////////////////////////////////////////////////
int dbConnection::Disconnect(UCHAR *LinkToDrop)
{

int i;
int j;
int nLink; /* index to hdbc to drop */
int offset;

HDBC ConnHandle;

 nLink = MatchLinkName(LinkToDrop);
 if (nLink < 0)
 {
 return OOPS;
 }
 ConnHandle = ODBCLinkHandle[nLink];
 /* ODBC disconnect */
 Status = SQLDisconnect(ConnHandle);
 if (Status)
 return Status;

 /* compress the list of active links and active handles */
 for (i=nLink; i < ActiveLinks;i++)
 {
 ODBCLinkHandle[i] = ODBCLinkHandle[i+1];
 }
 offset = nLink * LINK_NAME_LEN;
 j = offset;
 while ( ODBCLinkList[j+LINK_NAME_LEN] != \0')
 {

 ODBCLinkList[j] = ODBCLinkList[j+LINK_NAME_LEN];
 j++;
 }
 ODBCLinkList[j] = \0';
 --ActiveLinks;
 /* free conn handle */
 Status = SQLFreeConnect(ConnHandle);
 if (Status)
 return Status;

 return SQL_SUCCESS;
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: ProfileDataSource
// SYNOPSIS: information about driver and data source
///////////////////////////////////////////////////////////////////
RETCODE PASCAL dbConnection::ProfileDataSource( DataSourceInfo *inf )
{
 /* implementation of error handling is left to the user, since */
 /* the user interface may vary. Using QuickWin or EasyWin, you can */
 /* use printf statements. If you are writing a typical Windows app, */
 /* you can use MessageBox or BWCCMessageBox to display errors */

 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_DRIVER_NAME,
 &inf->DriverName,
 sizeof(inf->DriverName),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ACTIVE_STATEMENTS,
 (PTR)&inf->ActiveStatements,
 sizeof(inf->ActiveStatements),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ACTIVE_CONNECTIONS,
 (PTR)&inf->ActiveConnections,
 sizeof(inf->ActiveConnections),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_DRIVER_VER,
 &inf->DriverVersion,
 sizeof(inf->DriverVersion),
 NULL);

 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_SERVER_NAME,
 &inf->ServerName,
 sizeof(inf->ServerName),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_USER_NAME,
 &inf->UserName,
 sizeof(inf->UserName),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ODBC_API_CONFORMANCE,
 (PTR)&inf->ODBC_API_Level,
 sizeof(inf->ODBC_API_Level),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }

 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ODBC_SAG_CLI_CONFORMANCE,
 (PTR)&inf->ODBC_SAG_Level,
 sizeof(inf->ODBC_SAG_Level),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ODBC_SQL_CONFORMANCE,
 (PTR)&inf->ODBC_SQL_Level,
 sizeof(inf->ODBC_SQL_Level),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_DATABASE_NAME,
 &inf->DatabaseName,

 sizeof(inf->DatabaseName),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_DBMS_NAME,
 &inf->DBMSName,
 sizeof(inf->DBMSName),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_DBMS_VER,
 &inf->DBMSVersion,
 sizeof(inf->DBMSVersion),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 /* IEF ? */
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ODBC_SQL_OPT_IEF,
 &inf->IEFSupport,
 sizeof(inf->IEFSupport),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 /* Support Procedures ? */
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_PROCEDURES,
 &inf->Procedures,
 sizeof(inf->Procedures),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 /* detect changes in rows between fetches ? */
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ROW_UPDATES,
 &inf->RowUpdates,
 sizeof(inf->RowUpdates),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }

 /* all tables accessible */
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ACCESSIBLE_TABLES,
 &inf->AccessibleTables,
 sizeof(inf->AccessibleTables),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 if(inf->AccessibleProcedures[0] == Y')
 {
 /* all procedures accessible */
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_ACCESSIBLE_PROCEDURES,
 &inf->AccessibleProcedures,
 sizeof(inf->AccessibleProcedures),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_CONCAT_NULL_BEHAVIOR,
 (PTR)&inf->ConcatNullBehavior,
 sizeof(inf->ConcatNullBehavior),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_CURSOR_COMMIT_BEHAVIOR,
 (PTR)&inf->CursorCommitBehavior,
 sizeof(inf->CursorCommitBehavior),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_CURSOR_ROLLBACK_BEHAVIOR,
 (PTR)&inf->CursorRollbackBehavior,
 sizeof(inf->CursorRollbackBehavior),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_DATA_SOURCE_READ_ONLY,
 &inf->DSReadOnly,

 sizeof(inf->DSReadOnly),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MAX_COLUMN_NAME_LEN,
 (PTR)&inf->MaxColNameLen,
 sizeof(inf->MaxColNameLen),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MAX_CURSOR_NAME_LEN,
 (PTR)&inf->MaxCursorNameLen,
 sizeof(inf->MaxCursorNameLen),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MAX_OWNER_NAME_LEN,
 (PTR)&inf->MaxOwnerNameLen,
 sizeof(inf->MaxOwnerNameLen),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }

 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MAX_PROCEDURE_NAME_LEN,
 (PTR)&inf->MaxProcedureNameLen,
 sizeof(inf->MaxProcedureNameLen),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MAX_QUALIFIER_NAME_LEN,
 (PTR)&inf->MaxQualifierNameLen,
 sizeof(inf->MaxQualifierNameLen),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }

 Status = SQL_SUCCESS;

 Status = SQLGetInfo(hdbc,
 SQL_MAX_TABLE_NAME_LEN,
 (PTR)&inf->MaxTableNameLen,
 sizeof(inf->MaxTableNameLen),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MULT_RESULT_SETS,
 &inf->MultipleResultSets,
 sizeof(inf->MultipleResultSets),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_MULTIPLE_ACTIVE_TXN,
 &inf->MultipleActiveTransactions,
 sizeof(inf->MultipleActiveTransactions),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_OUTER_JOINS,
 &inf->OuterJoins,
 sizeof(inf->OuterJoins),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_OWNER_TERM,
 &inf->OwnerTerm,
 sizeof(inf->OwnerTerm),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_PROCEDURE_TERM,
 &inf->ProcTerm,
 sizeof(inf->ProcTerm),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }

 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_QUALIFIER_TERM,
 &inf->QualifierTerm,
 sizeof(inf->QualifierTerm),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_EXPRESSIONS_IN_ORDERBY,
 &inf->OrderBy,
 sizeof(inf->OrderBy),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }
 Status = SQL_SUCCESS;
 Status = SQLGetInfo(hdbc,
 SQL_TXN_CAPABLE,
 (PTR)&inf->TransCapable,
 sizeof(inf->TransCapable),
 NULL);
 if (Status != SQL_SUCCESS)
 {
 GetErrorInfo();
 }

 if (Status != SQL_SUCCESS)
 return Status;
 return(SQL_SUCCESS);
}
///////////////////////////////////////////////////////////////////
// FUNCTION NAME: GetErrorInfo
// SYNOPSIS: check SQLError info
///////////////////////////////////////////////////////////////////
void dbConnection::GetErrorInfo()
{
 /* implementation of error handling is left to the user, since */
 /* the user interface may vary. Using QuickWin or EasyWin, you can */
 /* use printf statements. If you are writing a typical Windows app, */
 /* you can use MessageBox or BWCCMessageBox to display errors */
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
}
///////////////////////////////////////////////////
// FUNCTION NAME: InitConnData
// SYNOPSIS: initialize conn data
///////////////////////////////////////////////////

void dbConnection::InitConnData()
{
 Status = 0;
 ErrStatus = 0;
 cstr = NULL;
 ConnTableCount = 0;
 memset(ConnectionString,'\x0',sizeof(ConnectionString));
 memset(ConnTableList,'\x0',sizeof(ConnTableList));
}

[LISTING SIX]

///////////////////////////////////////////////////
// FILE NAME: request.h TITLE: SQL request
// AUTHOR: Ken North Resource Group, Inc.
// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
///////////////////////////////////////////////////

#ifndef __REQUEST_H
#define __REQUEST_H

#ifndef __CONNECT_H
#include "connect.h"
#endif

#ifndef __DBOPTION_H
#include "dboption.h"
#endif

#ifndef __RQOPTION_H
#include "rqoption.h"
#endif

#ifndef __TABLESET_H
#include "tableset.h"
#endif
 // dbRequest is a child of dbConnection
class dbRequest : public dbConnection {
protected:
 UCHAR Statement[MAXSQL];
 UCHAR *stmt;
 RETCODE Status;
 // exceeds max statement length for d.b. driver
 BOOL ExceedsMax;

 HENV henv;
 HSTMT hAStmt;

 CURNAME ACursor; /* for named cursors */
 UCHAR CursorName[CURSOR_NAME_LEN];
public:
 UCHAR far *stptr;
 HSTMT hstmt;

 DataBaseOptions *dbopt; // options for this database

 RequestOptions OptionInfo; // options for this request


 dbRequest(HENV, HDBC, DataBaseOptions *, UCHAR *);
 virtual ~dbRequest();

 virtual HSTMT AllocateStatement(HDBC);
 virtual POINTER BuildRequest(BOOL);
 virtual RETCODE ExecuteStatement(POINTER);
 virtual SWORD FindNumberOfColumns(HSTMT);
 virtual CURNAME GetCursorforRequest(HDBC);
 SDWORD GetTableList(char *);
 void InitTableResultSet(TableResultSet *);
 virtual int TerminateStatement(HSTMT);
};
#endif

[LISTING SEVEN]

/////////////////////////////////////////////////////////////////////
// FILE NAME: request.cpp TITLE: SQL request
// AUTHOR: Ken North Resource Group, Inc.
// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
/////////////////////////////////////////////////////////////////////
// SYNOPSIS: SQL request class
/////////////////////////////////////////////////////////////////////

#include <windows.h>
#include <stdio.h>
#include <dos.h>
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
#include "sql.h"
#include "sqlext.h"
#include "gendefs.h"
#include "sqldefs.h"
#include "sqldb.h"
#include "connect.h"
#include "dboption.h" /* database options */

#ifndef __ODERROR_H
#include "oderror.h"
#endif

#ifndef __RGSTRING_H
#include "rgstring.h"
#endif

#include "request.h"

extern char ErrStat; /* return code for string calls */
extern char MemModel; /* large / huge model */
 /* Constructor for the Request' class: */
dbRequest :: dbRequest( HENV envhandle,
 HDBC connhandle,
 DataBaseOptions *opt,
 UCHAR *SQLstring )
 : dbConnection(envhandle)
{
 // save SQL string and options

 dbopt = opt;
 strcpy((char *)Statement,(char *)SQLstring);
 stmt = &Statement[0];
 // set request environment handle
 henv = envhandle;
 // set request connection handle
 hdbc = connhandle;

}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: ~dbRequest
// SYNOPSIS: destructor
/////////////////////////////////////////////////////////////////////
dbRequest::~dbRequest()
{
 free(stptr);
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: BuildRequest
// SYNOPSIS: prepare an SQL request
/////////////////////////////////////////////////////////////////////
POINTER dbRequest::BuildRequest( BOOL ExceedsMax )
{
 if (dbopt->AllocStmtHandle)
 {
 hAStmt = AllocateStatement(hdbc);
 }
 if (OptionInfo.UseCursor)
 {
 ACursor = GetCursorforRequest(hdbc);
 }
 stptr = (UCHAR far *) malloc(MAXSQL);
 movedata(FP_SEG(stmt), FP_OFF(stmt),
 FP_SEG(stptr), FP_OFF(stptr),
 1+strlen((char *)stmt));
 return stptr;
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: ExecuteStatement
// SYNOPSIS: executes a direct or prepared request
/////////////////////////////////////////////////////////////////////
RETCODE dbRequest::ExecuteStatement( POINTER stptr )
{
 Status = SQLExecDirect (hAStmt,
 stptr,
 SQL_NTS );
 if (Status != SQL_SUCCESS)
 {
 return Status;
 };
 return SQL_SUCCESS;
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: FindNumberOfColumns
// SYNOPSIS: get the number of columns in the request
/////////////////////////////////////////////////////////////////////
SWORD dbRequest::FindNumberOfColumns( HSTMT hstmt )
{
SWORD n;


 Status = SQLNumResultCols (hAStmt, &n);
 if (Status != SQL_SUCCESS)
 {
 return Status;
 };
 return n;
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: GetTableList
// SYNOPSIS: get the tables for this SQL data source
/////////////////////////////////////////////////////////////////////
SDWORD dbRequest::GetTableList(char *TableList)
{
SDWORD nTables=0;
SDWORD QualifierLength;
SDWORD OwnerLength;
SDWORD NameLength;
SDWORD TypeLength;
SDWORD RemarksLength;

UCHAR FAR *szTableQualifier=NULL;
SWORD cbTableQualifier=0;
UCHAR FAR *szTableOwner=NULL;
SWORD cbTableOwner=0;
UCHAR FAR *szTableName=NULL;
SWORD cbTableName=0;
UCHAR FAR *szTableType=NULL;
SWORD cbTableType=0;

TableResultSet *rset;
char NameString[TABLE_NAME_LEN+1];

 rset = new(TableResultSet);
 if(!rset)
 return NO_MEMORY;
 InitTableResultSet(rset);
 Status = SQL_SUCCESS;
 ErrStatus = 0;
 Status = SQLTables( hstmt,
 szTableQualifier,
 cbTableQualifier,
 szTableOwner,
 cbTableOwner,
 szTableName,
 cbTableName,
 szTableType,
 cbTableType);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);

 if (ErrStatus < 1)
 return Status;
 }
 Status = SQL_SUCCESS;
 Status=SQLBindCol(hstmt, 1, SQL_C_CHAR,
 &rset->TABLE_QUALIFIER,
 sizeof(rset->TABLE_QUALIFIER), &QualifierLength);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 if (ErrStatus < 1)
 return Status;
 }
 Status = SQL_SUCCESS;
 ErrStatus = 0;
 Status=SQLBindCol(hstmt, 2, SQL_C_CHAR,
 &rset->TABLE_OWNER,
 sizeof(rset->TABLE_OWNER), &OwnerLength);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 if (ErrStatus < 1)
 return Status;
 }
 Status = SQL_SUCCESS;
 ErrStatus = 0;
 Status=SQLBindCol(hstmt, 3, SQL_C_CHAR,
 &rset->TABLE_NAME,
 sizeof(rset->TABLE_NAME), &NameLength);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 if (ErrStatus < 1)
 return Status;
 }

 Status = SQL_SUCCESS;
 ErrStatus = 0;
 Status=SQLBindCol(hstmt, 4, SQL_C_CHAR,
 &rset->TABLE_TYPE,
 sizeof(rset->TABLE_TYPE), &TypeLength);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 if (ErrStatus < 1)
 return Status;
 }

 Status = SQL_SUCCESS;
 ErrStatus = 0;
 Status=SQLBindCol(hstmt, 5, SQL_C_CHAR,
 &rset->REMARKS,
 sizeof(rset->REMARKS), &RemarksLength);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 if (ErrStatus < 1)
 return Status;
 }
 /* Loop to fetch data for SQLTables */
 while (Status != SQL_NO_DATA_FOUND)
 {
 InitTableResultSet(rset);
 Status = SQL_SUCCESS;
 ErrStatus = 0;
 Status = SQLFetch(hstmt);
 if(Status == SQL_ERROR)
 {
 err.ErrorMsgMax = SQL_MAX_MESSAGE_LENGTH - 1;
 ErrStatus = SQLError(henv,
 hdbc,
 SQL_NULL_HSTMT,
 &err.szSQLState[0],
 &err.NativeError,
 &err.ErrorMsg[0],
 err.ErrorMsgMax,
 &err.ErrorMsgLength);
 if (ErrStatus < 1)
 return Status;

 }
 strcpy(NameString,(char *)rset->TABLE_NAME);
 right_fill(NameString,SPACE,TABLE_NAME_LEN);
 strncat(TableList,NameString,TABLE_NAME_LEN);
 nTables++;
 }
 return nTables;
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: InitTableResultSet
// SYNOPSIS: initialize result set for GetTables
/////////////////////////////////////////////////////////////////////
void dbRequest::InitTableResultSet(TableResultSet *rs)
{
 memset(rs->TABLE_QUALIFIER,'\x0',sizeof(rs->TABLE_QUALIFIER));
 memset(rs->TABLE_OWNER,'\x0',sizeof(rs->TABLE_OWNER));
 memset(rs->TABLE_NAME,'\x0',sizeof(rs->TABLE_NAME));
 memset(rs->TABLE_TYPE,'\x0',sizeof(rs->TABLE_TYPE));
 memset(rs->REMARKS,'\x0',sizeof(rs->REMARKS));
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: TerminateStatement
// SYNOPSIS: terminate the statement
/////////////////////////////////////////////////////////////////////
RETCODE dbRequest::TerminateStatement(HSTMT hstmt)
{
 Status = SQLFreeStmt(hstmt, SQL_DROP);
 if (Status)
 return Status;
 else
 return SQL_SUCCESS;
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: AllocateStatement
// SYNOPSIS: allocates a statement handle
/////////////////////////////////////////////////////////////////////
HSTMT dbRequest::AllocateStatement(HDBC hdbc)
{
 Status = SQLAllocStmt(hdbc, &hstmt);
 if (Status != SQL_SUCCESS)
 {
 return NULL;
 };
 return hstmt;
}
/////////////////////////////////////////////////////////////////////
// FUNCTION NAME: GetCursorforRequest
// SYNOPSIS: gets a cursor for this request
/////////////////////////////////////////////////////////////////////
CURNAME dbRequest::GetCursorforRequest(HDBC hdbc)
{
 return NULL;
}

[LISTING EIGHT]

/////////////////////////////////////////////////////////////////////
// FILE NAME: column.h TITLE: SQL column class
// AUTHOR: Ken North Resource Group, Inc.

// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
/////////////////////////////////////////////////////////////////////

#ifndef __COLUMN_H
#define __COLUMN_H

#ifndef __DBOPTION_H
#include "dboption.h"
#endif

#ifndef __REQUEST_H
#include "request.h"
#endif

#ifndef __COLDESC_H
#include "coldesc.h"
#endif
 /* dbColumn' is a child of dbRequest' */
class dbColumn : public dbRequest {
protected:
 HSTMT hstmt;
public:
 RETCODE Status;
 /* column info for this request */
 ColumnDesc ColumnInfo[MAXFIELD];

 dbColumn(HENV, HDBC, DataBaseOptions *, HSTMT, UCHAR *, UWORD);
 virtual ~dbColumn();

 virtual int Describe(UWORD, ColumnDesc *);
 virtual char * DecodeODBCDataType(SWORD);
};
#endif

[LISTING NINE]

/////////////////////////////////////////////////////////////////////
// FILE NAME: column.cpp TITLE: database column
// AUTHOR: Ken North Resource Group, Inc.
// 2604B El Camino Real, #351
// copyright(c)1992 Carlsbad, CA 92008
//
/////////////////////////////////////////////////////////////////////
// SYNOPSIS: implementation of SQL column class
/////////////////////////////////////////////////////////////////////

#include <windows.h>
#include <stdio.h>
#include <dos.h>
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
#include "sq '    	 
   
      $ &              ! " # / % ( * H ) + - , . ? > 0 1 2 3 4 5 6 7 8 9 : ; < = F B @ A  E  G  I J K L M N O P Q R S U V W X Y Z [ \ ] ^ _ ` a b c d e f g h k l m o p r s t x y z { | } ~                                                                                            +                                   	

3" !$#)%&'(9*+,-./01245678P:;<=>?@ABCDEFGHIJKLMNObQRSTUVWXYZ[\]^_`acedtfghijklmnopqrs|uvwxyz{}~*[E` 	

")Y,-./0123456789:;<=>?@ABCDEFG
IJKLMNOPQRSTUVWX	Z\]^_cdefghiklmnopqrstuvwxyz{|}~| ?	


& !"#$%'()*+,-./0123456789:;<=>}@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{Q~> 	

a !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`xbcdefghijklmnopqrstuvwyz}~	
 !"#$%&')*+,-./0123456789:;<=?@NABCDEFGHIJKLMOPRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~Y 	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOEUVWZ[\]^_`abcdefghijklmnoptuvwxyz{|}~ 	

$ !"#Z%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYj[\]^_`abcdefgiyklmnopqrstuvwxz{|}~ 	

+) !"#$%&'(X*,-./0123456789:;<=>?@ABCDEFGHIZKLMNOPQRSUVwYj[\]^_`abcdefghizklmnopqrstuvx{|}~	 	 													#			!	"	*	$	%	&	'	(	)	2	+	,	-	.	/	0	1	B	3	4	5	6	7	8	9	:	;	<	=	>	?	@	A	J	C	D	E	F	G	H	I	V	K	L	M	N	O	P	Q	R	S	T	U	W	X	Y	Z	[	\	]	^	_	`	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z	{	|	}	~																																																																											?																																																		 



	






















 
!
"
:
$
%
&
'
(
)
*
+
,
-
.
/
0
1
2
3
4
5
6
7
8
9
;
<
=
>
?
@
A
B
C
D
E
F
G
H
_
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
































2































































































 	

 !"#$&'()+-./0156789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ /b	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ 








	























 
!
"
#
$
%
&
'
(
)
*
+
,
-
.
/
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
































































































































 	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ 	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ 	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~p 	

- !"#$%&'()*+,E./0123456789:;<=>?@ABCD\FGHIJKLMNOPQRSTUVWXYZ[t]^_`abcdefghijklmnopqrsuvwxyz{|}~[$8 	
#
 !"+$%&'()*5,-./0123476=89:;<E>?@ABCDNFGHIJKLMPOVQRSTU^WXYZ[\]n_`abcdefghijklmopqrstuvwxyz{|}~ 	

+K !"#$%&'()*J,-./0123456789:;<=>?@ABCDEFGHIZLMNOPQRSTUVWXYj[\]^_`abcdefghiklmnoqrstuvwxyz{|}~ 	

M !"#$%&'()+,-.F0123456789:;<=>B@ADCELGHIJKNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ %	

' !"#$?&-()*+,123456789:;<=>^@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]f_`abcdenghijklmvopqrwxyz{|}~ 
	
%;Y9 !"#$<&'()*+,-./0123456789:;L=>?@ABCDEFGHIJK\MNOPQRSTUVWXYZ[n]^_`abcdefghijklm}opqrstuvwxyz{|U~e#S\&$ 	

 !"#w%='()*+,-./0123456789:;<M>?@ABCDEGHIJKLxNOPQRoTUVWXYZ[\]^_`abcdfghijklmnxpqrstuvfyz{|}~P 	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOQRTeVWXYZ[\]^_`abcdgwhijmo#pqrstuvyz{}~ 	

( !"#$%&'?)*+,-./0123456789:;<=>G@ABCDEFOHIJKLMNWQRTUV_XYZ[\]^g`abcdefvhijklmnopqrstuwxyz{|}~Fj )	
& !"#$%7'(*K,-./0123456P89:;<=>?@ABCDEFGHIJLoMNOXQRSTUVWgYZ[\]^_`abcdefxhilmqrstuvyz{|}~Ja!X 	

( !"#$%&'@)*+,-./0123456789:;<=>?HABCDEGWIJKLMNOPQRSTUVqYZ[\]^_`abhdefgimjklnorstuvwxyz{|}~ 	

% !"#$-&'()*+,?./0123456789:;<=>O@ABCDEFGHIJKLMN_PQRSTUVWXZ[\]^v`abcdefghijklmnopqrstuwxyz{|}~ 	

#p !"&$%('<)*+,-./0123456789:;L=>?@ABCDEFGHIJKdMNOPQRSTUVWXYZ[\]^_`abctefghijklmnoqrsuvwxyz{|}~ 	

$ !"#&%2'()*+,-./01J3456789:;<=>?@ABCDEFGHIiKLMNOPQRSTUVWXYZ[\]^_`abcdefghjklmnopqrstuvwxyz{|}~
 	
%" !I#$2&'()*+,-./013E456789:;<=>?@ABCDUFGHiJKLMNOPQRSTeVWXYZ[\]^_`abcdfghqjklmnopxrstuvwyz{|}~!            	 
   
                 .    ! " # $ % & ' ( ) * + , - 6 / 0 1 2 3 4 5 J 7 8 9 : ; < = > ? @ A B C D E F G H I  K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~                                                                                                               S$                 !!!!!!!!!	!
!!!
!!!!!!!!!!!G!!!!!!! !!!"!#!$!%!&!'!(!)!*!+!-!.!/!0!5!2!3!4!6!7!<!9!:!;!=!>!?!@!A!C!D!E!F!G"H!I!`!K!L!M!N!O!P!Q!R!S!T!U!V!W!X!Y!Z![!\!]!^!_!!b!c!d!e!f!g!h!i!j!!l!m!n!o!p!q!r!s!t!u!v!w!x!y!z!{!|!}!~!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! """"" """"	"
"""
""""""""""""""""""""9'!"*"#"$"%"&"'"(")"."+","-"B"/"0"1"2"3"4"5"6"7"8"9":";"<"=">"?"@"A"#C"D"E"F"n"H"I"J"K"L"N"O"P"Q"R"S"T"U"V"X"Y"Z"["\"]"^"`"a"b"c"e"f"h"i"k"m"o"p"q"r"s"t"u"v"w"x"y"z"{"|"}"~"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""4#""""""""""""""""""""""""""""""""""" #########	#
##:
################### #!#"###$#%#&#'#(#)#*#+#,#-#.#/#0#1#2#3#5#6#7#8#9#:#;#<#=#>#?#@#A#B#C#D#E#F#G#H#I#J#K#L#M#N#O#P#Q#R#S#T#U#V#W#X#Y#Z#[#\#]#^#_#`#a#b#c#d#e#f#g#h#i#j##l#m#n#o#p#q#r#s#t#u#v#w#x#y#z#{#|#}#~#################################################################################v8##############################$############### $$$$$$$$$	$
$$$
$$$$$$$$$$$$$$$$$q$$ $!$"$#$$$%$&$'$($)$*$+$,$-$.$/$2$4$6$8$:$=$>$@$B$D$E$F$G$I$J$K$L$M$N$O$P$Q$R$o$T$U$V$T'X$Y$HA\$]$^$_$`$a$b$c$d$e$f$g$h$i$j$k$l$m$n$$p$$r$s$t$u$v$w$x$y$z${$|$}$~$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$1$$$$$$$$v%$$$$$$$$$$$$$%$$$$$$$$$$$$$$$$$$$$$$$$ %%%%%%%%%	%
%%%
%<%%%%%%%%%%%%%%%%%% %!%"%#%$%%%&%'%(%)%*%+%,%-%.%/%0%1%2%3%4%5%6%7%8%9%:%;%=%>%?%@%A%B%C%D%E%F%G%H%I%J%K%L%M%N%O%P%Q%R%S%T%U%V%W%X%Y%[%\%]%^%_%`%b%c%d%e%f%g%h%i%j%k%m%o%p%q%s%t%y%w%x%z%{%|%~%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%q0%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%&&&&&	&
&&&
&&&&&&&&&&&&&&& &!&"&#&$&%&&&'&(&)&*&+&-&.&/&0&2&3&4&6&7&9&:&`&=&>&?&@&A&B&C&D&E&F&G&H&I&J&K&L&M&N&O&P&Q&R&S&T&U&V&W&X&Z&[&\&]&^&_&e&a&n&f&g&h&i&j&k&l&m&&o&p&q&r&s&t&u&v&w&x&y&z&~&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&c&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& ''''''''	'''
''''''''''''''!'#'%''','-'.'/'0'1'2'3'4'5'6'7'8'Z':';'<'='>'?'H'                I'J'K'L'M'e8            U'V'W'X'Y',['\']'^'_'`'a'b'c'd'e'f'g'h'i'j'k'l'm'n'o'p'q'r's't'u'v'w'x'y'z'{'|'}'~'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' (((((((((	(
(((
((((((((((((((( (!("(#($(%(&('((()(*(+(,(-(.(/(0(1(2(3(4(5(6(7(8(9(:(;(<(>(?(@(A(B(C(D(E(F((                                                                                                                                                                                                                                                                                                                                                                  (((( ))))))))	)
)))
)                                                                                                              M)              N)O)P)Q)R)S)T)U)V)W)X)Y)Z)[)])^)_)`)c)d)e)f)g)h)i)j)k)m)n)o)p)q)r)s)t)v)w)x)z){)|)})~))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))    ))))))))) **********
***************** *!*"*#*$*%*&*'*+*,*/*0*1*3*4*5*6*7*8*9*;*<*=*>*@*F*G*H*I*J*K*L*N*O*P*Q*R*S*T*V*W*X*Y*Z*[*\*]*^*_*`*a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*s*t*u*v******************************************************************************************************** +++++++++	+
+++
+++++++++++++++++ +!+"+$+%+&+(+)+*+,+-+/+0+1+B+C+D+F+G+H+I+J+L+M+N+O+P+Q+R+S+T+U+V+W+X+Z+\+]+^+_+b+c+e+g+h+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+{+|+}+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ,,,,,,,,,	,
,,,
,,,,,,,,,,,,,,, ,!,",#,$,%,&,',(,),*,+,,,-,.,/,0,1,2,3,4,5,6,7,8,9,:,;,<,=,>,?,@,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,[,\,^,_,`,a,b,c,d,e,f,g,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,{,|,},~,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ---------	-
---
----------  --                                                                                                                                                                                                                                                      ------------".--------------------------------------------------------------------------------------------- .........	.
...
................... .!.$.#.&.%.C.'.(.).*.+.,.../.0.1.2.3.4.5.6.7.8.9.:.;.<.=.>.?.@.A.B.D.b.E.F.G.H.I.J.K.L.M.N.O.P.Q.R.S.U.V.W.X.Y.Z.[.\.].^._.`.c.x.d.e.f.g.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.y./z.{.|.}.~............................./                                                                                                                                                                                      ...... ////////1                        ///////// /!/"/#/$/%/&/'/(/)/*/+/,/-/.///0/1/2/3/4/5/6/7/8/9/:/;/</=/>/?/@/A/B/C/D/E/F/G/H/I/J/K/L/M/N/O/P/Q/R/T/U/V/W/X/Y/Z/[/\/]/^/_/`/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/{/|/}/~////////////////////////////////////////////////////////////////////////////////////////////////////////////////// 000000000	0
000
000000000000000000!0"0#0%0'0(0)0*0+0,0-0.0/000102030405060708090:0;0<0=0>0?0@0A0B0C0D0E0F0G0v0                                                                                        w0~0            00000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000 111111111	1
1,2111111111111111 1!1"1$1%1&1'1(1)1+1,1-1.1011121314151618191:1;1<1=1?1@1A1B1D1E1F1G1H1J1K1L1N1O1P1Q1R1S1U1V1W1X1Z1[1\1]1^1_1a1b1c1d1e1f1g1i1j1k1l1m1n1o1p1q1r1s1t1u11w1x1y1z1{1|1}1~111111111111111111111111111111111111111111111111111111111121111111111111111111111111111111112111111111111111111 222222222	2
222
222222;8;2222222 2!2#2$2&2'2(2)2*2+242-2.2/20212223225262728292Y2=2B2D2F2G2I2J2L2M2N2O2P2Q2R2S2T2U2V2W2X2[2Z22\2]2_2a2c2d2e2g2h2i2j2o2q2v2z2{2}222222222222222222222222222222222222222222222222222222222222222222222222222222222222224222222222222222 33333333	33333333333333333333 3!3"3#3$3%3&3'3W3)3+3-3/303132333435363738393;3<3>3A3B3C3E3F3G3I3J3K3M3N3O3Q3S3T3V33[3\3]3_3a3b3c3d3e3g3h3j3k3l3n3o3q3r3s3u3v3w3x3y3{3}3~333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333 444444444	4
444
444444444444444444 4!4"4#4$4%4&4'4(4)4*4+4,4-4.4/404142434445464748494:4;4<4=4>4?4@4B4C4D4E4F4G4H4I4J4K4L4M4O4P4Q4R4S4T4U4V4W4X4Y4Z4\4]4^4_4a4b4c4d4e4f4g4h4i4j4k4l4m4n4o4p4q4r4s4u6u4v4w4x4y4z4{4|4}4~4444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444P644444444444444444444 555555555	5
555
5555555555555555555 5!5"5#5$5%5&5'5(5)5*5+5,5-5.5/505152535455565758595:5;5<5=5>5?5@5A5B5C5D5E5F5G5H5I5J5K5L5M5N5O5P5Q5R5S5T5U5V5W5X5Y5Z5[5\5]5^5_5`5a5b5c5d5e5f5g5h5i5j5k5l5m5n5o5p5q5r5s5t5u5v5w5x5y5z5{5|5}5~5555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555 666666666	6
666
6666666666666666666 6!6"6#6$6%6&6'6(6)6*6+6,6-6.6/606162636465666768696:6;6<6=6>6?6@6A6B6C6D6E6F6G6H6I6J6K6L6M6N6O6Q6`6R6S6T6U6V6W6X6Y6Z6[6\6]6^6_6b6a6c67d6e6f6g6h6i6j6k6l6m6n6o6p6q6s6w6x6y6z6|6}6~6666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666667777777
77777777777777777 7!7"7#7$7%7&7'7(7)7*7+7,7-7.7/707172737475767778797:7;7<7=7>7?7@7A7B7C7D7E7G7H7I7J7K7L7M7N7P7Q7S7U7V7W7X7Y7[7\7]7^7_7`77b7c7e7f7g7h7i7j7k7l7m7n7o7p7q7r7s7t7u7v7w7x7y7z7{7=}7~77777777777777777;777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777 888888	8
888
8888888888888888"8#8$8%8'8)8*8+8,8.8/808182838485868788898:8Q8<8=8>8?8@8A8B8C8D8E8F8G8H8I8J8K8L8M8N8O8P8j8R8S8T8U8V8W8X8Y8Z8[8\8]8^8_8`8a8b8c8d8f8<;g8h8i89k8l8m8n8o8p8q8r8s8t8u88w8x8y8z8{8|8}8~88888888888888888888888888888888888888888888888888888888888888888888888888888888888888888889888888888888888888888888888888888888 999999999	9
999
999999999(9999999999 9!9"9#9$9%9&9'9+9)9*9H9,9-9.9/909192939495969798999:9;9<9=9>9?9@9A9B9C9D9E9F9G99I9J9K9L9M9N9O9P9Q9R9S9T9U9V9W9Z