Pure Software Untangles DCE Threads
By
Steven Sonnenberg
Vice President of Technology, IntelliSoft Corp.
Pure Software has
developed a suite of software development tools
that use a technique called object code insertion
(OCI) to obtain detailed information about the
behavior of applications. Purify, the company's
flagship product, concerns itself with access of
memory -- array bounds violations, memory leaks
-- as well as a host of other areas. Another tool
in the quiver, Quantify, provides graphical
profiling, while another, PureCoverage, performs
code coverage analysis. Using OCI, Pure
Software's tools provide invaluable exploration
and insight about an application's behavior. This
review focuses on Purify version 3.0.1 beta, as
tested against SunOS 1.1.1b and Solaris 2.4
running DCE 1.0.3 and DCE 1.0.2 respectively.
The testing sought
to discover the benefits of a world-class memory
error-detection product in a multithreaded
environment. One new feature of Pure's tools is
the support for DCE threads. Thread support
includes the ability to record the identity of
the thread accessing memory, the number of
threads created, and that the product itself is
"thread-safe." As the tests progressed,
we learned that memory access and proper thread
programming are more closely related than may be
widely believed.
Syntax Checking
Multithreaded
programming introduces several new challenges to
the software development process. The first is
the issue of data synchronization. With multiple
threads executing the same body of code and all
having access to the same data, additional steps
are necessary to access this global data
coherently. The DCE runtime supports draft 4
specification of the Posix 1003.4a real-time
extensions, which are more commonly known as
Pthreads. The Pthreads interface defines a set of
operations for using mutual-exclusions variables,
or mutexes, which can be used to protect data
access.
Another set of
issues involve event synchronization whereby the
programmer tries to get threads to cooperate on a
set of tasks. This is accomplished using
condition variables, which provide a means of one
or more threads synchronizing their execution
based on some condition. If a novice programmer
omits the second argument to pthread_join(),
Purify will announce that an invalid pointer was
used to store the status of the exiting thread.
Another common omission is to forget to
initialize a mutex. Failure to initialize a mutex
can cause a core dump; worse, it can cause some
hard-to-find error which can occasionally harbor
some dangerous side-effects. Using Purify, the
error is caught and flagged in the application
rather than in the library as shown here.

Line 01 classifies
this error as "zero page
read"--attempting to read from a zero
address. This occurred within the DCE library
(where it would be impossible to debug without
sources), but looking up the stack frames, we
notice on line 04 that this was called from our
dummy() function. After expanding the reference
to dummy, the viewer showed the relevant source
code was displayed, indicating that the error
happened while trying to lock a mutex. This error
turns out to be caused by not initializing the
mutex, which left a zero in an internal pointer.
Stack Overflow - Bounds
Checking
Another issue
facing multithreaded application developers
concerns stack usage. In a multithreaded
application, each thread requires its own stack
and unlike a Unix process whose stack can grow
almost unbounded, each thread typically has a
fixed stack size of two to eight pages. If a
thread exceeds its stack, it will likely corrupt
something of value such as another thread's stack
or part of the heap. The DCE runtime attempts to
notify applications of this errant behavior by
creating a red zone (an unreadable page at the
bottom of each thread's stack) to guard against
this overflow. It isn't difficult to imagine a
thread whose automatic variables could exceed
this guard page and continue corrupting other
memory regions. Consider the recursive factorial
shown here:

When this program
runs, depending on the value of
"factor," it either runs successfully
to completion or it spins and burns (a
description used when the program dies in such a
way as to leave no useful trace). The next figure
shows the output from Purify: Line 01 indicates
the nature of the error -- in this case, writing
beyond the stack. Lines 03, 04 and 12 through 19
show the stack backtrace. In this case the thread
had called factorial recursively and during the
fourth invocation attempted to write beyond its
stack. Line 22 indicates that the "i"
variable was attempting to get initialized when
the error was detected. Lines 05 through 11 are
the result of expanding line 04. The factorial
function had called itself three times
successfully, but on the fourth invocation, the
automatic variables had exceeded the stack.
Fortunately, most thread packages allow
specification of the size of the stack as an
attribute such that it can be managed when
recursivity is required.
Tracking Race Conditions
Consider this
simple threaded programming example: create N
worker threads and wait until they all complete.
To do this, we can call pthread_create() to spawn
"N" threads and pthread_ join() on each
of them sequentially to wait until they have
terminated. (See code below.)

Using threads
looks simple enough -- a few new APIs. What's the
big deal? Now let's add the requirement that
after each worker completes its work, it must
wait for all other workers to finish before
performing some cleanup operation prior to
termination. The work is synchronized by the
condition variable CV1 and the cleanup is
synchronized by CV2. A potential solution is to
use the master thread as the synchronizing
thread, as shown below.
Although this
solution appears to be correct, Purify points out
that it is not. Why? Examine the output when run
under Purify.

Line 01 indicates
a free memory read (FMR) error is reported
because thread 3 (line 02) was reading from a
piece of memory that had already been returned to
the heap. Lines 03 through 09 contain the last
six levels of the stack backtrace of which the
top three are part of the DCE threads support
routines. At line 06, the function dummy(), which
was part of our test program, was identified.
Using the Purify graphical display tool, we could
expand the output at many levels to control the
amount of detail from one line to editing the
source at the location itself. Lines 12 through
16 detail the history of this memory while lines
17 through 22 show where it was freed.
What happened is
that the programmer falsely assumed that after
doing a pthread_ cond_broadcast() [Master:11]
that all threads had run to completion or at
least had begun their cleanup. As the Purify
output shows, thread 3 was still using the
condition variable when the Master thread had
already destroyed it (line 20). A remedy for this
situation is the following code.
Timing
One last benefit
that Purify adds pertains to the tendency of
multithreaded applications to change behavior
when moved between platforms of different
architectures. For example, the number of
processors and the load of the system can affect
the scheduling characteristics of the threads.
Purify's instrumentation affects the behavior of
applications by adding additional instructions in
the code path, which can have a significant
impact on the scheduling of threads. This
variation is helpful in its own right in changing
the run-time characteristics of threaded
applications.
Summary
The DCE runtime
contains several other facilities, most notably
the RPC runtime, which entails new challenges
both for Purify and for programmers. It makes
extensive use of internal memory management
techniques to efficiently manage communications.
DCE, like its cousin Motif, is loaded with memory
allocation functions, and these are bound to
cause grief in commercial applications without
tools to perform proper memory allocation
analysis. Purify is such a tool. Purify can play
a critical role in the development and debugging
of threaded applications, especially in view of
the extent that threading issues can be uncovered
by memory access analysis.
Steven
Sonneberg was a featured speaker at the
OSF DCE Users & Developers Conference.
|