DCI's
Publication Date: March 28, 1997
Client/Server Process
Partitioning:
Do It Now or Do It Later
By Lou
Russell
Founder and President, Russell Martin &
Associates
A key issue in
client/server design is how can we partition the
processes of an application to maximize
performance and minimize maintenance expense?
Partitioning means physically locating the data,
process and presentation components of an
application across a client/server installation.
For example, should you split the components
between servers or place copies on multiple
servers (a distributed application)?
There has been
much less attention paid to splitting the
processing of an application between clients and
servers, in spite of the confusion and
inefficiencies which arise from poor process
design and partitioning. This article outlines
the key steps to follow in process partitioning.
Consider this
common scenario: A company has purchased a visual
development tool to help them implement their
first client/server application. Initially it's a
struggle, but the programmers eventually get
comfortable prototyping powerful GUI screens. The
database designers eventually adapt to the new
platforms, taking their relational theories
easily along with them. What happens to the
actual process, the parts of the system that
transform data? If your people are
ex-mainframers, their first response is to put it
all on the server. This causes a tremendous
bottleneck on the network as clients compete for
access. If your people are ex-PCers, their first
response is to put it all on the client. This
creates havoc when the next release of the
software comes out and only three-quarters of the
clients get the update. What is needed is a
rigorous criteria for more than an all or nothing
choice.
In this article,
the following questions will be addressed:
- What is the
physical layout?
- Where will
the manual processes interface with the
automated processes?
- What are the
response time requirements?
- What are the
data currency requirements?
- What are the
process timings and frequencies?
- Which are the
volatile processes?
- Where are
there opportunities for
"reuse"?
- Are the
process, presentation and data
partitioning synchronized?
These questions
cannot and should not be attacked linearly; as
requirements and constraints become clearer, many
of these questions will have to be revisited.
Before these
questions can be answered, a clear understanding
of the business requirements is needed. Many
people implementing client/server mistakenly
believe that analysis is no longer necessary. The
result is often an "endless loop" of
prototyping--lack of analysis up front creates
days of turmoil troubleshooting performance
problems.
A business
requirement should contain the following:
- The business
opportunity being addressed (why are we
building this system?)
- The scope of
the project
- The process
requirements (using process models)
- The data
requirements (using data models)
- The business
events
The type of model
you use is not as important as the effort put
into making the analysis as complete as it can be
(analysis is never complete). If you are
interested in applying object-oriented concepts
to your client/server development, object
analysis can easily be substituted for the
process and data requirements. Business events
are a critical component regardless.
1.
Physical Layouts
Process
partitioning is interrelated with network design.
Choices about where the processes will reside
will be partially driven by the number of clients
and servers and the topology of the final network
and vice versa. As important to the network
design is the nature of the processes. The
network designer should work with the process
designer to determine what the network needs to
look like, based on the physical needs. At this
point the following should be clearly documented:
- The number of
users (equals log-on IDs)
- The physical
location of each user
- The physical
locations of existing equipment
(including mainframes, workstations,
printers)
- A table of
business events by user
As the network
design evolves, this documentation of the
physical layout must be continually updated.
2. Manual
Interfaces
The question is:
Which processes will be done by the system and
which processes will be done by people? Everyone
has a story of a process that was mistakenly
mechanized, such as mechanizing a neighborhood
newsletter with a sophisticated desktop
publishing package when the results would have
been faster and better looking had they used word
processing, scissors and glue. If it is true that
20 percent of the processes do 80 percent of the
work, many of the remaining processes aren't
worth the effort to automate.
Consider the
example of a system designed to process orders at
a quick oil change company (like Jiffy Lube).
There would be a process (previously documented
in the business requirements--see previous
paragraph) called "Enter customer
order" to receive the details about the
customer, car and the services that were
requested when the customer first entered the
shop. Is this a manual process or a mechanized
process? Actually, it contains both: a manual
process, "Enter customer order," which
is the actual data entry done by the Jiffy Lube
attendant, and a mechanized process, "Edit
customer order," which is the GUI screen and
edit processing that populates the order file. If
you are using a data flow diagram for your
process model, these types of processes are
generally receiving input from or sending output
to an external entity (usually a person or
external system).
Why do you care?
There are two reasons: First, it is not correct
to show a manual process updating a mechanized
file. Second, splitting the manual process out
into its true components helps you identify GUI
prototyping opportunities.
Think about it--it
is impossible for a data entry person to update a
mechanized file directly. The only way to do it
is through a mechanized process. If you are not
showing this mechanized process, you have missed
part of the requirements and one of the processes
that you must partition.
The "Edit
customer order" process is a perfect example
of a place where you would want a GUI interface.
This points out a need for prototyping and also
starts to push you toward allocating this process
to the client. You could also make the case that
there are really two processes here: the GUI
presentation process and the edit process. There
may be times due to volume or complexity that you
would want that edit piece on a server and the
GUI piece on a client. The processes should be
split down to the level that choices can be made.
This is a good
place to talk a little about GUI design. Screen
design should be driven by thorough user
analysis. An important criterion is the
discretionary vs. non-discretionary user. The
design will be different for people who use the
system every day as compared to a person who uses
a screen less frequently. In our Jiffy Lube
example, the attendant will use the system all
the time, every day, so would be able to handle a
more complex interface with less prompting.
3.
Response Time Requirements
Which processes
should get the best performance? Client/server
partitioning is a series of choices balancing
process performance against system performance.
Each process contends with all the other
processes for network and data resources.
There are two
options for judging performance needs: You can
guess or you can carefully prioritize your
processes based on business need. Each process
can be listed and prioritized in terms of
response time, preferably by the end user. If the
processes are too low level to make the choice
clear, the business events can be prioritized and
then the processes that are contained in these
business events will be prioritized
automatically. In the Jiffy Lube example, the
business events "Customer places order"
and "Customer pays for work" clearly
deserve better response time than "End of
day sales totals are calculated." It follows
then that the processes contained in
"Customer places order" will have a
higher priority than those in "End of day
sales totals are calculated."
It's easy to see
that the "Edit customer order" process
is contained in the "Customer places
order" business event. What is not clear is
mapping more internal processes (processes that
do not interface directly with a system input or
output) to business events. Basically, you need
to backtrack through the processes that provide
input into "Edit customer order." An
example might be "Verify credit card
number." Any process that provides input to
a different process must have the same response
time priority. This makes perfect sense--it would
be impossible to have a fast response time on a
screen process while the processes that read the
data to populate the fields on that screen take
much longer.
You are done when
all the processes are prioritized. If you find a
process with different response time needs, it is
a process that is doing more than one thing and
should be split into two.
4. Data
Currency Requirements
Data currency
means the timeliness, or freshness, of the data.
Let's suppose that the client/server system at
Jiffy Lube was implemented in a distributed
fashion, and the clients (and processes) that
took the orders were verifying pricing from a
different copy of the pricing file than the
clients (and processes) that took the money and
closed the order. The business problem would be
that the car owner might be billed more when they
left then they originally approved. In this step,
we want to make sure that dependent processes
have the same currency.
You can document
currency in the same fashion we documented
response time in the last step. The currency of
each process should be prioritized. Similarly,
the contributing processes (where did the data
originally come from?) should be grouped so that
they all have the same currency needs. In this
case, you are not only concerned with internal
processes that contribute but to the processes
that populate any data used as well. Data coming
out can only be as current as the data that was
put in. Again, processes that have different
currency needs should be split.
Although we are
talking about using currency to tie processes
together, this step also drives data
partitioning. Our example illustrates the kinds
of decisions that must be made when balancing
performance against redundant data. Certainly,
each process is faster if it is accessing its own
database. The problem comes up when the data gets
out of synch.
5. Process
Timings and Frequencies
If we can identify
which processes are run most often, we will have
more information to make performance choices.
Looked at from the other direction, we will also
be able to offload infrequent processes to make
more resources available for the frequent.
Using the same
approach we have used in the two previous steps,
we now look at:
- What triggers
each process? (generally a business event
or time)
- What is the
timing of each process? (daily, weekly,
monthly, yearly, etc.)
- What is the
frequency of each process? (example, 400
times per day)
If you find a
process with more than one frequency or trigger,
split it since it is really more than one
process.
Looking back at
what we have documented up to this point we now
know:
- Which
processes must have the best performance
- Which
processes must have data currency
- Process
triggers, timing and frequency
- Physical
schematic of network
- Which
business events are the highest priority
- Where the
high priority processes will be done
(process --> business event -->
physical location)
- The GUI
screens needed
- The extracts
needed
- The reports
needed
6.
Volatile Processes
In this step, you
are trying to identify places where you will need
ad hoc capabilities. Generally, this would be at
the output processes (reports/screens).
Ironically, these processes are usually
overlooked until the last minute, although ad hoc
queries are often cited as the main resource hogs
in client/server implementations. Tool choices
need to be made, as well as decisions about
"ad hoc supplementing" process needs.
You may decide to create "macros" of
common queries or create query edit routines if
it looks like performance is going to be
impacted.
7.
Opportunities for "Reuse"
By identifying
processes or pieces of processes that can be
reused, we shorten development time. In addition,
the fewer different pieces in client/server, the
narrower the troubleshooting choices. Many
developers by their nature dislike reusing
someone else's programs. However, client/server
offers many opportunities for reusability
including:
- Screen
templates
- Utility
programs
- RPCs (Remote
Procedure Calls)
- Standard
queries (SQL)
In addition, a
CRUD matrix (mapping how processes Create, Read,
Update and Delete the data) can help identify
process reusability opportunities: If you see two
different processes that create a database
occurrence, there is usually some code that can
be reused.
8.
Synchronized Process, Presentation and Data
Partitioning
As mentioned
earlier, process partitioning should not be done
in a vacuum. Consider process partitioning, data
partitioning and network design as the three legs
of the client/server solution stool. All three
need to work together.
As the database
design and distribution needs clarify, revisit
the process partitioning to see what the impact
will be. Denormalized and distributed data will
significantly change your process; not only will
the access change but the process must deal with
the redundancy and complexity of the data
structure.
The performance
needs documented in process and data partitioning
may require additional hardware be added to the
network design. This could be anything from
additional memory on the client machines to
additional servers.
Conclusion
Many designers of
client/server applications--even some
client/server packages--have by implication
treated process partitioning as an optional step.
Instead, the developers play around with
expensive hardware and software and implement
hurdles; this is sometimes called prototyping.
However, the inefficiencies and peculiar results
give both prototyping and client/server a bad
name.
The kind of
careful analysis and partitioning of processes
described in this article can go a long way
toward building client/server applications that
are more robust, more efficient and at the same
time better meet the user needs. "Do it now
or do it later" was never so true as when
applied to process partitioning.
Lou
Russell, founder and president of Russell Martin &
Associates,
is a frequent speaker for DCI. This spring, she
is leading DCI's seminar on Data
Analysis & Modeling for Building the Data
Warehouse.
|