Middleware: Client/Server Glue
by George Schussel
Ted Lewis, chairman of computer science at the Naval Postgraduate
School in Monterey, defined middleware in a November 7, 1994
Computerworld column as follows:"any ill-defined layer of
faceless software that connects your PC to your data
warehouse". That's an excellent description of the current
state of middleware. It is the purpose of this article to attempt
to further define the features and functions that we should
expect from our middleware, the glue that connects clients &
servers.
The approach below defines three levels of functions for
middleware, basic, intermediate and advanced. Just because
software can connect a client & server and provide some
communication function, however, doesn't mean that it qualifies
as middleware. Read on for a further discussion.
BASIC SERVICES
Basic services are a minimum level of function that you should
expect from a middleware architecture. A key characteristic of
these services is that they should be provided transparently, in
other words, their operation is invisible to the user.
COMMUNICATIONS SERVICES
- Different protocols - a number of communication protocols
are becoming standards across. At a lower level these
include technologies like IPX/SPX and TCP/IP. Middleware
should provide support for enough common protocols to
cover your current and likely future shop standards.
- Differences in TCP/IPs - Just saying you support TCP/IP
isn't enough because there are at least 15 different
varieties/vendors of this "standard". Windows95
will introduce a 16th. Your middleware needs to be able
to operate over any or all of these different
implementations.
- Protocol translations - when part of the enterprise
network is operating on one protocol and other parts on
others, single messages will have to traverse multiple
protocols.
DATA ACCESS & CONNECTIVITY SERVICES
- Connectivity - from client side tools to server DBMS -
This is usually the key point of middleware in the
client/server architecture. There are any number of
proprietary or "standard" API's that can be
used to establish connectivity. These API's can be
general purpose or SQL oriented. And, in fact, object
based standards such as OLE and DSOM and their messaging
processes can also be used to establish connectivity. A
middleware product should support common standards in
this area such as ODBC, DBLib, OLI, DRDA, SQL/API and
X/Open.
- Query optimization - for access to distributed DBMS. When
a JOIN is required between data that is located at
distributed sites, the middleware should provide
intelligence for navigation to the completion of the
query. In addition to the distributed navigation, the
existence of different file structures and indexing
schemes at various sites requires an intelligent approach
to avoid unreasonable overall query costs. Obviously the
middleware logic must work on relational, non-relational,
object and flat file structures.
- Query rationalization - between different implementations
of SQL permits a more open approach to tool selection,
obviating the need to use the RDBMS vendors own
tool set. Pass-through functionality can also enable
support for vendors differentiating features for
those applications where they are necessary.
- Remote procedure calls (RPC) - Different DBMS engines
support different forms of remote procedures. In addition
there are other forms of remote procedures, such as OSF
DCE, that middleware must pass on and properly support.
In like fashion, different types of object messaging
should be possible.
SCHEDULING SERVICES
- Thread management - provides a capability to exploit
cross-process communications and security facilities of
transaction based system environments, such as CICS or
IMS/DC. These permit the management of multiple processes
simultaneously. Since different environments handle these
functions differently, middleware can mask these
differences, making it easier to design applications that
can continue to run well as the client/server environment
evolves.
- Queuing may be required when multiple users want to
access the same system resources simultaneously. Again,
application programmers should not need to build this
function in.
- Load Balancing facilities may or may not be supported by
operating environments (as is often the case in parallel
systems.) Middleware can provide this function.
- Priority setting allows tasks that need higher
performance (improved response time) to be given the
additional resources to achieve that goal. Middleware
should permit the setting up of "private," or
nonshared, tasks to facilitate this.
INTERMEDIATE SERVICES
Continuing with our skiing analogy (beginner, intermediate and
advanced) the following list of intermediate services would be
provided by a middleware architecture. In reality for today,
however, the intermediate category would represent an advanced
level of services.
SECURITY SERVICES
- Multiple, different security mechanisms may exist; each
operating environment often has its own login controls,
separate security products such as RACF or Top Secret may
be in place, and DBMS administrators may also attach
restrictions to particular databases. From a single
point, administrators should be able to manage these
multiple security environments, simplifying the interface
to heterogeneous environments for users.
- Use of "trusted sources" allows the mapping of
authenticated IDs across systems, permitting
simplification of the environment. A valid IBM ID, for
example, may map to one on the Digital system,
eliminating the requirement for the user to supply
separate passwords for each subsystem in a
multi-environment join.
- Connection with legacy systems is greatly enhanced with
this capability; in a multi-tier environment, supplying
passwords to, say, IMS based on a logon to the Unix
server makes data access far more transparent to the
users.
TRANSACTION MANAGEMENT
- A general purpose distributed architecture needs to be
able to support transaction processing when the data
stores are physically distributed and connected by a
network. Key functions required for distributed
transaction processing include:
- Message queuing - the ability for a client to
drop a message at a server and then continue
further processing. The queuing implies that the
server will accept the message and store it for
later processing even if current facilities are
not accessible for immediate processing.
- Recovery - including the creation and management
of necessary logs.
- Thread management - the ability for one block of
code to support multiple transaction messages
from multiple users during one period of time.
- Commit processing - discussed just below.
- Commit processing - means that a transaction that spans
multiple data sets can be managed so that it all goes or
nothing goes. This results in a data set that maintains
integrity before and after the transaction. It is better
for an accounting record, for example, to be out of date
than to be out of a state of internal consistency.
Physical discrepancies in locations for data that are
being updated greatly increases the probability that a
network link is broken or that one or another computer
involved isn't available. The necessary 2-phase commit
technology to manage updates across multiple physical
sites will provide the necessary log management, backup
and recovery capabilities for distributed transaction
processing.
CATALOG/REPOSITORY SERVICES
- Catalog of catalogs - The basic idea of "information
at your fingertips" or data warehouses requires that
client-side users have a reasonable way to determine what
information is available and a simple method for
accessing that information. Each DBMS or file manager
that is part of the corporate network will have a catalog
or some software function that describes (most likely in
technical form, but possibly in business terms) the data
contained. Accessing and understanding many different
catalogs will be difficult for users. A middleware
repository should be able to assemble this group of
metadata and present it in a rationalized, normalized
form as definitions of business objects. It should be
accessible from any location on the network. It should
provide location transparency, the property of allowing
users to access data logically without requiring physical
navigation to the data. An advanced repository would
permit the definition of business rules to be stored and
enforced through a trigger generating mechanism.
SYSTEMS MANAGEMENT SERVICES
- Query governing is a must, since increasing numbers of
users will submit queries that may not have been
optimized by simple SQL generation tools. Runaway queries
must be stopped before they consume resources and
affect the performance of other processes.
- Performance monitoring must be done on an ongoing basis,
since the hardware environment and the user population
will change constantly. The "data metering"
function provides invaluable information that may be used
in determining priorities for data migration, such as in
data warehouse initiatives. It can also be invaluable for
database reorganization, indexing, etc.
- Chargeback subsystems in many shops make use of private
exits in installed software packages.Similar facilities
should be provided in middleware to allow administrators
to understand and perhaps spread the costs of the
middleware usage across the user population.
- An audit trail is an additional benefit of these
facilities. It documents usage in a way that can be
valuable for obtaining the support of decision makers who
will be needed to support additional projects.
ADVANCED SERVICES
Advanced services are only partially available in products
today. These functions are likely to become increasingly
available in leading middleware architectures over the 1995 -
1997 time frame.
REPLICATION
- Replication is system managed copying of data so that
data can serve multiple, possibly conflicting, uses and
users. Replication technology can be set up to support
transaction processing. In this case an application would
update and commit data at one site. Subsequently, the
middleware working in conjunction with various DBMS would
insure that secondary locations received the same updated
information insuring integrity in the transaction
transmission. Such a replication based copying scheme
can't insure that different physical locations of the
same data element will be identical at all times (in fact
there will always be some differences due to different
times that updates are posted). However, each location
can have its data internally consistent or in balance.
- Replication can also be set up for decision support types
of applications. In this case the goal is to use data
copies for analyzing trends or completed transactions for
defined periods of time, such as sales or accounting data
for the 3rd quarter. While replication for transaction
processing will use copying of individual transaction
records on an ASAP basis, replication for decision
support is likely to be scheduled on a business period of
time basis (such as monthly).
COPY MANAGEMENT
- In the process of providing replication services its
possible to add value to the data being copied. For
example, data can be aggregated and have computed fields
added. In addition the addition of different logical
views of data can make it significantly more usable. This
function of added logical representations of the same
physical data is sometimes referred to as "multiple
schema support".
George Schussel
George Schussel has been a CIO, consultant, industry analyst,
writer and lecturer on computer topics for 30 years. His lectures
are held before more than 20,000 professionals a year. He is the
founder and Chairman of Digital Consulting, Inc. (DCI) in
Andover, Massachusetts and Chairman of the Database &
Client/Server World trade show. He has published over 50
technical and analytical articles and his latest book,
Rightsizing Information Systems, co-authored with Steve
Guengerich, was published by the SAMS Publishing Division of
MacMillan. Reach him at 74407.2472@compuserve.com
©Copyright 1997 by Digital Consulting, Inc.
All Event names are trademarks of DCI or their clients.
Comments? webmaster@dciexpo.com