web hit counter DCI: Anita Freed - Building Success Into Your Data Warehouse Project
 
 

Publication Date: July 29, 1996

Building Success Into Your Data Warehouse Project

By Anita J. Freed

A data warehouse can be a powerful business tool, putting mountains of corporate information at workers’ fingertips. But fail to build your data warehouse on the foundation of sound management principles and it will fail to live up to its promise, says Larry English, president and principal of Information Impact International, Inc.

"This is not just another project. It is actually taking the corporate information and pulling it together in such a way that it can be analyzed and used as an enterprise knowledge resource. We have to apply good organizational principals – in planning; organizing and staffing; directing; and controlling," says English, an internationally recognized speaker and consultant on information management topics.

What that means is that upper management’s support and involvement is crucial to any data warehouse project. Other critical success factors, as identified by English, include the following:

  • The data warehouse must solve a real business problem.
  • It must be an enterprise solution.
  • The data must be complete, accurate, and consistently defined.
  • The project must use the minimal set of technology components.
  • Data must be managed at the detail level.
  • An information directory must be available to all knowledge workers.
  • Definitions established for the data warehouse must carry over into operational data.

These factors can be summarized in three categories: Make It Useful, Keep It Clean and Keep It Simple.

Make It Useful

The data warehouse must be designed as an enterprise solution – not a departmental solution – to a clearly defined business problem. Such problems tend to fall into broad categories: customer relations; product or service quality; productivity issues; or time-to-market issues. The goal, English says, is to provide information "horizontally, across departments. In a vertical structure, you exacerbate the problem of being unable to integrate the information resources company-wide." Plus it’s costly – in time and monetary terms – to later integrate separately developed departmental systems.

This is not to say a data warehouse cannot be used to solve the problems of a single business unit; rather, that the data architecture must take into account the broader uses of the information, so the warehouse will have broader value. For example, suppose you are a retailer whose company also has wholesale and direct marketing divisions. When the company develops the "customer subject area" of its data warehouse, English says, it needs to identify what is in common among customers in the three divisions. When you use the warehouse to solve a particular problem – say, how to increase the lifetime value of the retail customer – you may extract only retail data, but it won't change the architecture of the database.

This underscores the need for the database to have consistent, consensus definitions of all areas of interest for all parties who may someday want that information. It also points to the need to identify the authoritative sources for the warehouse data: A company with multiple customer groups, for example, is likely to have a certain redundancy, and inconsistency, among its records. Choosing an authoritative information source, and clearly identifying which attributes define the data, eliminates those inconsistencies. "The data warehouse will become the primary source of information about a given topic – customer, product, order history – if it's done right," English says.

Keep It Clean

"The majority of the effort that goes in a data warehousing project is loading clean data," English says. And no wonder. Ultimately, the success of a data warehouse rests on the accuracy and completeness of the information in it. Bad data generates two results: Workers will discover that the information they extract is wrong, and they will stop using the warehouse; or they will extract the data, believe that it is correct, and make bad decisions because of it. In either case, the warehouse is a failure.

A number of tools exist for transforming data to warehouse requirements, but some data just can’t be cleaned: Records are lost; information may be irretrievable. In those cases, it’s important to identify the accuracy level for that part of the warehouse, so workers know ahead of time the quality of the data they are extracting.

Furthermore, English says, "not a single data cleansing product can guarantee the accuracy of the data." A tool to check gender codes, for example, will recognize that a "G" is not an "M" or "F," but it won’t be able to tell you whether "Pat" is male or female. Therefore, management must foster what English calls "information stewardship." "You must make people believe: 'I'm accountable for what I do because other people count on it.'" Promoting that attitude could entail everything from better employee training programs to rewarding data entry personnel for accuracy rather than just speed. "It's a cultural change and it's another reason for senior management's involvement – because the data warehousing team can't do it," English says.

Keep It Simple

English advocates using a "minimalist set of technology components," or making the warehouse architecture as simple and intuitive as possible. Among his suggestions: Use a single directory for metadata; standardize the database management system; and match terms used in the data warehouse to standard business terms, so it’s easy to relate different stores of information. The easier it is to use the system, the more likely it will continue to be used and supported.

He also recommends maintaining the base data from which summary calculations are derived, as part of a backup system. (Not doing so can spell disaster if your summary data is lost and you need to regenerate the calculations.) Also, be sure to make an informational directory available to all workers so they know the meaning and source of data in the warehouse.

There are two schools of thought on whether to return cleaned-up data to a company’s operational database. While some believe the operational data should not be changed, "I disagree with that," English says. "I believe it is a critical success factor that once you clean data, that you send that data back into the operational system if that data is still being used," as a way of minimizing possible discrepancies between the two systems.

In the same vein, English recommends applying the consensus definitions used for the data warehouse to the operational data. "When any application comes up for re-employment, we have already defined these attributes. [Over time] this eliminates the need for a transform tool. It’s one of those side benefits," says English. "Capture once, use many times. It’s the principle of value."

Anita J. Freed is an Internet project manager at DCI.


Larry English is a featured speaker at DCI's Data Warehouse World.


 
[Home] [Events] [Find It] [Sign Up] [IT News] [Support] [What's New] [Brochures]
©Copyright 1997 by Digital Consulting, Inc. (508) 470-3880
All event names are trademarks of DCI or its clients.
Comments?
webmaster@dciexpo.com