Data Warehouse Project Warnings

It’s amazing how similar the various “warnings” are when it comes to data warehouse projects. An article by Dr. Paul Dorsey listed several cautions when dealing with this same issue. Dorsey specifically states “most organizations that undertake warehouse projects dismally disappointed with the results…[because] data warehouse projects are much harder to complete successfully than traditional systems development projects.” Why is this? Here are some fundamental reasons:

1. Designing a data warehouse is fundamentally different from designing an OLTP structure.
2. The data warehouse tool environment is several orders of magnitude more complex than the traditional tool environment. Not only are there many tools available but many categories of tools to select from.
3. Data warehouse projects are not core business systems. Therefore, they are much more sensitive to the political environment within an organization. Without complete user support, data warehouse projects are doomed to failure.
4. The analysis process, including requirements analysis is fundamentally different from a traditional project.
5. The problem of keeping the warehouse in synch with the production system is one that very few traditional developers have ever encountered.

Rushing into a DATA WAREHOUSE project is one of the worst mistakes a company can make – as was the case with Close Call. The CEO of that company made the fist blunder in being tempted by a software vendor. The CEO acted on his own malformed “instinct” that a he had to “act fast” in order to keep up with growth. The CEO acted impulsively by buying into the vendor’s sales pitch and set two unrealistic goals: to develop the DATA WAREHOUSE in three to four months and to budget the project at only $250,000. Another initial mistake was in NOT starting out with an experimental pilot project (sometimes, it’s best to start out with a mini version, such as a data mart, rather than a full blown project). As it turns out, the project team had ended up extending the initial build time to 5 months and persuading the CEO into first instigating a pilot project.

But even with the pilot inception, the Close Call project was doomed from start up due to one primary fault: lack of a clearly defined business objective. During the requirements phase, it turned out that the functional requirements model revealed a highly complex set of business requirements; this was further complicated by “an inconsistent group of data ‘facts’ that would populate the warehouse.” Obviously, no one had given serious though to data preprocessing stages such as cleaning, integrating and transformation functions. Not only was the data “dirty”, not much of it was useful (or even available) in its current state due to legacy and antiquated-technology issues. Data migration is paramount. According to Dorsey, “data migration can sink the project. Not only are migration scripts large and complex, they must be maintainable because they have to keep the warehouse in synch with the production system when the structure of either changes. This is not like a legacy migration script that is used once and discarded. Because it must be run periodically, the script must be tuned to run efficiently and maintained easily.”

The Close Call project overran its deadline and failed due to these mistakes. According to the article, the “Red Flags” of this project were:
• No pre-launch objectives or metrics
• Many major systems projects underway simultaneously
• The CEO set budgets and deadlines before project team was on board
• No insider presence on data warehouse project team
• An overburdened project manager
• Source data availability unconfirmed at the outset
• No user demand for sophisticated data analysis
• No routine meetings of executive sponsors and project manager
• No initial involvement of business managers

The author of Ten Mistakes to Avoid ( offers some sound advice. In addition to the mistakes listed above, which of mistakes listed in this article did Close Call make?
Starting With The Wrong Sponsorship Chain “ The right sponsorship chain includes two key individuals above the data warehousing manager. At the top is an executive sponsor with a great deal of money to invest in effective use of information. A good sponsor, however, is not the only person required in the reporting chain above the warehousing manager.” Close Call had only one person above the DATA WAREHOUSE manager, the CEO. Loading The Warehouse With Information ‘Just Because It Was Available.’

From Close Call own account: “Panicked at the thought of breaking that news to the executive sponsors, the team jury-rigged a way to populate the pilot by parsing the DOS-based Reflex reports and manipulating the report data into a relational database format. But the handata warehouseriting was on the wall—without replacing the proprietary switching systems, there would be no data warehouse.”

The best move that Close Call made was to abandon the project. It took a heavy toll (costing them a considerable amount of $ plus loosing half their IT staff). But if the company insisted on continuing, it would have cost them much more. Unbelievably, this fiasco stated at friendly golf game.

Some tips for approaching a DATA WAREHOUSE project:

A. The project leader must be experienced. The project leader should have completed other successful data warehouse projects and be aware of the different types of end-user tools including flexible reporting, ad hoc query and OLAP alternatives.
B. Careful collection and analysis of user requirements including legacy reports is crucial. It is important to verify which reports are actually being used. Simply gathering the names of reports is not enough. All ad hoc reports gathered over time should be examined. Of particular interest are reports given to users as ASCII files which users are inserting into Excel or SAS to generate their own reports. It is essential to know specifically what users are doing with their reports.
C. Do a pilot project. Choose the most enthusiastic users to do serious requirements analysis. Try several different tools for migration, back-end and front-end implementations. It is worth spending lots of money on the pilot project rather than plunging into spending even more money on a large project that fails. Expect that the first pilot project will fail. If this is the case, do another one.
D. Make the users happy. Warehouse projects are far more politically sensitive than any other type of project. The resulting system must be easy to use, producing accurate results. All it takes is one user making one bad decision based upon the system to have all users lose confidence.

More Data warehousing and data mining information:


Multiple Dimensional View of Database: ROLAP, MOLAP, HOLAP

Data Warehouse Project Warnings

Data Mining Primitives, Hierarchies, Architecture and Coupling
Data Preprocessing for Data Warehouses

Dimensions of data quality, tuples with missing values, data smoothing and data integration

Data Characterization, Discrimination, Association, Classification, Prediction, Clustering, and Evolution Analysis: Differences and Similarities

Data Warehouse Project vs Any Other Large Database Implementation

Data Mining and Data Warehousing in Biology, Medicine and Health Care

Other Information Technology pages:

Project Management Software

Project Management Training — FAQ part 1

FAQ part 4: Cost-Time Graph and Shortening Critical Path

Back to Info-Source home page