What are some reasons why data warehouse projects
are more likely to fail then traditional projects?
Reason 1: Designing a data warehouse -- the Online
Analytical Processing (OLAP) variety -- is fundamentally different from
an OLTP (On-Line
Processing) structure. Specifically:
- OLTP (On-Line Transaction Processing) is *customer-oriented* and
uses for transaction/query processing; OLAP (Online Analytical Processing)
is market oriented.
- Looking at the schemas of OLTP (On-Line Transaction Processing) and
OLAP (Online Analytical Processing), differences are immediately
apparent: OLTP uses ER-based, application-oriented constructs; OLAP
Processing) uses star/snowflake, subject-oriented constructs.
- OLTP (On-Line Transaction Processing) manages current data that is
often to detailed for decision making; OLAP (Online Analytical Processing)
uses historical data and allows for summarization and aggregation
stores/manages this data at various granular levels.
- OLTP (On-Line Transaction Processing) access patterns are short, “atomic” transactions;
OLAP are mostly read-only, some could be complex queries.
Traditional (relational) databases, aka OLTP systems, are designed to
perform queries and perform transactions. In other words, they perform
the everyday activities of an organization: purchasing, inventory, accounting,
payroll, etc. Data warehouse systems (aka OLAPs) are used for data analysis
and decision-making. They are used by ‘specialized’ users
or ‘knowledge workers’ and the data they present is used
mostly by upper management.
Here is a succinct list on how OLTP and OLAP differ:
Users and orientation:
OPTP: customer-oriented; used by clerks, DBA, IT pros
OLAP: market-oriented; used by knowledge workers, managers, analysts
But, of course, it takes IT pros and DBAs to *set up* data warehouses
OLTP: manages current, highly-detailed data (not useful for decision
OLAP: manages large amounts of historical data; can summarize and aggregate;
stores/manages info at various granular levels. Good for decision making
OLTP: ER model, application-oriented database design
OLAP: star or snowflake model, subject-oriented database design
OLTP: current data within enterprise or dept – not historical data
nor data in different organizations
OLAP: “spans multiple versions of a database schema” (based
on evolution of an organization). May incorporate info from other organizations
OLTP: short, atomic transactions. Requires use of consistency controls” and “recovery
OLAP: most access is read-only; some complex queries
Reason 2: The data warehouse tool environment
is several orders of magnitude more complex than the traditional tool
Not only are there many tools available but many categories of tools
to select from. Specifically:
- Due to the complexity and (especially) the wide variety of tool TYPES,
the warehouse project manager needs to conduct a thorough analysis
to select the right tools for the job. Some experts warn that choosing
front-end tool based on cost considerations only is the wrong approach.
It is much better to spend (invest) in good tools to start with so
that the data warehouse project (most likely, a very expensive endeavor)
not be compromised.
- There are some important issues regarding ad hoc query tools and
why they fail. Put simply, users cannot take the same approach as in
(On-Line Transaction Processing) systems due to the complexity of
OLAP (Online Analytical Processing). They must become familiar with
logic and SQL/relational theory as well as schema design and iterative
querying. They must also understand the tool itself; the repetitive,
button-pushing practices of OLTP do not apply to warehouses.
Reason 3: The analysis process, including requirements analysis is fundamentally
different from a traditional project.
This may be the most important (most common) reason that causes a data
warehouse project to fail. Companies or project managers may get excited
the prospects of creating the data warehouse — jumping into the
project without a careful requirements analysis. They should
first ask themselves, “what
are the USER requirements for the warehouse?” The analysis should
also examine all user reports (such as legacy reports). Prompt
users input is critical for these decisions.
More Data warehousing and data mining information:
OLAP vs. OLTP
Multiple Dimensional View of Database:
ROLAP, MOLAP, HOLAP
Data Warehouse Project
Mining Primitives, Hierarchies, Architecture and Coupling
Data Preprocessing for Data Warehouses
data quality, tuples with missing values, data smoothing and data
Characterization, Discrimination, Association, Classification, Prediction,
Clustering, and Evolution Analysis: Differences and Similarities
Project vs Any Other Large Database Implementation
Mining and Data Warehousing in Biology, Medicine and Health Care
Other Information Technology pages:
Training — FAQ part 1
part 4: Cost-Time Graph and Shortening Critical Path
Back to Info-Source home page