How similar or different do you think a data warehouse project might be compared to any other large database implementation?
First, it must be realized that the purpose of warehouses are different than regular databases; the former is used in OLAP and the latter in OLTP. Due to the decision-support and "business needs" requirements of warehouses, the scope is, usually, much larger than large (regular) database implementations such as a RDBMS. The warehouse project will need a lot more input/involvement form the upper management and other key decision makers. Data warehouses often require the integration of heterogeneous databases (and/or data sources). To integrate such data, and provide easy and efficient access to it, often proves to be a challenge. This is quite a different scenario than regular database projects -- no matter what their size.
The build time for an average DW is, normally, longer than traditional databases projects, large or small. In addition, the cost of a typical DW project is also quite a bit higher than traditional databases projects, large or small. Some reasons for this are:
Preprocessing of data: You are going to spend a lot of time extracting, cleaning, and loading data. The usual figure quoted is that 80% of the time building a data warehouse will be spent on this type of work.
Despite best efforts at project management, data warehousing project scope will increase. Data warehousing projects start with data and end with requirements. Once warehouse users see what they can do with 2001's technology, they will want much more!!
You are going to find problems with systems feeding the data warehouse. Problems that have gone undetected for years will pop up. You are going to have to make a decision on whether to fix the problem in what you thought was the 'read-only' data warehouse or fix the transaction processing system.
You will need to validate data not being validated by transaction processing systems. Typically once data are in warehouse many inconsistencies are found with fields containing 'descriptive' information.
Implementation of a DW means building a HIGH-maintenance system. Reorganizations, product introductions, new pricing schemes, new customers, changes in production systems, etc. are going to affect the warehouse. If the warehouse is going to stay 'current' (and being current will be a big selling point of the warehouse), changes to the warehouse have to be made fast.
Security issues: With typical OLTP project, database administration and management can easily establish who must have access and who does not. With data warehouses (OLAP) and mining practices, especially if the technology is being implemented for Web use, poses risks in security. The more accessible you make your data warehouse (by 'accessible', what is meant is architecting it in a way that people want to use it), the greater security risk you are exposing yourself too. Restricting people to "need to know" is not a good option in the organization on the 2000sHowever, exposing information to theft from anyplace in the globe is not too great for job security either.
Warehouse users will develop conflicting business rules
Data warehousing can have a learning curve that may be too big (long) for impatient
DW's may require the purchase of special OLAP tools (See http://www.olapreport.com/How_not_to_buy.htm)
Similarities between DB's and DW's:
- Some of the "core" concepts
are similar. That is, schema architecture, query language theory, syntax
and fundamentals, etc. are similar between traditional databases and
What makes a data warehouse project unique and/or similar to other IT/IS projects?
I think DW projects are similar to other IT projects if they are created using the traditional software engineering approach. That is, using a lifecycle process model which outlines the phases of the project: Requirements, Design, Testing, Implementation, Updating/Maintenance. I have never worked on a DW project so I don't have first-hand experience with how exactly this is done. Differences between typical IT projects and DW projects would be similar to some of those listed above. Namely, the size, cost and scope, and build-time of the project being greater than your run-of-the-mill software engineering project.
OLAP vs. OLTP