What is Data Warehouse ? Definition & Overview
Data Warehouse is a Relational Database Management System (RDBMS) or Operational database(OLTP), specially designed for Query, analysis and reporting the business marketing decision to achieve business goals.
Data Warehouse Definitions.
According to Bill Inmon, Data warehouse is a Subject-Oriented, Integrated, time-variant and non-volatile collection of data.
According to Ralph kimball, Data Warehouse is a transaction data specifically structured for query and analysis.
Why Data Warehousing is required?
An enterprise or organisation is a business unit where every department is integrated with various units. For example in Banking, we have different business units like Accounts, Loans, Mutual funds etc. Every business transaction is an operation and these online transaction are stored in Operational database or OLTP (Online Transactional Processing Database).
Data Warehouse (DW) centralises data from multiple Operational Databases (OLTP’s) because data is scattered in various places and it becomes difficult in gathering data.
Relational Database types.
- Low Range Database :- Low range Database can manage and organise mega bytes of data. Ex :- Ms-Access, D-Base, Fox PRO, Chipper, Paradox etc.
- Mid Range Database :- Mid Range Database can manage and organise Giga bytes of data. Ex : – Oracle, Microsoft SQL Server, IBM DB2 etc.
- High Range Database :- High Range Databases can organize and manage Tera-Bytes of data. Ex:- TD, Netezza, Green Plum etc.
How to choose Database?
To choose right Database that required for your organisation, it depends upon the size of the organisation, Transaction frequency per day and No of year’s of transaction history to be provided.
How to build Data Warehouse?
To build Data Warehouse, the database must support the following features. Check the following requirements before building DWH.
- A Relational Database is required.
- Database that support enormous storage capacity like billions of Rows and Tera-Bytes of Data.
- DB that supports parallel processing, Distributed file storage (DFS) pattern.
- Database that supports high availability, where data is available to the users even the software and hardware compounds are down.
- We must select Database that have “mature optimisers” and low TCO (Total Cost of Owner ship).
Characteristics of Data Warehouse.
- Time Variant : A Data Warehouse is a time variant data base, which supports the business management in analysing the business and comparing the business with different time periods like Year, Quarter, Month, Week and Date.
- Non Volatile : It is non volatile Database. Once the data entered into the database, it does not reflects to the change. The data is statics in DWH.
- A DWH is a integrated database which allows you to collect the data and integrate the data with multiple database sources.
- Data warehouse is a subject oriented database, which supports the business need of individual department specific user.