Data Warehouse Architecture will have different structures like some may have an Operational Data Store, Some may have multiple data store, some may have a small no of data sources, while some may have a dozens of data sources.
Data Warehouse Architecture
In general, all Data Warehouse Architecture will have the following layers.
- Data source layer.
- Data Extraction layer.
- Staging Area.
- ETL layer.
- Data Storage layer.
- Data logic layer.
- Data presentation layer.
- Metadata layer.
- System operations layer.
Types of Data Warehouses
There are three types of Data Warehouses. They are
- Centralised data Warehouse : A centralises Data Warehouse is one in which data is stored in a single, large primary database. This Database can be queried directly or used to feed data marts.
- Functional Data Warehouse : A functional Data Warehouse is dedicated to a subset of the business, such as a Marketing or finance business function.
- Federated Data Warehouse : A federated Data Warehouse is an active union and cooperation across separate Data Warehouses.
Data Warehouse and their architecture vary depending upon the specifics of an organisation’s situation. Three common Data Warehouse Architectures are
- Data Warehouse Architecture (Basic).
- Data Warehouse Architecture (with a Staging Area).
- Data Warehouse Architecture (with a Staging Area and Data Marts).
Data Warehouse Architecture (Basic)
End users directly access data derived from several source systems through the Data Warehouse.
The metadata and Raw data of a traditional OLAP system is present in above shown diagram. Summary data is in Data Warehouse pre compute long operations in advance.
Data Warehouse Architecture with a Staging Area
One needs to clean and process your operational data before putting it into the warehouse. Once can do this through programatically, although most data warehouses use a staging area instead. A Staging area simplifies building summaries and general Warehouse management.
Data Warehouse Architecture with a Staging Area and Data Marts
One may want to customise our architecture for different groups within our organisation. One can do this by adding data marts, which are systems designed for a particular line of business.
Difference between Data Warehouse and Data Mart ?
|Data Warehouse||Data Mart|
|Corporate / Enterprise wide||Departmental|
|Union of all Data marts.||A single business process.|
|Data received from staging area.||Star-join (facts & dimensions).|
|Queries on presentation resource.||Technology optimal for data access and analysis.|
|Structure for corporate view of data.||Structure to suit for departmental view of data.|
|Organises on E-R model.|
Data Warehouse Approches
To build effective and efficient data warehouse, different analysis and business needs to be understand. There are two approaches available to build Data Warehouse. They are
- Top-Down approach and
- Bottom-Up approach.
|Bottom-Up approach||Top-Down approach|
|This is Ralph Kimball approach.||This is Immon Approach.|
|Data is not normalised.||Data is normalised.|
|Data mart is loaded directly from source and enterprise DWH is loaded from these Data Marts.||Data flows from source to enterprise DWH and then to Data Mart.|
|Each data mart has its own narrow view of data.||Enterprise view of data.|
|Faster and easier implementation||Takes longer time to build even with an iterative method.|
|Less exposure / risk to failure||High exposure / risk to failure.|
|Favourable return on investment and proof of concept.||High outlay without proof of concept.|
|No need of high level of cross-functional skills.||Needs high level of cross-functional skills.|
|Enterprise goals may be compromised.||Enterprise goals not compromised.|
In this Data Warehouse tutorial, we learned about different data warehouses, different DWH architectures and about different Data Warehouse approaches. In our next tutorial, will learn about different Data Warehouse Components like source data component, data staging component, Data storage / target data component, Information delivery component, Metadata component and Management and control component.