C Language
DSA
Software Engineering
Software Architecture
Operating System
Big Data
Data Mining and Warehousing
TOC
Ada
C++

Topics

  • 1. What is Data Warehouse ?
  • 2. Key Characteristics of a Data Warehouse
  • 3. Need, Importance & Components of Data Warehousing
  • 4. Data Warehouse vs OLTP
  • 5. Architecture, Advantages & Applications of a Data Warehouse
  • 6. Data Warehouse (DW) Delivery Process
1. What is Data Warehouse ?

A Data Warehouse is a centralized repository that stores integrated data from multiple heterogeneous sources (databases, CRM systems, ERP, logs, spreadsheets, etc.). It supports business intelligence (BI), reporting, and decision-making.

Formal Definition (Bill Inmon – Father of Data Warehousing):A data warehouse is a subject-oriented, integrated, non-volatile, and time-variant collection of data to support management’s decision-making process.

Ralph Kimball’s Definition (Dimensional Model Advocate):A data warehouse is a centralized repository providing users access to data organized for analysis, typically using facts and dimensions.

2. Key Characteristics of a Data Warehouse
Characteristic Description
Subject-Oriented Organized around key business subjects such as Sales, Customers, Products, Finance, Inventory. Focuses on analysis, not transactional processes.
Integrated Combines data from multiple sources: cleaned, transformed, standardized. Example: consistent customer names, date formats, currencies, product codes.
Non-Volatile Data rarely changes once stored. Mostly read-only. Refreshed periodically via ETL processes.
Time-Variant Stores historical data over long periods (months/years) for trend analysis, forecasting, and comparisons. Example: Sales by region over 5 years, seasonal trends.

 

3. Need, Importance & Components of Data Warehousing
  • Businesses generate huge amounts of daily data.

  • Operational systems (OLTP) focus on transaction processing, speed, and routine operations, but are not ideal for analysis.

Benefits of Data Warehousing:
  a) Better decision-making
  b) Data consistency and accuracy
  c) Fast query performance
  d) Historical trend analysis
  e) Supports BI and data mining 

Components:

  • Data Sources – OLTP databases, CRM, ERP, spreadsheets, logs.

  • ETL (Extract, Transform, Load) – Extracts data, cleans it, standardizes formats, and loads into DW.

  • Data Storage – Can be:

    • Relational databases (RDBMS)

    • Multidimensional databases

    • Cloud warehouses (Snowflake, BigQuery, Redshift)

  • Metadata Repository – Stores information about data: definitions, source mapping, ETL rules, table descriptions.

  • Data Marts – Focused subsets of DW for specific business areas (e.g., Marketing Mart, Sales Mart, Finance Mart).

  • OLAP (Online Analytical Processing) – Enables complex, multi-dimensional analysis (slice, dice, drill-down, roll-up).

4. Data Warehouse vs OLTP
Feature Data Warehouse (OLAP) OLTP (Operational DB)
Purpose Analysis & reporting Day-to-day transactions
Data Type Historical Current
Updates Periodic (batch) Frequent (real-time)
Queries Complex Simple
Orientation Subject-oriented Process-oriented
Design De-normalized Highly normalized

 

5. Architecture, Advantages & Applications of a Data Warehouse
  • Single-tier Architecture – Minimizes data redundancy; rarely used.

  • Two-tier Architecture – Client → Server; suitable for small systems.

  • Three-tier Architecture (Most Common):

    • Bottom Tier: Data sources + ETL

    • Middle Tier: OLAP server / Data Warehouse

    • Top Tier: Front-end tools (reports, dashboards, BI tools)

    Advantages

a) Improved data quality & consistency
b) Faster decision-making
c) Historical analysis & future forecasting
d) Better business intelligence
e) Supports data mining & analytics

Application

  • Retail: Sales forecasting, customer behavior analysis

  • Banking: Fraud detection, risk analysis

  • Healthcare: Treatment patterns, insurance claims

  • Telecom: Customer churn analysis

  • Government: Tax analysis, social data

  • Manufacturing: Supply chain optimization