DEV Community

Cover image for Data warehouse explained
Barbara
Barbara

Posted on • Edited on

Data warehouse explained

WHY

Different business processes need different data structures due to different needs.
The aim of a data warehouse is to

  • make the data simple to understand
  • performant
  • quality assured
  • ready for new questions
  • secure

It should deliver the data to the user

  • in an understandable and performant dimensional model
  • with conformed dimensions or separate data marts
  • to report and visualize -- by interacting directly with the model -- or through an BI application

OLTP - online transactional processing

For running a business it is important to track different things, like for example inventory, financial transactions or different types of movements.
A single relational database might be sufficient for this part, as there is no redundancy and high integrity.

OLAP - online analytical processing

To understand what is going on in a business it is important to do analysis and monitoring. Like examine market segmentation over a period of time or split users into market segmentations.
As single relational database is not sufficient, because it might become too slow for analytics and quite hard to understand.

WHAT

A data warehouse is a system that enables the support of an analytical process. It gathers data from multiple sources. It is a copy of transaction data specifically structured for queries and analysis.

The data from OLTP will run through an ETL process to provide data for the OLAP.

ETL - Extract Transform Load

Extract

  • get data from a source
  • update old states

Transform

  • integrate many sources together
  • clean data
  • produce metadata

Load

  • structure the data
  • load into a dimensional data model

sketch ETL

Dimensional modeling

The dimensional model will help to understand the data easily and perform fast analytical queries.

sketch starscheme

Fact table - numeric and additive

  • business events, like a phone call or a review of a book
  • context of the business events in quantifiable metrics. Like duration of a call, the rating of a book

Dimension table

  • context of business events: who, what, where, why
  • attributes, like: date, time, physical locations, human roles, good sold

Oldest comments (0)