Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (2024)

Introduction

Slowly Changing Dimensions in Data Warehouse is an important concept that is used to enable the historic aspect of data in an analytical system. As you know, the data warehouse is used to analyze historical data, it is essential to store the different states of data.

In data warehousing, we have fact and dimension tables to store the data. Dimensional tables are used to analyze the measures in the fact tables. In a data environment, data is initiated at operational databases and data will be extracted-transformed-loaded (ETL) to the data warehouse to suit the analytical environment.

Customer, Product are examples for Dimensional tables. These dimension attributes are modified over time and in the data warehouse, we need to maintain the history. In operational systems, we may overwrite the modified attributes as we may not need the historical aspects of data. Since our primary target in data warehousing is to analyze data with the perspective of history, we may not be able to simply overwrite the data and we need to implement special techniques to maintain the history considering analytical and volume aspects of the data warehouse. This implementation is done using Slowly Changing Dimensions in Data Warehouse.

What are Slowly Changing Dimensions

Before discussing the details of Slowly Changing Dimensions (SCDs), let us list the different slowly changing dimensions as shown in the below table.

SCD Type

Summary

Type 0

Ignore any changes and audit the changes.

Type 1

Overwrite the changes

Type 2

History will be added as a new row.

Type 3

History will be added as a new column.

Type 4

A new dimension will be added

Type 6

Combination of Type 2 and Type 3

Now let us look at each type of slowly changing dimension.

SCD Type 0

There are situations where you ignore any changes. For example, when an employee joined an organization, there are joined related attributes such as joined Designation and JoinedDate, etc. that should not change over time.

The following is the example for Type 0 of Slowly Changing Dimensions in Data Warehouse.

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (1)

In the above Customer Dimension, FirstDesignation, JoinedDate and DateFirstPurchase are the attributes that will not be updated which is Type 0 SCD.

SCD Type 1

In the Type 1 SCD, you simply overwrite data in dimensions. There can be situations where you don’t have the entire data when the record is initiated in the dimension. For example, when the customer record is initiated, you may not get all attributes. Therefore, when the customer record is initiated at the operational database, there will be empty or null records in the customer records. Once the ETL is executed, those empty records will be created in the data warehouse. Once these attributes are filled in the operational databases, that has to be updated in the data warehouse.

Type 1 SCDs are identifying if the existing attributes are null and you are receiving a value from the operational table.

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (2)

In the above Customer Dimension table, the AnnualIncome of customers CustomerKey 11015 and 11019 are NULL. When these records are updated in the operational database, those values should be updated in the data warehouse without considering those are historical values.

SCD Type 2

Type 2 Slowly Changing Dimensions in Data warehouse is the most popular dimension that is used in the data warehouse. As we discussed data warehouse is used for data analysis. If you need to analyze data, you need to accommodate historical aspects of data. Let us see how we can implement SCD Type 2.

For the SCD Type 2, we need to include three more attributes such as StartDate, EndDate and IsCurrent as shown below.

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (3)

In the above customer dimension, there are two records and let us say that customer whose CustomerCode is AW00011012, has been promoted to Senior Management. However, if you simply update the record with the new value, you will not see the previous records. Therefore, a new record will be created with a new CustomerKey and a new Designation. However, other attributes will be remaining the same.

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (4)

As you can see in the above figure, CustomerCode AW00011012 has a new record with 11013. All the new transactions will be related to CustomerKey 11013 while previous transactional are related to CustomerKey 11012. This mechanism helps to preserve the historic aspect of the customer as shown in the below query.

1

2

3

4

SELECTC.Designation,SUM(SalesAmount) SalesAmount,SUM(TotalProductCost)TotalProductCost

FROM FactInternetSales F

INNER JOIN Dim_Customer C ON F.CustomerKey = C.CustomerKey

GROUP BY C.Designation

Once the query is executed, the following results will be observed.

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (5)

As you can see Management designation can be seen in the above result which means that it has covered the historical aspects. Type 2 SCD is one of the implementations where you cannot avoid surrogate keys in dimensional tables in the data warehouse.

SCD Type 3

Type 3 Slowly Changing Dimension in Data warehouse is a simple implementation where history will be kept in the additional column. If we relate the same scenario that we discussed under Type 2 SCD to Type 3 SCD, the customer dimension would look like below.

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (6)

As you can see, historical aspects of the data are preserved as a different column. However, this method will not be scalable if you want to preserve history. Further, this technique will allow only to keep the last version of the history, unlike Type 2 SCD.

Typically, this would be better suited to implement name changes of an employee. In some cases, female employees will change their names after their marriage. In such situations, you can use Type 3 SCD since these types of changes will not occur rapidly.

SCD Type 4

As we discussed in SCD type 2, we maintain the history by adding a different version of the row to the dimension. However, if the changes are rapid in nature Type 2 SCD will not be scalable.

For example, let us assume we want to keep the customer risk type depending on his previous payment. Since this is an attribute related to the customer, it should be stored in a customer dimension. This means every month there will be a new version of the customer record. If you have 1000 customers, you are looking at 12,000 records per month. As you can imagine this Slowly Changing Dimensions in Data Warehouse is not scalable.

Following is the relationship between the Fact and the Customer Dimension table.

SCD Type 4 is introduced in order to fix this issue. In this technique, a rapidly changing column is moved out of the dimension and is moved to a new dimension table. This new dimension is linked to the fact table as shown in the below diagram.

With the above implementation of Type 4 Slowly Changing Dimensions in Data Warehouse, you are eliminating the unnecessary volume in the main dimension. However, still you have the capabilities of performing the required analysis.

SCD Type 6

Type 6 Slowly Changing Dimensions in Data Warehouse is a combination of Type 2 and Type 3 SCDs. This means that Type 6 SCD has both columns are rows in its implementation.

With this implementation, you can further improve the analytical capabilities in the data warehouse. If you want to find out an analysis between current and historical occupation, you can use the following query.

1

2

3

4

5

SELECTC.Occupation,C.CurrentOccupation,

SUM(SalesAmount) SalesAmount,SUM(TotalProductCost)TotalProductCost

FROM FactInternetSales F

INNER JOIN Dim_Customer C ON F.CustomerKey = C.CustomerKey

GROUP BY C.Occupation,C.CurrentOccupation

Above query will provide the following result:

Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (10)

Without Type 6, Slowly Changing Dimensions in Data Warehouse, complex queries have to be used.

In the Type 6 SCD, not only the current occupation, you can use the first occupation as well in order to provide more analysis.

Conclusion

Slowly Changing Dimensions in Data Warehouse are used to perform different analyses. This article provides details of how to implement Different types of Slowly Changing Dimensions such as Type 0, Type 1, Type 2, Type 3, Type 4 and Type 6. Type 2 and Type 6 are the most commonly used dimension in a data warehouse.

  • Author
  • Recent Posts

Dinesh Asanka

Dinesh Asanka is MVP for SQL Server Category for last 8 years. He has been working with SQL Server for more than 15 years, written articles and coauthored books. He is a presenter at various user groups and universities. He is always available to learn and share his knowledge.

View all posts by Dinesh Asanka

Latest posts by Dinesh Asanka (see all)

  • Testing Type 2 Slowly Changing Dimensions in a Data Warehouse - May 30, 2022
  • Incremental Data Extraction for ETL using Database Snapshots - January 10, 2022
  • Use Replication to improve the ETL process in SQL Server - November 4, 2021

Related posts:

  1. Managing A Slowly Changing Dimension in SQL Server Integration Services
  2. Data understanding and preparation – basic work with datasets
  3. Two methods for restoring a data warehouse/data mart environment
  4. Analysis Services (SSAS) Cubes – Dimension Attributes and Hierarchies
  5. Analysis Services (SSAS) Multidimensional Design Tips – Creating Dimensions
Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses (2024)
Top Articles
Latest Posts
Article information

Author: Foster Heidenreich CPA

Last Updated:

Views: 6508

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Foster Heidenreich CPA

Birthday: 1995-01-14

Address: 55021 Usha Garden, North Larisa, DE 19209

Phone: +6812240846623

Job: Corporate Healthcare Strategist

Hobby: Singing, Listening to music, Rafting, LARPing, Gardening, Quilting, Rappelling

Introduction: My name is Foster Heidenreich CPA, I am a delightful, quaint, glorious, quaint, faithful, enchanting, fine person who loves writing and wants to share my knowledge and understanding with you.