Creating Surrogate Keys in Slowly Changing Dimensions
By Tom Nonmacher
Slowly Changing Dimensions (SCDs) are often a critical part of data warehousing and business intelligence applications. They provide a way to track changes in your data over time. One of the keys to efficiently managing SCDs is the use of surrogate keys. In this blog post, we will look at how to create surrogate keys in SCDs using various database technologies such as SQL Server 2016, SQL Server 2017, MySQL 5.7, DB2 11.1, and Azure SQL.
A surrogate key is an artificial or substitute key that is used for object identification within a database. It is unique and it provides a way to identify each record in a table uniquely. The surrogate key has no business meaning and is only for the internal use of the database. They are particularly useful in SCDs because they allow you to maintain a history of data changes without affecting the primary key of the table.
In SQL Server, the Identity property can be used to generate surrogate keys. When a new row is created, SQL Server automatically increments the column value by one, starting with a seed value. Here is a simple example of how this can be done:
CREATE TABLE Customer
(
SurrogateKey INT IDENTITY(1,1) PRIMARY KEY,
CustomerID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
-- other columns
)
In MySQL 5.7, the AUTO_INCREMENT attribute can be used to generate surrogate keys. This attribute automatically generates a unique number for each row. Here is an example:
CREATE TABLE Customer
(
SurrogateKey INT AUTO_INCREMENT PRIMARY KEY,
CustomerID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
-- other columns
)
DB2 has a similar feature to SQL Server’s Identity property. The GENERATED ALWAYS AS IDENTITY constraint can be used to create surrogate keys. Here is an example:
CREATE TABLE Customer
(
SurrogateKey INT NOT NULL GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1) PRIMARY KEY,
CustomerID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
-- other columns
)
Azure SQL, being Microsoft’s cloud database service based on SQL Server technology, also uses the Identity property to generate surrogate keys. The syntax is identical to that of SQL Server:
CREATE TABLE Customer
(
SurrogateKey INT IDENTITY(1,1) PRIMARY KEY,
CustomerID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
-- other columns
)
Surrogate keys in SCDs are a powerful tool that allows for better data management and tracking of changes over time. Each of the mentioned database technologies has its own way of generating surrogate keys. It’s important to become familiar with the method used by the database technology in your environment to fully leverage the power of SCDs.