Thursday, July 4, 2013

SURROGATE KEYS and NATURAL KEYS

Hope my last few posts have been helpful in getting a deeper understanding of Data warehouse and especially Facts and Dimensions.

In this article we will look into another concept of DWH which are:

SURROGATE KEYS

“A surrogate key is an artificial or synthetic key that is used as a substitute for a natural key”.

The surrogate key is an automatically generated sequence number which will act as primary key for a dimension and also used in referencing from fact tables.

In this example Org_key is Surrogate Key and Org_ID is natural key.

 
Org_key
Org_ID
Org_Name
HeadQuarters
1
101
XYZ Ltd
Sydney
2
102
PQR Ltd
Cape Town
3
103
ABC Ltd
Beijing

 
The question!!!! –What is use of generating a special key(surrogate) when we already have a natural key coming in from source???.

Let us find the answer to this through the following scenario:

Scenario - 1 :

System1 and System2 have a products table which we integrate in our Dim_Products table in the DWH.

Both the systems have their own unique identifiers for the Products which is Product_ID column.

However, when we try to consolidate the data, we see a conflict in the Natural keys;

For e.g. Product_ID=1 represents “Rice” in the System1 , but there is also a Product_ID=1 in System2 (which represents a different product).

Here is pictorial representation of the scenario(with and without surrogate keys)

                
*The use of Surrogate key, eliminates this confusion and gives a us a clear-cut attribute, that can be used in referencing the product table.

Scenario - 2 :

There are also other scenarios e.g.”98xxxxx256” is a phone number associated with person A. Later person A has changed his number and this number is allotted to person B and then person C; the surrogate key comes into the picture helping us to maintain the history of that original mapping.

 
Some recommended characteristics of a Surrogate key in DWH:

·         System generated (not composed of several values from different domains).

·         Unique system-wide, hence never reused

·         Should not be not editable by the user or application

1 comment:

  1. Thank you very much! I totally understand the benefit of using surrogate keys now.

    ReplyDelete