Sunday, July 21, 2013

The Fuzzy Lookup Transformation

A lookup becomes Fuzzy when it can match to records that are similar, but not identical to, the lookup key.
The Lookup transformation returns either an exact match or nothing from the reference table, while the Fuzzy Lookup transformation uses fuzzy matching to return one or more close matches from the reference table.

Fuzzy lookup can be used for cleansing the data or to check the quality before loading data to the destination.
For e.g. let us consider a scenario where the Employee data coming from the source file may have some of the Employee Names that are misspelt or have bad characters. As per business requirement we want the Source data to be cleansed before loading it to destination. The source data will be matched with the Reference data set(Having correct data entries) and the close or perfect match result set will be loaded in the destination table.

Matching of the records of Primary(source) dataset and reference dataset is done by configuring the “Similarity Threshold” scale in the Fuzzy Look up editor which is scaled from 0-1.


