Sunday, December 2, 2007

SSIS: Fuzzy Lookup

Sample.txt

1.00010001;AED;Afghanl;0.99960016
1.00010001;ADD;Algerian Dinarl;1.00100100
1.00020004;ARR;Argentine Pesol;0.99990001

Referenced Table:dbo.DimCurrency
The data in Sample.txt, AED, ADD, ARR can not be used for transformation because the referenced table does not have the exact match. Fuzzy lookup control can solve the problem. Based on the CurrencyAlternateKey and CurrencyName, the fuzzy lookup control can find the CurrencyKey.


Flat File Source:
Column 0 --AverageRate --numeric [DT_NUMERIC], Data Precision:18 and DataScale:8
Column 1--CurrencyAlternateKey--Unicode text stream [DT_WSTR], OutputWidth:3
Column 2--CurrencyName--Unicode text stream [DT_WSTR], OutputWidth:50

Column 3 --CurrentRate--numeric [DT_NUMERIC], Data Precision:18 and DataScale:8
Fuzzy Lookup:



Based on CurrencyAlternateKey and CurrencyName, the CurrencyKey will be closely matched.


Flat File Destination:


Execute the package and open the output file:


1.00010001,AED,Afghanl,.99960016,1
1.00010001,ADD,Algerian Dinarl,1.00100100,2
1.00020004,ARR,Argentine Pesol,.99990001,3

Adjust the Similarity threshold to 0.50




Re-execute the package, the output file will have:

1.00010001,AED,Afghanl,.99960016,
1.00010001,ADD,Algerian Dinarl,1.00100100,2
1.00020004,ARR,Argentine Pesol,.99990001,3


The CurrencyKey of the first row cannot be located. If you move the Similarity threshold to 0.8, the CurrencyKey of the first row and second row cannot be located.