Sunday, December 2, 2007

SSIS: Lookup

Lookup:Exact Match

Database: AdventureWorksDW

Reference Tables: dbo.DimTime and dbo.DimCurrency

SELECT TimeKey,FullDateALternateKey FROM DimTime


SELECT CurrencyKey,CurrencyAlternateKey FROM DimCurrency

Flat File: sample.txt
1.00050025;AED;9/7/2001 0:00;0.99990001
1.00050024;AED;9/8/2001 0:00;1.001001001
1.00030025;AED;9/9/2001 0:00;1
1.00010001;AED;9/10/2001 0:00;1.00040016
1.00020004;AED;9/11/2001 0:00;0.99980801
1.00050001;AED;9/12/2001 0:00;1.001101211






Data in sample.txt file will be transformed. The transformation references two tables:DimTime and DimCurrency.



BIDS--Create an SSIS project


Drag "Data Flow Task" onto Control Flow pane;


Double Click "Data Flow Task" to navigate to Data Flow pane;





"Flat File Source"

File Name=C:\sample.txt and create the advanced mapping as:



  • Column 0 --AverageRate --numeric [DT_NUMERIC], Data Precision:18 and DataScale:8

  • Column 1--CurrencyAlternateKey--Unicode text stream [DT_WSTR], OutputWidth:3

  • Column 2--FullDateAlternateKey--database timestamp [DT_DBTIMESTAMP]

  • Column 3 --EndOfDayRate--numeric [DT_NUMERIC], Data Precision:18 and DataScale:8

"Lookup" transformation item
Connection to dbo.DimCurrency of AdventureWorksDW database;

Make sure you tick the CurrencyKey in the Available Lookup Columns. OK. Save it.



"Lookup 1 control"



Connection to dbo.DimTime of AdventureWorksDW database;


Make sure you tick the TimeKey column.


"Flat File Destination"

New Flat File Connection Manager: File Name C:\outputFile.txt

Save the package and execute it;

The c:\outputFile.txt will have:


1.00050025,.9999000100,35,69
1.00050024,1.0010010010,35,70
1.00030025,1.0000000000,35,71
1.00010001,1.0004001600,35,72
1.00020004,.9998080100,35,73
1.00050001,1.0011012110,35,74

====

data cleansing

Flat File:sample1.txt
1.00010001;AED;7/3/2001 1:00;0.99960016
1.00010001;AEC;7/4/2001 0:00;1.00100100
1.00020004;AFB;7/5/2001 2:00;0.99990001
1.00020002;ALL;7/6/2001 0:00;1.00040015
1.00050025;ALL;7/7/2001 4:00;0.99990001
1.00050024;ALH;7/8/2001 2:00;1.001001001
1.00030025;AND;7/9/2001 0:00;1
1.00010001;AND;7/10/2001 5:00;1.00040016
1.00020004;AMD;7/11/2001 1:00;0.99980801
1.00050001;AMD;7/12/2001 0:00;1.00110121

AED and AEC are not in the dbo.DimCurrency. Similarly, the time (in red) is not in dbo.DimTime table.

If you apply the sample1.txt to the above package, it fails with the error message:


Information: 0x402090DD at Data Flow Task, Flat File Source [1]: The processing of file "C:\sample1.txt" has ended.
Information: 0x402090DD at Data Flow Task, Flat File Destination [745]: The processing of file "D:\AOL\outputFile.txt" has ended.
Information: 0x40043009 at Data Flow Task, DTS.Pipeline: Cleanup phase is beginning.
Information: 0x4004300B at Data Flow Task, DTS.Pipeline: "component "Flat File Destination" (745)" wrote 0 rows.