Monday, December 3, 2007

SSIS: Fuzzy Grouping

Source file: sample.txt

Mr.,Gustavo,Achong,gustavo0@adventure-works.com
Ms.,Catherine,Abel,catherine0@adventure-works.com
Ms.,Kim,Abercrombie,kim2@adventure-works.com
Sr.,Humberto,Acevedo,humberto0@adventure-works.com
Sr.,Pilar,Ackerman,pilar1@adventure-works.com
Mr.,Gustavo1,Achong,gustavo1@adventure-works.com
Mr,Gustavo2,Achong,gustavo2@adventure-works.com
Ms.,Catherine1,Abel,catherine1@adventure-works.com
Ms,Kim,Abercrombie,kim2@adventure-works.com
Ms.,Kim,Abercrombie,kimAbercrombie@adventure-works.com


BIDS--new project--Drag "Data Flow control" onto Control Flow pane and double click it;

Flat File Source:
Connection Managers: Flat
File Name: C:\Sample.txt
Fuzzy Grouping:
As seen, select the fuzzy criteria: Title, FirstName,LastName, and EmailAddress. Similarity Threshold: 0.8

Flat File Destination:
Flat File Connection Manager:Out
File Name: c:\output.txt
Input and output mapping:

1,1,1,Mr.,Gustavo,Achong,gustavo0@adventure-works.com
6,1,0.92130822,Mr.,Gustavo1,Achong,gustavo1@adventure-works.com
7,1,0.92130822,Mr,Gustavo2,Achong,gustavo2@adventure-works.com
2,2,1,Ms.,Catherine,Abel,catherine0@adventure-works.com
8,2,0.93621671,Ms.,Catherine1,Abel,catherine1@adventure-works.com

4,4,1,Sr.,Humberto,Acevedo,humberto0@adventure-works.com
5,5,1,Sr.,Pilar,Ackerman,pilar1@adventure-works.com
9,9,1,Ms,Kim,Abercrombie,kim2@adventure-works.com
3,9,0.98750001,Ms.,Kim,Abercrombie,kim2@adventure-works.com

10,10,1,Ms.,Kim,Abercrombie,kimAbercrombie@adventure-works.com

_key_out identifies the grouping.