Microsoft SSIS BIDS SCD Type 2 Development
2 posters
Page 1 of 1
Microsoft SSIS BIDS SCD Type 2 Development
Are there thoughts on which path would be best to take in terms of which tool to use for Type 2 SCDs?
So far I see:
1) Microsoft's canned transformation.
2) Hand building transformations and T-SQL.
3) Dimension Merge SCD transformation from codeplex.
I am having a little trouble with the Dimension Merge transformation in terms of specifying Expiry Timestamps and it runs successfully once with row expiration and such, but then it fails the next time and just inserts all the rows as new again; thereby errantly ballooning the dimension table.
Thank you for your insights.
So far I see:
1) Microsoft's canned transformation.
2) Hand building transformations and T-SQL.
3) Dimension Merge SCD transformation from codeplex.
I am having a little trouble with the Dimension Merge transformation in terms of specifying Expiry Timestamps and it runs successfully once with row expiration and such, but then it fails the next time and just inserts all the rows as new again; thereby errantly ballooning the dimension table.
Thank you for your insights.
blynch- Posts : 18
Join date : 2011-10-16
Re: Microsoft SSIS BIDS SCD Type 2 Development
Hi blynch,
I've built several reliable and reasonably performant solutions on "Microsoft's canned transformation". Here are the caveats:
- doesn't properly handle multiple Business Keys in one execution (only detects one change) - need to filter out duplicates or loop through them
- need to manually tweak the SqlCommand property after running the wizard to avoid considering non-Current rows (e.g. WHERE Row_Is_Current = 'Y')
- tweak the Insert Destination to Fast Load if supported (but don't lock the table)
- can typically handle up to 500,000 dimension change candidates in an hour or so on a commodity server
- Business Key and Row_Is_Current columns in target dimension need an index
- ensure column names in data flow match target table for automatic matching - usually easiest to implement in your Data Flow Sources / Lookup SQL e.g. PRODCOD1 AS Product_Code
On the plus side, the SCD component is very quick to develop, test and maintain - zero code. Considering the volume and complexity of code required for alternatives, that's a huge plus in my book.
Good luck!
Mike
I've built several reliable and reasonably performant solutions on "Microsoft's canned transformation". Here are the caveats:
- doesn't properly handle multiple Business Keys in one execution (only detects one change) - need to filter out duplicates or loop through them
- need to manually tweak the SqlCommand property after running the wizard to avoid considering non-Current rows (e.g. WHERE Row_Is_Current = 'Y')
- tweak the Insert Destination to Fast Load if supported (but don't lock the table)
- can typically handle up to 500,000 dimension change candidates in an hour or so on a commodity server
- Business Key and Row_Is_Current columns in target dimension need an index
- ensure column names in data flow match target table for automatic matching - usually easiest to implement in your Data Flow Sources / Lookup SQL e.g. PRODCOD1 AS Product_Code
On the plus side, the SCD component is very quick to develop, test and maintain - zero code. Considering the volume and complexity of code required for alternatives, that's a huge plus in my book.
Good luck!
Mike
Re: Microsoft SSIS BIDS SCD Type 2 Development
I want to thank you for your very helpful reply. I am still keeping the Microsoft transform in mind. The tone at the Kimball classes (not an endorsement) was that this transform from codeplex had a lot of pros over the Microsoft transform; though what exactly in practice I am still learning.
I do find that this codeplex transform works best if you let it manage all of the fields and not insert any Derived Columns after it. That seems to make it upset.
Also, I find that I am not able to test it during the same day. So I have to load, wait a day, load and test the SCD flows. Makes it a bit slower to develop.
I'm curious how the Microsoft SCD transform performs in this regard.
I do find that this codeplex transform works best if you let it manage all of the fields and not insert any Derived Columns after it. That seems to make it upset.
Also, I find that I am not able to test it during the same day. So I have to load, wait a day, load and test the SCD flows. Makes it a bit slower to develop.
I'm curious how the Microsoft SCD transform performs in this regard.
blynch- Posts : 18
Join date : 2011-10-16
Re: Microsoft SSIS BIDS SCD Type 2 Development
Hi blynch,
I'm quite sure what you are asking - can you clarify?
If you mean how the Start & End Dates work in the Microsoft SCD, they get loaded with a full date-time stamp. The wizard generates a Derived Column transformation that references the system variable @[System::StartTime]. So you can run multiple executions in one day.
Mike
I'm quite sure what you are asking - can you clarify?
If you mean how the Start & End Dates work in the Microsoft SCD, they get loaded with a full date-time stamp. The wizard generates a Derived Column transformation that references the system variable @[System::StartTime]. So you can run multiple executions in one day.
Mike
Re: Microsoft SSIS BIDS SCD Type 2 Development
I've seen posts (on codeplex I think) where the developer says that the Dimension Merge component (from Codeplex) is not designed to be ran more than once on the same day.
This seems to make development a little slow if I am understanding it correctly.
If you do run it more than once on the same day, it erroneously creates a bunch of "New" records that should be either "Unchanged" or "SCD 1 Update Rows".
I have since confirmed the transformation works better if I only run it once per day. I'm a little confused as to why.
It sounds like you would recommend the Microsoft SCD transformation, correct?
Thank you for your insights.
Brad
This seems to make development a little slow if I am understanding it correctly.
If you do run it more than once on the same day, it erroneously creates a bunch of "New" records that should be either "Unchanged" or "SCD 1 Update Rows".
I have since confirmed the transformation works better if I only run it once per day. I'm a little confused as to why.
It sounds like you would recommend the Microsoft SCD transformation, correct?
Thank you for your insights.
Brad
blynch- Posts : 18
Join date : 2011-10-16
Re: Microsoft SSIS BIDS SCD Type 2 Development
Hi Brad,
I've never tried the Dimension Merge component - that limitation put me off and I generally try to stick to the out-of-the-box components if I can. The Microsoft SCD component does seem to work fairly reliably and predictably (within the limitations I outlined previously in this thread).
On the plus side, it handles most SCD scenarios with no coding required.
Good luck
I've never tried the Dimension Merge component - that limitation put me off and I generally try to stick to the out-of-the-box components if I can. The Microsoft SCD component does seem to work fairly reliably and predictably (within the limitations I outlined previously in this thread).
On the plus side, it handles most SCD scenarios with no coding required.
Good luck
Last edited by Mike Honey on Thu Apr 12, 2012 1:59 am; edited 1 time in total (Reason for editing : changed "below" to "previously in this thread")
Re: Microsoft SSIS BIDS SCD Type 2 Development
I took one of my dimension transform-load packages and pulled out the custom Dimension Merge component from codeplex and re-developed it using the Microsoft Slowly Changing Dimension transformation and have to report that once I got the hang of it, it is much easier (and faster) to develop and use. Maybe this will change in the future, but for now I will be going with the Microsoft transformation for the rest of my dimensions.
Thank you Mike.
Brad
Thank you Mike.
Brad
blynch- Posts : 18
Join date : 2011-10-16
Re: Microsoft SSIS BIDS SCD Type 2 Development
Hi Brad, that's good news and thanks for the update. Mike.
Similar topics
» How to make Quality Assurance and unit tests for a DW/BI project?
» Incremental Load Testing -Type 1 & Type 2
» Type 2 SCD handling using the SSIS SCD Wizard
» Business Rule Development
» Modeling tool with support for hierarchies
» Incremental Load Testing -Type 1 & Type 2
» Type 2 SCD handling using the SSIS SCD Wizard
» Business Rule Development
» Modeling tool with support for hierarchies
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum