Where to implement SCD type 2 logic?
3 posters
Page 1 of 1
Where to implement SCD type 2 logic?
Hello friends,
Recently I joined a program, where the first thing i noticed is that the SCD type 2 logic has been implemented inside the MDM software (in this case, it is Tibco MDM).
Tibco MDM architecture has support for record versioning, among many other features. However, it does not have out-of-the-box support for type 2 logic as such (which is normal and obvious). So, in the past, the previous system integrator did some tweak; they wrote a custom logic for handling type 2 entities in Java and plugged that into the MDM architecture.
Life was smooth for a while until some day when we realized that the custom logic has flaws and has been giving rise to DQ issues.
If you want to know more - like, what exactly the custom logic does, the type of DQ issues, etc, please let me know. I am not mentioning these yet because I am going to ask a fundamental question:
What is the best practice? Should the type 2 logic be implemented in the MDM tool or during dimension load (i.e. in the ETL) in the EDW?
I am not sure if this topic has been already discussed in this forum or any other forum. If it has been, then please provide me the pointer to the same. Thanks in advance.
- Kajal K. Das
Recently I joined a program, where the first thing i noticed is that the SCD type 2 logic has been implemented inside the MDM software (in this case, it is Tibco MDM).
Tibco MDM architecture has support for record versioning, among many other features. However, it does not have out-of-the-box support for type 2 logic as such (which is normal and obvious). So, in the past, the previous system integrator did some tweak; they wrote a custom logic for handling type 2 entities in Java and plugged that into the MDM architecture.
Life was smooth for a while until some day when we realized that the custom logic has flaws and has been giving rise to DQ issues.
If you want to know more - like, what exactly the custom logic does, the type of DQ issues, etc, please let me know. I am not mentioning these yet because I am going to ask a fundamental question:
What is the best practice? Should the type 2 logic be implemented in the MDM tool or during dimension load (i.e. in the ETL) in the EDW?
I am not sure if this topic has been already discussed in this forum or any other forum. If it has been, then please provide me the pointer to the same. Thanks in advance.
- Kajal K. Das
kajaldas007- Posts : 15
Join date : 2012-01-05
Re: Where to implement SCD type 2 logic?
I don't think there is a correct answer; it depends on your requirements, the capabilities of your toolset, etc.
As you already have an MDM system then my starting point would be to try to implement your SCD logic there - but only if your MDM system supports what you are trying to do. I wouldn't try and force your SCD logic into your MDM system just because you have an MDM system.
If your ETL tool is a better place to implement SCD logic, and the downsides of taking it out of your MDM system are minimal, then do it in your ETL system.
Basically, weigh up the pros and cons of each (any) approach and then pick the best solution for your particular needs
As you already have an MDM system then my starting point would be to try to implement your SCD logic there - but only if your MDM system supports what you are trying to do. I wouldn't try and force your SCD logic into your MDM system just because you have an MDM system.
If your ETL tool is a better place to implement SCD logic, and the downsides of taking it out of your MDM system are minimal, then do it in your ETL system.
Basically, weigh up the pros and cons of each (any) approach and then pick the best solution for your particular needs
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Where to implement SCD type 2 logic?
Thanks for your reply.
I agree with you.
As I already said, Tibco MDM does not have inbuilt support for type 2 handling. Significant custom coding had to be done to address the problem. even after that all scenarios were not properly covered, e.g. late arriving records, correction records, inferred rows, etc. And these give rise to various DQ issues.
In ETL, definitely, we would have more flexibility.
BTW, is there any MDM tool that can handle type 2 entities?
I agree with you.
As I already said, Tibco MDM does not have inbuilt support for type 2 handling. Significant custom coding had to be done to address the problem. even after that all scenarios were not properly covered, e.g. late arriving records, correction records, inferred rows, etc. And these give rise to various DQ issues.
In ETL, definitely, we would have more flexibility.
BTW, is there any MDM tool that can handle type 2 entities?
kajaldas007- Posts : 15
Join date : 2012-01-05
Re: Where to implement SCD type 2 logic?
A MDM system is not a suitable mechanism to maintain type 2 dimensions in a data warehouse.
1. A MDM system is not a data warehousing system
2. A MDM system does a lot of stuff that data warehouses do not do and are complex enough as it is.
3. A MDM system has no reason to know what a type 2 dimension is, let alone be able to maintain one.
There are very simple and basic code patterns to read a source and maintain a type 2 dimension. Why complicate things? It's like asking an auto mechanic to do a heart transplant. He could probably do it eventually, but why not save a lot of time and use a heart surgeon?
1. A MDM system is not a data warehousing system
2. A MDM system does a lot of stuff that data warehouses do not do and are complex enough as it is.
3. A MDM system has no reason to know what a type 2 dimension is, let alone be able to maintain one.
There are very simple and basic code patterns to read a source and maintain a type 2 dimension. Why complicate things? It's like asking an auto mechanic to do a heart transplant. He could probably do it eventually, but why not save a lot of time and use a heart surgeon?
Re: Where to implement SCD type 2 logic?
I agree. Having read through these post again I would like to backtrack on my earlier response. The MDM system should provide all the information you require for the source of your SCD dataload (what changed and when) but all SCD logic for loading data should reside in your ETL system
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Where to implement SCD type 2 logic?
Thanks ngalemmo.
I couldn't agree more with you. This is the kind of perspective I was looking for.
Thanks once again.
Thanks nick_white for your concurrence.
I couldn't agree more with you. This is the kind of perspective I was looking for.
Thanks once again.
Thanks nick_white for your concurrence.
kajaldas007- Posts : 15
Join date : 2012-01-05
Similar topics
» Why do I need type 3 and 6 SCDs when I can implement type 7?
» Logic behind Top Down and Bottom Up approach
» Hot Swappable Dimension -> best practice for implementation of swapping logic?
» Business Logic: DWH vs. Source system
» Type-2 Dates as Date Data Type ?
» Logic behind Top Down and Bottom Up approach
» Hot Swappable Dimension -> best practice for implementation of swapping logic?
» Business Logic: DWH vs. Source system
» Type-2 Dates as Date Data Type ?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum