Did anybody read this article ???
+2
John Simon
Vishy
6 posters
Page 1 of 1
Did anybody read this article ???
http://sqlblog.com/blogs/jamie_thomson/archive/2009/11/28/debunking-kimball-effective-dates.aspx
Vishy- Posts : 75
Join date : 2012-02-28
Location : India
Re: Did anybody read this article ???
I like Jamie's posts, but this is much ado about nothing.
He doesn't really present a case as to why removing End Dates is beneficial - other than to fix poor ETL. He could simply run an update statement each night to fix the end dates if required.
I don't like having an End Date be the same as the succeeding Start Date - I think it's easier to "end" the End Date one time period before the succeeding Start Date - generally that would be a day for a warehouse loaded on a daily basis.
He doesn't really present a case as to why removing End Dates is beneficial - other than to fix poor ETL. He could simply run an update statement each night to fix the end dates if required.
I don't like having an End Date be the same as the succeeding Start Date - I think it's easier to "end" the End Date one time period before the succeeding Start Date - generally that would be a day for a warehouse loaded on a daily basis.
Can we logically delete dimension records by setting SCD dates
I agree with John about focusing on a robust ETL process instead of trying to minimise the data redundancy at price of less clarity on physical model, although I found the debates around the topic are quite interesting.
However some of the posts did remind me of a relevant issue with using SCD dates to cater for deletes. On the surface, I thought expiring (end date) a dimension record can achieve the effect of soft deletion, but reality is never that simple. How can we effectively achieve the logical (soft) deletes in SCD dimensions so that we can report (query) on historical data without influence of the deleted dimension records.
However some of the posts did remind me of a relevant issue with using SCD dates to cater for deletes. On the surface, I thought expiring (end date) a dimension record can achieve the effect of soft deletion, but reality is never that simple. How can we effectively achieve the logical (soft) deletes in SCD dimensions so that we can report (query) on historical data without influence of the deleted dimension records.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Did anybody read this article ???
Not to keen on the idea, and you wind up with much more complex SQL. It's unnecessary. Besides, since when was data redundancy a problem in a dimensional model?
Re: Did anybody read this article ???
I've seen this article a few times - always gives me a chuckle.
The problem he appears to want to address is inaccurate data in the effective-end-date column. He even labels it a "data integrity issue".
I always ask myself, since the column is generated by the ETL process - if the data is inaccurate, why not fix the ETL code? That would also result in the desired data integrity, without the overhead/complication of creating views to calculate effective-end-date and latest-flags on the fly.
LAndrews- Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada
Re: Did anybody read this article ???
I agree with all you guys comments, when I was going through this article and comments posted there, nothing was matching with what I was thinking so I thought "am I missing something here ".
First thing that put me off was talking about "redundancy !!!" , how can you talk about redundancy in DWH and talk about writing complex queries and believe me I even tried the sql posted there but that updated all the previous rows enddate and not only the immediate previous row.
We have to always decide whether our decision is going to improve query performance or ETL performance. I always feel if any of our decisoin is going improve ETL but badly affect query performance then that decision is against the DWH spirit.
customer should always be put first as his experience with our solutoin is going to decide whether we have a delivered a good solution or bad solution.
First thing that put me off was talking about "redundancy !!!" , how can you talk about redundancy in DWH and talk about writing complex queries and believe me I even tried the sql posted there but that updated all the previous rows enddate and not only the immediate previous row.
We have to always decide whether our decision is going to improve query performance or ETL performance. I always feel if any of our decisoin is going improve ETL but badly affect query performance then that decision is against the DWH spirit.
customer should always be put first as his experience with our solutoin is going to decide whether we have a delivered a good solution or bad solution.
Vishy- Posts : 75
Join date : 2012-02-28
Location : India
Re: Did anybody read this article ???
Hello,
Does this design even matter for performance if you are using it for your Historical schema? Data Mart schema being produced by ETL workflow.
Your queries are hitting Data Mart which can already have EndDate produces by ETL if needed.
Historical layer keeps data redundancy - useful when doing a lot in-memory analytics.
The approach taken by Jamie seems reasonable to me if used in historical layer.
Looking forward for your comments on that matter.
Does this design even matter for performance if you are using it for your Historical schema? Data Mart schema being produced by ETL workflow.
Your queries are hitting Data Mart which can already have EndDate produces by ETL if needed.
Historical layer keeps data redundancy - useful when doing a lot in-memory analytics.
The approach taken by Jamie seems reasonable to me if used in historical layer.
Looking forward for your comments on that matter.
jangorecki- Posts : 2
Join date : 2016-01-14
Similar topics
» How to model article dimension for unpredictable changes with multiple versions and multiple article categories
» "A Trio of Interesting Snowflakes" article
» Historical and Incremental Loading of Dimension and Fact tables
» dimension attribute denormalisation in fact table
» Interesting article listing 9 women writers who have made major impacts on the Business Intelligence & Big Data world
» "A Trio of Interesting Snowflakes" article
» Historical and Incremental Loading of Dimension and Fact tables
» dimension attribute denormalisation in fact table
» Interesting article listing 9 women writers who have made major impacts on the Business Intelligence & Big Data world
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum