Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Did anybody read this article ???

+2
John Simon
Vishy
6 posters

Go down

Did anybody read this article ??? Empty Did anybody read this article ???

Post  Vishy Fri Mar 09, 2012 4:43 am


http://sqlblog.com/blogs/jamie_thomson/archive/2009/11/28/debunking-kimball-effective-dates.aspx

Vishy

Posts : 75
Join date : 2012-02-28
Location : India

Back to top Go down

Did anybody read this article ??? Empty Re: Did anybody read this article ???

Post  John Simon Fri Mar 09, 2012 5:10 am

I like Jamie's posts, but this is much ado about nothing.
He doesn't really present a case as to why removing End Dates is beneficial - other than to fix poor ETL. He could simply run an update statement each night to fix the end dates if required.

I don't like having an End Date be the same as the succeeding Start Date - I think it's easier to "end" the End Date one time period before the succeeding Start Date - generally that would be a day for a warehouse loaded on a daily basis.

John Simon

Posts : 186
Join date : 2009-05-29
Location : Melbourne, Australia

http://jsimonbi.wordpress.com

Back to top Go down

Did anybody read this article ??? Empty Can we logically delete dimension records by setting SCD dates

Post  hang Fri Mar 09, 2012 8:13 am

I agree with John about focusing on a robust ETL process instead of trying to minimise the data redundancy at price of less clarity on physical model, although I found the debates around the topic are quite interesting.

However some of the posts did remind me of a relevant issue with using SCD dates to cater for deletes. On the surface, I thought expiring (end date) a dimension record can achieve the effect of soft deletion, but reality is never that simple. How can we effectively achieve the logical (soft) deletes in SCD dimensions so that we can report (query) on historical data without influence of the deleted dimension records.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Did anybody read this article ??? Empty Re: Did anybody read this article ???

Post  ngalemmo Fri Mar 09, 2012 1:49 pm

Not to keen on the idea, and you wind up with much more complex SQL. It's unnecessary. Besides, since when was data redundancy a problem in a dimensional model?
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Did anybody read this article ??? Empty Re: Did anybody read this article ???

Post  LAndrews Fri Mar 09, 2012 5:47 pm


I've seen this article a few times - always gives me a chuckle.

The problem he appears to want to address is inaccurate data in the effective-end-date column. He even labels it a "data integrity issue".
I always ask myself, since the column is generated by the ETL process - if the data is inaccurate, why not fix the ETL code? That would also result in the desired data integrity, without the overhead/complication of creating views to calculate effective-end-date and latest-flags on the fly.


LAndrews

Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada

Back to top Go down

Did anybody read this article ??? Empty Re: Did anybody read this article ???

Post  Vishy Sat Mar 10, 2012 3:22 am

I agree with all you guys comments, when I was going through this article and comments posted there, nothing was matching with what I was thinking so I thought "am I missing something here ".

First thing that put me off was talking about "redundancy !!!" , how can you talk about redundancy in DWH and talk about writing complex queries and believe me I even tried the sql posted there but that updated all the previous rows enddate and not only the immediate previous row.

We have to always decide whether our decision is going to improve query performance or ETL performance. I always feel if any of our decisoin is going improve ETL but badly affect query performance then that decision is against the DWH spirit.

customer should always be put first as his experience with our solutoin is going to decide whether we have a delivered a good solution or bad solution.

Vishy

Posts : 75
Join date : 2012-02-28
Location : India

Back to top Go down

Did anybody read this article ??? Empty Re: Did anybody read this article ???

Post  jangorecki Thu Jan 14, 2016 12:14 pm

Hello,
Does this design even matter for performance if you are using it for your Historical schema? Data Mart schema being produced by ETL workflow.
Your queries are hitting Data Mart which can already have EndDate produces by ETL if needed.
Historical layer keeps data redundancy - useful when doing a lot in-memory analytics.
The approach taken by Jamie seems reasonable to me if used in historical layer.
Looking forward for your comments on that matter.

jangorecki

Posts : 2
Join date : 2016-01-14

Back to top Go down

Did anybody read this article ??? Empty Re: Did anybody read this article ???

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum