Dimensions with effective and end dates - best design and best way to join to fact tables?

ccr Wed Apr 30, 2014 10:30 am

We have many different organization hierarchies in my corporation. They are volatile and have minor changes during the quarter and major changes at the end of the quarter. These organization changes and also attributes about entities have an Effective date and an End Date.
The users expect to be able to choose which hierarchy to use for a report ( Corporate, Operational, Food Service) and possibly also filter or include attributes for reporting as of the last day in the selection time frame. (Snapshot)

The thing is Users wil build ad-hoc queris as well as scheduled reports for the business. So I'm thinking that joining on something like an Organization_Key and DAY_DATE between effective date and end date from the fact table will not work. or would only work if users pull one day at a time. I'm hoping this is more simple than I think it is.
Any help is appreciated.

hkandpal Wed Apr 30, 2014 12:18 pm

Hi,

in a hierarchial dmiension one way you can do is store the diension as flat, that means if the hirearchy is 18 level then have the level1 to level18 stored and if any thing changes in between then store compelte hierarchy. This will help in generating the query for old historical data.

thanks

Himanshu

nick_white Wed Apr 30, 2014 12:42 pm

ccr wrote:We have many different organization hierarchies in my corporation. They are volatile and have minor changes during the quarter and major changes at the end of the quarter. These organization changes and also attributes about entities have an Effective date and an End Date.
The users expect to be able to choose which hierarchy to use for a report ( Corporate, Operational, Food Service) and possibly also filter or include attributes for reporting as of the last day in the selection time frame. (Snapshot)

The thing is Users wil build ad-hoc queris as well as scheduled reports for the business. So I'm thinking that joining on something like an Organization_Key and DAY_DATE between effective date and end date from the fact table will not work. or would only work if users pull one day at a time. I'm hoping this is more simple than I think it is.
Any help is appreciated.

Q1. For each hierarchy, do you need to report using the state of the hierarchy at the time of the fact, the state of the hierarchy at the time of running the report, or both, or something different?

Q2. When you say "include attributes for reporting as of the last day in the selection time frame" do you mean something like "show me all transactions in January where the customer's value was 'gold' on the 31st of Jan, even if the customer's value wasn't 'gold' when the transaction actually occurred"?

ccr Wed Apr 30, 2014 1:02 pm

Q1:The business expects to receive the hierarchy they choose that was effective on the last day of whatever timeframe they have selected. Because of how our Date Dimension is setup they can choose from several date types without choosing a start and enddate. That is why I am wondering if there is a standard way to design a variable or somehow to choose the max(day_date from the fact table.

Q2: Exactly. Say they want to run a report of carwash activity, then they would only want to include stores that had the car-wash attribute effective for the ending day of their report selection.

Note: there are over 1500 entities, 20 attributes with history. 25% of the entities may have average of 200-250 organization history versions, The rest may have 50-150.

So lets say the users pulled a sales report by corporate hierarchy for first quarter 2014. They would pull 201401 = the calendar_dim.. fy_quarterString.
I am having a problem seeing how I can correctly join the hierarchies and attributes to a fact table (There will be many) to bring back the one to one relationship Users expect of a snapshot.

nick_white Wed Apr 30, 2014 1:33 pm

OK, next question: are the time periods that someone can query by fixed (e.g. a week/month/quarter year) or random (e.g. between 12th Jan 2014 and 5th April 2014)?

If the dates can be random then I don't believe (but am happy to be proved wrong) that there is a dimensional model design that will support this fully. The only solution I can think of is to write (probably quite complex) SQL - possibly hidden from you end-users to some extent by using DB View(s) e.g. Run a sub-query to find all stores that at date period end had the carwash attribute, plug the results as a filter into the main query for carwash activities. The downside to this is that you are limiting your users to queries that you have pre-built for them and are losing a lot of the flexibility a dimensional model normally gives to them.

If the dates are fixed then potentially you can create aggregates for each possible date period and ensure that each fact record uses the surrogate keys of the dimension records effective at the end of that date period. Obviously this requires that all your relevant dimensions to be SCDs. Similarly you would create/use date period versions of each hierarchy and fix their association to these aggregate facts.

To be honest, I am facing a very similar design issue and have yet to come up with a solution that doesn't feel like a compromise/inelegant. If someone else can provide a solution I'd be very happy

Sponsored content

Dimensions with effective and end dates - best design and best way to join to fact tables?

Dimensions with effective and end dates - best design and best way to join to fact tables?

Re:Dimensions with effective and end dates - best design and best way to join to fact tables?

Re: Dimensions with effective and end dates - best design and best way to join to fact tables?

Thanks for your questions and replies

Re: Dimensions with effective and end dates - best design and best way to join to fact tables?

Re: Dimensions with effective and end dates - best design and best way to join to fact tables?