Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Date Dimension: Representing partial dates/Imputing date values

4 posters

Go down

Date Dimension: Representing partial dates/Imputing date values Empty Date Dimension: Representing partial dates/Imputing date values

Post  tjmaglio Thu Apr 10, 2014 12:18 pm

Hi,
 
Suppose your use case is to allow for both complete and partial dates, where dates are intentionally obfuscated for privacy reasons.
What is considered best-practice for handling this scenario?
Should a separate dimension be created to handle partial dates where the grain is limited to month or year
or should one date dimension be kept, with partial dates being handled by imputing them to obtain the granularity of a day(for instance APR-2014 would get translated into APR-01-2014)?

datekeystring_receivedimputed_datequarter
20140101'2014-01-01'2014-01-01Q1
20140100'2014-01'2014-01-01Q1

Any advice/suggestions are appreciated. Thank you

tjmaglio

Posts : 2
Join date : 2014-04-10

Back to top Go down

Date Dimension: Representing partial dates/Imputing date values Empty Re: Date Dimension: Representing partial dates/Imputing date values

Post  ngalemmo Thu Apr 10, 2014 12:45 pm

If the obfuscated date is supposed to represent a date then a month dimension isn't going to help, particularly if the same fact date has both proper and obfuscated values.

Create a row in the date dimension and populate the date related columns based on how such dates are to be presented to the business.

ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Date Dimension: Representing partial dates/Imputing date values Empty Re: Date Dimension: Representing partial dates/Imputing date values

Post  BoxesAndLines Thu Apr 10, 2014 4:02 pm

If your facts are at the monthly grain, then I would create a month dimension.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Date Dimension: Representing partial dates/Imputing date values Empty Re: Date Dimension: Representing partial dates/Imputing date values

Post  tjmaglio Fri Apr 11, 2014 12:36 pm

Thanks for your help.

To be more explicit-the heart of the problem is that there's a mixture of grain unless a partial date is imputed to be at the grain of a day.
There is a real business need to compare data where the date is known with data where only a piece of a date is known.
Ideally, the fact that the date has been imputed and is not what was received should be represented.

The date dimension I have in mind would look like so:



       
       
       
       
DateKeyFullDateKeyImputedfulldatekeyImputedMonthKeyImputedDayKeyCalendarYearCalendarQuarter
201404114/11/2014201404114112014Q2
20140011 201401111112014Q2
20140400 20140401412014Q2

The fact table would be linked to the date most suitable - whether it is the actual date from source data or it is an imputed date.
For practical purposes, there would have to be imputed dates handling the different combinations of unknown date parts up to all three being unknown in which case you're going to be pointing to the unknown date or invalid date key.

Thanks again for any other input- much appreciated.

tjmaglio

Posts : 2
Join date : 2014-04-10

Back to top Go down

Date Dimension: Representing partial dates/Imputing date values Empty Re: Date Dimension: Representing partial dates/Imputing date values

Post  nick_white Mon Apr 14, 2014 3:04 am

It is almost Rule 1 of Dimensional modelling that you define the grain of your fact table and then ensure that all measures in that fact have that grain. If you break this rule you will almost certainly get into a bit of a mess.
Am I right in thinking that by "partial date" you do know the month but not the actual day? If this is the case then I can think of two approaches:
1. If you can assign a day to these partial date records that does make business sense and won't skew other reports then do that - e.g. assign them to the first of the month
2. If you can't follow the first approach then create a fact table which holds all your day facts and then create an aggregate table that rolls this up the month level. For your partial date records either include them in this aggregate fact record or put them in their own "monthly aggregate" table and then put a View across both tables to combine them for querying purposes

nick_white

Posts : 364
Join date : 2014-01-06
Location : London

Back to top Go down

Date Dimension: Representing partial dates/Imputing date values Empty Re: Date Dimension: Representing partial dates/Imputing date values

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum