How best to allow for future custom facts and dimensions while defining a new star schema
4 posters
Page 1 of 1
How best to allow for future custom facts and dimensions while defining a new star schema
Hi - Looking for best practice guidelines regarding setting up of a new star schema that has to potentially allow for custom additions of new facts and dimensions with no requirement of backfilling history for the newly added facts. I did search for similar topics - any pointers to other articles with this answer also appreciated.
Last edited by kmdata on Mon Feb 20, 2012 10:08 pm; edited 1 time in total (Reason for editing : incorrect usage of tags around)
kmdata- Posts : 4
Join date : 2012-02-20
Re: How best to allow for future custom facts and dimensions while defining a new star schema
There's no magic here other than keeping the historical data around. Ideally, you keep it in the same format as your regular ETL processes so that you don't have to write special historical load mappings.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: How best to allow for future custom facts and dimensions while defining a new star schema
Thanks for you response. What if there is NO backfilling requirement to support for the new facts and dimensions that come up in the future - how should the star design be prepared to handle addition of new dimensions and facts? Will we be able to use conformed dimensions to plug in the new schema elements as a new star into something like a virtual cube to facilitate unified analysis? Are there other approaches to consider here? Thanks in advance.
kmdata- Posts : 4
Join date : 2012-02-20
Re: How best to allow for future custom facts and dimensions while defining a new star schema
Hi,
This requriment can be managed by making sure facts contain lowest granulirity of data. If by seeing the current requirment we take the granularity which is not lowest it will be difficult in future when a requirment comes which requires lower granulaity then what you have.
One thing we should also keep in mind for example Day data can be rolled up to Weekly as well as monthly and yearly also. But weekly data can not be rolled up to monthly or yearly.
So these small issues should be taken care of in the beginning.
This requriment can be managed by making sure facts contain lowest granulirity of data. If by seeing the current requirment we take the granularity which is not lowest it will be difficult in future when a requirment comes which requires lower granulaity then what you have.
One thing we should also keep in mind for example Day data can be rolled up to Weekly as well as monthly and yearly also. But weekly data can not be rolled up to monthly or yearly.
So these small issues should be taken care of in the beginning.
Vishy- Posts : 75
Join date : 2012-02-28
Location : India
Re: How best to allow for future custom facts and dimensions while defining a new star schema
If there isn't a backfill requirements, it simple to add new measures, dimensional attributes or new dimensions. For measures and attributes just add the columns to the table. Set the columns to a default value or leave them null if you prefer. For new dimensions, add the FK to the fact(s) and set them all to reference a 'not applicable' row in the new dimension table.
The work is not in the schema, but the ETL processes and testing necessary to move the new data. The question is, is this data that could have been loaded from day 1?
The work is not in the schema, but the ETL processes and testing necessary to move the new data. The question is, is this data that could have been loaded from day 1?
Re: How best to allow for future custom facts and dimensions while defining a new star schema
Thanks ngalemmo. Understand the answer regarding the schema changes. Not sure I follow the heads-up regarding "moving the new data" in the ETL process. Wouldn't it be as simple as including the new facts and dimensions in the ETL process while pulling from the source and updating the star? The dimensions and facts that we see getting added in the future are going to be independent of the pre-existing set... So the impact on ETLs and testing is only going to come via increased volume and not complexity as we see it. Are you alluding to a case where the newly added facts/dims are related to preexisting dims in non-trivial ways?
kmdata- Posts : 4
Join date : 2012-02-20
Re: How best to allow for future custom facts and dimensions while defining a new star schema
I assumed you meant adding new measures or attributes to existing facts and dimensions.
New fact tables and related dimensions is dead simple. Just add the tables, each star is independent of the others (unless it is an aggregate). Integration with existing facts depends on the new facts having some dimensions (or at least dimensional attributes) in common. Always use existing dimensions (with new facts) when you can,
New fact tables and related dimensions is dead simple. Just add the tables, each star is independent of the others (unless it is an aggregate). Integration with existing facts depends on the new facts having some dimensions (or at least dimensional attributes) in common. Always use existing dimensions (with new facts) when you can,
Similar topics
» Help defining Facts and Dimensions for a General Ledger Transaction Table
» Only way to pull data from star/snowflake schema is by using facts?
» How to handle multiple many-to-many dimensions in a single star schema
» How to populate custom fact table with core facts
» Loading Future Dated Rows to Dimensions
» Only way to pull data from star/snowflake schema is by using facts?
» How to handle multiple many-to-many dimensions in a single star schema
» How to populate custom fact table with core facts
» Loading Future Dated Rows to Dimensions
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum