Reporting on Dimensions
3 posters
Page 1 of 1
Reporting on Dimensions
Hi,
This relates in part to another topic (http://forum.kimballgroup.com/t749-fact-table-as-a-dimension) which has definitely made it clear to me that simply because the focus of a report is a dimension, that doesn't make it a fact!
The product I work on has a basic data warehouse that currently has a few transactional facts with conformed dimensions. Our standard offering on top of this is a BO universe and some standard template reports.
We now have a further requirement to be able to give users the ability to report against reference data. What I mean by reference data in this context is data that would usually only appear in dimensions. This is not a problem for dimensions that are already in our "ETL'D" schema. Where it becomes strange for me is where the requirement is to report against reference data that is not needed for the current transactional data. The way I see it there are 2 options:
I would be interested in hearing people's views on which way to go or what factors should be used to decide which way to go.
Thanks in advance.
This relates in part to another topic (http://forum.kimballgroup.com/t749-fact-table-as-a-dimension) which has definitely made it clear to me that simply because the focus of a report is a dimension, that doesn't make it a fact!
The product I work on has a basic data warehouse that currently has a few transactional facts with conformed dimensions. Our standard offering on top of this is a BO universe and some standard template reports.
We now have a further requirement to be able to give users the ability to report against reference data. What I mean by reference data in this context is data that would usually only appear in dimensions. This is not a problem for dimensions that are already in our "ETL'D" schema. Where it becomes strange for me is where the requirement is to report against reference data that is not needed for the current transactional data. The way I see it there are 2 options:
- Add ETL code for every reference data entity that we need to report against even if the entity is not required for the transactional reports. We would effectively be creating a pretty large snowflake schema where many of the added snowflake entities are only required for the reference data reports not for reports written against the fact tables.
The argument in favour of this is that we will have a single schema that contains tables for all reports whether they are transactional reports or reference data reports. This is more consistent and there is no confusion about whether any given report is against live data or ETL’d data. The argument against this is that it is extra work to do this that is not necessary for performance reasons because unlike the transactional tables, the reference data tables being reported on will never contain many rows. - Allow access to the OLTP system (possibly via views that perform simple joins) so that users can report directly against the base reference data tables. This access would still be via a reporting schema but would be based on views or synonyms onto the OLTP schema rather than ETL’d tables on the reporting schema. Arguments for and against this are the reverse of the above.
I would be interested in hearing people's views on which way to go or what factors should be used to decide which way to go.
Thanks in advance.
allonb- Posts : 3
Join date : 2012-09-03
Re: Reporting on Dimensions
"Centralized" is a common verb used in defining data warehouses. "Federated" gets a lot of publicity, but in practice, I have rarely seen it used. I would consider option 1A. Bring the data into the warehouse and keep as stand alone tables. There is no reason to snowflake existing dimensions to account for this data if it will never be used in conjunction with the associated facts or dimensions.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Reporting on Dimensions
Hi B&L(nice username),
Thanks for the reply. Some of the reference data entities that need to be reported on have foreign keys to dimensions that in turn are referenced by the transactional facts. So if I go for the first option I don't see how I can avoid a large snow-flaked model.
Not sure what you meant by 1A - there were only 2 options, although my BBCode skills might have make it look otherwise Option 1: ETL the extra reference data entities across even though they are not needed as dimensions for existing transactional facts. Option 2: Don't ETL the entities and report directly off the OLTP database.
Thanks for the reply. Some of the reference data entities that need to be reported on have foreign keys to dimensions that in turn are referenced by the transactional facts. So if I go for the first option I don't see how I can avoid a large snow-flaked model.
Not sure what you meant by 1A - there were only 2 options, although my BBCode skills might have make it look otherwise Option 1: ETL the extra reference data entities across even though they are not needed as dimensions for existing transactional facts. Option 2: Don't ETL the entities and report directly off the OLTP database.
allonb- Posts : 3
Join date : 2012-09-03
Re: Reporting on Dimensions
It really is a toss up. There is no reason why reference data reporting could not be done against the OLTP system provided there is some controls over how much and how often. Generally you would provide a series of 'canned' reports with parameters, and discourage ad-hoc reporting. If that's all you need then bringing the data into the data warehouse is not necessary.
But if you do need to support a robust ad-hoc environment, with a lot of users and a lot of queries, then moving it to the data warehouse is probably a better long term solution. But I would not resort to snowflaking. I would treat them as dimensions and follow the dimensional pattern. You never know when they may come in handy to support a future requirement. As far as relating dimensions goes, that is what fact tables are for. Use a factless fact instead of foreign keys directly off the dimension tables.
But if you do need to support a robust ad-hoc environment, with a lot of users and a lot of queries, then moving it to the data warehouse is probably a better long term solution. But I would not resort to snowflaking. I would treat them as dimensions and follow the dimensional pattern. You never know when they may come in handy to support a future requirement. As far as relating dimensions goes, that is what fact tables are for. Use a factless fact instead of foreign keys directly off the dimension tables.
Re: Reporting on Dimensions
Thanks for the very useful responses. Good to know that it is not just me that thinks the answer is not obvious.
allonb- Posts : 3
Join date : 2012-09-03
Similar topics
» Schema Roadblock: Dimension vs Fact [Dimensions w/FKs to other Dimensions?]
» Question about Dimensions with SKeys to join with other dimensions.
» Correlated - Separate Dimensions OR Single Dimensions ?
» Deleting Dimensions and Bridge Dimensions
» Design all dimensions as conformed dimensions
» Question about Dimensions with SKeys to join with other dimensions.
» Correlated - Separate Dimensions OR Single Dimensions ?
» Deleting Dimensions and Bridge Dimensions
» Design all dimensions as conformed dimensions
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum