Snowflake to support different grains with BObj universe?
5 posters
Page 1 of 1
Snowflake to support different grains with BObj universe?
In my DW work to date I have dutifully followed the Kimball recommendation to avoid snowflakes. However, I am considering the use of a snowflake to solve some problems with ad hoc reporting comparing facts at different grains. Here’s what I’ve got:
• Modest (11GB) DW on SQL Server 2008, 5 fact tables, 6 conformed dimensions, star schema
• SSRS reports running directly off relational star schemas
• SSAS Cube supporting slice and dice in Excel (not yet in production)
• Business Objects XI 3.1 universe on relational DW supporting Web Intelligence (WebI) users
The problem is budget-to-actual reporting in WebI. For example, actual sales are recorded by customer, but budget is by territory (customers roll up into territories.) I can create a mini-dimension for Territory. Now I have no problem comparing budget at the Territory level with actual sales at the Customer level if I am writing the SQL, but I have not been able to configure the BObj universe to handle this correctly.
After extensive searching and rereading lots of old posts both here and on the BusinessObjects Board, I’ve concluded this is a limitation with the BObj universe… there is no easy way to tell it that two conformed dimension attributes represent the same thing (in this case DimCustomer.Territory is the same as DimTerritory.Territory) so it won’t generate the correct SQL to compare budget to actual at the territory level.
It seems to me that one easy solution is to snowflake my Customer dimension to break out Territory into its own table. Budget facts would then have a FK to DimTerritory. Sales facts would continue to point to DimCustomer which would then point to DimTerritory. With two contexts the universe should then be happy and generate the correct SQL to compare budget with actual.
The main reasons cited for avoiding snowflake schemas are (1.) query performance and (2.) keeping the design simple for end users. With my modest DW size I’m not worried about performance, and since end users will access the DW either through the universe or (eventually) through SSAS, they wouldn’t see the DW schema directly.
Should I snowflake? It seems so wrong! Any better solutions to ad hoc reporting across facts at differing grains using BObj?
• Modest (11GB) DW on SQL Server 2008, 5 fact tables, 6 conformed dimensions, star schema
• SSRS reports running directly off relational star schemas
• SSAS Cube supporting slice and dice in Excel (not yet in production)
• Business Objects XI 3.1 universe on relational DW supporting Web Intelligence (WebI) users
The problem is budget-to-actual reporting in WebI. For example, actual sales are recorded by customer, but budget is by territory (customers roll up into territories.) I can create a mini-dimension for Territory. Now I have no problem comparing budget at the Territory level with actual sales at the Customer level if I am writing the SQL, but I have not been able to configure the BObj universe to handle this correctly.
After extensive searching and rereading lots of old posts both here and on the BusinessObjects Board, I’ve concluded this is a limitation with the BObj universe… there is no easy way to tell it that two conformed dimension attributes represent the same thing (in this case DimCustomer.Territory is the same as DimTerritory.Territory) so it won’t generate the correct SQL to compare budget to actual at the territory level.
It seems to me that one easy solution is to snowflake my Customer dimension to break out Territory into its own table. Budget facts would then have a FK to DimTerritory. Sales facts would continue to point to DimCustomer which would then point to DimTerritory. With two contexts the universe should then be happy and generate the correct SQL to compare budget with actual.
The main reasons cited for avoiding snowflake schemas are (1.) query performance and (2.) keeping the design simple for end users. With my modest DW size I’m not worried about performance, and since end users will access the DW either through the universe or (eventually) through SSAS, they wouldn’t see the DW schema directly.
Should I snowflake? It seems so wrong! Any better solutions to ad hoc reporting across facts at differing grains using BObj?
Last edited by VHF on Wed Aug 03, 2011 12:33 pm; edited 1 time in total (Reason for editing : typo)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake to support different grains with BObj universe?
Why not put a FK to the Territory dimension on to the Sales Fact table? That should resolve the issues in the universe without affecting performance.
Dave Jermy- Posts : 33
Join date : 2011-03-24
Location : London, UK
Re: Snowflake to support different grains with BObj universe?
How can you compare Customer Level Sales with Territory Level Budget? If you are aggregating Customer Level Sales data to the Territory Level, then you could create an aggregate table at the Territory level that has the budget and sales.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Snowflake to support different grains with BObj universe?
This would require joining two SQL subquries, one of which would SUM Customer level sales up to the Territory level (which is an attribute of Customer.) I can do it easily if I am writing my own SQL!Jeff Smith wrote:How can you compare Customer Level Sales with Territory Level Budget?
I am considering this approach. I don't need an aggreagate table for performance, but it would certainly simplify things, especially if I had a single fact table with both budget and (aggregate) sales.Jeff Smith wrote:If you are aggregating Customer Level Sales data to the Territory Level, then you could create an aggregate table at the Territory level that has the budget and sales.
I considered this, but Territory is a SCD Type 1 attribute of Customer. When a customer is assigned to a new territory, the users want all historical reporting to occur as if the customer had always been in that territory. If I added a FK directly to Territory from the Sales Fact table, I would need to update existing fact records whenever a customer is assigned to a different territory. If Territory had been an SCD Type 2 attribute of Customer this soultion would have been ideal.Dave Jermy wrote:Why not put a FK to the Territory dimension on to the Sales Fact table? That should resolve the issues in the universe without affecting performance.
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake to support different grains with BObj universe?
You can do it in BOBJ...
Define a relationship between the customer territory and the territory dimension territory and include it in the context for the sales fact. This is basically your snowflake without actually having to change the database.
When you do the combined query (budget/actuals) you must use the territory objects in the territory dimension, not the territory code on the customer. If you are just doing customer reporting you can continue to use the territory object on the customer dim and the query will not use the territory dim.
The only real problem is training. When doing a budget report, the user needs to be aware of which territory to use. One way to enforce it is to have a different universe for that application and only expose the territory dim attributes.
Define a relationship between the customer territory and the territory dimension territory and include it in the context for the sales fact. This is basically your snowflake without actually having to change the database.
When you do the combined query (budget/actuals) you must use the territory objects in the territory dimension, not the territory code on the customer. If you are just doing customer reporting you can continue to use the territory object on the customer dim and the query will not use the territory dim.
The only real problem is training. When doing a budget report, the user needs to be aware of which territory to use. One way to enforce it is to have a different universe for that application and only expose the territory dim attributes.
Re: Snowflake to support different grains with BObj universe?
Thanks... I had already prototyped pretty much what you described, but the solution seemed a little dirty to me. On my first attempt I had the join in the budget context, but of course that caused a fan trap (and grossly exaggerated budget numbers!) After further thought it was clear that it had to be in the sales fact context as you said.
I like the idea of a separate universe to keep the users from using the wrong territory object for budget reporting. Otherwise I can see them starting with actual sales using customer territory and then trying to add budget to their query and wondering why it doesn't work!
Too bad the universe doesn't have built-in support for shrunken dimensions!
I like the idea of a separate universe to keep the users from using the wrong territory object for budget reporting. Otherwise I can see them starting with actual sales using customer territory and then trying to add budget to their query and wondering why it doesn't work!
Too bad the universe doesn't have built-in support for shrunken dimensions!
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake to support different grains with BObj universe?
Kimball methodology seems to recommend the use of a shrunken dimension for handling different fact tables at different grains (such as my scenario with budget at the territory level vs. actual at the customer level.) In fact, this month's Design Tip #137 discusses it! I have no problem with writing queries to compare facts at different grains when coding my own SQL, but I am bumping into issues when trying to enable cross-grain ad hoc reporting either through a BObj universe (directly against the relational star-schema DW) or with SSAS.
SSAS has good support for multiple grains within a cube--in fact, on the Dimension Usage configuration screen there is a Granularity setting for each dimension for each fact table (SSAS "measure group")--almost a Bus Matrix! However, I have only figured out two scenarios that work: (1.) a single dimension table (with special rows to represent higher levels) or (2.) a snowflake. It is certainly possible to add a shrunken dimension in SSAS, but then SSAS doesn't recognize it as being conformed with the original dimension--the same problem as with the BObj universe. Has anyone figured out how to make drill-across work in SSAS when using shrunken dimensions for higher grained facts?
The struggles I've had with both BObj and SSAS lead me to ask this question:
What BI tools (if any) directly support the Kimball-recommend approach of a star schema with conformed shrunken dimensions for higher grained facts?
SSAS has good support for multiple grains within a cube--in fact, on the Dimension Usage configuration screen there is a Granularity setting for each dimension for each fact table (SSAS "measure group")--almost a Bus Matrix! However, I have only figured out two scenarios that work: (1.) a single dimension table (with special rows to represent higher levels) or (2.) a snowflake. It is certainly possible to add a shrunken dimension in SSAS, but then SSAS doesn't recognize it as being conformed with the original dimension--the same problem as with the BObj universe. Has anyone figured out how to make drill-across work in SSAS when using shrunken dimensions for higher grained facts?
The struggles I've had with both BObj and SSAS lead me to ask this question:
What BI tools (if any) directly support the Kimball-recommend approach of a star schema with conformed shrunken dimensions for higher grained facts?
Last edited by VHF on Tue Aug 09, 2011 9:47 am; edited 1 time in total (Reason for editing : additional details)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake to support different grains with BObj universe?
Maybe it needs to be viewed as a logical/physical issue. From a logical standpoint, the shrunken dimension makes sense, but, in the case of certain BI environments, the original and shrunken dimensions must coexist in the same physical table in order to support fact integration. It may also explain why, over the years of working with BOBJ, I intuitively prefer a single date dimension rather than additional month or quarter dimensions.
Re: Snowflake to support different grains with BObj universe?
For the date dimension I agree that it works out pretty cleanly to designate rows that also represent a month, quarter, year, etc. But that pattern just doesn’t seem to translate well to other dimensions such as product or customer.
In my case, I could designate an arbitrary customer record to also represent a territory, but it just doesn’t feel right--and maintaining this with an active SCD would be a little trickier than with the static date dimension. ETL would need to make sure there is a customer row designated to represent territory for each distinct territory.
I also considered adding special records to the customer dimension to represent a territory. This is a little better, but still seems messy, and results in a bunch of dimension rows that would have a lot of null/empty/NA attributes. I suppose that would work OK if the user tried to use an inappropriate attribute (for example, if they tried to use customer name while reporting on budget at the territory level.)
Our DW will eventually have several fact tables at differing grains (budget, forecast) for which I would like to support ad hoc drill-across, so I want to come up with a “best practices” pattern.
In my case, I could designate an arbitrary customer record to also represent a territory, but it just doesn’t feel right--and maintaining this with an active SCD would be a little trickier than with the static date dimension. ETL would need to make sure there is a customer row designated to represent territory for each distinct territory.
I also considered adding special records to the customer dimension to represent a territory. This is a little better, but still seems messy, and results in a bunch of dimension rows that would have a lot of null/empty/NA attributes. I suppose that would work OK if the user tried to use an inappropriate attribute (for example, if they tried to use customer name while reporting on budget at the territory level.)
Our DW will eventually have several fact tables at differing grains (budget, forecast) for which I would like to support ad hoc drill-across, so I want to come up with a “best practices” pattern.
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Ralph says I can do it...
From my Dimensional Modeling in Depth Kimball University class handbook:
I'll also do some before and after performance meansurements to see if query time is affected.
My objective is to factor out a few common sub-dimensions to facilitate drill-across; I'm not trying to normalize each dimension into 3NF. So I’m not really snowflaking, just creating a few outriggers!
I'm going to test snowflaking the customer dimension in my DW development environment. I'll need to provide a view that presents a flat customer dimension so as not to break existing SQL-based reports which are expecting a star schema. For the BObj universe I'll change the dimension attribute objects to use fields from the new snowflaked tables (territory, etc.) as needed, but the presentation to the user will still be flat (customer class folder.)If you present the dimension as a flat file in your user interface AND all your browses run fast, then you have our permission to snowflake!
I'll also do some before and after performance meansurements to see if query time is affected.
My objective is to factor out a few common sub-dimensions to facilitate drill-across; I'm not trying to normalize each dimension into 3NF. So I’m not really snowflaking, just creating a few outriggers!
Last edited by VHF on Wed Aug 10, 2011 11:37 am; edited 1 time in total (Reason for editing : typo)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
slower than I thought!
As a test, I snowflaked my customer dimension out to 4 tables (customer, territory, region, channel) and did some performance testing... it had a bigger impact than I expected. Small queries aren't significantly impacted (5% slower), but queries that touch a lot of records take a bigger hit (30% slower). I would expect it to get worse as the DW grows.
I might still do an outrigger or two, but rather than snowflake extensively I'm going to explore some of the design alternatives I mentioned earlier.
I started this thread to discuss using special rows to represent higher levels in a dimension.
I might still do an outrigger or two, but rather than snowflake extensively I'm going to explore some of the design alternatives I mentioned earlier.
I started this thread to discuss using special rows to represent higher levels in a dimension.
Last edited by VHF on Thu Aug 11, 2011 11:44 am; edited 1 time in total (Reason for editing : added link to other thread)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake to support different grains with BObj universe?
Microsoft SQL Server 2008 Standard Edition.
The production DW is running on a new dual 4-core CPU server with 32GB of RAM. The dev environment where I did the testing is a VM with 16GB allocated, so it is possible I might get different performance results in production.
The perfomance drop in my testing would actually be acceptable, but I don't want to snowflake heavily and bring the DW to its knees!
The production DW is running on a new dual 4-core CPU server with 32GB of RAM. The dev environment where I did the testing is a VM with 16GB allocated, so it is possible I might get different performance results in production.
The perfomance drop in my testing would actually be acceptable, but I don't want to snowflake heavily and bring the DW to its knees!
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake to support different grains with BObj universe?
VHF wrote:Microsoft SQL Server 2008 Standard Edition.
The production DW is running on a new dual 4-core CPU server with 32GB of RAM. The dev environment where I did the testing is a VM with 16GB allocated, so it is possible I might get different performance results in production.
The perfomance drop in my testing would actually be acceptable, but I don't want to snowflake heavily and bring the DW to its knees!
Oh. I thought it may have been Oracle, as it is pretty sensitive to snowflaking. But, SQLServer has improved its query strategies against proper stars, so I'm not surprised to see some hit when using snowflakes. In the newer versions of SQLServer, when you do a star join, it internally generates bit vectors of the fact FKs to quickly isolate rows of interest. When you snowflake, it has to do traditional joins, two sets at a time.
Re: Snowflake to support different grains with BObj universe?
ngalemmo wrote:
Oh. I thought it may have been Oracle, as it is pretty sensitive to snowflaking.
@ngalemmo, would you, please, give us more details why do you decide that Oracle is pretty sensitive to snowflaking ? Thank you in advance!
hayrabedian- Posts : 7
Join date : 2011-04-01
Re: Snowflake to support different grains with BObj universe?
Oracle's optimizer will perform star joins if conditions are right. Bitmap indexes on the fact foreign keys and standard fact-to-dimension joins. When you snowflake it will not do this. Instead, it will perform more traditional relational joins, which is slower for large queries.
Re: Snowflake to support different grains with BObj universe?
I have to disagree with your statement.ngalemmo wrote:Oracle's optimizer will perform star joins if conditions are right. Bitmap indexes on the fact foreign keys and standard fact-to-dimension joins. When you snowflake it will not do this.
I assume you are mentioning the "star transformation" feature, when you are talking about "performing star joins".
You are absolutelly right, that Oracle Optimiser will use this transformation, only when some conditions are satisfied, like existence of bitmap indexes, foreign keys and so on..
But, these conditions are not related to a snowflaking. The star transformation is still possible, even in a snowflaking model. The Optimiser is smart enough to find the needed dimension keys from the the first phase of the optimisation, just by joining the dimension tables first and then using the result set to "merge" the bitmaps.
Again, I have to disagree..ngalemmo wrote:Instead, it will perform more traditional relational joins, which is slower for large queries.
The "traditional joins" are not slower for large queries, actually, the oposite is true in most of the cases. If the referenced result set (after applying of all the predicates) is of big cardinality (which is usually the case in an analitical queries), then the index access path (even through a bitmap index) would be much more slower than "traditional" joins (like hash join, which involves a full table scan of the fact table).
Please, correct me if I am wrong. Thanks in advance!
hayrabedian- Posts : 7
Join date : 2011-04-01
Similar topics
» Dimensional Model Vs Views as Structure for BOBJ Universe
» Handling new grains for an existing model
» Fact tables at different grains with measures in each
» Two Grains for Time
» Date Dimension at Various Grains
» Handling new grains for an existing model
» Fact tables at different grains with measures in each
» Two Grains for Time
» Date Dimension at Various Grains
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum