Account attributes in separate dimensions
3 posters
Page 1 of 1
Account attributes in separate dimensions
An account can have many attributes such as the account type, marketing category, status. There are about 10 in total. Each of the attributes is sourced from separate reference tables in the OLTP that contain (for example):
- account type code
- account type description
- the date this record was activated
- the date this record ended
Initially we flattened all of this out in the account dimension showing the current values (scd1). The issue was that all of the descriptions and codes made the dimension quite large and performance running queries was slow.
Next we put each of them into their own dimensions just as they are represented in the source. This meant we could put the account type surrogate key into the aggregate fact table for must faster queries rather than have to navigate the larger account dimension (which would never be in the aggregate). Users are now happier (until they want the account!) but it has never felt right from a modelling perspective and tends to mean the fact table contains a lot of keys.
We are due to put the facts and dimensions into SSAS MOLAP cubes to be accessed by Business Objects (rather than BO query them directly). Assuming performance is improved I'm thinking of reverting back to how we initially starting and either flattening into the account dimension or at least store the surrogate keys in it.
Any views?
- account type code
- account type description
- the date this record was activated
- the date this record ended
Initially we flattened all of this out in the account dimension showing the current values (scd1). The issue was that all of the descriptions and codes made the dimension quite large and performance running queries was slow.
Next we put each of them into their own dimensions just as they are represented in the source. This meant we could put the account type surrogate key into the aggregate fact table for must faster queries rather than have to navigate the larger account dimension (which would never be in the aggregate). Users are now happier (until they want the account!) but it has never felt right from a modelling perspective and tends to mean the fact table contains a lot of keys.
We are due to put the facts and dimensions into SSAS MOLAP cubes to be accessed by Business Objects (rather than BO query them directly). Assuming performance is improved I'm thinking of reverting back to how we initially starting and either flattening into the account dimension or at least store the surrogate keys in it.
Any views?
Willow- Posts : 5
Join date : 2012-10-24
Re: Account attributes in separate dimensions
Hi Willow
Can you define "quite large" and "slow"?
What flavour of database is this stored in?
Are there indexes on the Fact FK and Dimension PK? Are there indexes on the other dimension attributes?
Mike
Can you define "quite large" and "slow"?
What flavour of database is this stored in?
Are there indexes on the Fact FK and Dimension PK? Are there indexes on the other dimension attributes?
Mike
Re: Account attributes in separate dimensions
Hi Mike,
The account dimension contains 6m records and is stored in SQL Server 2008 R2. The fact tables are anything from a few million to about 700 million records.
There is a clustered index on the PK (the surrogate) of the dimension. We don't have indexes on the FK's in the fact table - we could look at this but I'd need to check the overhead on maintaining the indexes as data changes / inserts.
Our universes are meant to be 'self-serve' so in theory there would be a high number of candidates for indexes on individual attributes within the dimension (arguably all of them)
The answer in the past has been to move attributes into their own dimensions and then put those keys into the aggregate fact tables - but there are now a lot of them. I suspect the answer may be to put everything (aggregate and detail facts) into a cube - at least then the performance will be consistent. The cube might be enormous though!
Regards
The account dimension contains 6m records and is stored in SQL Server 2008 R2. The fact tables are anything from a few million to about 700 million records.
There is a clustered index on the PK (the surrogate) of the dimension. We don't have indexes on the FK's in the fact table - we could look at this but I'd need to check the overhead on maintaining the indexes as data changes / inserts.
Our universes are meant to be 'self-serve' so in theory there would be a high number of candidates for indexes on individual attributes within the dimension (arguably all of them)
The answer in the past has been to move attributes into their own dimensions and then put those keys into the aggregate fact tables - but there are now a lot of them. I suspect the answer may be to put everything (aggregate and detail facts) into a cube - at least then the performance will be consistent. The cube might be enormous though!
Regards
Willow- Posts : 5
Join date : 2012-10-24
Re: Account attributes in separate dimensions
Hi Willow
I'd say indexes on the FK's in the fact tables would help your queries a lot - otherwise you are probably incurring table scans of your 700m rows every time.
Indexes on individual dimension attributes could be limited to the most frequently used. I prefer single-column indexes as they are more broadly useful than multi-column indexes.
I love SSAS but it is a lot more work to define and maintain than a few indexes. The SSAS build process will probably benefit from some of those indexes anyway. It's also going to be easier than changing your schema, ETL, universe, reports etc.
Good luck!
Mike
I'd say indexes on the FK's in the fact tables would help your queries a lot - otherwise you are probably incurring table scans of your 700m rows every time.
Indexes on individual dimension attributes could be limited to the most frequently used. I prefer single-column indexes as they are more broadly useful than multi-column indexes.
I love SSAS but it is a lot more work to define and maintain than a few indexes. The SSAS build process will probably benefit from some of those indexes anyway. It's also going to be easier than changing your schema, ETL, universe, reports etc.
Good luck!
Mike
Re: Account attributes in separate dimensions
Is your fact table partitioned? If it is not, it should be. This will address your index maintenance issue all improve query performance.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Account attributes in separate dimensions
Yup, partitioned by month. Now getting the users to actually include the some kind of month filter in order to make use of the partition is another thing....
We will look at adding indexes to the FKs in the fact - thanks Mike.
Reading up, I think the mini-dimension approach we have currently adopted is fine. The only thing we could perhaps do is consolidate some of the attributes that have low cardinality. However I'm not sure how we would do this based on our source data. As an example, the source data is stored as follows:
Customer type table
Customer type code
Customer type desc
Start Date
Stop Date
Customer segment table
Segment code
Segment desc
Start Date
Stop Date
Getting all the combinations of customer type and segment is OK but how do we deal with the dates which may overlap or have gaps, etc.?
We will look at adding indexes to the FKs in the fact - thanks Mike.
Reading up, I think the mini-dimension approach we have currently adopted is fine. The only thing we could perhaps do is consolidate some of the attributes that have low cardinality. However I'm not sure how we would do this based on our source data. As an example, the source data is stored as follows:
Customer type table
Customer type code
Customer type desc
Start Date
Stop Date
Customer segment table
Segment code
Segment desc
Start Date
Stop Date
Getting all the combinations of customer type and segment is OK but how do we deal with the dates which may overlap or have gaps, etc.?
Willow- Posts : 5
Join date : 2012-10-24
Similar topics
» Model with Attributes Based on Two Separate Dimensions
» Correlated - Separate Dimensions OR Single Dimensions ?
» Design Tip #142 - Building Bridges Dilemma with Diagnosis - additional account specific attributes
» Customer and Account dimensions
» A single Dimension table Or separate the Dimensions?
» Correlated - Separate Dimensions OR Single Dimensions ?
» Design Tip #142 - Building Bridges Dilemma with Diagnosis - additional account specific attributes
» Customer and Account dimensions
» A single Dimension table Or separate the Dimensions?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum