Dimension with million of records - Performance on delivery
3 posters
Page 1 of 1
Dimension with million of records - Performance on delivery
Hello everyone,
I have a dimension table for Clients and we have million of clients. However, once we nedd to track changings in records, the deliverance is really slow!
Whould somebody suggest a way how to gain performance?
I have a dimension table for Clients and we have million of clients. However, once we nedd to track changings in records, the deliverance is really slow!
Whould somebody suggest a way how to gain performance?
Luiz Jonata- Posts : 1
Join date : 2011-05-18
Re: Dimension with million of records - Performance on delivery
Break the dimension down. Look at static versus time sensitive attributes. Can the time sensitive ones be broken out to junk dimensions and avoid creating a type 2 dimension? Are their a number of commonly used low cardinality attributes (static or otherwise) that can be placed in their own dimension?
Should you remove the attributes from the client dimension or keep them there, but still implement the other smaller dimensions depends on how you get your facts. If all you get from the fact feed is the client business key, you need to maintain a cohesive client dimension, but you do not need to expose all the attributes in that table to the user, you may also avoid the need to implement a type 2.
In such a scenario, when loading facts you would use the client dimension as a source for current attribute values, covered in the other junk dimensions, for the fact. You would then derive the appropriate FKs to the junk dimensions for the fact row. You will see significant performance improvement when most of the query filtering can be done against these much smaller dimension tables.
You may also consider implementing a second version of the client dimension that only contains attributes not covered in the other dimensions (or split the data between two tables (vertical paritioning)). You would still have the full client table in the background for loading purposes, but users would only see the smaller version for query purposes. This will help performance for queries that need attributes from the client dimension as the table would be narrower.
Should you remove the attributes from the client dimension or keep them there, but still implement the other smaller dimensions depends on how you get your facts. If all you get from the fact feed is the client business key, you need to maintain a cohesive client dimension, but you do not need to expose all the attributes in that table to the user, you may also avoid the need to implement a type 2.
In such a scenario, when loading facts you would use the client dimension as a source for current attribute values, covered in the other junk dimensions, for the fact. You would then derive the appropriate FKs to the junk dimensions for the fact row. You will see significant performance improvement when most of the query filtering can be done against these much smaller dimension tables.
You may also consider implementing a second version of the client dimension that only contains attributes not covered in the other dimensions (or split the data between two tables (vertical paritioning)). You would still have the full client table in the background for loading purposes, but users would only see the smaller version for query purposes. This will help performance for queries that need attributes from the client dimension as the table would be narrower.
Re: Dimension with million of records - Performance on delivery
There's also database tuning and hardware upgrade. Your indexing strategy could be bad, your hardware maybe insufficient, and your queries maybe bad as well. This is especially true if you are using a BI tool.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» Index Strategy on FACT Table with 300 Million records
» Dimension more records than fact
» Question on Deleting records from dimension tables
» Handling records in Fact when dimension is Type 2
» Grouping a subset of dimension records for a report
» Dimension more records than fact
» Question on Deleting records from dimension tables
» Handling records in Fact when dimension is Type 2
» Grouping a subset of dimension records for a report
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum