Fact table, strings and row size
4 posters
Page 1 of 1
Fact table, strings and row size
I'm very new to dimensional modelling, but I'm looking at a design made by someone very experienced.
That design has in each fact table a set of metadata attributes, e.g. "data provider", along with denormalised versions of things are really dimensions but which no one is ever going to query and make the design too complex to have as dimensions, e.g. currency code, currency description.
So an example row would be:
"RETAIL", "Import from retail transactions", "123.45", "USD", "United State Dollars", 11/22/2010
The upshot of this is that each fact table row is over 1kB, but the actual numeric facts only amount to about 100B. To me that seems madness, meaning I have to spend a huge amount of money on disc bandwidth to feed the processors during table scans (this is SQL Server, so no column oriented compression).
Is this normal/usual/correct?
That design has in each fact table a set of metadata attributes, e.g. "data provider", along with denormalised versions of things are really dimensions but which no one is ever going to query and make the design too complex to have as dimensions, e.g. currency code, currency description.
So an example row would be:
"RETAIL", "Import from retail transactions", "123.45", "USD", "United State Dollars", 11/22/2010
The upshot of this is that each fact table row is over 1kB, but the actual numeric facts only amount to about 100B. To me that seems madness, meaning I have to spend a huge amount of money on disc bandwidth to feed the processors during table scans (this is SQL Server, so no column oriented compression).
Is this normal/usual/correct?
adrianw- Posts : 3
Join date : 2011-01-26
Re: Fact table, strings and row size
It's terrible. It wouldn't matter if it was a columnar database, but in a row-based database like SQL Server it's extremely bad practice. Essentially if you run a query for a sum of amount, you have to go through more pages than you need to, adversely affecting performance.
Rip them out.
Rip them out.
Re: Fact table, strings and row size
Yep... ditch the text, use dimensions.
I have seen this before, and the common complaint is that queries run for a very long time. Most of the time this type of thing is done by ex 3NF modelers who believe the issue is reducing the number of joins. Actually, the real issue is how you join. A properly designed star schema on a database system that supports star joins will outperform big flat tables any day.
I have seen this before, and the common complaint is that queries run for a very long time. Most of the time this type of thing is done by ex 3NF modelers who believe the issue is reducing the number of joins. Actually, the real issue is how you join. A properly designed star schema on a database system that supports star joins will outperform big flat tables any day.
Re: Fact table, strings and row size
The thing that bothers me is that the person who designed has an impeccable industry reputation and decades of experience.
Maybe I misled with my example, there are also dimensions (and copies of the dimension values) involved so it is more like:
"RETAIL", "Import from retail transactions", customerid, "JOHN SMITH", customeraddressid, productid, 123.45, "USD", "United State Dollars", 11/22/2010
Maybe I misled with my example, there are also dimensions (and copies of the dimension values) involved so it is more like:
"RETAIL", "Import from retail transactions", customerid, "JOHN SMITH", customeraddressid, productid, 123.45, "USD", "United State Dollars", 11/22/2010
Last edited by adrianw on Thu Jan 27, 2011 3:26 pm; edited 1 time in total
adrianw- Posts : 3
Join date : 2011-01-26
Re: Fact table, strings and row size
Yep, I've seen it before as well. Create a dimensional model, but denormalize all the dimension attributes to the fact table to "improve performance". It will actually degrade performance over time. Hopefully none of the denormalized columns are from type 2 dimensions or you'll really be in a world of hurt.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Fact table, strings and row size
In this instance the denomalising is not to improve performance but to improve readability of the fact table.
adrianw- Posts : 3
Join date : 2011-01-26
Re: Fact table, strings and row size
If you want to improve readability then use a view. These additional attributes will adversely impact performance.
Re: Fact table, strings and row size
adrianw wrote:In this instance the denomalising is not to improve performance but to improve readability of the fact table.
This doesn't make sense to me. Why wouldn't you just do a join to pull the columns in that you want to see?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Fact table, strings and row size
Bottom line, its a bad design. My comment is not intended to vilify the person that did it. Sometimes, particularly if you are a consultant, have to compromise designs to suit the client. Politics often drive decisions, not technical purity.
Similar topics
» Dimension same size as Fact
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum