Star Schema vs All in one table
+2
ngalemmo
daredevil
6 posters
Page 1 of 1
Star Schema vs All in one table
Hi,
I have a scenario where we have dimensional information and the facts in one table rather than a traditional break up into facts and dimensions. What could be the problem i could run into with having all information in one table. Thanks.
I have a scenario where we have dimensional information and the facts in one table rather than a traditional break up into facts and dimensions. What could be the problem i could run into with having all information in one table. Thanks.
daredevil- Posts : 9
Join date : 2010-08-05
Re: Star Schema vs All in one table
Poor performance, difficulty updating attributes, difficult to integrate with other facts...
Re: Star Schema vs All in one table
I agree with the last two, but, was wondering how it would pull down performance. If i have everything in one table then i wouldn't have to make any joins and things should be faster. Prompts though would be slower, but, report would be faster. I might be wrong and missed something here. Thanks!!
daredevil- Posts : 9
Join date : 2010-08-05
It's not just the joins
What DB are you running on? It probably doesn't matter, because it will still run slower. I've done a few "fix-it" jobs where people have tried your approach and the performance was terrible for both ETL and querying.
Why? Well if your table is very wide - which is probably the case because you have all of your dimensional attributes in it - then it takes more disk i/o for each read because you will get less rows per page. So if for example, you want to sum your sales figures and group by location, then you will have to read many more pages using your method than you would using a proper star-schema.
It's not just the joins, it's the number of reads that slow down your query performance. Having everything in one table increases your reads. Don't do it.
Why? Well if your table is very wide - which is probably the case because you have all of your dimensional attributes in it - then it takes more disk i/o for each read because you will get less rows per page. So if for example, you want to sum your sales figures and group by location, then you will have to read many more pages using your method than you would using a proper star-schema.
It's not just the joins, it's the number of reads that slow down your query performance. Having everything in one table increases your reads. Don't do it.
Database assumption
The previous posts sum up the issues well, if you are using a relational database. On the other hand, if you are using a column-oriented database like Sybase IQ, or C-Store, many of these issues go away in One Big Table scenario. I would still use separate tables in my ETL area in order to track slowly changing attributes, do lookups, etc., but it may be OK to put it all in one table in the user database. I don't think even the Sybase folks recommend the One big Table approach, but I'm just making sure we consider all the options here. ;-)
See Ralph's design tip Columnar Databases: Game Changers for DW/BI Deployment? for more info.
--Warren
See Ralph's design tip Columnar Databases: Game Changers for DW/BI Deployment? for more info.
--Warren
warrent- Posts : 41
Join date : 2008-08-18
Re: Star Schema vs All in one table
Interestingly, the next version of SQL Server, SQL Server 11, will have column store indexes. Using the same table without the column store index, an aggregation of over 1.4 Billions records returned in 501 seconds. Using a Column Store Index it returned in 1 second. Amazing stuff.
Re: Star Schema vs All in one table
Basically, one big table approach is against the principle of both relational and dimensional modelling. Interestingly many OLTP systems have created this kind of quick and dirty monster transaction table in order to cater for reporting purposes. I think one of the up-front challenges in designing a proper BI system is to remodel the legacy system based on the best practice modelling techniques, and to us, it’s Kimball’s method.
Dimensional modelling is about denormalising dimensions and normalising facts, and it’s everything to do with performance. When the data is massive in DW, the performance has more to do with physical fact table size, as John pointed out, and less to do with joins with dimensions as dimensions are comparatively tiny anyway, unless they are heavily snowflaked.
Dimensional modelling is about denormalising dimensions and normalising facts, and it’s everything to do with performance. When the data is massive in DW, the performance has more to do with physical fact table size, as John pointed out, and less to do with joins with dimensions as dimensions are comparatively tiny anyway, unless they are heavily snowflaked.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Star Schema vs All in one table
Dimensional modelling is about denormalising dimensions and normalising facts, and it’s everything to do with performance
What do you mean with "normalising facts" ?
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Star Schema vs All in one table
By facts I referred to fact tables. So the fact tables should be normalized by replacing all the repeating groups(attributes) with dimension keys.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Star Schema vs All in one table
That's ok, thanks for your response.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Star Schema vs All in one table
Your comments are still valid in the case of exadata HCC technology?
When I have a look at the explain plan, it looks more costly but I can not see any problem in query execution time.
When I have a look at the explain plan, it looks more costly but I can not see any problem in query execution time.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Similar topics
» Flat Table vs star schema
» Huge table meets star schema
» Reasons to use a 3NF design over a Dimensional Model design for an EDW - Discussion
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» Snowflake or Star Schema?
» Huge table meets star schema
» Reasons to use a 3NF design over a Dimensional Model design for an EDW - Discussion
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» Snowflake or Star Schema?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum