Best practice for creating tables that support trend analysis
5 posters
Page 1 of 1
Best practice for creating tables that support trend analysis
Hi,
I have designed a transaction fact table for Sales and Sales Order, now I am thinking of creating snapshot table , what would be the best practice for creating such a table ? At what level, the snapshot table can be created that could be prove useful ? Is it at a month, week or Quarter ?
Also, I am thinking of creating tables that support trend analysis. For example: Last year vs Current Year, Last week vs Current Week or any weeks or months for that matter. What would be the best practice for creating such trend tables ?
Thanks
Thirumalai
I have designed a transaction fact table for Sales and Sales Order, now I am thinking of creating snapshot table , what would be the best practice for creating such a table ? At what level, the snapshot table can be created that could be prove useful ? Is it at a month, week or Quarter ?
Also, I am thinking of creating tables that support trend analysis. For example: Last year vs Current Year, Last week vs Current Week or any weeks or months for that matter. What would be the best practice for creating such trend tables ?
Thanks
Thirumalai
pthirum- Posts : 8
Join date : 2014-11-30
Re: Best practice for creating tables that support trend analysis
To clarify on the trend analysis table, I am thinking on whether a column based or row based approach is better.
For example: I could design a fact table with
1. Month
2. Week1
3. Week2
4. Week3
5. Week4
6. Week5
Or I could do store data on a row by row basis, which approach is better ?
For example: I could design a fact table with
1. Month
2. Week1
3. Week2
4. Week3
5. Week4
6. Week5
Or I could do store data on a row by row basis, which approach is better ?
pthirum- Posts : 8
Join date : 2014-11-30
Re: Best practice for creating tables that support trend analysis
Snapshot: my first question would be why are thinking creating one if you do not why/when it would prove useful? If you don't have a clear requirement to do something it's probably better not to do it and focus on things you do have requirements for.
A snapshot is what it says it is - a snapshot at a point in time and so would normally be created at the end of a time period - end of day/week/month - showing the state of the system at that point in time?
BTW - you don't mean aggregate rather than snapshot do you?
Trend analysis: why do you want to create a fact table to do this? Your transaction fact table should be able to deliver trend analysis without any problems
Regards,
A snapshot is what it says it is - a snapshot at a point in time and so would normally be created at the end of a time period - end of day/week/month - showing the state of the system at that point in time?
BTW - you don't mean aggregate rather than snapshot do you?
Trend analysis: why do you want to create a fact table to do this? Your transaction fact table should be able to deliver trend analysis without any problems
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Best practice for creating tables that support trend analysis
The key to trend analysis is a robust date dimension. You don't need a snapshot.
The date dimension should contain chronologically assigned sequence numbers for each time period (day, week, month, quarter, year as well as fiscal variants of the same) for all time as well as within a year. This allows you to easily calculate which days you need for a particular reporting requirement. For example, last year would be the current day's year sequence - 1. The inter-year sequences would be used to locate the same time period in another year. You may also need inter-period sequences depending on the nature of reporting (day of week, week of month, week of quarter and so on.
The date dimension should contain chronologically assigned sequence numbers for each time period (day, week, month, quarter, year as well as fiscal variants of the same) for all time as well as within a year. This allows you to easily calculate which days you need for a particular reporting requirement. For example, last year would be the current day's year sequence - 1. The inter-year sequences would be used to locate the same time period in another year. You may also need inter-period sequences depending on the nature of reporting (day of week, week of month, week of quarter and so on.
Re: Best practice for creating tables that support trend analysis
Thanks people, The reason I am planning for a snapshot or an aggregate table is that it will be useful for data to be quickly retrieved from the datawarehouse. Its purely for performance gains that I am looking at creating a snapshot or an aggregate table.
pthirum- Posts : 8
Join date : 2014-11-30
Re: Best practice for creating tables that support trend analysis
Hi - snapshots do not normally help with performance, they are there to answer specific questions. For example, a business might want to know what its debt liability is at month end so you'd create a fact table that takes a 'snapshot' of the outstanding debt at each month end - this is not a summary of debt created during the month, just the debt at a point in time - which can be difficult to reconstruct from a transaction fact table.
An aggregate table does help performance - and that is almost the only reason for creating one. Which aggregates you could usefully create can be found by analysing the queries being executed against your DW - particularly the GROUP BY clauses. If you process a lot of queries that group by month then a monthly aggregate fact table is probably appropriate; if you have no queries that group by week there is little benefit in creating a weekly aggregate fact table.
Hope this helps?
An aggregate table does help performance - and that is almost the only reason for creating one. Which aggregates you could usefully create can be found by analysing the queries being executed against your DW - particularly the GROUP BY clauses. If you process a lot of queries that group by month then a monthly aggregate fact table is probably appropriate; if you have no queries that group by week there is little benefit in creating a weekly aggregate fact table.
Hope this helps?
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Best practice for creating tables that support trend analysis
Hi ngalemmo, Can you please provide us with a sample of the data dimension you are talking about. This would give me a sense and a thorough understanding of what you are referring to. thanks for sharing your expertise.
BIDW- Posts : 25
Join date : 2015-01-18
Re: Best practice for creating tables that support trend analysis
Hi - in case it wasn't a typo on your part, ngalemmo said date dimension, not data dimension. If you google "date dimension" you'll find loads of examples
Regards,
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Best practice for creating tables that support trend analysis
To support trend analysis, one thing every date dimension should have is chronologically assigned sequence numbers for the various time periods: year, fiscal year, fiscal period, week, quarter, etc… For example, if you have a date dimension with 15 years of dates, the year sequence number would range from 1 to 15. Each date in the year has the same number. So if you need to get last year's data you would look for dates whose year sequence is one less than the current year. Same process to get last week, 3 months ago and so on. It makes it very easy to construct queries to compare two time periods. These sequence numbers would be generated based on business rules that define the period.
To make queries simpler, you could also have companion relative sequences which are recalculated each day. This process would take the current day's sequence values and subtract them from the target date's sequence numbers. The result would be stored in a new column. Current periods would have zero values, prior periods negative values and future periods would have positive values. To report last quarter, you would select a relative quarter value of -1.
You would also have the usual day of year and other period based counts so you can do things like as of same day last year.
To make queries simpler, you could also have companion relative sequences which are recalculated each day. This process would take the current day's sequence values and subtract them from the target date's sequence numbers. The result would be stored in a new column. Current periods would have zero values, prior periods negative values and future periods would have positive values. To report last quarter, you would select a relative quarter value of -1.
You would also have the usual day of year and other period based counts so you can do things like as of same day last year.
Re: Best practice for creating tables that support trend analysis
The best solution to go about trend analysis is to leverage OLAP cube that builds all the necessary aggregates (weekly, monthly, yearly etc.) for you once processed. MDX is the de facto cube query language that supports all sorts of functions involved in relative period calculations once you have a proper designed date dimension in place.
Alternatively, you may create your aggregates manually (in ETL), and leverage SQL's WINDOW/OVER functions (eg. LAG/LEAD) to pivot values from other periods into columns on current period row in your fact table. WINDOW functions can deliver relative period values very efficiently, outperforming any sub-query approaches, or worst of all, row by row approach.
Alternatively, you may create your aggregates manually (in ETL), and leverage SQL's WINDOW/OVER functions (eg. LAG/LEAD) to pivot values from other periods into columns on current period row in your fact table. WINDOW functions can deliver relative period values very efficiently, outperforming any sub-query approaches, or worst of all, row by row approach.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Best practice for versioning tables/views?
» Best practice for date attributes of dimension tables
» Best practise in creating fact tables for health care
» Storing Date Keys in dimension tables versus fact tables
» Using a dimension in multiple fact tables with different grain and support SCD
» Best practice for date attributes of dimension tables
» Best practise in creating fact tables for health care
» Storing Date Keys in dimension tables versus fact tables
» Using a dimension in multiple fact tables with different grain and support SCD
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum