Real time datawarehousing
+6
clgentr
DanoSmith
Kateryna
BoxesAndLines
Jeff Smith
Rik Declercq
10 posters
Page 1 of 1
Real time datawarehousing
Hey,
There is a lot of buzz going on regarding real time datawarehousing. But how do you actually do this ? How can you implement conformed dimensions and conformed facts in real time ? How can you consolidate data real time ? Vendors promise applications that can make this happen but how is this possible ? Does anybody do this and if yes, how ?
Best regards,
Rik
There is a lot of buzz going on regarding real time datawarehousing. But how do you actually do this ? How can you implement conformed dimensions and conformed facts in real time ? How can you consolidate data real time ? Vendors promise applications that can make this happen but how is this possible ? Does anybody do this and if yes, how ?
Best regards,
Rik
Rik Declercq- Posts : 10
Join date : 2009-02-03
It's magic
With smoke and mirrors.
I think true real-time data warehousing is impossible. Transaction systems are constantly changing - 1 second after you query data it's changed. I think that it's possible to have close to real time DW on a a few key pieces of data as long as the volume is fairly small. You could probably pick up only data that has changed, but I would think it would be as strain on the transactional system.
I've heard of companies that load twice a day.
We had someone that was talking about realtime data - until some told him that are systems run in batch overnight.
You can't even get real time quotes on mutual funds, which is one of the only things that I can think of in which changes in a short time can have a huge impact.
I think true real-time data warehousing is impossible. Transaction systems are constantly changing - 1 second after you query data it's changed. I think that it's possible to have close to real time DW on a a few key pieces of data as long as the volume is fairly small. You could probably pick up only data that has changed, but I would think it would be as strain on the transactional system.
I've heard of companies that load twice a day.
We had someone that was talking about realtime data - until some told him that are systems run in batch overnight.
You can't even get real time quotes on mutual funds, which is one of the only things that I can think of in which changes in a short time can have a huge impact.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Real time datawarehousing
LOL. I always get a chuckle when I read about RDW. Isn't real time just the application database? It would probably be easier to consolidate the front end applications into a unified database than try to implement a real time DW. Fortunately, my users are happy with business plus 1 day.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Real time datawarehousing
We have something updated every 3 hours or 2-3 times a day. But it's a small portion of data and there are not a complicated metric calculations.
Kateryna- Posts : 5
Join date : 2009-05-05
Where real time makes sense
Everyone throwing "real time data warehousing" terms around - is just "me too" in many cases.
We should really use "near time", unless it IS the OLTP system.
But think about a manufacturing process. Perhaps when the variance on a Widget length goes from 1 inch to 1.1 inch.
This may mean that a machine cutting blade is dulling, and could break and take a production line down [very costly]. If the information were to be reported in near time, it may mean that the machine can be down for a few minutes [planned] and the blade replaced and minimal work stoppage down the line.
Keep this in mind - if you can describe a business process improvement that can be obtained by "near time" DW, then you need it.
Otherwise it really is just "me too".
Dano Smith
We should really use "near time", unless it IS the OLTP system.
But think about a manufacturing process. Perhaps when the variance on a Widget length goes from 1 inch to 1.1 inch.
This may mean that a machine cutting blade is dulling, and could break and take a production line down [very costly]. If the information were to be reported in near time, it may mean that the machine can be down for a few minutes [planned] and the blade replaced and minimal work stoppage down the line.
Keep this in mind - if you can describe a business process improvement that can be obtained by "near time" DW, then you need it.
Otherwise it really is just "me too".
Dano Smith
DanoSmith- Posts : 1
Join date : 2009-07-30
Age : 54
Location : Dayton, OH
Near Real Time Data
Can anyone tell me how to handle loading data to a fact table several times a day when dimensions are only loaded nightly? We run into date logic problems with this. The fact table has a dimension ID that relates to the current record when the fact was loaded. A batch job that run nightly then updates the dimension with new records that are effective dated for the same day. Now my fact table dimensions are tied to the wrong dimension record actually the day before.
clgentr
clgentr
clgentr- Posts : 1
Join date : 2009-08-21
Near Real Time Data Warehouse
I am designing a near real time datawarehouse using a set of conformed dimensions (business measures) and related fact tables as the business requirements come in. My source system has been modelled and implemented to manage the OLTP data so that we can tell the last update date and if the row has been marked as deleted or not. This allows me to perform accurate extraction. After reading the Chapter 11 of the DW ETL Toolkit book I feel that using a separate fact table to capture the 'trickle in' data which will be purged nightly once its data is merged into the main fact table. I will use a view to pull these two data sets together so Cognos sees only one 'fact' table....This is all good and I have the data flow figured out. It is the dimension data that has me baffled. I could create the same scenario with the 'daily - realtime' version of the dimension table and have it capture the 'trickle in' data throughout the day and merge it in the evening but how do I ensure that my facts are synchronized with the correct dimension id throughout the day.....
CHERPOTT- Posts : 2
Join date : 2009-11-10
Re: Real time datawarehousing
I don't recall why Ralph suggests the 'day' table for realtime feeds (I don't think he explained it in the book), but it is my opinion the reason you want to do it is so that the transaction states and index maintenance doesn't impact data visibility and query performance. All of which is due to the fact that almost all the data volume is in the facts.
Dimensions, on the other hand, tend to update relatively infrequently, and how you handle them will depend on the business requriements. Assuming you stick with one dimension table, you must certainly need to add new rows when needed based on the facts coming in. If it is a type 1 dimension, it is a matter of when you want attribute changes to become visible. If the business wants real time updating, so be it. Simply update in a one-of transaction state, othewise hold updates off to the end of day. If it is a type 2 dimension, it doesn't matter since any updates do not affect old rows.
Dimension tables tend to be smaller and do not have as extensive an index burden as fact tables do. There is no need to hold off updating them in a real time feed nor is there a need to have a 'day' version. But, then again, there are probably exceptions.
Dimensions, on the other hand, tend to update relatively infrequently, and how you handle them will depend on the business requriements. Assuming you stick with one dimension table, you must certainly need to add new rows when needed based on the facts coming in. If it is a type 1 dimension, it is a matter of when you want attribute changes to become visible. If the business wants real time updating, so be it. Simply update in a one-of transaction state, othewise hold updates off to the end of day. If it is a type 2 dimension, it doesn't matter since any updates do not affect old rows.
Dimension tables tend to be smaller and do not have as extensive an index burden as fact tables do. There is no need to hold off updating them in a real time feed nor is there a need to have a 'day' version. But, then again, there are probably exceptions.
Real Time Data warehousing
Thank you for your quick response but perhaps I did not ask my question properly. If I want to bring in type 2 dimension data 'real time' from our source systems and load them directly (or not, if there is no change) into the conformed dimension tables and I have designed my fact table as a 'set' (meaning there is the main fact table that gets loaded nightly from the 'real time fact table', and a real time table receiving the trickle in data throughout the day) I assume, as my 'real time fact table' is loaded throughout the day the dimension ids will be pulled from the most current record in the dimension table and linked to this 'real time' fact. When I load the main fact table in the evening I would not change any of the associated dimension ids loaded throughout the day. Is this the best practice for this type of design?
CHERPOTT- Posts : 2
Join date : 2009-11-10
Re: Real time datawarehousing
Yes. If you are maintaing the dimension in real time as well. your facts would naturally be keyed by the current state of the dimension table. At the end of the day you should be able to simply append that day's facts to the main fact table.
Re: Real time datawarehousing
I think true real-time data warehousing is impossible. Transaction systems are constantly changing - 1 second after you query data it's changed. I think that it's possible to have close toa a few key pieces of dataas long as the volume is fairly small. You could probably pick up only data that has changed, but I would think it would be as strain on the transactional system.
___________
watson
___________
watson
Cock145- Posts : 1
Join date : 2015-01-17
Re: Real time datawarehousing
IMO the real-time data warehousing will be possible when all of its components would be able to fit into RAM memory of a logical machine.
If you would be following REPRODUCIBLE benchmarks of transformation tools and DB drivers you will be quite up-to-date in that matter.
If you would be following REPRODUCIBLE benchmarks of transformation tools and DB drivers you will be quite up-to-date in that matter.
jangorecki- Posts : 2
Join date : 2016-01-14
Similar topics
» Near Real time ETL and ETL cloud
» Real time datawarehouse replacing operational reporting
» ETL Design Problems for Real time
» Real time DWH - Choosing Right Technologies
» Buildiong Star Schema in real time ?
» Real time datawarehouse replacing operational reporting
» ETL Design Problems for Real time
» Real time DWH - Choosing Right Technologies
» Buildiong Star Schema in real time ?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum