Large number of late arriving facts
2 posters
Page 1 of 1
Large number of late arriving facts
I am loading a Star Schema from a data base that acts as the staging area. We are converting our source systems to a new transaction system. We are converting data to the new transaction systems in large chunks with conversion dates spread over 3 years. I am only loading data from the staging area only after it has been converted to the new transaction system. When converted, transaction system will contain 3 years of history. As a result, after each conversion, I get 3 years of data.
On the fact table, i want to create a clustered index on Paid Date (of the insurance claim). I am afraid that after a 3 year chunk of data gets converted, the load process will be brought to it's knees. I'd drop the index, load the 3 years worth of history and then rebuild the index. I'm afraid that rebuilding the index will bring the load process to it's knees.
I'm trying to figure out ways to facilitate the historical data. I was thinking about partitioning the table based on the Paid Date, which would limit the amount of data that had to be resorted at one time when the clustered index was rebuilt. I was also considering partitioning the table based on 2 factors - paid date on clients - the conversion schedule is based on clients and client data is a typical filter when querying the fact table.
Thoughts? How would I partition a fact table on 2 unrelated attributes?
On the fact table, i want to create a clustered index on Paid Date (of the insurance claim). I am afraid that after a 3 year chunk of data gets converted, the load process will be brought to it's knees. I'd drop the index, load the 3 years worth of history and then rebuild the index. I'm afraid that rebuilding the index will bring the load process to it's knees.
I'm trying to figure out ways to facilitate the historical data. I was thinking about partitioning the table based on the Paid Date, which would limit the amount of data that had to be resorted at one time when the clustered index was rebuilt. I was also considering partitioning the table based on 2 factors - paid date on clients - the conversion schedule is based on clients and client data is a typical filter when querying the fact table.
Thoughts? How would I partition a fact table on 2 unrelated attributes?
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Large number of late arriving facts
I wouldn't focus on historical conversions. I would focus how the table is loaded and queried normally. Historical conversions are always painful which is why they are often done over the weekend.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» Late Arriving Facts
» alternate approaches for late arriving dimension attributes
» Part of fact information arives later
» Late arriving dim *change* - how to update the fact?
» Not quite late arriving dimensions but similar case....
» alternate approaches for late arriving dimension attributes
» Part of fact information arives later
» Late arriving dim *change* - how to update the fact?
» Not quite late arriving dimensions but similar case....
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum