Natural Key in the fact table for ETL purpose ?
3 posters
Page 1 of 1
Natural Key in the fact table for ETL purpose ?
I am just wondering how common is to use a Natural Key of a grain in the fact table ? For Ex: If a grain of the fact table is 1 row per policy then I would like to have Policy Number in the fact table along with Policy Dimesion ID. I understand that Policy Number would be in the Policy Dimension but having this in the fact table makes the ETL much more efficient. Idea is to hide these keys from the users but use it for ETL.
One main reason is that when we don't have a Policy in the Policy Dimension for whatever reason but we got it from the Fact extract then we have to default the record to say -1 and then when we have that
Policy in Policy Dimension, we need to go back and update the fact record with the Policy number. If we don't have the Policy Number then we can't do that as there could be multiple -1's are there.
I guess another option is to just omit the records if the main dimension is missing it but is that the standard practice ?
How have you handled this scneario ?
Thanks
One main reason is that when we don't have a Policy in the Policy Dimension for whatever reason but we got it from the Fact extract then we have to default the record to say -1 and then when we have that
Policy in Policy Dimension, we need to go back and update the fact record with the Policy number. If we don't have the Policy Number then we can't do that as there could be multiple -1's are there.
I guess another option is to just omit the records if the main dimension is missing it but is that the standard practice ?
How have you handled this scneario ?
Thanks
VTK- Posts : 50
Join date : 2011-07-15
Re: Natural Key in the fact table for ETL purpose ?
No. Storing extraneous columns in a fact table significantly degrades the performance of the table in queries. A typical query almost always winds up doing a scan of the fact table. Extra columns means less rows retrieved per block meaning more reads and longer query times. Disk I/O is the slowest operation in any query.
If you receive a fact with a policy number that does not exist, infer a new policy dimension row using that number and reference that row.
If you receive a fact with a policy number that does not exist, infer a new policy dimension row using that number and reference that row.
Re: Natural Key in the fact table for ETL purpose ?
Are you saying that we need to maintain a cross reference table where we store the Natural key and an ID and use the ID in the fact table as a SK.
Later, when we have that in the dimension, we have to lookup to this table and then take the assigned SK and replace it with actual SK.
Thanks
Later, when we have that in the dimension, we have to lookup to this table and then take the assigned SK and replace it with actual SK.
Thanks
VTK- Posts : 50
Join date : 2011-07-15
Re: Natural Key in the fact table for ETL purpose ?
The dimension is the cross reference as it contains both the natural key and the surrogate key. When you infer a dimension you create a new dimension row with a surrogate key and the natural key. When the dimension information finally shows up in the normal load, it updates the other attributes.
Re: Natural Key in the fact table for ETL purpose ?
Another thing to consider if using the -1 value as the surrogate key to connect fact records to dimensions is, if that SK is used in the fact table as part of the primary key for those records, then you will eventually get a unique constraint violation error (database level) and then the process will fail. By assigning (infering) a dimension record, you avoid that.
svmayor- Posts : 3
Join date : 2013-03-06
Similar topics
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» Business keys or Natural keys in the Fact table
» Can a FACT table contains Natural Primary keys and text columns
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» How to create fact table with measures derived from comparing two fact table rows
» Business keys or Natural keys in the Fact table
» Can a FACT table contains Natural Primary keys and text columns
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» How to create fact table with measures derived from comparing two fact table rows
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum