Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Fix Orphaned Dim Keys in Facts - Fastest

2 posters

Go down

Fix Orphaned Dim Keys in Facts - Fastest  Empty Fix Orphaned Dim Keys in Facts - Fastest

Post  nlulla Thu May 24, 2012 2:12 pm

I have about 30 Dimensions Joining to my FactTable. I assign -3 as dimkey (also store bk_naturalkey) in the fact if the dimension doesn't exist but the fact does. The requirement is to try to fix this orphaned key during the daily morning load for the next 7 days if it arrives within a weeks time. I use below code to perform this action. but i will have to write the below almost 30 times for 30 dimensions keys in fact seprately.

SET t.DimEmployeeID = s.DimEmployeeID
dbo.FactTransaction AS t WITH(NOLOCK)
dbo.DimEmployee AS s WITH(NOLOCK)
ON t.BK_EmployeeNumber = s.EmployeeNumber
AND t.DimEmployeeID = -3
AND t.[DimTransactionDateKey] BETWEEN (@Today-7) AND @Today
AND t.[DimTransactionDateKey] BETWEEN s.RowStartDate AND s.RowEndDate

Is there a better way then repeating above for 30 times ?


Posts : 1
Join date : 2012-05-24

Back to top Go down

Fix Orphaned Dim Keys in Facts - Fastest  Empty Re: Fix Orphaned Dim Keys in Facts - Fastest

Post  ngalemmo Thu May 24, 2012 3:05 pm

Looking at your code, what I can gather from it is:

1. You are storing natural keys in the fact table
2. You are rekeying facts because you could not find that natural key when you loaded the facts in the first place.

Best practice says you don't store natural keys in the fact table and you don't rekey facts if you can avoid it. Rekeying fact foreign keys should be an exceptional occurance.

The mistake you are making is using a bogus FK (-3) instead of inferring dimension rows. When you are loading facts and you find a natural key that does not exist in the dimension table, you should infer the dimension row... in other words create a row in the dimension table and assign it the natural key. This will give you a proper surrogate key for that instance of a natural key. Later, when the source dimension data catches up, you update the dimension row. There is no need to rekey facts because they already reference the correct row.

End result is nothing is rekeyed and dimensional information gets to where it needs to go through the normal update cycle. No special cases, no special processes.

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

Back to top Go down

Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum