How to share attributes between conformed dimension entities
3 posters
Page 1 of 1
How to share attributes between conformed dimension entities
Hi,
I'm designing a Staff dimension. I want to use it to conform entities across many source systems. I'm including a source_system column, and every new Staff code is loaded into the dimension table (along with any useful attributes relating to the staff member) as soon as it is discovered by a data extract. (I'm using a "Readsert" Lookup in SSIS). We then rely on Data Stewards to enter a reporting name which conforms that row to the other rows for the same individual staff member. Until they do that, the system's business key code is available for reporting.
My question is; when one system has a useful attribute, e.g. working_status, but other systems don't hold it, how best to ensure that the code is propagated into the other rows in the dimension for that staff member, or reported against other rows?
Many thanks in advance,
Al
I'm designing a Staff dimension. I want to use it to conform entities across many source systems. I'm including a source_system column, and every new Staff code is loaded into the dimension table (along with any useful attributes relating to the staff member) as soon as it is discovered by a data extract. (I'm using a "Readsert" Lookup in SSIS). We then rely on Data Stewards to enter a reporting name which conforms that row to the other rows for the same individual staff member. Until they do that, the system's business key code is available for reporting.
My question is; when one system has a useful attribute, e.g. working_status, but other systems don't hold it, how best to ensure that the code is propagated into the other rows in the dimension for that staff member, or reported against other rows?
Many thanks in advance,
Al
Last edited by Al Wood on Wed May 25, 2011 9:30 am; edited 1 time in total (Reason for editing : Clarity)
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How to share attributes between conformed dimension entities
I'm confused. What are you propogating to other rows? You update the latest row representing a staff member and then you're done.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: How to share attributes between conformed dimension entities
There will be several rows per staff member, one for each source system where they have a code.
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How to share attributes between conformed dimension entities
The table needs an additional key, I call it the 'update key' which represents a unique staff member. Once you have figured out who are the same (based on a cross reference or something like it), you populate those rows with the same key value and use that key to propagate common values to all rows.
But what if...
Thanks for the reply. I was thinking I'd use a Reporting_Name field, which sounds the same as your Update_Key. We would ask the Data Stewards to agree to make them the same so we would have:
----System---- | NaturalKey | Reporting_Name | Gender
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
Then when we load SourceSystem4's data, we have:-
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
SourceSystem4 | smith a | Unknown | Female
the DS for that system gets an email and soon we have: -
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
SourceSystem4 | smith a | A. Smith | Female
then the db runs an update and we get :-
SourceSystem1 | AS | A. Smith | Female
SourceSystem2 | Smith | A. Smith | Female
SourceSystem3 | ASMITH | A. Smith | Female
SourceSystem4 | smith a | A. Smith | Female
then the SourceSystem4 DS realises that Amy Smith is not the same person as Al Smith,
SourceSystem1 | AS | A. Smith | Female
SourceSystem2 | Smith | A. Smith | Female
SourceSystem3 | ASMITH | A. Smith | Female
SourceSystem4 | smith a | Amy Smith | Female
So how do I undo the gender assignment for Al Smith!
----System---- | NaturalKey | Reporting_Name | Gender
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
Then when we load SourceSystem4's data, we have:-
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
SourceSystem4 | smith a | Unknown | Female
the DS for that system gets an email and soon we have: -
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
SourceSystem4 | smith a | A. Smith | Female
then the db runs an update and we get :-
SourceSystem1 | AS | A. Smith | Female
SourceSystem2 | Smith | A. Smith | Female
SourceSystem3 | ASMITH | A. Smith | Female
SourceSystem4 | smith a | A. Smith | Female
then the SourceSystem4 DS realises that Amy Smith is not the same person as Al Smith,
SourceSystem1 | AS | A. Smith | Female
SourceSystem2 | Smith | A. Smith | Female
SourceSystem3 | ASMITH | A. Smith | Female
SourceSystem4 | smith a | Amy Smith | Female
So how do I undo the gender assignment for Al Smith!
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How to share attributes between conformed dimension entities
Are you manually trying to do this or are you using a data quality toolset? The DQ toolset will manage this for you. I would also approach this slightly differently. I would not propogate information across all sources. I would build a golden record that combined all sources. In your case, the one record would break into two records once enough information was provided to distinguish one customer from another.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
I considered that, but
I have to avoid doing it manually, too much work, and I guess I'm trying to build a DQ toolset using SQL server, SSIS, Sharepoint etc.
Do you mean, when you say you would not propogate rows, that you would just derive the golden record from the data at this stage without overwriting the 'Unknown' values?
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
SourceSystem4 | smith a | A. Smith | Female
- so when it was corrected, the golden record for A. Smith would go back to Unknown?
Good idea, but then the table I described isn't really a dimension any more. It's a type of mapping table, isn't it? We would expose the derived table to the end users as the dimension?
Al
Do you mean, when you say you would not propogate rows, that you would just derive the golden record from the data at this stage without overwriting the 'Unknown' values?
SourceSystem1 | AS | A. Smith | Unknown
SourceSystem2 | Smith | A. Smith | Unknown
SourceSystem3 | ASMITH | A. Smith | Unknown
SourceSystem4 | smith a | A. Smith | Female
- so when it was corrected, the golden record for A. Smith would go back to Unknown?
Good idea, but then the table I described isn't really a dimension any more. It's a type of mapping table, isn't it? We would expose the derived table to the end users as the dimension?
Al
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How to share attributes between conformed dimension entities
Yes, the table is a cross reference table. If there are three rows for the same agent, then I really want to only expose the one row, with the "best" information. Best is determined based on data analysis. It could be as simple as the most recent update or it could be by source. That is, for some information, it is only available from one source. This is what you do for MDM implementations. The cross reference table is the source for the dimension table.
The hard part for coding your own is all the name (and address) standardization. You have to cleanse and standardize the data before you match and consolidate.
The hard part for coding your own is all the name (and address) standardization. You have to cleanse and standardize the data before you match and consolidate.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» Conformed Dimension Partial Set of Attributes
» date attributes in a conformed dimension
» Advice on Dimensional Modeling where dimensions share attributes
» Best Practice: two Cube share one Dimension?
» dimension table design question for around 100 attributes and higher level calculated attributes
» date attributes in a conformed dimension
» Advice on Dimensional Modeling where dimensions share attributes
» Best Practice: two Cube share one Dimension?
» dimension table design question for around 100 attributes and higher level calculated attributes
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum