ETL : check child-parent relation integrity
2 posters
Page 1 of 1
ETL : check child-parent relation integrity
Hello !
We are writing our own ETL, and I have a question about the best practices to manage child-parent relationship integrity in a SCD 2.
Let say I have this table (product referential) :
Basically, I want to check that a product type belongs to only one product line (for records with current flag set to "Y"), to respect hierarchy integrity.
How could I achieve that in a good way, at ETL & DB level ? (I'm using a SQL Server 2012 Standard Edition)
For the moment, the ETL is working in this way :
1. Read flat file line by line & check format integrity
2. If there is no error in the entire flat file :
- insert new lines
- modify existing lines if needed (put flag to "N" & insert end date)
I suppose I have to do something between step 1 & 2 to check dimension integrity.
Thanks for any advice =)
We are writing our own ETL, and I have a question about the best practices to manage child-parent relationship integrity in a SCD 2.
Let say I have this table (product referential) :
Basically, I want to check that a product type belongs to only one product line (for records with current flag set to "Y"), to respect hierarchy integrity.
How could I achieve that in a good way, at ETL & DB level ? (I'm using a SQL Server 2012 Standard Edition)
For the moment, the ETL is working in this way :
1. Read flat file line by line & check format integrity
2. If there is no error in the entire flat file :
- insert new lines
- modify existing lines if needed (put flag to "N" & insert end date)
I suppose I have to do something between step 1 & 2 to check dimension integrity.
Thanks for any advice =)
Skualys- Posts : 13
Join date : 2014-04-04
Re: ETL : check child-parent relation integrity
How do you decide to insert a new line? Simply add product type to the natural key.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: ETL : check child-parent relation integrity
I insert a new line if :
- product code doesn't exist,
- product code exists, but informations changed (product type code or product line code).
Why should I add product type to the natural key ? I want that at a time T1, one product type belongs to only one product line for all current records (I don't want a record with T1 to belong to both L1 & L2, for example), like it would be with a normalized model.
- product code doesn't exist,
- product code exists, but informations changed (product type code or product line code).
Why should I add product type to the natural key ? I want that at a time T1, one product type belongs to only one product line for all current records (I don't want a record with T1 to belong to both L1 & L2, for example), like it would be with a normalized model.
Skualys- Posts : 13
Join date : 2014-04-04
Re: ETL : check child-parent relation integrity
Ahh, then yes, you need something else to enforce the constraint. If the volume of changes is small, a DB trigger is fine. Just make sure the ETL process can recover gracefully from a failed insert.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» ETL Load - Dropping Indexes and Constraints
» Table column analysis/profiling tool?
» DataRelease after manual quality check
» How many fact columns is too many? 80-100? Grain Check.
» Foreign key referential integrity in the fact tables
» Table column analysis/profiling tool?
» DataRelease after manual quality check
» How many fact columns is too many? 80-100? Grain Check.
» Foreign key referential integrity in the fact tables
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum