Designing data marts from an EAV data source
5 posters
Page 1 of 1
Designing data marts from an EAV data source
I am working on designing data marts that will import data from a datasource which uses the EAV pattern extensively.
This is the EAV pattern for your reference:
http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Let's get past my concerns about the dangers of using the EAV pattern. It's very flexible and quite dangerous if the appropriate governance is not put in place. I have no control over this architecture choice as the source system is outside of my jurisdiction.
Does anyone know what's the best way to address this when designing Dimensions and Fact tables when the source is a mix of normal tables and EAV data elements?
Is there a pattern or some kind of approach I can take?
I am going with the worst possible scenario where
1 - the users can change the existing attributes or add new attributes at any time
2 - the data elements stored in the EAV tables in the source system can contain both measures and and attributes (so they need to be split up and feed data to facts and dimensions separately)
I can propose to impose rules on the users so that they are not able to change existing attributes (For example) or come up with other rules such as, no measures can be stored in the EAV tables, only attributes.
For now I am just looking for some guidance from you more experienced folks to figure out if there some kind of preferred approach or best practice when dealing with extracting data for a DW from the EAV model in the source system.
Thanks!!!
Marco
This is the EAV pattern for your reference:
http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Let's get past my concerns about the dangers of using the EAV pattern. It's very flexible and quite dangerous if the appropriate governance is not put in place. I have no control over this architecture choice as the source system is outside of my jurisdiction.
Does anyone know what's the best way to address this when designing Dimensions and Fact tables when the source is a mix of normal tables and EAV data elements?
Is there a pattern or some kind of approach I can take?
I am going with the worst possible scenario where
1 - the users can change the existing attributes or add new attributes at any time
2 - the data elements stored in the EAV tables in the source system can contain both measures and and attributes (so they need to be split up and feed data to facts and dimensions separately)
I can propose to impose rules on the users so that they are not able to change existing attributes (For example) or come up with other rules such as, no measures can be stored in the EAV tables, only attributes.
For now I am just looking for some guidance from you more experienced folks to figure out if there some kind of preferred approach or best practice when dealing with extracting data for a DW from the EAV model in the source system.
Thanks!!!
Marco
mrispoli- Posts : 1
Join date : 2013-02-11
Re: Designing data marts from an EAV data source
First, accept my condolences. Obviously you can't build your dimensional model based on the source model. Your facts would make no sense. First, you have to understand the facts and dimensions hiding with the EAV model. Once you've identified these objects, you need to pivot the data into meaningful structure. So instead of Characteristic and Characteristic Value, you end up with with Product with its associated attributes. This can normally be accomplished via views or ETL logic. Once you've done that, then you can look at the resulting structure to determine what entities and attribute constitute a business process.
As the source data structure is dynamic, you will need to closely monitor new content being added to ensure you don't pull in new undefined metrics.
As the source data structure is dynamic, you will need to closely monitor new content being added to ensure you don't pull in new undefined metrics.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Designing data marts from an EAV data source
Usually these kind of systems come with significant metadata support and mechanisms to publish the data in more conventional formats. Is this a product or a home-grown system?
Re: Designing data marts from an EAV data source
As Boxesandlines points out, you need to understand how the source EAV model translates into a dimensional model and pivot out values and attributes using ETL.
I would be worried if the end-user could change the structure of the model as they like. Is there a process around extending the EAV model?
We use a similar approach on the current project I am on. It works quite well as there is a common source data model which is a standard relational 3NF. Then attached to the common model is an attribute / value set of entities that allow us to extend the common model to cater for new requirements.
The down side to this approach is the overhead in unpivoting / pivoting data in and out of the model. It can be a bit labourious but the advantage is that we get an extensible model.
I would be worried if the end-user could change the structure of the model as they like. Is there a process around extending the EAV model?
We use a similar approach on the current project I am on. It works quite well as there is a common source data model which is a standard relational 3NF. Then attached to the common model is an attribute / value set of entities that allow us to extend the common model to cater for new requirements.
The down side to this approach is the overhead in unpivoting / pivoting data in and out of the model. It can be a bit labourious but the advantage is that we get an extensible model.
thedude- Posts : 21
Join date : 2009-02-03
Location : London
Re: Designing data marts from an EAV data source
My condolences also - in my experience these designs often conceal shoddy data. But anyway ...
I have found it useful to prototype such data by pumping it raw into a cube tool e.g. SSAS or Power Pivot, with just a simple count measure. The end result will blow the mind of any end-user but will probably be your quickest route to understanding the true (as opposed to documented) relationships in the data.
Good luck!
I have found it useful to prototype such data by pumping it raw into a cube tool e.g. SSAS or Power Pivot, with just a simple count measure. The end result will blow the mind of any end-user but will probably be your quickest route to understanding the true (as opposed to documented) relationships in the data.
Good luck!
Similar topics
» Designing Single set of dimensions for disparate source data
» Data Marts Built Upon Data Marts
» Data Marts, Conformed dimensions and Data Warehouse
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» Modelling Problem of Data Marts
» Data Marts Built Upon Data Marts
» Data Marts, Conformed dimensions and Data Warehouse
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» Modelling Problem of Data Marts
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum