Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Data Marts Built Upon Data Marts

5 posters

Go down

Data Marts Built Upon Data Marts Empty Data Marts Built Upon Data Marts

Post  trbarnes182 Fri Mar 28, 2014 7:40 am

This is a theoretical question as opposed to a specific, real example.

I often hear cautions that building a Data Mart upon a Data Mart is a bad idea. For the purpose of this discussion, let's assume the first Data Mart is derived from an underlying Data Warehouse.

What are the cons to sourcing another Data Mart from the first one? What kind of best practices does this conflict with? Are there any situations where this could be considered an acceptable solution?

Thanks for any guidance you can provide.

trbarnes182

Posts : 1
Join date : 2014-03-28

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  BoxesAndLines Fri Mar 28, 2014 9:06 am

The main "con" is time. You have to wait for the first ETL cycle to complete before you can start the second cycle. If the first cycle crashes, everything dependent on it will not load. The biggest boost in ETL processing is setting up the workflows to process data in parallel.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  nick_white Fri Mar 28, 2014 9:12 am

Assuming you are following the Kimball methodology and have an Enterprise Data warehouse then a Data Mart is just a logical collection of tables grouped into a subject area; a data mart has no physical representation - the fact tables could exist in multiple data marts and the dimension tables almost certainly will (as they are conformed dimensions).

Therefore the idea of building a Data Mart on top of another Data Mart makes no sense (to me). You can build dimensions from existing dimensions (derive a month dimension from a date dimension) and facts from facts (create aggregates or join data in multiple facts by their conformed dimensions to create a single new fact table).

There is nothing wrong with doing this and it is often the best approach. For example, if you have already built a "quotes" fact table and a "sales" fact table and then your users ask for a report that needs both quote and sales data but is not easily satisfied by querying across both tables ("show me the average time between a quote being created and a sale being closed") then you might want to build a quote-sales fact table. It's much easier to do this by using the two fact tables as your source (assuming that between them they hold all the information you need) rather than going back to your staging layer, or even your source systems, to get the data you need.

Hope this helps?

nick_white

Posts : 364
Join date : 2014-01-06
Location : London

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  ngalemmo Fri Mar 28, 2014 12:27 pm

There is a big problem with the term 'Data Mart' as it has so many different interpretations.

In a dimensional model, the term is used to refer to a single star schema: one fact table and its dimensions. Collectively these are the data warehouse.

In a properly implemented dimensional data warehouse the fact tables should be atomic, at the lowest level of detail available. For performance reasons it may be necessary to summarize or combine (aggregate) facts, as such you are creating a data mart from another data mart or marts.

As B&L points out, you don't want to go too far with it as the dependencies and timing may introduce operational issues.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  Jeff Smith Fri Mar 28, 2014 1:14 pm

But going with the concept of a Mart being a single Fact and Dimensions, then conceptually an aggregated fact table would be a data mart that was built off of other data marts.


Jeff Smith

Posts : 471
Join date : 2009-02-03

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  ngalemmo Fri Mar 28, 2014 1:45 pm

Jeff Smith wrote:But going with the concept of a Mart being a single Fact and Dimensions, then conceptually an aggregated fact table would be a data mart that was built off of other data marts.  


Yes. There is nothing wrong with that. The issue is the operational complexities if you go too far. There is also a matter of diminishing returns. Would you gain very much if you built an aggregate of an aggregate, for example.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  ngalemmo Fri Mar 28, 2014 2:07 pm

trbarnes182 wrote:For the purpose of this discussion, let's assume the first Data Mart is derived from an underlying Data Warehouse.

This comment leads me to think this is more an Inmon style implementation. Building dimensional marts from a normalized DW is not a trivial task (not really difficult either, but not as simple as when the DW is dimensional to begin with).

Assuming there has been some investment in time and effort to work out what went into the mart and work out the transformations necessary to convert it to a dimensional model and apply whatever business rules that pertain to it, it can make sense to leverage this previous work to build a further aggregation of this data.

Wither this is the correct thing to do depends on why it is being done. If it is solely to provide a higher level of summarization, does it result in a significant improvement in performance for a substantial number of queries?

Other reasons to do this may be to localize a subset of data for a particular audience, to distribute query loads, for security, or to obfuscate or remove sensitive data that certain users should not see.

There are no hard rules on this. It is a matter of balancing a need versus operational impact and alternate methods of accomplishing the result.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Data Marts Built Upon Data Marts Empty Re: Data Marts Built Upon Data Marts

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum