LEFT OUTER join in source extract SQL versus downstream lookup?
3 posters
Page 1 of 1
LEFT OUTER join in source extract SQL versus downstream lookup?
I am working on a packaged BI Apps solution for a enterprise HR system. My question is, when I want to add a dimensional attribute, it seems I can either add it directly to the source SQL extracts (in my case a App SQL Qualifier in Informatica) using a LEFT OUTER join, or I can use a lookup transformation downstream in the mapping. I would use the same logic in the lookup as I would in the SQ LEFT OUTER JOIN condition.
Question is..does it matter which approach I take? It seems the lookup would be easier given that I avoid any cartesian product issues (too many rows returned). A lookup will ensure only one row gets returned. Can someone shed some light into this. Is it always best to add fields directly to the extract SQL? What is the best practice?
Thanks.
Question is..does it matter which approach I take? It seems the lookup would be easier given that I avoid any cartesian product issues (too many rows returned). A lookup will ensure only one row gets returned. Can someone shed some light into this. Is it always best to add fields directly to the extract SQL? What is the best practice?
Thanks.
obiapps- Posts : 21
Join date : 2010-09-28
Re: LEFT OUTER join in source extract SQL versus downstream lookup?
How many rows in the lookup table?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: LEFT OUTER join in source extract SQL versus downstream lookup?
You shouldn't get a cartesian product from the join provided it is not a many-to-many relationship, and if it was, a lookup wouldn't work anyway, unless you used some form of aggregation in the lookup source query.
If you are pulling data during a quiet time in the system, either approach should work. If, however, the source system is up and there are active users, you tend to be better off keeping queries as simple as possible to avoid degredation in the operational system.
If the lookup is very large, you could preload it in a separate mapping into a persistant lookup prior to running the main load process to reduce the time the main process is hitting the source system.
If you are pulling data during a quiet time in the system, either approach should work. If, however, the source system is up and there are active users, you tend to be better off keeping queries as simple as possible to avoid degredation in the operational system.
If the lookup is very large, you could preload it in a separate mapping into a persistant lookup prior to running the main load process to reduce the time the main process is hitting the source system.
LEFT OUTER join in source extract SQL versus downstream lookup?
Thanks for your comments. The ETLs are run from a daily copy of PROD so the source system is static. Also, I am extracting to a TEMP table as using that as the source of my lookup. Size or performance is not really the issue..the temp table has only 11k rows. My main issue is really just understanding best practices with regard to how and where to add additional data attributes. In this case, the source system is a PSoft enterprise and each source dependent mapping has MANY tables and a number of LEFT OUTER joins. I am just wondering if I am required to add new fields/tables at the extract point (SQL Qualifier) or if its ok to avoid modifying the vanilla ERP logic and instead using a downstream lookup from a temp table I created.
Is there a best practice or advantage/disadvantage of each?
Is there a best practice or advantage/disadvantage of each?
obiapps- Posts : 21
Join date : 2010-09-28
Re: LEFT OUTER join in source extract SQL versus downstream lookup?
Best practice is to use a lookup transformation (reusable if it is used more than once). It is easier to understand and maintain the mapping that way. I would only use a join if dealing with some sort of performance issue (for Informatica lookups, rowcount > 500K).
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
obiapps- Posts : 21
Join date : 2010-09-28
Similar topics
» Why extract?
» Extract to staging: use constraints or not?
» Question asked about ETL (little explanation needed)
» Right Outer Join with Fact table
» Kimball SCD vs Extract from trxn System
» Extract to staging: use constraints or not?
» Question asked about ETL (little explanation needed)
» Right Outer Join with Fact table
» Kimball SCD vs Extract from trxn System
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum