Modeling tool with support for hierarchies
5 posters
Page 1 of 1
Modeling tool with support for hierarchies
Hi,
this is my first post to this forum so please be nice! hehe
I'm at the beginning of a fairly large DWH project (by Norwegian standards) and I have been reviewing data modeling tools recently. However, there seems to be a general lack of support for creating dimensional hierarchies in these tools and I was wondering if you guys know of any that support this? I have been reviewing CA ERwin and ER/Studio from Embarcadero which both seems to lack the functionality, while PowerDesigner from Sybase seems to have it. Oracle SQL Data Modeler also has it but this tools doesn't really seem "enterprise friendly"...
Do any of you guys have any recomandations?
this is my first post to this forum so please be nice! hehe
I'm at the beginning of a fairly large DWH project (by Norwegian standards) and I have been reviewing data modeling tools recently. However, there seems to be a general lack of support for creating dimensional hierarchies in these tools and I was wondering if you guys know of any that support this? I have been reviewing CA ERwin and ER/Studio from Embarcadero which both seems to lack the functionality, while PowerDesigner from Sybase seems to have it. Oracle SQL Data Modeler also has it but this tools doesn't really seem "enterprise friendly"...
Do any of you guys have any recomandations?
Lumbago- Posts : 3
Join date : 2011-09-05
Re: Modeling tool with support for hierarchies
What specific functionality is lacking? One solution is to specify the normalized hierarchy in the logical model and then flatten the structure in the physical model.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Modeling tool with support for hierarchies
I’m so glad to hear I’m not the only person frustrated by this lacking functionality. I simply can’t build a project-wide ETL developer friendly STT (source to target) Lineage using standard modeling tools! The detail required is simply too cumbersome to document using these tools!
Anyway, here is what I’ve ended up doing…
I build the “pretty” illustration in a modeling tool without worries of the details. Then I use a custom application to build out the STT and Lineage for the ETL develpers. This allows me to build out the levels within the separate hierarchies of the facts and dimensions (yes… multilevel facts). I can build out an entire project with dozens of facts and dimensions without overlap or confusion and hand the ETL developers a nice and organized (easy to understand) STT Lineage.
I’ve worked projects with some of the largest delivered EDWs available for PeopleSoft, tax systems, GIS, etc… they’ve ALL used Excel to document the lineage.
If a modeling tool salesperson says their tool can do it… make them illustrate it!
Hope this helps.
Anyway, here is what I’ve ended up doing…
I build the “pretty” illustration in a modeling tool without worries of the details. Then I use a custom application to build out the STT and Lineage for the ETL develpers. This allows me to build out the levels within the separate hierarchies of the facts and dimensions (yes… multilevel facts). I can build out an entire project with dozens of facts and dimensions without overlap or confusion and hand the ETL developers a nice and organized (easy to understand) STT Lineage.
I’ve worked projects with some of the largest delivered EDWs available for PeopleSoft, tax systems, GIS, etc… they’ve ALL used Excel to document the lineage.
If a modeling tool salesperson says their tool can do it… make them illustrate it!
Hope this helps.
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
Re: Modeling tool with support for hierarchies
I appreciate your answers fellas. I'm not overly worried about the data lineage actually...I have never heard of a modeling tool that has such capabilities although it would be *really* cool and extremely helpful to have it. The ETL-tool we are using (Business Objects Data Services) is supposedly able to do it, but I have some doubts...our solution will with 95% certainty involve Excel as well
The issue I'm more worried about however is the ability to model hierarchies (i.e. Year-Month-Day or Year-Week) and keep track of them directly in the modeling tool. PowerDesigner from Sybase is up to the task but I would really like some alternatives to choose from. Data Modeler from Oracle is suboptimal in the sense that their platform support is really bad (it only supports oracle9+, db2 and sql server 2000/2005)
@KS_EDW: This custom application you are talking about...can you please elaborate a bit about it?
The issue I'm more worried about however is the ability to model hierarchies (i.e. Year-Month-Day or Year-Week) and keep track of them directly in the modeling tool. PowerDesigner from Sybase is up to the task but I would really like some alternatives to choose from. Data Modeler from Oracle is suboptimal in the sense that their platform support is really bad (it only supports oracle9+, db2 and sql server 2000/2005)
@KS_EDW: This custom application you are talking about...can you please elaborate a bit about it?
Lumbago- Posts : 3
Join date : 2011-09-05
Re: Modeling tool with support for hierarchies
Spreadsheets are clumsy at best but are just about our only cost-effective tool. Mapping source to target and transformations is not a problem in Excel. Building a lineage, ETL procedures, hierarchies, and levels for an entire enterprise warehouse is! Keeping all that straight in 10s of thousands of rows is a headache! Creating reports for the ODS, ETL, and semantic modelers from Excel models is also very time consuming.
In my experience, the developers were always fumbling through the spreadsheets, hiding columns, sorting rows… causing mistakes…. It was just humanly impossible to keep it all straight for me or the modelers…
Essentially, four distinct build sections need modeled – operational data stage, ETL, DW, and the semantic layer. This application captures all four with one effort. I’ve tried to attach an illustration but "new member" status is blocking it (changed jobs/email addr since last membership). I'll update in a vew days.
The output from the application is very “report” oriented models, mappings, lineages, etc… in that, the developers from all sides can get what they need without sorting through huge spreadsheets. Reports can include DW Table builds, DW table/structure modifications, ETL build, Conforming procedures, Semantic build, level and hierarchy architecture, etc…. PLUS… when there is a change ALL of the reports are updated… PLUS agile can be implemented by separating an enterprise project into sub projects.
One word though... the application does not make it any easier to conceptualize a model while building it. It only makes the organization of documentation easier. It’s very much a mad-scientist approach, but it works beautifully.
I was hoping to send the idea off to a vendor… I don’t have the capacity to create a solid application or distribute it. Any suggestions?
In my experience, the developers were always fumbling through the spreadsheets, hiding columns, sorting rows… causing mistakes…. It was just humanly impossible to keep it all straight for me or the modelers…
Essentially, four distinct build sections need modeled – operational data stage, ETL, DW, and the semantic layer. This application captures all four with one effort. I’ve tried to attach an illustration but "new member" status is blocking it (changed jobs/email addr since last membership). I'll update in a vew days.
The output from the application is very “report” oriented models, mappings, lineages, etc… in that, the developers from all sides can get what they need without sorting through huge spreadsheets. Reports can include DW Table builds, DW table/structure modifications, ETL build, Conforming procedures, Semantic build, level and hierarchy architecture, etc…. PLUS… when there is a change ALL of the reports are updated… PLUS agile can be implemented by separating an enterprise project into sub projects.
One word though... the application does not make it any easier to conceptualize a model while building it. It only makes the organization of documentation easier. It’s very much a mad-scientist approach, but it works beautifully.
I was hoping to send the idea off to a vendor… I don’t have the capacity to create a solid application or distribute it. Any suggestions?
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
metadata driven DW development
I've been using a DW development application for a number of years which automatically generates documentation including data lineage and all Source to Target mappings including any transformations. I've got some sample documentation on my website: data warehouse documentation
Have a look and let me know what you think. For example select technical documentation, General Ledger, Fact Tables, fact_gl_transactions
there you can see procedures, dimensions, columns, star schema, source diagram (aka data lineage). This data lineage is at the table level, you can view column level lineage elsewhere.
The metadata is the core, what you design is stored in the metadata repository, what you generate (facts, dims, staging, procedures, etc.) are all generated from the metadata. It generates native db objects, no black box or any extra steps involved.
Let me know what you think
Have a look and let me know what you think. For example select technical documentation, General Ledger, Fact Tables, fact_gl_transactions
there you can see procedures, dimensions, columns, star schema, source diagram (aka data lineage). This data lineage is at the table level, you can view column level lineage elsewhere.
The metadata is the core, what you design is stored in the metadata repository, what you generate (facts, dims, staging, procedures, etc.) are all generated from the metadata. It generates native db objects, no black box or any extra steps involved.
Let me know what you think
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: Modeling tool with support for hierarchies
just re-read the post, i was working bottom up and replied to the data lineage request...
from a hierarchy point of view, same thing, stored in the metadata so look at the cube group in same structure and you will see a hierarchy link...
from a hierarchy point of view, same thing, stored in the metadata so look at the cube group in same structure and you will see a hierarchy link...
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: Modeling tool with support for hierarchies
@robber: I looked at the documentation you provided and I've also looked a bit on the WhereScape website and from what I've seen so far it looks intriguing. I was surprised however that it wasn't possible to download a trial-version if the software and I was also not able to find any licensing information (pricing) which kind of puts me off. Is it also the case that all ETL is implemented as stored procedures in the/a database?
Lumbago- Posts : 3
Join date : 2011-09-05
Re: Modeling tool with support for hierarchies
Lumbago wrote:@robber: I looked at the documentation you provided and I've also looked a bit on the WhereScape website and from what I've seen so far it looks intriguing. I was surprised however that it wasn't possible to download a trial-version if the software and I was also not able to find any licensing information (pricing) which kind of puts me off. Is it also the case that all ETL is implemented as stored procedures in the/a database?
You can download an eval version, pricing is reasonable (100k - 200k USD), ETL is ELT which for many of us is a good thing, work is done in the database, open and transparent.
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: Modeling tool with support for hierarchies
That kind of pricing was exactly our fear. Not to say anything about the product… it looks excellent, but the output is not so unlike our home-grown solution. Usually, for the three projects/organizations I’ve been working for, they have 800K – 2 million for the entire hardware/operating system/database/software budget. We usually can stretch 20K for a “wanted software” (software for which there are practical work arounds for).
I really [REALLY] like the tool and have tagged it for future reference.
I really [REALLY] like the tool and have tagged it for future reference.
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
Re: Modeling tool with support for hierarchies
KS_EDW wrote:That kind of pricing was exactly our fear. Not to say anything about the product… it looks excellent, but the output is not so unlike our home-grown solution. Usually, for the three projects/organizations I’ve been working for, they have 800K – 2 million for the entire hardware/operating system/database/software budget. We usually can stretch 20K for a “wanted software” (software for which there are practical work arounds for).
I really [REALLY] like the tool and have tagged it for future reference.
I hear ya but consulting fee's are not cheap either and the productivity increases not to mention ongoing support, maintenance, and enhancements more than pay for the upfront cost. The metadata is a side benefit, you've also built your data warehouse along the way.
But for those smaller gigs where they just cannot pony up have a look at Talend Open Source...
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: Modeling tool with support for hierarchies
Absolutely! Talend looked like a great tool… unfortunately, I no more than installed it than got told to uninstall it. (Warning… I’m going to vent here…) I work in Government…I often find myself between a rock, hard place, and politics. I can deal with “rock” and “hard place” but the “politics” of a public organization are sometimes ludicrous. “We don’t use open source”… someone in functional administration had a bad run-in with an open source tool in the 90s… “really…, the 90s! Come ON!” So, whatever we use must come through a vendor. I'll tell you... if Talend had been bundled with the BI, it would have been okay to use. Arg!
Hate to be so negative… despite my personal take on things…, all of our projects (though modeled by spreadsheet) have been wildly successful. One of the data marts I modeled found $64 million in new revenues in the first 5 days of production… we collected almost $25 million of that within year one with letters - produced from the EDW. That alone paid for the entire 6-month project twenty five times over! We’ve also had a data mart that caused a complete redesign of individual income tax policy, reducing April 15th annual debt by almost 10%. It’s amazing what these things can do when properly designed!
Hate to be so negative… despite my personal take on things…, all of our projects (though modeled by spreadsheet) have been wildly successful. One of the data marts I modeled found $64 million in new revenues in the first 5 days of production… we collected almost $25 million of that within year one with letters - produced from the EDW. That alone paid for the entire 6-month project twenty five times over! We’ve also had a data mart that caused a complete redesign of individual income tax policy, reducing April 15th annual debt by almost 10%. It’s amazing what these things can do when properly designed!
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
Re: Modeling tool with support for hierarchies
I'm currently working in Government as well and yes there are some challenges.
There are BI players who will bundle a deal with Talend so you could try that approach - or considering the ROI that you are demonstrating why don't you buy the ETL personally and then charge them a royalty, say 10%, on the new revenue streams. That would be a win/win in my books.
There are BI players who will bundle a deal with Talend so you could try that approach - or considering the ROI that you are demonstrating why don't you buy the ETL personally and then charge them a royalty, say 10%, on the new revenue streams. That would be a win/win in my books.
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: Modeling tool with support for hierarchies
Very good suggestions! I'll keep that suggestion in my pocket for the next project! Thanks for your insight and suggestions!
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
SQL Server Master Data Services
This has been a very interesting thread.
I'm wondering if anyone has tried or considered the SQL Server Master Data Services module:
http://msdn.microsoft.com/en-us/sqlserver/ff943581
http://www.microsoft.com/sqlserver/en/us/solutions-technologies/data-warehousing/master-data-services.aspx#
It cites hierarchy management as one of the key features. Its effectively free (with SQL Server 2008 R2 Enterprise+ Editions).
I'm about to embark on an eval of it for a client - they are similar or smaller scale than the projects KS_EDW described above. So I'd be interested to hear of other people's experience and/or opinions on this.
Thanks - Mike
I'm wondering if anyone has tried or considered the SQL Server Master Data Services module:
http://msdn.microsoft.com/en-us/sqlserver/ff943581
http://www.microsoft.com/sqlserver/en/us/solutions-technologies/data-warehousing/master-data-services.aspx#
It cites hierarchy management as one of the key features. Its effectively free (with SQL Server 2008 R2 Enterprise+ Editions).
I'm about to embark on an eval of it for a client - they are similar or smaller scale than the projects KS_EDW described above. So I'd be interested to hear of other people's experience and/or opinions on this.
Thanks - Mike
Re: Modeling tool with support for hierarchies
Hey folks… I thought I would post that diagram.
This is how the modeling tool I discussed earlier works (illustrated). It’s designed around current employer’s operation – using a 7-stage ETL process using OWB as the ETL tool. We operate with a LOT of geographical, financial, and event data. There are standardized object naming conventions also (not the “Naming Standard Validations) node. Projects are handled via “agile” development, so projects are often segmented into distinct sections. Currently we’re running about 4,000 distinct columns through ETL, 100 or so subject areas, and a little over that in dimensions.
https://i.servimg.com/u/f44/16/84/97/03/presen11.jpg
[img][/img]
This is how the modeling tool I discussed earlier works (illustrated). It’s designed around current employer’s operation – using a 7-stage ETL process using OWB as the ETL tool. We operate with a LOT of geographical, financial, and event data. There are standardized object naming conventions also (not the “Naming Standard Validations) node. Projects are handled via “agile” development, so projects are often segmented into distinct sections. Currently we’re running about 4,000 distinct columns through ETL, 100 or so subject areas, and a little over that in dimensions.
https://i.servimg.com/u/f44/16/84/97/03/presen11.jpg
[img][/img]
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
Re: Modeling tool with support for hierarchies
How do you define "subject area"? With 4000 columns and 100 subject areas, that leave ~40 columns per subject area. That seems a little light to implement any sort of business functionality.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Modeling tool with support for hierarchies
Ah, good call, some clarity would probably help… we have a HUGE volume of conformed dimensions so the dimensions often end up being 10 or fewer columns. (I think I confused things when I desribed columns as "distinct ETL columns"... should have said "distinct EDW columns".)
Looking at it mathematically, say dividing total columns by subject areas, kind of skews the appears of EDW’s actual distribution of columns throught the surrogates.
Distinct contributing columns is much higher than 4,000 (many more...). But, EDW only maintains about 4,000 distinct columns mapped from or derived in the ETLs. Many subject areas use the same dimension thus, EDW only maintains… for example… say 10 distinct columns but has 100 contributing columns, and, when flattened by the fact’s surrogates, they appear as 100+ to the users ("+" = derived data), but are actually only 10 in EDW. I hope this makes some sense of the situation. Most subject areas maintain roughly 150+ columns (date and location usually take up a lot of those columns).
To expand…. Many (most) of our subject areas use linear referencing methods (GIS) location, date, person, and notes dimensions. These alone are 100s of contributing columns funneled into 10s of EDW columns. Often we have a fact table with 10 or more surrogates for location and another 10 or more for date and another 10 for persons (bridge), and yet another 10 or more for notes (bridge). All together, there are hundreds of distinct contributing columns, but EDW only holds 10s of distinct columns.
To explain this particular fact… This type of fact enables users to interact with EDW through mapping software… plotting the points of progress on a project (1 project = one row) for example, then drill down occurs through bridge tables, but the bridge tables are a distinct fact, having reporting requirements outside of mapping – person for example, having aggregates of salary based on project time. Because location (“GIS”) is the center point and purpose of much of this EDW, it makes the model a little different than the typical EDW model (financial, customer, sales, or HR for example). When illustrated it turns into a huge web with the center points being date, GIS, and resources.
Overall, our typical fact table has 2 – 30 (or so) surrogates. Our facts have between 4 and 20 measures. The dimensions range from 3 to 50 (most having 10 or less) columns. There are of course a couple exceptions – several dimensions with 80 or so columns and a couple facts with 100+ surrogates.
Hope this clarifies… oh, and let me know if something seems wrong… I’m always up for some constructive criticism.
Looking at it mathematically, say dividing total columns by subject areas, kind of skews the appears of EDW’s actual distribution of columns throught the surrogates.
Distinct contributing columns is much higher than 4,000 (many more...). But, EDW only maintains about 4,000 distinct columns mapped from or derived in the ETLs. Many subject areas use the same dimension thus, EDW only maintains… for example… say 10 distinct columns but has 100 contributing columns, and, when flattened by the fact’s surrogates, they appear as 100+ to the users ("+" = derived data), but are actually only 10 in EDW. I hope this makes some sense of the situation. Most subject areas maintain roughly 150+ columns (date and location usually take up a lot of those columns).
To expand…. Many (most) of our subject areas use linear referencing methods (GIS) location, date, person, and notes dimensions. These alone are 100s of contributing columns funneled into 10s of EDW columns. Often we have a fact table with 10 or more surrogates for location and another 10 or more for date and another 10 for persons (bridge), and yet another 10 or more for notes (bridge). All together, there are hundreds of distinct contributing columns, but EDW only holds 10s of distinct columns.
To explain this particular fact… This type of fact enables users to interact with EDW through mapping software… plotting the points of progress on a project (1 project = one row) for example, then drill down occurs through bridge tables, but the bridge tables are a distinct fact, having reporting requirements outside of mapping – person for example, having aggregates of salary based on project time. Because location (“GIS”) is the center point and purpose of much of this EDW, it makes the model a little different than the typical EDW model (financial, customer, sales, or HR for example). When illustrated it turns into a huge web with the center points being date, GIS, and resources.
Overall, our typical fact table has 2 – 30 (or so) surrogates. Our facts have between 4 and 20 measures. The dimensions range from 3 to 50 (most having 10 or less) columns. There are of course a couple exceptions – several dimensions with 80 or so columns and a couple facts with 100+ surrogates.
Hope this clarifies… oh, and let me know if something seems wrong… I’m always up for some constructive criticism.
KS_EDW- Posts : 20
Join date : 2011-09-07
Age : 49
Location : Kansas
Similar topics
» What to do if BI tool doesn't support drill-across?
» Data modeling for Multi-language support
» Hierarchies in dimensional modeling
» Uses of a data modeling tool
» Need for a dimensional modeling tool?
» Data modeling for Multi-language support
» Hierarchies in dimensional modeling
» Uses of a data modeling tool
» Need for a dimensional modeling tool?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum