metadata management
3 posters
Page 1 of 1
metadata management
What are others using for comprehensive metadata management? Some of the ETL vendors (informatica and IBM come to mind) have reasonably good metadata management products but what about the rest? I'm currently working on a project that is using SSIS for ETL with metadata scattered throughout a variety of tools.
By comprehensive I refer to standard fact and dimension table and column level business and technical descriptions, data lineage, transformation logic, etc. Additionally operational information like how many rows are processed, processing time, unmatched dimension rows, errors, etc.
There are a couple commercial products we are considering, ASG Rochdale is one, but sadly a custom built repository is looking like the best option at this point.
Once we resolve what I consider the back end metadata challenge then we have to work out how to expose the metadata in Cognos or Microsoft BI applications but that is a separate issue.
By comprehensive I refer to standard fact and dimension table and column level business and technical descriptions, data lineage, transformation logic, etc. Additionally operational information like how many rows are processed, processing time, unmatched dimension rows, errors, etc.
There are a couple commercial products we are considering, ASG Rochdale is one, but sadly a custom built repository is looking like the best option at this point.
Once we resolve what I consider the back end metadata challenge then we have to work out how to expose the metadata in Cognos or Microsoft BI applications but that is a separate issue.
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: metadata management
While schema metadata is easy, and plenty of tools do it, process metadata is far more difficult.
The basic problem with data lineage and transformation metadata is any useful information (for human consumption) needs to be created manually and entered into the tool... in other words, someone needs to write documentation. If you used a tool that allows you to develop the source-to-target mapping, it can be leveraged to serve as the meta repository for process information.
But, if you are hoping for useful metadata coming from the process itself, it simply isn't going to happen. Many ETL tools provide such metadata, but, except for the simplest of processes, is almost useless for a non-technical audience (and not very useful for a technical one as well). Many ETL tools provide language capabilites to perform custom data manipulation. If the lineage path passes through such code, you need to be able to understand the code in order to figure out the true lineage of the data. At this point, only a human can provide such interpretation and tie it back to business requirements. Also, the appearance of such code and the use of multiple fields into such code blocks causes the lineage backtracking to fan out into a multitude of source columns and tables, many of which have no direct bearing on the final column.
The basic problem with data lineage and transformation metadata is any useful information (for human consumption) needs to be created manually and entered into the tool... in other words, someone needs to write documentation. If you used a tool that allows you to develop the source-to-target mapping, it can be leveraged to serve as the meta repository for process information.
But, if you are hoping for useful metadata coming from the process itself, it simply isn't going to happen. Many ETL tools provide such metadata, but, except for the simplest of processes, is almost useless for a non-technical audience (and not very useful for a technical one as well). Many ETL tools provide language capabilites to perform custom data manipulation. If the lineage path passes through such code, you need to be able to understand the code in order to figure out the true lineage of the data. At this point, only a human can provide such interpretation and tie it back to business requirements. Also, the appearance of such code and the use of multiple fields into such code blocks causes the lineage backtracking to fan out into a multitude of source columns and tables, many of which have no direct bearing on the final column.
Re: metadata management
thanks Nick, always appreciate your sage advice!
FYI, my benchmark is a little biased as I've used a metadata driven development environment for the majority of my projects and not had to worry about this scenario, for the back end at least.
FYI, my benchmark is a little biased as I've used a metadata driven development environment for the majority of my projects and not had to worry about this scenario, for the back end at least.
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: metadata management
Hi Robber,
I've been using the utilities that came with the Kimball MSDWTK book for several years on many projects with quite good results. For software you only need Excel so you can save a lot of time and money on software, training etc.
Here's the link:
http://www.kimballgroup.com/html/booksMDWTtools.html
In particular, Chapter 2, Dimensional modeling spreadsheet - this covers most of your requirements. You can then simply publish from Excel, or feed your metadata from Excel to the Metadata database and SSRS Reports (see Chapter 15—Metadata Plan).
Excel is actually quite a good tool for the input/design task, as the spreadsheet layout encourages comparison and consistency of definitions.
I've been using the utilities that came with the Kimball MSDWTK book for several years on many projects with quite good results. For software you only need Excel so you can save a lot of time and money on software, training etc.
Here's the link:
http://www.kimballgroup.com/html/booksMDWTtools.html
In particular, Chapter 2, Dimensional modeling spreadsheet - this covers most of your requirements. You can then simply publish from Excel, or feed your metadata from Excel to the Metadata database and SSRS Reports (see Chapter 15—Metadata Plan).
Excel is actually quite a good tool for the input/design task, as the spreadsheet layout encourages comparison and consistency of definitions.
Similar topics
» SCD 2 Audit / Metadata Columns
» Microsoft Metadata Management
» Instead of Order Management, this dimensional modeling is more for Quotation and Purchase Order management
» Data Vault v's Dimensional Model
» metadata driven ETL architecture
» Microsoft Metadata Management
» Instead of Order Management, this dimensional modeling is more for Quotation and Purchase Order management
» Data Vault v's Dimensional Model
» metadata driven ETL architecture
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum