Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Data Quality Strategy

3 posters

Go down

Data Quality Strategy Empty Data Quality Strategy

Post  dwcurious Wed Jul 06, 2011 6:39 am

How do we go about getting Data Quality Analysis? What aspects to be touch upon?


Posts : 20
Join date : 2011-04-14

Back to top Go down

Data Quality Strategy Empty Re: Data Quality Strategy

Post  ian.coetzer Thu Jul 07, 2011 6:38 am


Firstly, when you are inserting records during ETL - into either Dimensions or Facts:

1) Rather insert the records and those fields that you can't find (containing nulls etc.) set them to 'Unknown'
DO NOT ignore records from the source system that are not fully populated - rather bring them to the data warehouse - but set those fields to 'Unknown'

2) Logging - every ETL run should log the # of records inserted, updated and deleted for every dimension and fact.

3) Ask the business!
The should give you rules - (that you can define in the source-to-target mapping) that will raise a flag!
- either in the warehouse you have an extra field that can be set if a certan business rule (check on record fails etc.)

4) Otherwise if you really cannot insert a Fact record because the keys, (say all foreign keys) to dimensions cannot be looked up - then think of adding an exceptions fact table where you can at least insert all those records - hopefully something pulls through like the measure! etc. that you can later on use to correlate with the source system.
so if you have FactAffordability, you also need FactAffordabilityExceptions where those really really bad records can be inserted into.

Bottom line you want everything to be pulled into the data warehouse somewhere, UNLESS the business explicitly defined that you can ignore those 'faulty' records.

BUT more than often if the DataWarehouse integrity comes into question - you must be able to prove that the ETL process did not fail - by pulling ALL source records into the DataWarehouse - even those where most foreign keys / values were unknown - that is why you must also keep a staging environment alive for the DataWarehouse so that you can show them what source records resulted in those funny fact records / fact exception records!

Hope this helps a bit?

- IF you have specific rules setup to identify faulty records - sit with the business and ask them for quality score on each of these rules - then you can when you insert or update records have a field to indicate the quality score for that record, simply S1; S2; S2 or whatever - then you can group records also based on their quality

Posts : 57
Join date : 2010-05-31
Age : 42
Location : South Africa

Back to top Go down

Data Quality Strategy Empty Re: Data Quality Strategy

Post  bbjumpman Tue Nov 29, 2011 5:35 am

Everyone in any organisation has a role to play in improving data quality and sustaining that improvement:
The maintainer standing in front of a pump is the expert on the data representing that pump;
The call handler dealing with a complaint is the expert on that customer and their data issues;
Business Intelligence staff can spot unusual data patterns and validity issues
Executives can spot areas where cross business reports are not providing expected result
Etc. etc.
Data quality technology is only one part of the overall solution. A fundamental requirement is for that overall DQ solution to be defined, with clear standards, expectations and individual responsibilities clearly defined. The role of technology withing this solution should be defined, as should the areas that technology cannot solve. Delegated ownership responsibilities should be clear to all.
Data quality improvement should involve the whole organisation, and should address the root causes of data quality problems. It is unlikely that any organisation can reach an improved level of data quality, and sustain this quality, without the majority of the organisation being actively involved.


Posts : 3
Join date : 2011-11-04

Back to top Go down

Data Quality Strategy Empty Re: Data Quality Strategy

Post  Sponsored content

Sponsored content

Back to top Go down

Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum