DW/BI QA and Testing
2 posters
Page 1 of 1
DW/BI QA and Testing
Hi all,
Can anyone give some tips on QA and Testing on DW/BI?
I already read the Kimball ETL book that talks a little about the subject but I would like to have a more QA specific view on the subject.
What would be your main concerns when testing a DW/BI application? I think testing continues to be not so popular in the DW/BI realm.
What types of tests or QA best practices from common Software Engineering Development make sense to apply on DW/BI?
Any resource will be greatly appreciated.
Thanks!
Can anyone give some tips on QA and Testing on DW/BI?
I already read the Kimball ETL book that talks a little about the subject but I would like to have a more QA specific view on the subject.
What would be your main concerns when testing a DW/BI application? I think testing continues to be not so popular in the DW/BI realm.
What types of tests or QA best practices from common Software Engineering Development make sense to apply on DW/BI?
Any resource will be greatly appreciated.
Thanks!
apc- Posts : 8
Join date : 2012-10-11
Re: DW/BI QA and Testing
Hi,
at a high level, testing a DW/BI system is no different from testing any other type of computer system: component testing, integration testing, performance testing, etc.
Are there specific BI/DW testing issues you are concerned about?
Regards,
at a high level, testing a DW/BI system is no different from testing any other type of computer system: component testing, integration testing, performance testing, etc.
Are there specific BI/DW testing issues you are concerned about?
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
DW/BI QA
Hi!
Thanks for responding!
I guess when we're talking about the front-end the type of testing we make is no different from what we would do for a normal website or another type of interface.
However, I'm looking for insight on specific testing peculiarities of DW/BI systems, like the ETL phase.
What should I look for when testing ETL?
Any special considerations? Particularly regarding integration and regression type of testing. What to automate?
Maybe I'm not being very specific but I'm new to the process and was looking for insight on the subject (or study resources).
Hope I made it more clear.
Thanks for responding!
I guess when we're talking about the front-end the type of testing we make is no different from what we would do for a normal website or another type of interface.
However, I'm looking for insight on specific testing peculiarities of DW/BI systems, like the ETL phase.
What should I look for when testing ETL?
Any special considerations? Particularly regarding integration and regression type of testing. What to automate?
Maybe I'm not being very specific but I'm new to the process and was looking for insight on the subject (or study resources).
Hope I made it more clear.
apc- Posts : 8
Join date : 2012-10-11
Re: DW/BI QA and Testing
If I was looking at how to define ETL testing at a high level then I would be considering the following:
1. Unit testing
Test every source field to target field and for each field ensure you test every possible type of data - and by type I mean each logic path data can pass through in your ETL. So if you treat null values differently from non-nulls ensure you test both; if certain values can throw errors then ensure you test values that will throw each type of error, etc.
2. Integration testing
Test initial loads and then incremental loads
If you have multiple sources for the same target then ensure that you test them in every combination that the source data can arrive in - don't just assume that System A will always deliver its data before System B - make sure you know what happens if System B delivers its data first e.g. the ETL waits for System A or loading System B's data first doesn't cause a problem
3. Recovery (and error) Testing
At some point in its life the ETL process will go wrong - a code change has unexpected consequences, a source system changes without telling the ETL team etc. This will either cause your ETL process to fail at some point or you will get incorrect data in your DW.
You need to know how you will recover this cleanly and be sure your process works. You really don't want to be faced with a corrupt DW on a Monday morning with the business screaming at you because they can't run their reports and your trying to work out how to recover the system
4. Performance testing
This is vital and it must be performed with realistic volumes of data being loaded and already existing in your DW - and it must happen in an environment that exactly mirrors your Production environment - if it doesn't all your performance testing is a waste of time
For example, loading 10m records into an empty DB might take an hour but loading 1m records into the DB that now holds 10m records might take 24 hours. You should also test with realistic growth numbers so you know how the system is likely to perform in 6/12/18 months time - so you know when you are likely to have to review your indexing/partioning/archiving/etc. strategy and can pro-actively plan for it - rather than the business telling you in 12 months time that the load performance has degraded
Hope this helps and gives you a few pointers for what you should be looking at? I'm sure others can expand on this and comment on things I've missed
Regards,
1. Unit testing
Test every source field to target field and for each field ensure you test every possible type of data - and by type I mean each logic path data can pass through in your ETL. So if you treat null values differently from non-nulls ensure you test both; if certain values can throw errors then ensure you test values that will throw each type of error, etc.
2. Integration testing
Test initial loads and then incremental loads
If you have multiple sources for the same target then ensure that you test them in every combination that the source data can arrive in - don't just assume that System A will always deliver its data before System B - make sure you know what happens if System B delivers its data first e.g. the ETL waits for System A or loading System B's data first doesn't cause a problem
3. Recovery (and error) Testing
At some point in its life the ETL process will go wrong - a code change has unexpected consequences, a source system changes without telling the ETL team etc. This will either cause your ETL process to fail at some point or you will get incorrect data in your DW.
You need to know how you will recover this cleanly and be sure your process works. You really don't want to be faced with a corrupt DW on a Monday morning with the business screaming at you because they can't run their reports and your trying to work out how to recover the system
4. Performance testing
This is vital and it must be performed with realistic volumes of data being loaded and already existing in your DW - and it must happen in an environment that exactly mirrors your Production environment - if it doesn't all your performance testing is a waste of time
For example, loading 10m records into an empty DB might take an hour but loading 1m records into the DB that now holds 10m records might take 24 hours. You should also test with realistic growth numbers so you know how the system is likely to perform in 6/12/18 months time - so you know when you are likely to have to review your indexing/partioning/archiving/etc. strategy and can pro-actively plan for it - rather than the business telling you in 12 months time that the load performance has degraded
Hope this helps and gives you a few pointers for what you should be looking at? I'm sure others can expand on this and comment on things I've missed
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
DW/BI QA
Thanks a lot on the pointers!
Great insight!
I've been reading profusely on the subject trying to grasp everything I can in the shortest amount of time possible.
Any study resources you'd recommend?
Thanks!
Great insight!
I've been reading profusely on the subject trying to grasp everything I can in the shortest amount of time possible.
Any study resources you'd recommend?
Thanks!
apc- Posts : 8
Join date : 2012-10-11
Similar topics
» Testing ETL pipeline(s)
» DW/BI Solution Testing
» Regression testing a data warehouse.
» ETL Testing coverage
» Integration testing
» DW/BI Solution Testing
» Regression testing a data warehouse.
» ETL Testing coverage
» Integration testing
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum