settingsLogin | Registersettings
Show Menu

Quality Assurance

+1 vote
I am here to request another Category 'QA', which can be useful for user's like me who is looking for ways to validate or test data vault.

Here is my question:

We are currently using SQL queries to verify data in RDV and our tests are based on the knowledge we gained reading 'Super Charge Your Warehouse' (including all reference rules) and other online forums & other related experience testing Kimball\Inmon data models.

However, i am curious to find out if you are aware of any pre established approach or guidelines surrounding this subject.

Also, we are looking to automate testing of datavault and i will appreciate if anyone has any pointers in this area. I did order a copy of DV2.0 and waiting for it.


Thank You,

asked Sep 16, 2015 in Quality Assurance by Venkata Kalisetty
recategorized Sep 16, 2015 by molschimke

1 Answer

+1 vote

Hi Venkata,

There are acceptable and standard procedures for testing and quality assurance in Data Vault, including DV2.  My thoughts are as follows:

  1. it's no different than QA testing of standard data models - follow the best practices here (in terms of table structures and referential integrity, etc..)
  2. it's no different than testing standard ETL and ELT routines
  3. It's no different than best practices for reconciling and balancing business data and raw data back to the source systems.

That said: let me shed a little insight: (my partners) have an Automated Test Suite engine / software, and using that software with Mapping Manager, you can automate your test cases against staging, data vault warehouses, star schema's and even Big Data / Hadoop storage.

Ok - here are a few of the tests I generally setup and run:

  1. Data Structure Validation - ensure PK is unique and enforced on all tables, ensure FK is enforced and not broken, ensure there are no orphan records, ensure the business keys in the data vault hubs are unique and enforced, ensure the business key surrogate combinations are enforced and unique in the link structures  -
    ALL of these types of tests can be automated with STATIC test data and static test cases.
  2. Structure Validation - THESE tests should be run as reports in your data modeling tool, standard queries like "two parents, cycle relationships, missing PK's, naming conventions, etc..." but, you can also run these against the metadata catalogue in the database too.
  3. Process Validation - Again, the question is: are the source fields mapped properly to the targets? AND on the way out, are the aggregations correct? the joins correct? the calculations correct? the business rules correct?  Are errors handled properly?  Are DELTA processes working?  Are Hash Key calculations working?
    ALL of these tests as well, can be automatically generated, using STATIC data sets.

The "tricks" if you really want to know, are in developing the DATA SET TEST CASES / Static source data, to run - and then determining how to measure the results of the test.

Again, check out and their Test Manager tool.  I think you'll be pleasantly surprised.

answered Sep 16, 2015 by dlinstedt (400 points)
Thank You Dan, It looks like we are heading in the right path based on your inputs, that was helpful.
Scalefree Scalefree

Upcoming Trainings

  • July 01 to 03 - Amsterdam (English) Data Vault 2.0 Boot Camp and Certification

  • August 05 to 07 - Berlin (German) Data Vault 2.0 Boot Camp and Certification

  • September 9 to 13 - Hanover WWDVC EU

  • September 16 to 18 - Brussels (English) Data Vault 2.0 Boot Camp and Certification

  • October 21 to 23 - Hanover (German) Data Vault 2.0 Boot Camp and Certification

  • November 18 to 20 - Amsterdam (English) Data Vault 2.0 Boot Camp and Certification

  • December 02 to 04 - Dusseldorf (German) Data Vault 2.0 Boot Camp and Certification

  • December 16 to 18 - Zurich (English) Data Vault 2.0 Boot Camp and Certification

  • Contact us for inhouse training engagements.
    Visual Data Vault: Logical Modelling for Data Vault - Free Download!
    Scalefree is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
    Permissions beyond the scope of this license are available in our license. | DWH Wiki | Recent questions RSS feed | Imprint | Provided by Michael Olschimke