settingsLogin | Registersettings
Scalefree
Show Menu

Handling Mutations to Business Keys in Data Vault

+1 vote

Hi eveyone,

Just a short background...

I have already bought and read the Super Charge Your Data Warehouse book. I have been reading articles and watching videos about the Data Vault Modeling approach and methodology for over a month now as I shall use it to build a Raw Data Vault to be the core of the new EDW in my company.

By coincidence, there are several business transformations happening at the company (a merge, full replacement of several legacy systems with integrated suites, organization changes, ...) which are expected to be resident at least during the upcoming 2-3 years. Unfortunate and very challenging for an EDW initiative. That was one of the essential reasons that pushed me away from the classical Kimball-only approach looking for an alternative till I learned about the Data Vault (which also just happended by coincidence smiley).

Now the question:

In such circumstances, it is inevitable to face the business key changes, either through old-to-new 1:1 mapping, 1:M split or M:1 consolidation. As Data Vault modeling considers the business (natural) key to be immutable. Is there an official method or a best practice to handle such changes?

There is of course the horrible option of killing all old affected relationships using effectivity satellites and just keep old data as a historical view and start with fresh hubs/links/satellites. However, there should be an option to enable the users to report from the data using both the old and new keys. Have not anybody face this issue previously? Any thoughts?

Thanks.

 

asked Sep 28, 2015 in Modelling by maged-mostafa (170 points)

1 Answer

+2 votes
 
Best answer
Dear Maged,

this problem is actually common in enterprise data warehousing. We have clients where there are 5+ CRM systems in place and keys are mapped back and forth. They just bought other organizations, merged the business without merging the operational systems.

There is an official method which is described in "Building a Scalable Data Warehouse with Data Vault 2.0" in section 5.1.1 (Business Key Consolidation) on page 124. Basically, the idea is to keep all business keys that have been used in any source system without consolidation in the first step. After all business keys have been loaded into the Raw Data Vault, a same-as-link (p129) is used to identify master and duplicate records (business keys) in the hub. This same-as-link is sourced from your mapping tables.

This way, you can have both: the historic view (based on all the business keys in the hub) and a consolidated view (based on the same-as-link).

Hope that helps.

Mike
answered Oct 22, 2015 by molschimke (1,890 points)
selected Nov 1, 2015 by maged-mostafa
Thanks a lot. I already purchased the kindle version book and read this method and it seems logical as it keeps the mapping from the old setup to the new setup. Back in September, I did not find a talk about it anywhere.
I came across an article by Martijn Evers at
http://dm-unseen.blogspot.com.eg/2012/10/data-vault-business-key-mutations-matter.html
that also uses the same approach but the book describes it clearer.
Scalefree Scalefree

Upcoming Trainings

  • July 01 to 03 - Amsterdam (English) Data Vault 2.0 Boot Camp and Certification

  • August 05 to 07 - Berlin (German) Data Vault 2.0 Boot Camp and Certification

  • September 9 to 13 - Hanover WWDVC EU

  • September 16 to 18 - Brussels (English) Data Vault 2.0 Boot Camp and Certification

  • October 21 to 23 - Hanover (German) Data Vault 2.0 Boot Camp and Certification

  • November 18 to 20 - Amsterdam (English) Data Vault 2.0 Boot Camp and Certification

  • December 02 to 04 - Dusseldorf (German) Data Vault 2.0 Boot Camp and Certification

  • December 16 to 18 - Zurich (English) Data Vault 2.0 Boot Camp and Certification

  • Contact us for inhouse training engagements.
    Visual Data Vault: Logical Modelling for Data Vault - Free Download!
    Scalefree
    DataVault.guru is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
    Permissions beyond the scope of this license are available in our license.

    DataVault.guru | DWH Wiki | Recent questions RSS feed | Imprint | Provided by Michael Olschimke
    ...