As we all understand,
Data Vault architecture allows one to establish load and recovery standards, query time-based information and establish synchronicity across content for the modeling aspects of a warehouse. I have a question concerning
Data Vault.
I have completed the first segment of a
design for a data structure based on DV techniques. I am now commencing with
testing the
design and load tables. I have run into a problem, and was wondering if other programmers have encountered the same problem and what approach they took to solve it.
I want to load a
satellite for employee information from the main production system. This system is the core system used at the company. The record's effective load date is becoming a major problem for me. It is not possible to use the load end date for the employee's record because the legacy data has already been migrated into the production system. As a result, if I attempt to query with a date parameter, all records from before we started using the
satellite will be excluded. There is absolutely no indication of when they started with the company using this approach, unless you want to obtain the data from the HR Department; this department maintains this type of history. Now I could just take the profile on the legacy system and upload the records following DV methods, but there is one
big stumbling block.
In addition to a migration of the legacy data, some of the data was modified or enhanced to live in the current production system. There are several fields from the old records that I will need to access in order to draw an accurate image. Here's an example to show what I mean:
Employee=Niccole
Hire Date=1/12/2018
Record Created Date in Production=1/1/2020
Records back-filled in system>=1/12/2018
Is it a bad idea to do a preliminary load using a place holder and update record-effective dates according to the data HR has on record?