I have to create a Data Vault
model based on data that is arriving in flat files.
These flat files do not contain any timestamps at the row level.
Part of the data is related to cars, with information like registration number, model, mileage, etc.
My problem is that in the same file, the same vehicle - that is, the same registration number - could have different attribute values. For example, different car manufacturer, because there has been a manual error, saved but immediately corrected in the source system that provided the flat file.
The question is: how do I model this data? I started by creating a vehicle hub
(with the registration number as the business key) and a vehicle satellite
with the attributes. The satellite
has a LoadDate and a LoadEndDate.
Let's imagine that I receive a file with the following two rows:
Registration Number | Manufacturer
AA12 ABC | Renault
AA12 ABC | BMW
Shall I insert the two rows in the satellite
with the same LoadDate? If I do that, how can I reconstruct the original data later, knowing that the join with the link
tables will be done on the hash key plus the LoadDate - which are the same for both rows?
Let me know if you need more details.
Thanks in advance,