Hello,
I have a question on how to model
Hub(s) when data comes from multiple sources. We have client details coming from 3 sources (Sources A, B and C).
Some clients exist in all three source systems but with different ids. Source B and C also have corresponding Client IDs of Source A, so we can use that
to identify common clients.
Following table diagrams are formatted for fixed-width fonts and best viewed by copying and pasting on to Notepad
Eg.
Source_A:
_____________________________
|ClientID | ClientName |
|===========|===============|
|A100 | Client XXXX |
|A200 | Client YYYY |
|A300 | Client ZZZZ |
|A400 | Client FFFF |
Source_B:
_______________________________________________________
|ClientID | Source_A ClientID | ClientAddress |
|===========|=======================|=================|
|B20 | A100 | ..... |
|B30 | A200 | ..... |
|B40 | - | ..... |
Source_C:
_______________________________________________________
|ClientID | Source_A ClientID | Client Phone |
|===========|=======================|=================|
|C55 | A100 | ..... |
|C60 | A300 | ..... |
|C65 | - | ..... |
How do we model "
HubClient" in this situation? What are the correct ways to model
HubClient (or Clients) according to
Data Vault modelling?
What about a
design like below? Here, the com
bination of the 3 Source System Keys is the effective business key of the
hub.
There will be 3
satellites (one for each source)
HubClient:
__________________________________________________________________
|Client Surrogate Key| Source_A ID | Source_B ID | Source_C ID|
|====================|==============|===============|============|
|SK001 | A100 | B20 | C55 |
|SK002 | A200 | B30 | -1 |
|SK003 | A300 | -1 | C60 |
|SK004 | A400 | -1 | -1 |
|SK005 | -1 | B40 | -1 |
|SK006 | -1 | -1 | C65 |
Does this
design comply with
Data Vault modelling?
Thanks