From what I can tell (and I could be wrong) you'd need= your Employee and Department objects, two satellites
to match and a link
to contain the DV keys from each both Employee and Department. The nice thing about this setup is that you shouldn't need any reference tables.
If you're following DV 2.0 certified practice for generating keys using MD5 hashing (i.e., generating unique keys, keeping the keys consistent, not using MD5 as a primary key and only using MD5 as a BK when necessary) , then you should be able to calculate and load it for each object simultaneously. A huge benefit to joining objects this way is that it lets you load them in parallel. In our case, we calculate them when staging the raw DV. The hashed keys also make it easier to join and pull data from the satellites
later on. During the ETL
, you can use the DepartmentId FK in Employee to access the BK from Department and generate the key used in the join to populate it. Even if you're using sequence numbers, the ETL
would only need to add the join on Department and obtain the key for the join. The only downside is that you'd have to sequence the loads to populate Employee and Department first, or you could run into issues when joining them.