settingsLogin | Registersettings
Scalefree
Show Menu

Reengineering the data warehouse

+2 votes
We seem to have hit a roadblock in making our warehouse more agile. Using data vault over a 3NF or Dimensional warehouse is a huge benefit since it provides future-proofing and agility, but what happens when we reach the presentation layer? When we deliver our dimensional model to the BI team and it contains new attributes or a new data source, aren't we retaining the same problems and inflexible structure that we would've had if the warehouse was dimensional? Can the warehouse really be considered flexible if only a part of it is? Yes the warehouse can handle a new attribute or data source with no problems, but then the BI/Analytics team will go ballistic when they have to update their objects. How can we overcome this?
asked Dec 12, 2013 in Architecture by TheFlash (260 points)
recategorized Jan 9, 2015 by molschimke

3 Answers

+1 vote
Agility at the final layer depends on how well your dimensional model is implemented. If you follow Agile Model Driven Development, you can create a general overview and form the details of the model as they're needed. If you construct your dimensions and facts in DV style – Hub-conformed satellites for the dimensions and link-satellites for the facts – you shouldn't have to worry about how flexible your model is. From there you can create a star schema through views for the business user.
answered Jan 10, 2014 by Richy (600 points)
It's still not completely clear to me. Say, for example, the BI team is connecting to a Delivery database using Microstrategies or SSAS. They build 500 different cubes, dashboards, reports, etc. based on those objects. Down the road, we decide to add new Hubs, Sats and Links. I've got no problem adding this to the warehouse, but the dimension and fact tables will have to be recreated, then the BI team will have to change their reports to fit the new schema.

Whether the database consists of physical tables built through ETL or abstracted through views, I'm still constricted by BI objects. It seems strange to have this limitation where our warehouse can only be as agile as downstream. I can update the database with no problem, but when all the downstream dependencies no longer fits the schema then I'm going to be the one who gets blamed for it.

I can't imagine I'm the first person to have this issue. Has this issue cropped up for anyone else, and do you have suggestions for resolving conflicts between the warehouse and BI/Analytics?
+2 votes
There are a few ways this can be avoided:

- Create virtual views to abstract any underlying table changes. Changing the data warehouse won't affect the views. Also, you avoid traditional issues with dimensional warehouses like surrogate keys and history maintenance issues.
- When new Hubs and Links are added, just create additional dimensional views. That way, you avoid duplicating data or using more space.
- Avoid grain changes or adding systems if possible. This will result in fewer downstream changes.
- Avoid using source-specific attributes and use conformed dimensions. Conformed dimensions use one copy of a shared dimension across different subject areas, which helps to reduce complexity.
- Recreate the dimensional views with intelligent business logic. This is more complicated, but if designed well it can support existing requirements as well as new requirements down the road.
answered Feb 8, 2014 by DanWhit (290 points)
0 votes
Changing the data mart will always impact the presentation layer, since the data mart is the last layer before the data goes out to the user. In our case we realized that what happens beyond the data mart is really out of our control, which is why we started using DV to try to contain the ripple effect.

Some DV tools including Business Objects, Cognos and Microstrategy have built-in abstraction layers (Catalogs and Universes) that take care of this for you. Short of changing a column name or removing a column entirely, this should successfully isolate any changes. The only time this won't help is if your BI team created a different catalog for each report, in which case there's nothing we can implement that will account for that.
answered Mar 9, 2014 by Okinawa (260 points)
Scalefree Scalefree

Upcoming Trainings

  • July 01 to 03 - Amsterdam (English) Data Vault 2.0 Boot Camp and Certification

  • August 05 to 07 - Berlin (German) Data Vault 2.0 Boot Camp and Certification

  • September 9 to 13 - Hanover WWDVC EU

  • September 16 to 18 - Brussels (English) Data Vault 2.0 Boot Camp and Certification

  • October 21 to 23 - Hanover (German) Data Vault 2.0 Boot Camp and Certification

  • November 18 to 20 - Amsterdam (English) Data Vault 2.0 Boot Camp and Certification

  • December 02 to 04 - Dusseldorf (German) Data Vault 2.0 Boot Camp and Certification

  • December 16 to 18 - Zurich (English) Data Vault 2.0 Boot Camp and Certification

  • Contact us for inhouse training engagements.
    Visual Data Vault: Logical Modelling for Data Vault - Free Download!
    Scalefree
    DataVault.guru is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
    Permissions beyond the scope of this license are available in our license.

    DataVault.guru | DWH Wiki | Recent questions RSS feed | Imprint | Provided by Michael Olschimke
    ...