Main Menu

Login Form

A data warehouse can be simply defined as a centralized and federated repository for all data that an enterprise or organization collects.  In the healthcare industry, having the right type of data warehouse is important because data warehousing emphasizes the capture of data from diverse sources for useful analysis and access. This is relevant because a huge amount of data is being mined daily. The existence of big data has made is important to install the right model. The old approach of clustering reports together from different sources is time-consuming and inefficient. The better approach is relying on complete and accurate information from across the enterprise-wide data ecosystem from beginning to end, and this approach requires a healthcare-specific data warehouse.


There are a few different approaches to getting the right healthcare data warehouse as there are different models out there now. Before the development of these models, there were just two approaches available to enterprises, the top down and the bottom up approach. The top down approach spins off data marts for specific groups of users after the complete data warehouse has been created. The bottom up approach builds the data marts first and then combines them into a single, all-encompassing data warehouse. This is similar to the late binding and early binding techniques.


Data binding is the process of mapping data that has been aggregated from source systems to standardized vocabularies and business rules in the data warehouse. Optimizing data from different sources so that it can be used together for analysis. An example of standardized vocabularies are SNOMED and RxNorm and examples of business rules are ADT rules and length-of-stays. Binding data is required and very important in any relational database model. There are major data warehouse models that have proven to be the best and are generally preferred by healthcare organizations, doctors, and clinicians. These models are:


  • Enterprise Data Model Approach
  • Independent Data Mart Approach


Enterprise Data Model Approach


The enterprise data model approach uses the top-down method in the handling of data. The goal of this approach is to model the database to perfection from the onset. It determines in advance everything that needs to be analyzed in order to improve outcomes and patient satisfaction.

This approach involves building a secondary system that receives data from systems that already exist. Extracting data from existing systems and making it all play well together in a net-new system can get very very complicated. However, with patience, the right skills, and a bit of magic, it’s possible but it is incredibly time-consuming and expensive. The complexity of this model and how long it takes to actually get value is one setback and significant downside of this model. Other downsides are: This model binds data very early, and once data is bound, it becomes very difficult and time-consuming to make changes. In healthcare, business rules, use cases, and vocabularies change rapidly. Hence you can get stuck with outdated terminology. Another setback is that this model tends to disregard the realities of the data in an organization.

Independent Data Mart Approach


The independent data mart approach is a bottoms-up approach in which health organizations start small and build individual data marts as you go along and as needed. The advantage of this approach is that healthcare organizations can start implementing and measuring much more quickly than the enterprise data model. The drawbacks to this model are: there are so many isolated data marts in place, you don’t have an atomic-level data warehouse from which to build additional data marts in the future. Typically, data marts do not contain data at the lowest level of granularity. Data transformed in a data mart is usually summarized up a level or two. This means that the data mart may present information that a certain metric is below your benchmark. Secondly, this model bombards source systems repeatedly and unnecessarily which makes it needed to build redundant feeds.