Dimensional Modelling vs One Big Table : Which is better and why?
While designing a Data Warehouse one of the key decision to make is whether to use Dimensional Modelling or One Big Table (OBT) approach. Lets discuss all plus and minus of both the approaches.
What is Dimensional Modelling?
Dimensional Modelling, pioneered by Ralph Kimbal, is a widely used approach for designing Data Warehouses (DWH). Data is structured in Facts and Dimension Tables.
Dimension Tables
Provide descriptive attributes for analysis, such as time, location, product details, vendor details,customer details.
Fact Tables
It contains data which can be measured like Sales amount, number of sales etc.
Advantages of Dimensional Modelling
- Query performance: Star and Snowflake schemas reduce data redundency and optimizes query joins as a result it makes queries faster.
- Data Governance: Separating Facts and Dimensions provide better schema structure, having a better managebility.
- Scalabiliy: Since most of the Transactional data goes idenpendently in Fact tables, hence this structure can easily handle big data sets.
- Anlytical Flexibility: It supports advance report like slice-dice operation and drill-down capability.
Disadvantages of Dimensional Modelling
- Slow Query Performance: Sometimes if query joins are not optimized properly, it can cause slow performance.
- Complex ETL/ELT process: It requires complex ETL/ELT processes for maintaining a relation between Fact and Dimension tables.
What is One Big Table (OBT) approach?
One big table approach have just one big table with all related attributes/fields as columns. It is also known as Flat Table or Denormalized approach as well. Generally alot of duplicate values for common columns. It requires almost no or minimal joins.
Advantages of One Big Table
- Faster Query Performance: Since the requirement of joins is very minimal, hence the query performances are really good.
- Easier Data Ingestion: Since there is no dependency, so the data ingestion process is really very easy.
Disadvantages of One Big Table
- Inefficient Storage: Since there is a lot of duplicate data, hence it increases the storage cost.
- Managebility issue: Due to high no of columns, it becomes difficult to manage.
- Data Integrity Issue: Since there are a lot of dupicate data, so there will be higher risk of incosistencies.
Conclusion
Both Dimensional modelling and One Big Table have their own importance in Data Warehouse. If the priority is performance, governance and scalability then Dimensional Modelling is the answer. But if speed and simlicity is required then One Big table is the solution.