See your company like never before
Power BI deep-dives, migration playbooks, and data strategy for enterprise teams.


Join dozens of organizations who have moved to Beyond The Analytics. Book your personalized demo today
Star schema vs snowflake schema in Power BI — how VertiPaq compression, filter propagation, and DAX performance differ, and when each design applies.
Quick answer: Star schemas deliver better compression, faster queries, and more predictable DAX behaviour in Power BI because the VertiPaq engine and DAX language are explicitly optimised for dimensional models with separated fact and dimension tables.
Every Power BI report visual generates a query against the semantic model. That query filters, groups, and summarises data. The shape of your data model determines how efficiently the VertiPaq storage engine answers those queries. Choose the wrong schema and you get bloated models, slow slicers, and subtle calculation errors that only surface when a finance director questions why the numbers do not add up.
Microsoft's official guidance is unambiguous: star schema design is "highly relevant to developing Power BI models that are optimized for performance and usability." SQLBI's research goes further, stating that "a star schema is always the best choice" based on benchmarks against a four-billion-row fact table.
This guide breaks down exactly why — grounded in VertiPaq engine internals — and covers the scenarios where snowflake schemas might still be acceptable.
Quick answer: A star schema places a central fact table (transactions, events, measurements) surrounded by denormalised dimension tables (products, customers, dates, geographies) connected via one-to-many relationships.
Dimension tables describe business entities (products, customers, dates), while fact tables record events (sales, shipments, inventory snapshots). Each dimension connects directly to the fact table through a single relationship, creating the star shape.
Key characteristics for Power BI:
This separation directly aligns with how the VertiPaq columnar engine compresses and scans data. A Category column with 12 distinct values compresses dramatically better in a small dimension table than when repeated across billions of fact rows.
For a deeper look at how relationships work in Power BI, see our relationships and semantic model guide.
Quick answer: A snowflake schema normalises dimension tables into multiple related sub-tables — for example, splitting a single Product dimension into separate Product, Subcategory, and Category tables — creating longer relationship chains between the fact table and the outermost dimension attributes.
In a classic snowflake design, a product dimension becomes three tables: DimProduct links to DimProductSubcategory, which links to DimProductCategory. The fact table connects only to DimProduct, and filters must propagate through two additional hops to reach category-level attributes.
This normalisation reduces data redundancy within the dimension layer. In a disk-based relational warehouse, that reduction can matter. In Power BI's in-memory VertiPaq engine, the calculus is different.
Microsoft Learn states: "Generally, the benefits of a single model table outweigh the benefits of multiple model tables." The documentation lists specific drawbacks:
Quick answer: VertiPaq's columnar compression (dictionary encoding, value encoding, and run-length encoding) works most efficiently when dimension attributes live in small, separate dimension tables rather than repeated across fact table rows or fragmented across chains of normalised sub-tables.
VertiPaq compresses each column independently, choosing between two primary encoding strategies:
Country column with 195 distinct values stores a dictionary of 195 entries and replaces every row's text with a small integer index.In a star schema, the Country column lives in a Geography dimension with perhaps 195 rows. The dictionary has 195 entries, and the column is tiny. In a flat or denormalised model, that same Country column sits in the fact table across millions or billions of rows — the dictionary is the same size, but the index column storing row-level references is vastly larger.
After dictionary or value encoding, VertiPaq applies run-length encoding (RLE) to columns where consecutive rows share the same value. Sorted columns compress spectacularly under RLE — a million consecutive rows with the value "Electronics" become a single entry: "Electronics, count: 1,000,000."
Dimension tables benefit disproportionately from RLE because they are small, often sorted, and contain columns with low cardinality. Fact tables benefit too, but only for their narrow foreign key columns — not for denormalised text attributes that would otherwise live in dimensions.
SQLBI's benchmark quantified this on a four-billion-row dataset:
| Metric | Star Schema ("Slim") | Single Table ("Fat") |
|---|---|---|
| Model size in RAM | 16.80 GB | 44.51 GB |
| Size ratio | 1x | 2.65x larger |
The star schema model consumed less than 40% of the memory required by the denormalised equivalent. Snowflake schemas fall between these two extremes — they avoid fact-table bloat but introduce additional relationship overhead and redundant key columns across the normalised dimension chain.
Quick answer: Star schemas outperform snowflake schemas in Power BI across nearly every metric that matters for real-world reporting: slicer speed, DAX calculation time, filter propagation efficiency, and model size.
| Criteria | Star Schema | Snowflake Schema |
|---|---|---|
| VertiPaq compression | Optimal — dimension attributes in small tables compress well via RLE and dictionary encoding | Moderate — more tables and key columns add overhead, though better than flat models |
| Filter propagation | Single hop from dimension to fact | Multi-hop chains (e.g., Category to Subcategory to Product to Fact) |
| Slicer performance | Scans dimension table (thousands of rows) | Scans outermost table then propagates through chain — additional latency |
| DAX complexity | Standard patterns work as expected | Longer chains may require explicit CROSSFILTER or TREATAS patterns |
| Model size | Smallest for equivalent data | Larger due to additional tables, key columns, and relationship metadata |
| Hierarchy support | Single-table hierarchies (Category > Subcategory > Product) | Cannot create cross-table hierarchies — must denormalise anyway |
| Data pane usability | Fewer tables, cleaner navigation | More tables, potentially confusing for report authors |
| Auto-exist behaviour | Predictable — dimensions are separate tables | Can produce unexpected filter interactions in certain configurations |
SQLBI's testing revealed the starkest contrast in slicer queries. To populate a Year slicer, the star schema retrieves ten distinct values from a Date dimension table containing a few thousand rows — execution time is approximately one millisecond. A flat model must scan the entire four-billion-row fact table to find those same ten distinct values, taking roughly five seconds. Snowflake schemas perform closer to star schemas here (the outermost dimension table is still small), but the additional relationship traversal adds measurable overhead when multiple slicers interact.
Quick answer: Flat (denormalised) tables trigger DAX auto-exist behaviour that silently filters out valid dimension combinations, producing incorrect totals and misleading reports — a problem that star schemas avoid entirely.
SQLBI's research on auto-exist documents a subtle but critical issue. When multiple columns from the same table are used as filters (which happens constantly in flat models), DAX applies an optimisation called auto-exist that intersects filter values based on combinations that actually exist in the data. If a product has never been sold in a particular region, that combination is silently excluded — even when business logic requires it to appear.
In a star schema, filters on Product[Category] and Geography[Region] operate on different tables. Auto-exist does not apply across tables, so cross-joins produce the complete set of combinations and the result is correct.
A related issue occurs when fact table rows are absent for certain dimension members. In a flat model, if no transactions exist for a particular customer, that customer disappears entirely from slicers and visuals — there is no dimension row to anchor on. In a star schema, the Customer dimension table independently contains all customers regardless of whether they have transactions, ensuring complete visibility.
If you have inherited a flat table, Power Query provides the tools to restructure it:
For organisations migrating from spreadsheet-based reporting, our Excel to Power BI migration guide covers this flat-to-star transformation in the context of replacing workbook-based analytics.
Quick answer: Snowflake schemas are acceptable when a dimension is extremely large (millions of rows), when row-level security requires a separate security mapping table, or when the source system's normalised structure provides a compelling maintenance benefit that outweighs the performance cost.
Microsoft does not say "never use snowflake schemas" — it says the benefits of denormalising "generally" outweigh keeping the normalised structure. Here are the scenarios where a snowflake dimension can be justified:
If a dimension table itself contains millions of rows (for example, a customer dimension in a telecom company with 50 million subscribers), normalising stable attributes like Region > City > District into a separate geography sub-table can reduce the overall dimension size. The storage savings may offset the filter propagation cost.
When implementing dynamic row-level security, you often need a mapping table that connects user principals to dimension values (for example, which sales regions each user can see). This table naturally forms a snowflake extension off the main dimension. SQLBI has documented how moving security filters to a smaller table can reduce the performance cost of RLS evaluation.
If data arrives from a third-normal-form relational database and the dimension is small enough that performance impact is negligible, the maintenance cost of writing Power Query denormalisation logic may not justify the marginal gain. This is a pragmatic trade-off, not a best practice.
In all cases, snowflake schemas cannot support cross-table hierarchies. Microsoft recommends creating a denormalised view (using Power Query merge) for hierarchy support, even if you keep the snowflake structure for filtering.
Quick answer: Default to star schema unless you have a documented, specific reason to normalise a dimension — the performance, correctness, and usability benefits are too significant to give up without justification.
For every new semantic model, begin with a star schema. Create denormalised dimension tables for each business entity (Date, Product, Customer, Geography) connected to fact tables. Consider snowflake extensions only if a dimension exceeds one million rows, RLS requires a separate mapping table, or source system constraints make denormalisation impractical.
After building the model, use VertiPaq Analyzer (available in DAX Studio and Tabular Editor) to measure actual compression ratios and column sizes. Use Performance Analyzer in Power BI Desktop to measure query execution times — pay special attention to slicer population queries and visuals that filter across dimension chains. Our DAX best practices guide covers performance profiling workflows with DAX Studio and VertiPaq Analyzer in detail.
Organisations migrating from Qlik Sense, which uses an associative engine with different modelling assumptions, should pay particular attention to schema design during the transition. Our Qlik to Power BI migration guide covers how to translate associative models into proper star schemas.
Yes, for the vast majority of cases. Microsoft's guidance, SQLBI's benchmarks, and VertiPaq's internal optimisations all align on this. Star schemas deliver better compression, faster queries, correct DAX behaviour, and a cleaner authoring experience. The only exceptions involve very large dimensions, RLS mapping tables, or source-system constraints — and even then, snowflake portions should be limited to specific dimensions, not the overall architecture.
Not typically. VertiPaq already handles redundancy efficiently through dictionary encoding — repeated category names compress well because the engine stores distinct values once and uses compact integer indices for each row. The additional tables, key columns, and relationship metadata that snowflake schemas introduce often consume more memory than the redundancy they eliminate.
In a star schema, a filter propagates through a single one-to-many relationship to reach the fact table — one hop. In a snowflake schema, a filter on Category must propagate through Subcategory, then Product, then to the fact table — three hops. Each hop adds processing overhead. The difference is negligible for simple queries but compounds in complex reports with multiple slicers and cross-filtering.
No. Power BI hierarchies require all columns to belong to a single table. If your Product dimension is split across three tables, you cannot create a Category > Subcategory > Product drill-down. The workaround is merging the snowflake tables into a single dimension in Power Query for hierarchy support — which effectively recreates a star schema for that dimension.
Auto-exist is a DAX optimisation that filters dimension combinations based on what actually exists in the data when multiple columns from the same table are used as filters. In flat models, auto-exist can silently exclude valid business combinations (like a new product in a new region with no sales yet), producing incorrect totals. In a star schema, filters on different dimensions operate on separate tables, so auto-exist does not cross-filter between them — the result includes all expected combinations correctly.
Use Power Query's Reference feature to create dimension queries from the existing flat table. For each dimension, reference the source query, use "Remove Other Columns" to keep only relevant attributes, apply "Remove Duplicates," and add an Index column as a surrogate key. Merge each dimension back into the fact query to replace text columns with key references. After loading, verify that one-to-many relationships are correctly established.
No. Fabric semantic models use the same VertiPaq engine as Power BI Desktop and the Power BI service. The Fabric dimensional modelling documentation explicitly recommends star schema design for Fabric warehouses and lakehouses. Direct Lake mode — which reads Parquet files directly from OneLake into VertiPaq — benefits from the same columnar compression advantages. The guidance remains consistent: denormalise dimensions, separate facts from dimensions, keep relationship chains short.
Microsoft Partner · Dubai
Your business intelligence partner for the GCC
Have a data challenge or a project in mind? Reach out and let's explore how we can help.
Clients we've worked with







Topics