Leveraging pure Direct Lake mode for maximum query performance | Microsoft Power BI Blog


Co-authored with Kay Unkroth

We are happy to announce a new Direct Lake semantic model property to control Direct Lake behavior. Direct Lake, by default, will transparently fallback to DirectQuery whenever a DAX query exceeds the limits on a SKU or uses features not supported by Direct Lake mode, such as SQL views from the Warehouse. This makes sure your reports show data, but it could be at the cost of performance. Now you can choose what happens instead.

This property can be viewed or set when you open the data model in the Fabric workspace. In the Data pane navigate to the Model Explorer, click on Semantic model, and then in the Properties pane choose the Direct Lake behavior on your custom Direct Lake semantic model.

  • Automatic will allow Direct Lake with fallback to DirectQuery. This is the default behavior.
  • Direct Lake only will ensure no fallback.
  • DirectQuery only will always fallback to DirectQuery, running all DAX queries as SQL queries through the Warehouse.

Advanced BI pros working with Direct Lake models using third-party tools or custom solutions based on Tabular Object Model (TOM) or Tabular Model Scripting Language (TMSL) can also use this property to control query processing. Specifically, BI pros can now configure their semantic models to use pure Direct Lake mode. The new property is called DirectLakeBehavior. It can be configured to allow Direct Lake and DirectQuery mode (Automatic), Direct Lake-only mode, or DirectQuery-only mode, as in the following code listing.

The default setting for DirectLakeBehavior is Automatic, which means Direct Lake models read delta tables directly from OneLake — unless they encounter an issue that prevents efficient loading of data into memory. Reasons include delta tables that exceed the SKU-based max number of parquet files or row groups, memory pressure, and feature limitations, such as model tables based on Data Warehouse views. The presence of Row-level security (RLS) or Object-level security (OLS) at the source also falls into the category of feature limitations. Another frequently encountered reason is an unprocessed table in the Direct Lake model. If you programmatically add a table to a model by using a third-party tool, TOM, or TMSL, don’t forget to process (refresh) the table so that Direct Lake mode can load the columns into memory when a DAX query needs them. Otherwise, if a column needed for a DAX query can’t be loaded, Direct Lake mode falls back to DirectQuery mode to process the DAX query, which can impact DAX query performance. For more details, check out the Fallback and Known issues and limitations sections in the Direct Lake overview article in the product documentation.

Fallback to DirectQuery mode ensures that reports built on top of a Direct Lake model show results despite any Direct Lake-impacting issues. On the flip side, this fallback can also hide critical problems. Perhaps your delta tables require optimization to keep the number of parquet files and row groups at reasonable levels. Perhaps, the data volume necessitates a higher Fabric SKU to leverage Direct Lake model more consistently. Or perhaps you would just like a reminder that you need to process your tables after schema changes. With fallback enabled, how can you be sure that your semantic models really leverage Direct Lake mode?

Up until now, you had to analyze the query processing in Performance analyzer or by using SQL Server Profiler to discover fallbacks to DirectQuery mode. But the new DirectLakeBehavior property greatly simplifies this. Just set the DirectLakeBehavior property to DirectLakeOnly and your DAX queries will fail when the data can’t be loaded. The error message conveniently reveals the reasons. Or put another way, if your DAX queries succeed in pure Direct Lake mode, you can rest assured that your semantic model can load the columns.

Moreover, did you notice the DirectQueryOnly option? Yes, you can now configure a Direct Lake model to always operate in DirectQuery mode. This can make sense for performance measurements. If you test query perf in DirectQueryOnly mode, you can reliably evaluate the impact of fallback. If DirectQueryOnly perf is acceptable, it makes sense to enable fallback for your semantic model in production. If it isn’t, DirectLakeOnly might be the better option.

As a best practice, we recommend that you work with your Direct Lake models in pure Direct Lake mode during the model authoring phase and then evaluate fallback performance in pure DirectQuery mode to determine the best DirectLakeBehavior configuration when deploying your solution to production. Fabric is redefining how customers can build their BI solutions for faster performance at big-data scale while at the same time reducing Total Cost of Ownership (TCO) and infrastructure complexity, and Direct Lake models are a key to delivering actionable insights with the speed of thought on top of your Fabric investment. So, don’t delay and leverage pure Direct Lake mode in your model authoring. We hope this new level of control helps you streamline your Direct Lake model authoring processes. And as always, please provide us with feedback if you want to help deliver additional enhancements. We hope you are as excited about Fabric and semantic models in Direct Lake mode as we are. We think this is a massive innovation and are looking forward to hearing from you!



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*