Three methods to get refresh history of Power BI d…


Power BI dataflows enable you to connect to, transform, combine, and distribute data for downstream analytics. Some users hope to monitor refresh history and status of dataflows so that they can make refresh plans for datasets that use these dataflows.

In this article, I will introduce three methods to get refresh history of Power BI dataflows.

 

Method 1

The simplest method is to review Refresh History on dataflow settings page. Navigate to Settings > Dataflows > Refresh History. You can also select a specific dataflow in the Workspace > context menu (…) > Refresh History.

 

The Refresh History provides an overview of refreshes, including the run time, the refresh type, the duration, and the run status. To see details in the form of a CSV file, select the download icon on the far right of the refresh description’s row. In this CSV file, there is detailed refresh history for each table (also known as entity) included in this dataflow. It also contains error messages for a table if its refresh failed. To better understand all attributes in the CSV file, please refer to Understanding and optimizing dataflows refresh – Power BI | Microsoft Docs.

Vlianlmsft_0-1639117712948.jpeg

 

Method 2

If you have a premium Power Automate license, you could create an automated cloud flow to record the refresh history of a dataflow. This has been well introduced in official document Power Automate templates for the dataflows connector | Microsoft Docs. It introduces the following three templates.

You could follow the detailed steps in these template tutorials to create a cloud flow. The flow records refresh metadata of every refresh for a dataflow, including dataflow id, dataflow name, refresh type, refresh status, start time and end time. It does not provide refresh details of every table in the dataflow.

 

One point you need to notice is that the automated cloud flow is triggered when a dataflow refresh is completed (either succeeded or failed) and it adds a row to a table or dataset you have set up, therefore it only records history data for refreshes occurring after this flow is created. It will not have data for previous refresh history. If you turn off this flow, it will not record refresh history either.

 

You could add multiple dataflows’ refresh history data to the same table or dataset, then use it to create a monitoring report.

Vlianlmsft_1-1639117829386.jpegVlianlmsft_2-1639117836973.jpeg

 

 

Method 3

The third method is to use Power BI REST APIs for dataflows. You can use Dataflows – Get Dataflow Transactions API to get a list of refresh transactions for the specified dataflow. Similar to the earlier Method 2, it returns refresh metadata for the overall dataflow, not for each table in the dataflow. It requires a GET request as below.

GET https://api.powerbi.com/v1.0/myorg/groups/{groupId}/dataflows/{dataflowId}/transactions

 

This method is more flexible than the other two. You can combine it with other Power BI REST APIs to get data per your need. You can use Power Query Editor to get data from these APIs and transform the returned data. Then using them to create a custom monitoring report is also possible.

 

Below is an example of M codes about how to get the most recent refresh history of every dataflow in a new workspace.

 

1. Create a custom function getRefreshHistory to get the most recent refresh record for a specific dataflow. I added a duration column to it.

(groupId as text, dataflowId as text)=>

let

Source = Json.Document(Web.Contents("https://api.powerbi.com/v1.0/myorg/groups/"&groupId&"/dataflows/"&dataflowId&"/transactions", [Headers = [Authorization = "Bearer " & GetAccessToken()]])),

#"Converted to Table" = Record.ToTable(Source),

Value = #"Converted to Table"{1}[Value],

#"Converted to Table1" = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table1", "Column1", {"id", "refreshType", "startTime", "endTime", "status"}, {"id", "refreshType", "startTime", "endTime", "status"}),

#"Changed Type" = Table.TransformColumnTypes(#"Expanded Column1",{{"id", type text}, {"refreshType", type text}, {"startTime", type datetime}, {"endTime", type datetime}, {"status", type text}}),

#"Added Custom" = Table.AddColumn(#"Changed Type", "duration", each [endTime] - [startTime]),

#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"duration", type duration}}),

#"Filtered Rows" = Table.SelectRows(#"Changed Type1", let latest = List.Max(#"Changed Type1"[startTime]) in each [startTime] = latest)

in

#"Filtered Rows"

 

2. Use Dataflows – Get Dataflows API to get all dataflows in a workspace and invoke the function getRefreshHistory to get the most recent refresh history for each dataflow.

let

groupId = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",

Source = Json.Document(Web.Contents("https://api.powerbi.com/v1.0/myorg/groups/"&groupId&"/dataflows", [Headers=[Authorization="Bearer "&GetAccessToken()]])),

#"Converted to Table" = Record.ToTable(Source),

Value = #"Converted to Table"{1}[Value],

#"Converted to Table1" = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table1", "Column1", {"objectId", "name"}, {"objectId", "name"}),

#"Renamed Columns" = Table.RenameColumns(#"Expanded Column1",{{"objectId", "dataflowId"}, {"name", "dataflowName"}}),

#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"dataflowId", type text}, {"dataflowName", type text}}),

#"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "refreshHistory", each getRefreshHistory(groupId, [dataflowId])),

#"Expanded refreshHistory" = Table.ExpandTableColumn(#"Invoked Custom Function", "refreshHistory", {"refreshType", "startTime", "endTime", "status", "duration"}, {"refreshType", "startTime", "endTime", "status", "duration"})

in

#"Expanded refreshHistory"

Vlianlmsft_3-1639118088844.jpeg

 

Here I used a custom function GetAccessToken() to get access token (a bearer token) from Azure service. Below code is one way to do that. In order to get an access token, you need to register an Azure Active Directory (Azure AD) application in Azure portal and grant relevant permissions for it. Please refer to Register an Azure AD application to use with Power BI – Power BI | Microsoft Docs.

() =>

let

apiUrl = "https://login.windows.net/common/oauth2/token",

body = [

grant_type = "password",

resource="https://analysis.windows.net/powerbi/api",

username="your power bi account",

password="your power bi password",

client_id = "your client id for registered application",

client_secret = " your client secret for registered application"

],



Source = Json.Document(Web.Contents(apiUrl, [Content = Text.ToBinary(Uri.BuildQueryString(body))])),

access_token = Source[access_token]

in

access_token

Vlianlmsft_4-1639118127819.jpeg

 

 

Conclusion

If you would like to have an overview of refresh history of dataflows in a tenant or a workspace, you could use Power Automate Templates or Power BI REST APIs to get refresh data and build a monitoring report on it. For detailed information to troubleshoot if a refresh fails, you could download the CSV file from Refresh History on Settings page.

 

 

Author: Jing Zhang

Reviewer: Kerry Wang & Ula Huang



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*