As data analysts, we are often required to group and rank values from a data set in order to generate trends and insights. However, not everyone knows how to do these tasks the easy way using Power BI.
In today’s blog, we will examine a scenario that requires Power BI rank and custom grouping functions to show you how to do exactly that. You can watch the full video of this tutorial at the bottom of this blog.
For this tutorial, we will use a table of data with three columns, namely Record IDs, Priorities, and Minutes. Our goal is to identify the top 5% of the records with priorities 1 or 2 based on the maximum number of minutes.
Data Set To Create Power BI Rank and Grouping
This is how the dataset looks in Power Query. It has no attribute to separate priorities 1 and 2 from the rest, thus requiring us to use the M function Table.Partition. This will generate a helper column to segregate priorities 1 and 2 from other rows.
Power BI Function: Table.Partition
To examine how to use Table.Partition to segment our data, we will start by opening the Advanced Editor.
In the window that will appear, add a new line to create a new variable. Let’s call the variable Partitions. Add the function Table.Partition which requires the following information.
- The table that we want to partition. In our case, is the Source.
- The column to evaluate. This is the column in the return table in which the rules will be placed. That will be the Priority column.
- The number of groups to create. This also refers to the number of tables to partition, and we want to have 2 tables.
- A hash function. To obtain a hash value, we need to use a number. Let’s start the hash function with each if followed by an underscore representing the current value. Then, let’s add the condition is less than three, then return zero else one.
In line 5, we will change Source to Partitions, giving us the following solution.
Click Done to see how it affects our table. As shown below, the Table.Partition function creates a list with two nested tables.
Clicking Table 1 opens a preview table containing all records with a priority smaller than three (priorities 1 and 2).
All remaining rows are in Table 2.
Since priorities 1 and 2 are now grouped in a single table, we can now use the table to calculate our top 5%.
Power BI Function: Table.RowCount
The Table.RowCount can be used to determine the number of rows needed to meet the requirement of 5%. It will count the number of rows with priorities 1 and 2, then get the 5% of the number obtained.
To start, open the Advanced Editor window from the upper left corner.
Then, we will add a new line and a new variable, which we will call nRow. We need to extract our first table by copying its variable name Partitions from line 3.
Next, we will use the positional index operator to access the first item in the list of the first table. Then, wrap the Table.RowCount operator around Partitions(0). We will then multiply it by 0.05 to get 5%.
Also, round the number to zero decimal using the Number.Round function. Finally, we will change the last line from Partitions to nRow.
This should return two as shown below.
The result means that to meet the 5% requirement, we need to mark two rows from Table 1. To do that, we can use the new M function Table.AddRankColumn.
Power BI Function: Table.AddRankColumn
In this section, we will use the Power BI Rank feature to rank our rows. The Table.AddRankColumn is used to append a new ranking column to a table based on our comparison criteria.
To start, open the Advanced Editor window. Create Table 1 and name it t1 on a new line. Then, use the Table.AddRankColumn and add our first table, Partitions (0). We also need to assign a new column name which we will call i.
For the comparison criteria, we want to rank the minutes from high to low using {“Minutes”, Order.Descending}.
Finally, add a RankKind=RankKind.Ordinal. This will force all items to be given a unique number, even if they are evaluated as equal. Then, we will change nRow to t1 to return Table 1.
Our solution should look like this.
Click Done to get a table similar to this.
Adding a Boolean Filter
We will also add a boolean filter to our table because we do not want to keep the rank column anymore. Instead, what we want is to add a boolean filter to identify only the top two rows that we need to exclude.
So, let’s go back to the Advanced Editor window and add a custom column. Add a new line with the Table.AddColumn function and call the new column Exclude.
For the columnGenerator, we can look at the newly created column using each i smaller than or equal to our nRow number, which is the number of rows to exclude. We will also assign them a data type using type logical.
Our solution should look like this.
Click Done to get a table that is similar to the one below.
We no longer need our helper column so we can remove it from our table.
So, in the Advanced Editor window, let us use the Table.RemoveColumns and remove column i.
Then click Done.
Combining Tables
We have created a new version of our nested Table 1 and need to create the same layout for our nested Table 2. And for that table, we need to add an exclude column where all values should be equal to TRUE.
In the Advanced Editor and create a new variable called t2. We will add a custom column using Table.AddColumn and call Partitions (1). Then, add an Exclude column and set the columnGenerator as each true, type logical.
We want to return t2 by editing line 13.
We should get a table like this.
The last thing we need to do is combine these two tables again. In the Advanced Editor window, add a new variable called t which equals t1 & t2.
Click Done to get our final table that looks like the one below. Now, we have a table ranked and grouped based on our given criteria, and with the top 5% marked for exclusion.
***** Related Links *****
Ranking Analysis In Power BI: Advanced Insights From Combined Data Sets
Ranking Visualization In Power BI – Dynamic Visual
Power BI Dimensions & Groups For Filtering Tables In Reports
Conclusion
In this blog, you learned how to use the Table.Partition, Table.AddRankColumn, and Table.RemoveColumns functions in Power Query. Our sample dataset has 44 rows of priorities 1 and 2, which we successfully grouped and ranked, with the top 5% marked accordingly.
With this newfound knowledge of using these functions, you can now use Power BI rank and custom grouping features for your next data set.
All the best,
Melissa de Korte
[youtube https://www.youtube.com/watch?v=Cypemxd0spQ&w=784&h=441]
Be the first to comment