Scenario:
In many cases, when we import data into Power BI, Power Query will automatically add suffixes if there are identical column names. However, even if there are multiple duplicate column names in the table, it will add them in numerical order without recognizing the fields and adding them based on the fields. If there isn’t much data in the table, it is not a trouble for us to adjust the following names manually, but when there are many fields in the table, it will still be easy for us to improve them in the following ways.
Table Used:
The following is base table:
Expect output:
Current output:
Solution:
The problem is mainly solved by the flexible use of pivot and unpivot. However, in order to make them be matched correspondingly, we first create an index.
Step 2,
Then we unpivot the table (the data in the first column is the number of rows they were originally in):
Step 3,
Since many columns have the same name, we need to create another index to sort the columns with the same name correctly.
Step 4,
Then group the table by values:
Step 5,
Then add custom column to all with index:
Expand table ,we get:
Step 6,
Delete the column we don’t need:
Step 7 ,
Then add a suffix to the initial column name.
Step 8 ,
Remove the unwanted rows,then pivot table,and remove the columns we don’t need,then promot headers:
The full code could refer the Advanced editor:
let
Source = Excel.Workbook(File.Contents("\\flielink\import.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Added Index1" = Table.AddIndexColumn(Sheet1_Sheet, "IndexALL", 1, 1, Int64.Type),
#"Unpivoted Only Selected Columns" = Table.Unpivot(#"Added Index1", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6"}, "Attribute", "Value"),
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Only Selected Columns", "Index", 1, 1, Int64.Type),
#"Grouped Rows" = Table.Group(#"Added Index", {"Value"}, {{"all", each _, type table [Attribute=text, Value=any, Index=number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([all],"Index1",1)),
#"Expanded Custom1" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"IndexALL", "Attribute", "Value", "Index", "Index1"}, {"Custom.IndexALL", "Custom.Attribute", "Custom.Value", "Custom.Index", "Custom.Index1"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom1",{"all", "Custom.Value", "Custom.Index"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "newvalue", each if Value.Is([Value],type text)=true then [Value]&"_"&Number.ToText([Custom.Index1]) else [Value]),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Value", "Custom.Index1"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns1", each true),
#"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[Custom.Attribute]), "Custom.Attribute", "newvalue"),
#"Removed Columns2" = Table.RemoveColumns(#"Pivoted Column",{"Custom.IndexALL"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Columns2", [PromoteAllScalars=true])
in
#"Promoted Headers"
Result:
If we have more columns in our table, how can we get the results quickly and easily based on the above steps? Referring to the image below, I have added several new columns containing the same column names as the previous columns as well as different column names. Using the above Power Query steps, I only need to complete column names when unpivoting to get the wanted data.
Hope this article helps everyone with similar questions here.
Author: Lucien Wang
Reviewer: Kerry Wang & Ula Huang
Be the first to comment